Matlab's Loss is Nobody's Gain

Joseph D. Darcy darcyacs.berkeley.edu
Mon Sep 7 23:51:34 PDT 1998


> David Chase's comments on http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf

> p.11 I don't buy the "more than one" Complex class claims.  Or
> rather, that there will be no problem of confusion, unless the three
> Complex classes are made interchangeable behind either an abstract
> base class or an interface (which would make them slower and
> difficult to inline and optimize).  One complex class might be
> called java.lang.Complex, another might be org.fsf.lang.Complex, yet
> another could be edu.uc.berkeley.kahan.Complex.  Code written to use
> one would not (and could not) use another unless its source code
> were modified.

Well, that depends.  A class with the same name could be evolved over
time (or replaced by another version) without any change in the client
code (dynamic linking is the norm in Java).  Different VM providers
might also include classes with the same name but slightly different
semantics.  In general, someone may want to write software (e.g. an
equation solver for complex numbers) that can work with whatever
version of complex someone else may want to use.

[the complex type]

> p.15, the same.  What is it about (x,y) versus
> x+y*I that produces these anomalies?  To me (a
> compiler backend person) these differences appear
> to be mere syntactic sugar; there must be something
> peculiar in the way that Fortran systematically
> translates the complex operations.

When double and complex numbers are combined, Fortran promotes the
double to a complex number with a zero imaginary component.  This
causes a number of problems:

* the sign of zero is not preserved properly (leading in the graphical
errors discussed in the JavaHurt document)

* spurious exceptions are possible, (introducing an imaginary zero and
multiplying by a complex infinity produces a spurious NaN, etc.)

These same problems would arise if all purely imaginary values were
represented as complex numbers with a zero real component.

> p. 16

> "Linguistically legislated exact reproducibility is unenforceable"
>  ????  > What about Sun's Java trademark?  They can, in theory, >
>  block anyone from using the word "Java" who does not > conform
>  exactly.  "J++", of course, is not trademarked > by Sun.

If users had to wait for a Java VM that strictly followed the Java
specification, we would still be waiting.  Even Sun's own VMs have
fallen short in the floating point area; base conversion wasn't
correct until 1.1 and at least some versions of 1.1.n have
non-conforming transcendental functions.

As you point out later, Java already includes several sources of
non-determinism such as threads and finalize methods (not to mention
more mundane issues like different amounts of memory from system to
system).

> p. 20, quibbling about use of the term "exception"
> in Java not matching IEEE-754.  This is petty and
> distracting.  Java's use of the word follows
> Modula-3 and C++, which follow Modula-2+, which
> follows Cedar, which follows Mesa, which probably
> got it from somewhere else.

While the wording is pejorative, the use and meaning of terminology is
an important issue.  Even among different programming languages, the
word "exception" has slightly different semantics.  For example, the
exceptions may be synchronous or asynchronous.  Exceptions might also
be precise (as in Java) or imprecise (as in Ada).  Some languages,
like Mesa and PL/1, allow resumption of the block exited by the
exception while others, like Java and C++ do not.  Clearly such
differences impact how exception handlers are written.

In IEEE 754-speak, an exception doesn't necessarily imply anything
about control flow; an exception is an event; the policy to use when
that event occurs (traps or flags) is up to the programmer. "The
Java(TM) Language Specification" states that "Java floating-point
operators produce no exceptions (section 11)."  This says that Java
floating point operators don't throw Java-language exceptions (which
roughly correspond to a restricted form or trapping).  What about
sticky flags, the other IEEE 754 exception handling policy?  These are
simply omitted from Java.

> p.47
> However, the speed claims for use of "double" as an
> intermediate value not slowing down Intel are not
> correct (as near as I can tell, without making other
> changes to the language). 

[on the x86, the calculations on the double format can have a
discrepancy due to a problem with double rounding on underflow.
Getting rid of this small discrepancy can be expensive.]

> but for float
> it is sufficient (?) to merely load, perform the
> double op, and store (I'm almost certain this is
> correct for multiplication, and I think it works
> for division).

Yes, promoting float operands to double, doing the operation in
double, and rounding the double answer back to float gives the same
answer as doing the operation all in float.  In general this
relationship holds between any two IEEE 754 floating point formats if
the wider type has at least 2p + 2 significand bits where p is the
number of significand bits of the narrower format.  For float p=24 and
for double p=52 , 52 > 2*24 + 2 = 50.  (For a sketch of the proof, see
Goldberg's appendix to Hennessy & Patterson's "Computer Architecture:
A Quantitative Approach, Second Edition.")

>  Replacing float ops with double
> ops will slow programs down on Intel.

Yes, double rounding on underflow can cause problems on the x86.  A
slight change to the language's semantics can avoid that problem (VMs
on the x86 generally don't handle double rounding on underflow
according to the Java spec).  So, for the techniques currently used in
practice, using double instead of float on the x86 shouldn't introduce
much of a speed penalty.  Alternatively, Roger Golliver has come up
with a new technique to make the x86 avoid the double rounding problem
for marginally more cost than the currently used store-reload method.
Golliver's technique is described in more detail in the forthcoming
report from the Java Grande Numerics Working Group (see
www.javagrande.org after September 15).

> p. 56
> The paper needs a better example to justify access
> to directed rounding modes.  Debugging tools are
> often extralingual, which is another way of saying
> that debugging tools need not be specified as part
> of the language.  Support for this directed rounding
> debugging trick can be added either:

[use a special rounding mode debugger or hack the bytecode]

Even to use traditional debugging tools there often has to be
co-operation between the compiler and the debugger, shared formats for
symbol table information, etc.  (Tools such as Purify may collude with
the compiler by recognizing compiler and platform specific code
generation idioms.)  Ideally, a debugger runs in a environment as
close as possible to the production code environment.  For example,
the appearance of certain types of bugs, such as race conditions, can
vanish when a debugger is used.  Changing the rounding modes
dynamically allows a more controlled experiment to be performed; using
a different dynamic rounding mode shouldn't have any performance
impact (as long as the values stay in approximately the same range).
Sometimes taking a one or two order of magnitude performance hit is
not acceptable.

It is not unheard of for languages to impose semantically transparent
requirements on compiler and runtime implementations.  For example,
Scheme requires that tail calls be optimized to allow recursion to be
used to implement looping without overflowing the call stack.

> >From the point-of-view of someone who prefers to
> view programs as having a single defined behavior
> (and not a collection of behaviors parameterized
> by a collection of global knobs) I've never quite
> understood the use of mode bits to control FP rounding;
> the DEC Alpha approach makes much more sense to me.

The DEC Alpha also supports dynamic rounding modes.  On the Alpha
architecture, two bits of the floating point opcode are used to
indicate what rounding mode to use; three of the four bit patterns
indicate a static rounding mode.  The fourth bit patterns indicates to
use the rounding mode currently in floating point control register.
Round to nearest is one of the three rounding modes that can be
hard-coded into the instruction; this is how Alpha compilers generate
code by default.  Compiling a program to use any non-default or
dynamic rounding modes requires special compiler flags.

> I can see where it might be helpful to modify it in
> a lexical sense (I thought that Borneo worked
> this way, perhaps I am wrong)

The Borneo language supports rounding mode control with lexical scope;
the lexically set value may be dynamic at runtime.  Borneo views using
dynamic rounding modes to find roundoff problems to be a debugging
technique.  As a debugging technique, it does not get support in the
language per se.  Instead, Borneo has certain code generation
requirements to ensure that the debugging technique can be used.

> Also, Borneo's implementation strategy (native calls
> that tweak the rounding modes) is unlikely to work in
> all JVM's; upon hearing about Borneo's trick,

The Borneo document discusses several options to implement rounding
mode and sticky flags support.  One option is to use a modified VM
that has instruction-level support for these operations.  Until such a
VM is available, to allow Borneo code to run on existing JVMs, Borneo
specifies how to desugar new language constructs into Java with native
method calls.  For example, in Borneo the block

{
  rounding Math.TO_POSITIVE_INFINITY;
  Some code..
}

is equivalent to the Java program

{
  int savedRM = getRound();
  try
    {
      // rounding Math.TO_POSITIVE_INFINITY;
      setRound(Math.TO_POSITIVE_INFINITY);
      Some code...
    }
  finally
    {
      setRound(savedRM);
    }
}

(assuming the Java compiler doesn't optimize in nasty ways).  The
finally block is necessary to reset the rounding mode in case an
exception occurs in "Some code..."  Since there are asynchronous
exceptions in Java, an exception can potentially be thrown between any
two instructions.

> my reaction was that there was another piece of machine
> state that I need to remember to reset upon return from
> a native call. In other words, Borneo exploits a bug in
> current Java implementations.

As listed in the first edition of "The Java(TH) Programming Language,"
one situation where native methods are appropriate is where "an
application must use system-specific features not provided by Java
classes."  IEEE 754 sticky flags and rounding modes are such features.
Yes, it is possible to write a Java class that implements IEEE 754
arithmetic, but only with about a 30X speed penalty.  Why should
programmers wanting to use these features in Java have to make their
machines run like machines from 5 years ago just because the designers
of Java didn't deem fuller IEEE 754 floating point support to be
worthwhile?

While it is possible to classify not restoring the rounding mode to
round to nearest after a native method call as a bug, I think such a
change would most likely be done deliberately by the programmer.
There are other situations where compilers refrain from performing
actions that might make sense.  For example, gcc doesn't optimize away
a loop with no body.  Why?  Because the programmer is probably writing
an empty loop to make the computer pause; optimizing away the loop
defeats the intent of the programmer's code.  Similarly, while acting
in accordance with the Java specification, always resetting the
rounding mode to round to nearest after a native method call may be
defeating the intent of someone's code.

>  (I'm not doing this to be
> perverse, I'm implementing Java in conformance with
> Sun's spec, which includes continuing to run in the
> proper fashion after native calls.

Native methods can do much worse damage to a running Java program's
state than changing the rounding mode.  From the JNI (Java Native
Interface) specification: "The programmer must not pass illegal
pointers or arguments of the wrong type to JNI functions.  Doing so
could result in arbitrary consequences, including a corrupted system
state or VM crash."

A reasonably fast Borneo implementation must use native methods to
change the rounding mode and manipulate the sticky flags.

> p.75 Again, note the distinction between Java source
> compatibility and Java VM compatibility.  The changes
> at the VM level from 1.0 to 1.1 to 1.2 have been
> relatively minor (it depends somewhat on the VM
> implementation strategy, but in my experience there
> were no changes from 1.0 to 1.1, and only the addition

Going from Java 1.0 to 1.1 adds reflection and JNI.  Implementing
reflection requires some changes to the VM since reflection basically
queries the runtime state of the VM.  These changes from 1.0 to 1.1
certainly aren't inconsequential; Microsoft's refusal to implement JNI
in its VM was part of the impetus for the ongoing lawsuit between Sun
and Microsoft.

> Because of the Java/JVM split, relaxing associativity
> is likely to be tricky to specify in a way that is
> useful, safe, and efficient.  Compilation from Java
> to bytecodes is (currently) not machine-specific, so
> there's little machine-specific reassocation that
> can occur here.

Yes, some thought must go into specifying what taking advantage of
associativity can entail.  For example, since relaxing associativity
is done for speed, Java's precise exception requirement should also be
relaxed.

>  On the other hand, all the lexical
> structure present at the source level is lost at
> the bytecodes, so a bytecode-to-native compiler
> (whether batch or better-late-than-never) will not
> see the boundaries across which reassociation should
> not use.

The compiler can emit an extra class file attribute that lists what
ranges of bytecodes can be re-associated.

>  (There's no distinction between temporaries
> and variables at the bytecode level, either.)

That depends on the compilation strategy.  It is certainly possible to
store each intermediate result in an explicit compiler-generated local
variable.  However, a more natural compilation will use the operand
stack to hold (at least some) anonymous values.

-Joe Darcy
darcyacs.berkeley.edu





More information about the Numeric-interest mailing list