Java floating point implementation bugs

Thu Aug 12 09:16:28 PDT 1999

This was also posted to RISKS digest, but it seemed somewhat
appropriate for this forum.  See also

  http://www.naturalbridge.com/floatingpoint/

===================================================
Jitterbugs, or Risks of JIT compilation

NaturalBridge has developed a set of tests to ensure
the conformance of our static Java floating point
compilation with the specifications published by Sun
Microsystems. In the course of running these
conformance tests on competing virtual machines
(VMs), we discovered a serious and unexpected result
that potentially affects anyone developing Java
applications, even applications that make no use of
floating point.

Most advanced JITs (just in time compilers), such as
Sun's distributions of HotSpot and the Symantec JIT
(both for the Intel/Windows platform) generally
operate in at least two modes: a fast translation but
slow evaluation mode ("low-speed"), and a slow
translation but fast evaluation mode
("high-speed"). JIT technology is ideal for
applications like web browsers where the code cannot
possibly be compiled ahead-of-time because it is not
available. However, the use of two execution modes
has the disadvantage of introducing a new category of
bugs and substantially complicating testing.

We have discovered one area of Java where at least
two advanced JITs use a technique that passes the
conformance and correctness tests in low-speed mode,
but use a faster but non-conforming technique in
high-speed mode. We believe this is a consequence of
late compilation hiding the bugs from tests; a
virtual machine may appear to work, but high-speed
evaluation is not tested. One bug we discovered was
serious enough that we believe that it is highly
unlikely it could have escaped a competently written
test suite, if it were actually tested.

The Trig Bug 

The discrepancies we discovered are in the
implementation of java.lang.Math.sin and
java.lang.Math.cos. The Java Language Specification
mandates behavior "as if" from a straightforward
translation of Freely Redistributable Libm (fdlibm,
available from netlib) from C to Java. Some
implementations appear to use fdlibm in low-speed
mode, but switch to the Pentium FSIN and FCOS
instructions (which are about twice as fast as an
optimized version of fdlibm) in high-speed
mode. Unfortunately, these instructions are not
suitable for this purpose; they generate incorrect
answers for many inputs:

1.  The Pentium FSIN instruction near PI and the FCOS
instruction near PI/2 produce answers that differ
from the correct results (as defined both by
mathematics and the Java Language Spec) in as many as
38 bits. This behavior is a natural consequence of
range reduction with an insufficiently precise
approximation of PI.

2.  The Pentium instructions FSIN and FCOS
instructions are only defined on the range
(-2^63,+2^63). When passed an input outside of that
range, they return the input and set an error
bit. Some VMs we have tested do not check this bit
(or otherwise reduce the range) and thus in
high-speed mode java.lang.Math.sin(2^65) returns
2^65.  The proper result should be a number between
-1 and 1.

3.  Even in the hardware's defined input range, the
two implementations diverge for large inputs. At the
end of the defined range, there is a 2% difference
between 68-bit argument reduction and precise
argument reduction.

We have made tests available that demonstrate this
bug, as well as other floating problems.  We also
have a more complete description of the problems with
the Intel transcendental instructions.

See
  http://www.naturalbridge.com/floatingpoint/floattestdesc.html
  http://www.naturalbridge.com/floatingpoint/intelfp.html

Significance Beyond Floating Point 
---------------------------------

The points that we wish to stress are: 

1.  This is a new category of bug. People have dealt
with compiler bugs in the past, usually by testing
(after compilation) exactly what they ship. Here,
such testing is not only not possible, but the
shipped application runs the risk of encountering the
bug mid-execution. We don't really know if it matters
that java.lang.Math.sin changes mid-execution, but
the programmer probably doesn't expect it.

2.  JITs make it too easy to miss VM implementation
mistakes. Unless the regression tests for a virtual
machine are carefully written, they may not ever
exercise the high-speed mode. As JITs become even
more sophisticated, more sensitive to program
behavior, and acquire both more optimization levels
and larger collections of situation-specific
optimizations, the testing harness will have to
improve with them.

Such specialization is claimed to deliver very high
performance, but in practice (as demonstrated above)
compilers do have bugs, and it is foolish to pretend
that they do not. Batch compiler vendors reduce the
number of bugs encountered by encouraging the use of
a small set of well-tested optimization flags (for
instance, the "-fast" option of Sun's C and Fortran
compilers), and batch compiler users reduce their
exposure to bugs by testing the same compiled code
that they intend to ship.

3.  Dual-mode evaluation also makes it very difficult
for someone on the outside to evaluate the quality of
a virtual machine. Performance is easily measured,
but how can a user tell if a vendor has cut corners
in their high-speed mode?

4.  Dynamic compilation is becoming very
complex. Optimizing compilers are known for their
complexity, but they take one input and produce one
output as an off-line problem. Improving interpreter
performance is a fine goal, but attempting to meet or
exceed the performance of batch optimizing compilers
requires all the same complex transformations that
batch compilers apply, but with tight time and space
constraints in a multithreaded environment.

David Chase and Ken Zadeck
chaseanaturalbridge.com
zadeckanaturalbridge.com