Comments on "Proposal for Extension of Java(TM) Floating Point Semantics, Revision 1"

Tue Aug 11 01:47:56 PDT 1998

Sun's "Proposal for Extension of Java(TM) Floating Point Semantics,
Revision 1" (henceforth abbreviated as PEJFPS) has two goals:

* grant some access to extended precision hardware, where available

* allow the x86 to run Java floating point quickly (avoiding
stiff performance penalties from trying to conform exactly to Java's
semantics [Coo, Gol])

PEJFPS addresses these issues by:

* in some contexts, allowing extended formats to be used instead of
pure float and double

* giving the virtual machine wide latitude in deciding, when, where,
and whether, extended precision will be used

"Except for timing dependencies or other non-determinisms and given
sufficient time and sufficient memory space, a Java program should
compute the same result on all machines and in all implementations."
		    --Preface to The Java(TM) Language Specification

Programmers have different needs for different programs.  Sometimes
bit-wise reproducibility is needed and warranted.  At other times, the
speed of the program is more important than the exact answer computed.
Existing Java semantics favor the former requirement over the latter.
However, in addressing the speed issue, PEJFPS destroys a program's
*predictability*.

For reasons both intrinsic and practical, Java has never lived up to
its "write once, run anywhere" slogan.  Even single-threaded Java
program are non-deterministic because the garbage collection algorithm
determines the order finalize methods are called ([JLS] section 12.6).
However, by specifying the sizes of primitive types and by giving
precise rules for expression evaluation order, Java is much more
predictable than other contemporary languages such as C and C++.  This
increased predictability is the achievable fraction of "write once,
run anywhere."  Therefore, by removing Java's predictability PEJFPS
spoils an important feature of Java while contradicting previously
espoused Java philosophy.

What does the PEJFPS allow?
---------------------------

Many surprising and perverse situations are allowed by PEJFPS.  Most
stem from the license granted to the virtual machine by PEJFPS section
5.1.8:

"Within an FP-wide expression, format conversion allows an
implementation, at its option, to perform any of the following
operations on a value:"

* promote float to float extended and round float extended to float
* promote double to double extended and round double extended to double

Some of the permissible anomalies this sanctions are discussed below.

How do you know what you are getting?
-------------------------------------

PEJFPS adds various class fields (WIDEFP_MAX_EXPONENT,
WIDEFP_MIN_EXPONENT, etc.) to indicate the range and precision of the
formats that the VM uses in FP-wide contexts (PEJFPS section
20.9-20.10).  It is *permissible* for a VM to set these fields to
indicate extended precision and then, in accordance with section 5.1.8,
round every result to the corresponding base format (float or double).
Therefore, the implementation can legally indicate extended precision
is being used and then not actually use extended precision at all.

Can assigning a float to a double overflow?
-------------------------------------------

It is possible under PEJFPS for FP-wide contexts to promote float
values to double extended and keep double values as double.
Therefore, assigning a float to a double can overflow!  Even FORTRAN
guarantees width(float) <= width(double).

What can a JIT do?
------------------

In a virtual machine comprised of an interpreter and a JIT, it may be
faster to interpret code that is only run once (or a few times) than
to compile and execute the code.  Therefore, a JIT could perform
on-line profiling to determine what methods to profitably compile down
to native instructions.  A JIT might also have a limited-size code
cache that evicts compiled code that hasn't been run in a while.

Consider the possible behavior of the code below in the environment
described above:

widefp static double dot(double a[], double b[])
{
  double sum;
  for(int i = 0; i < a.length; i++)
    sum += a[i] * b[i]
  return sum;
}

public static void main(String[] args)
{
  double a[] = {Double.MAX_FLOAT, -Double.MAX_FLOAT, 1.0};
  double b[] = {Double.MAX_FLOAT,  Double.MAX_FLOAT, 1.0};

  for(int i=0; i < 10; i++)
    System.out.println(dot(a,b));	// What will be printed?

  Other code ...

  System.out.println(dot(a,b));		// What will be printed?
}

If dot(a, b) is run with FP-strict semantics, a NaN is generated
(Double.MAX_FLOAT * Double.MAX_FLOAT overflows to infinity and
(infinity - infinity) is NaN).  If dot(a, b) is run with FP-wide
semantics, 1.0 is returned.

Assume the VM described above uses strict semantics when interpreting
(to reuse an existing code base) and uses wide semantics when
compiling, for better performance.  Therefore, the first five calls to
dot could return NaN (strict interpreting) and the next five could
return 1.0 (wide compiling).  The last call to dot could also return
NaN since dot could have been evicted from the code cache.

PEJFPS allow calls to the same method with the same arguments to vary
over time even in the absence of race conditions and dependence on
external state.  (Successive calls to a random number generator
(hopefully) give different answers depending on state external to the
method.)

Although Sun's Hotspot may always compile a method as soon as it is
called, there are other JVM jits that could use different policies.

A complication for JIT writers
------------------------------

In Java, it is illegal for one method to override another if the
methods have different return types ([JLS] section 8.4.8.3).  Having a
known return type eases code generation.  For example, since the
caller knows what type to expect, better register allocation and stack
management can be performed.  In contrast, under PEJFPS the format
returned by a floating point method is not knowable at compile time.

class A {
  strictfp double foo(){...}
}

class B extends A {
  widefp double foo(){...}
}

static double bar(){
  A x = (<<a random predicate>>? new B():
                                 new A());

  return x.foo();	// does strict foo or wide foo get called?
}

Dynamic dispatch is a fundamental feature of Java.  Therefore, it is
not always possible to know at compile time which method a particular
call site will call at runtime (because of dynamic linking, at compile
time, the method that eventually gets called might not even by written
yet).  In the code for bar above, either B.foo() or A.foo() gets
called, depending on some predicate opaque to the compiler.
Therefore, at compile time, the compiler does not know if a strict 64
bit double or an 80 bit double extended value will be returned
(assuming the widefp context actually uses double extended values).
"The presence or absence of widefp and strictfp modifiers has no
effect whatsoever on the rules for overriding methods and implementing
abstract methods," PEJFPS section 8.4.6.1. Therefore, compiling the
above classes and methods to native code may be problematic unless the
VM enacts some convention such as storing all floating point stack
values as double extended.

On the x86, distinguishing between double and double extended at
compile time is less important since the floating point registers
promote all values to double extended anyway.  However, on
architectures that have orthogonal support for different floating
point formats (separate instructions for each format, fewer registers
for wider formats), knowing the width of the return value is important.

A related problem arises for register spilling; namely, at what width
do values get written to memory.  On recent x86 processors, it is
faster to spill 64 bit double values than to spill 80 bit double
extended values (the 64 bit spill instruction has lower latency even
ignoring reduced memory traffic).  Therefore, the speed goals of
PEJFPS temp a VM implementor to unpredictable spill to memory double
values instead of double extended, breaking referential transparency
of expressions and reintroducing precision anomalies found on the Sun
III compilers.

No extended format arrays
-------------------------

The loss of storage equivalence between variables and array elements
precludes employing the Java programming idiom of using a one-element
array to pass an extended floating point value by reference (see
http://www.afu.com/javafaq.html section 6, question 21).

Additionally, allowing extended precision arrays would be useful in
solving certain linear algebra problems.

Miscellaneous problems
----------------------

At variance with both IEEE 754 and the current Java specification,
PEJFPS does not require subnormal support in widefp methods (p. 30).
Therefore, widefp variables might be able to represent *fewer*
floating point values than corresponding strictfp variables.

PEJFPS apparently does not allow fused mac to be used.  Therefore,
PEJFPS ameliorates Java's performance implications on the x86 but not
the PowerPC.

Having widefp be the default can break existing Java programs that
rely on Java's strict floating point semantics (admittedly such
programs are in the minority).

PEJFPS lacks any mention of library support.  For example, can
extended format values be printed out and read in?

Another way
-----------

The problems identified above are not inherent in any proposal that
addresses the two goals of PEJFPS, allowing better execution speed on
the x86 and providing (at least limited) access to hardware-supported
extended precision.

Many of the problems above are due to the ability of the compiler to
unpredictably choose to use extended precision in an unnameable format.
The easiest way to avoid that problem is to introduce a third
primitive floating point type "indigenous."  The indigenous type
corresponds to the widest format with hardware support on the
underlying processor, 80 bit double extended on the x86, 64 double
elsewhere.  Adding a third type restores a close correspondence
between hardware formats and floating point types.  Although
indigenous varies from platform to platform, it is fixed for a given VM
and cannot be varied at runtime.  The new type can also be used in all
floating point contexts, such as arrays and object fields.

The speed problem can be addressed by modifying Java's default
floating point semantics.  For example, running the x86 with the
rounding precision set to double can be deemed sufficient (this allows
greater exponent range but the same precision).  However, some
precautions must be taken to avoid programmer confusion.  For example,
all stores to named variables should round to true float or double.
Alternatively, the current practice of storing after every floating
point operation can be canonized (this retains a very minor double
rounding on underflow discrepancy and carries some performance
degradation).

A new declaration, anonymous <FloatingPointType>, can be used to
control the expression evaluation policy.  The anonymous declaration
promotes all floating point operands narrower than the target type to
the target type and performs all operations on that type.  Stores to
narrower floating point variables are implicitly narrowed as
necessary.  Essentially, "anonymous indigenous" uses pre-ANSI C
expression evaluation rules.  "anonymous indigenous" allows good
hardware utilization (for speed) and provides the benefits of
extra-precise expression evaluation (where available).  "anonymous
double" can be used for better bit-wise reproducibility.  "anonymous
float" means to use the existing Java rules.

The indigenous type and anonymous declarations are described in more
detail in "Borneo 1.0: Adding IEEE 754 floating point support to Java"
[Dar].  One of Borneo's design goals is keeping Java predictable while
enhancing its floating point support.  Borneo also includes support
for required IEEE 754 features omitted both in the original Java
specification and PEJFPS.

-Joe Darcy
darcyacs.berkeley.edu

References
----------

[Coo] Jerome Coonen, "A Note on Java Numerics," message on the Numeric
interest mailing list January 1997,
http://www.validgh.com/numeric-interest/numeric-interest.archive/numeric-interest.9701

[Dar] Joseph D. Darcy, "Borneo 1.0:  Adding IEEE 754 floating point
support to Java," M.S. thesis, University of California, Berkeley, May
1998, http://www.cs.berkeley.edu/~darcy/Borneo

[Gol] Roger A. Golliver, "First-implementation artifacts in Java(TM)"

[JLS] James Gosling, Bill Joy, and Guy Steele, The Java(TM) Language
Specification, Addison-Wesley, 1996.