Comments on Sun's Proposal for Extension of Java Floating Point in JDK 1.2
Samuel Figueroa
figueroaaapple.com
Mon Aug 10 09:44:52 PDT 1998
Here are my personal comments on Sun's Proposal for Extension of Java
Floating Point in JDK 1.2 (see http://java.sun.com/feedback/fp.html).
Executive summary:
Sun's desire to improve the floating-point aspects of the Java language, as
evidenced by the recently announced proposal, is welcome. Sun should be
applauded for not yielding to the temptation of proposing rash, radical changes
to the Java language, since this could invite strong disagreement among Java
users and licensees, possibly resulting in no changes in the near future - a
few well thought out improvements are much better than no improvements at all.
However, some small changes would significantly strengthen Sun's current
proposal. Specifically, floating-point arithmetic in widefp methods should be
more predictable (to reduce the "write once debug everywhere" syndrome), and
implementors should be allowed to take advantage of the fused multiply-add
instruction that many processors have - potential performance improvement
should not be confined to Intel-style processors only. These changes are
sketched below in outline-like form (and summarized in the conclusion), along
with some suggestions for additional possible improvements that will hopefully
be taken into consideration.
A guess as to what might have been the goals of Java's designers in the area of
floating-point arithmetic
- for the original Java spec:
a) bit-for-bit identical results on all conforming implementations
b) make floating-point arithmetic available in the most straightforward
way possible on the grounds that the simplest semantics best serves the
naive user
c) don't complicate things with all the frills of the IEEE Standard, which
are probably of use only to experts
d) oh, and by the way, this can be implemented very efficiently on SPARC
processors, which, after all, is a model implementation of the IEEE
Standard
- reasoning for the current proposal seems to be along the line of:
not all processors have floating-point engines like that of the SPARC;
let's see if we can loosen the semantics a little bit so that in
particular, JVMs running on Intel processors can perform reasonably well
What should have been the goal of the Java spec in the area of floating-point
arithmetic?
- in retrospect, given the wide popularity of Java and the desire to use it
for such a wide variety of applications, the goal should have been to
make it easier for naive users to write numerical code that is
satifactorily robust, while not adversely impacting performance unduly,
or throwing unnecessary road blocks in numerical experts' paths;
this means:
a) allow appropriate use of wider precision to protect naive users from
their own mistakes, or at least so as to allow implementations to
provide greater accuracy
b) give implementors flexibility for the sake of performance on a wide
variety of processors - not just SPARC and Intel
c) provide at least the essential "expert" features; these features don't
have to be easily accessible if controlling language complexity is a
consideration ("expert" features include:
- controlling expression evaluation mode, possibly on an operation-by-
operation basis
- manipulating the rounding mode
- accessing the sticky status flags
- floating-point trap handling
- enabling/disabling features such as fused multiply-add or abrupt
underflow ("flush to zero")
- enabling/disabling double rounding and extra range on double
extended based processors
- determining whether certain features are available, such as support
for precision wider than double, fused multiply-add, abrupt underflow,
and trap handling)
d) make floating-point semantics sufficiently predictable so that error
analysis at least becomes tractable
e) as a concession to those [marketing folks?] who feel they absolutely
need this, make getting bit-for-bit identical results across diverse
implementations achievable as a secondary consideration
To what extent do the Java spec and current proposal achieve the ideal goals?
- in theory, floating-point semantics can be predictable, and bit-for-bit
identical results are possible, though in practice this is not currently
the case
- the current proposal allows some flexibility (at the expense of
predictability) so that JVMs running on Intel processors can have better
performance; performance on virtually all other processors is unchanged
with this proposal
How both the Java spec and current proposal fall short of the ideal goals
- the current proposal falls short on the first four of the five goals
above:
a) it doesn't do enough to protect naive users from their mistakes, since
implementations are not required to evaluate expressions using wider
precision, nor even be consistent
b) it misses potential performance improvements on processors other than
Intel and SPARC
c) it doesn't provide access to "expert" features
d) floating-point semantics are no longer predictable (except in strictfp
mode, in which case performance may not be acceptable)
- the original Java spec did meet the fourth goal (floating-point semantics
was predictable), but at the expense of very significant detrimental
performance impact on Intel processors and potentially less protection
for naive users (especially in the case of not using robust formulas
that produce acceptable results, even when intermediate results are not
computed to high accuracy)
How could Java be changed to better meet the ideal goals?
- What specific modifications to the current proposal would give the greatest
"bang for the buck?"
a) permit wider precision, but in a way that is predictable
- in all explicitly widefp code, all expressions should be required to
always be evaluated in double or always in double extended, depending
on which of these two generally leads to better performance; if the
underlying arithmetic engine does not support double extended,
expressions would always be evaluated in double
- results should always be narrowed to appropriate format on assignment
in order to reduce surprising behavior - unfortunately, this excludes
having variables with wider precision, but makes the language simpler
(variables with wider precision need to be called something other
than double - it's too confusing to have double sometimes mean double,
and sometimes not mean double)
- narrowing should also be required when casting, when calling strictfp
methods, and when returning to strictfp methods (i.e., if the
underlying arithmetic engine supports double extended, parameters in
widefp methods should be passed in double extended format and fp
return values of widefp methods should be in double extended format;
this would make behavior of functional notation identical to infix
notation, so that "add(a/b, c) * d" would give the same result as
(a/b + c) * d, assuming add() simply adds two numbers together)
- narrowing should not be allowed in any other cases
- implicitly widefp methods should all be treated as either explicitly
strictfp methods or as explicitly widefp methods, at the
implementors' option, never sometimes one way and other times another
way
- a globally-accessible constant should be made available to indicate
whether implicitly widefp methods are always treated as explcitly
strictfp or widefp methods
- another globally-accessible constant should be made available to
indicate whether expressions within widefp methods are evaluated to
double or double extended precision
b) permit, but not require, fused multiply-add in widefp mode
- interpreted code would probably never use it, whereas compiled code
would probably use it whenever possible
- predictability is probably not as critical a consideration, since no
new formats are involved - values in fp registers do not have more
precision than values stored in memory
- however, whenever necessary, control over when fused multiply-add
must be used and when it must not be used could be achieved by
invoking new methods to be added to java.lang.math, instead of using
infix notation for arithmetic expressions
- all implementations should be required to provide a fused multiply-
add method in java.lang.math for each of the fp data types; these
methods should be required to obey the standard semantics for fused
multiply-add (exact product, rounding only after addition), even if
they must be implemented in software
- a type-specific constant should be made available to indicate whether
there is hardware support for fused multiply-add, i.e., whether the
fused multiply-add methods in java.lang.math are implemented purely
in software or not; this would allow the language processor to choose
between two different algorithms - one that exploits fused muliply-
add, and one that doesn't - based on which algorithm is faster
- if performance is important, a JIT or traditional compiler would be
used, either of which would be capable of exploiting fused multiply-
add, if available
- if an interpreter is used, this implies performance is not important,
so if a bad choice is made with respect to which algorithm to use
(e.g., the algorithm that exploits fused multiply-add is chosen, even
though the interpreter isn't able to actually make use of the fused
multiply-add instruction), it doesn't matter - the right result will
be obtained anyway, because places that actually require the fused
multiply-add operation will invoke the fused multiply-add method
c) add methods to java.lang.math that would allow one to write fp
arithmetic expressions in functional notation
- the semantics of these methods would be identical to the current Java
semantics for fp operations
- fused multiply-add should be one of these methods
- these methods can be used to avoid double rounding at all costs in
(implicitly or explicitly) widefp methods, and to either force the
use of fused multiply-add, or prevent its use
- these methods would likely be used only by "experts," and even then
only rarely
- What else would it take to make Java a more ideal language for numerical
computing?
a) allow slightly more flexibility in strictfp mode to lessen performance
impact
- allow double rounding in strictfp mode when result is within double's
denormal range (i.e., on Intel processors, it should be sufficient to
set rounding precision to double, then store-load after every fp
operation)
- bit-for-bit identical results could still be very nearly always
achievable, and performance could at least double in some cases,
though performance might not equal that of a widefp method
b) add a data type so wider precision can be referred to by name
- this can be similar to Real Java's doubleN or C's long double data
type; i.e., double extended precision if the underlying arithmetic
engine supports it and performance is close to double arithmetic,
double otherwise
- this would be the format used when evaluating explicitly widefp code
- one possible name for this data type could be "widefp double"
- this will be especially important in the future as wider precision
becomes more commonly supported in hardware
c) provide more complete support for the IEEE Standard by making available:
- signaling NaNs
- more than one quiet NaN
- float version of square root
- different rounding modes
- (sticky) status flags
- maybe even trap handling
d) allow implementations to provide greater accuracy, e.g., for elementary
functions
- give programmers a choice between fast, accurate, and standard
(i.e., current) versions of the elementary functions
e) for the sake of completeness, one might consider adding
a nonstandard mode that would permit, e.g., abrupt underflow ("flush to
zero"), some compiler optimizations that are currently forbidden, and
unpredictable (but possibly more efficient) expression evaluation;
however, this kind of feature probably doesn't fit very well with the
rest of the language, although note that the spirit of the current
proposal is kind of along these lines - widefp mode is very loosely
specified, unlike the rest of the Java language
Conclusions and recommendations
- Is the current proposal ready to be set in stone?
No, because:
a) behavior of fp arithmetic is not predictable except in strictfp mode,
which exacts a very high performance penalty on some processors; i.e.,
the current proposal's widefp mode almost amounts to nonstandard
(do-whatever-you-want) arithmetic
b) the current proposal does not allow the use of fused multiply-add
c) allowing double to really mean double extended is too confusing and
complicates the language unnecessarily
d) the proposal should not strongly tempt implementors to disregard its
requirements (e.g., in strictfp mode) for the sake of marketing
advantages, such as greater performance, or to provide an
implementation with good performance but with poor usability due to the
poor quality of the arithmetic
- Can this proposal be strengthened significantly without redesigning the
whole language?
Yes, a few small changes would improve the proposal significantly:
a) fp arithmetic in widefp methods should be more predictable
b) fused multiply-add should be allowed in widefp methods
c) experts should be able to control when widefp arithmetic and fused
multiply-add are used (by marking methods as explicitly widefp or
strictfp, as the current proposal allows, and by calling methods in
java.lang.math when double rounding or fused multiply-add must be
avoided at all costs, or to force fused multiply-add to be used);
nonexpert users will rarely need to make use of these features
d) maybe allow double rounding in strictfp mode when result is in denormal
range, since this can improve performance significantly, and happens
very rarely
e) potential performance improvement might be slightly less (on Intel
processors, but somewhat more on other processors) than what the
current proposal allows, but performance improvement would still be
very significant, and the quality of Java's floating-point arithmetic
would be enhanced substantially over the current proposal
- What additional modifications should be made in the area of floating-point
arithmetic?
If Sun does not make this the last time Java's floating-point aspects are
improved, then this proposal can be kept modest
Sun should view this proposal as a first step towards making Java a more
convenient language for numerical programming, one that:
a) has more complete support for the IEEE Standard
b) allows implementors to provide greater accuracy (and quite possibly
better performance) than is currently allowed, particularly in regards
to the math library; also, providing an additional floating-point type
similar to C's long double that is at least as wide as double, if not
wider, would help in this regards
More information about the Numeric-interest
mailing list