Even more on Java numerics

David G. Hough at validgh dgh
Thu Feb 20 22:25:24 PST 1997


Coonen's comments having propagated to another mailing list, java-for-cse,
I posted a reference to my response at http://www.validgh.com/java, which
in turn elicited a reply by Russell Carter.    Some of his points (with >)
may be of interest to numeric-interest, as well as my comments (no >):

> 1. On x86, and especially for Pentium and Pentium Pro CPUs, the statement
>    that "the performance penalty for adhering to Java's model are not
>    overwhelming..." is inaccurate at best.  As David should well know it
>    is the ability to keep intermediate results on the stack, along with
>    judicious use of fxch that give these at best 1 result per cycle
>    processors their rather surprising sustained floating point performance.
>    Inducing an extra store/load at every floating point op will cause
>    a minimum of a factor of 2 in sustained performance. It surely looks
>    fortunate that Sun has the floating point architecture it does.

>    And I might also observe that it is possible to buy several compilers
>    that do The Right Thing for x86, and pgcc does not too bad for free.
>    So I think that maybe David's concern for the competence of x86 compiler
>    writers is misplaced, also.
> 
> Taking the long view, it is likely that Sun will find that the market will
> settle these issues.  It's a nice try at hamstringing wintel/IBM, but
> I don't think it will work.

As the founders of Java have observed, the original target was embedded
processors, of which Intel and IBM have plenty that would not find the
Java language definition an impediment.  Several chip companies are working on
Java chip implementations besides Sun, so there's no particular advantage to
any one of them.    In any event the Java language
definition does not cause Pentium or PowerPC chips to "run SLOW" compared to
other architectures, although Pentium and PowerPC might run slower than
they would if unconstrained.    The major performance issues with these chips
are that 

1a) The x86 architecture is rather constrained by only eight floating-point
registers; whatever is done to match load/store bandwidth to that constraint
in general,
will also tend to relieve any performance deficit due to loads/stores inserted
to get Java-specified rounding.    Indeed we can speculate that P7/Merced
has already solved this problem, since the shortcomings of the x86 architecture
have been known for years and some concessions must have been made to HP.
Intel could lay such speculations to rest by being more forthcoming about
its plans.

1b) The x86 architecture doesn't seem to have produced many compilers robust
enough for scientific programming; none of those I have tested can compile and
run UCBTEST and the LAPACK test and timing programs at maximum optimization
levels, and problems are common at lower optimization levels too.    
This suggests to me that, although x86 floating-point units may be
the most common in number, 
they are not particularly predominant among systems used primarily for
performance-sensitive floating-point computation.

2) For whatever reasons, PowerPC's have only been popular for Macintosh
computers, which are not much used for performance-sensitive floating-point
computation.    I have not had any reason to evaluate PowerPC compilers.

And the net of these considerations is that not many users of these computers
would notice these alleged performance deficits due to Java's definition,
and there are much more fruitful avenues to obtaining higher numerical
performance on systems built of these chips that would yield greater 
improvements than relaxing the definition of Java. 

> 2. As a practical matter the fused multiply operations on Power2 CPUs,
>    and the extended precision registers found in x86, have had essentially
>    NO discernible effect on the reproducible accuracy of nearly 
>    all practical scientific calculations.

It's certainly true that a lot of common scientific code, that runs
acceptably on IBM 370, DEC VAX, Crays, and IEEE 754 systems of various
flavors, is not much affected by arithmetic details because all such effects
have been forcibly suppressed by great programming 
effort over many years.   The focus
of Java is to get the same effective portability without such great effort. 

>    In fact, I would go so far as to say that Java numerical results are
>    going to differ across platforms a lot more from the immaturity of the
>    output formatting code than from any adherence to individual CPU
>    architectural enhancements.

These differences are bugs and will be so handled.    Unlike other languages,
Java will not permit hardware bugs, optimizer errors, and library mistakes
to masquerade as differences in roundoff or expression evaluation.    
Write once, test once, run anywhere is the goal.

That's a simplification of course; you still need to test Java programs on
each platform, if you are paranoid about those hardware bugs, optimizer errors,
and library mistakes, 
but a satisfactory error analysis on one platform will hold
for all, and discrepancies can be usefully analyzed because they do mean
that something is wrong and there is no likelihood that they will ultimately
have to be tolerated because somebody rightly or wrongly concluded that 
the discrepancies were "just roundoff".




More information about the Numeric-interest mailing list