IEEE arithmetic history

David G. Hough at validgh dgh
Tue Sep 12 09:49:48 PDT 1995


It was certainly the intent of the 754 committee that the 8087 be grandfathered
in as conformant, since Intel seemed to be providing the first attempt to
provide most of the draft standard in hardware.    Intel decided to tape out
before the committee did, so there are some things in the x86 architecture
to this day that seemed like good ideas at the time but don't look so good
in retrospect.

I remember Kahan remarking that the precision control that Intel had specified -
that dealt with precision only and not exponent range - was satisfactory
for the types of programs e.g. accounting and spreadsheets that he expected
would want exact bitwise reproducibility of results among IEEE implementations.
In practice, it didn't work out that way because IBM declined to make 8087's
standard on PC's - possibly for good reasons involving supply or competitive
issues with the Apple ][ which had particularly bad software floating point -
and so hardware floating point was uncommon on PC's until
486DX's became standard on the low end, with the result that all kinds of
software sort-of-IEEE floating point was devised at first.    And the kinds
of programs that bitwise reproducibility has really been an issue for in
my experience have been typically technical ones that hit lots of exceptional
cases.    In retrospect the MC68881 decision to have precision control affect
exponent range was the right one - it avoids that pesky little message in
from Paranoia about "except possibly for double rounding during gradual underflow."

In any case, my point is that regardless of legalistic arguments about
wording, the intent of the standard was that systems incorporating the 8087
in appropriate ways would be conformant.    There are indeed problems with
extended precision, which perhaps were not fully appreciated in advance,
and might have been avoided if all PC's had 8087's installed and Intel or IBM
had been more aggressive about promulgating what we would now call an ABI for
PC's, in the way that Apple tried to promote the Macintosh software arithmetic.
I have summarized the issues elsewhere as follows:

     The most frequent type of problem unique to x86 and Motorola m68k sys-
tems, rather than RISC or older CISC architectures, relates to the extended-
precision floating-point data registers.  The original intent of the IEEE 754
standard was that loops like
 
        do 1 i=1,n
 1      sum = sum + x(i)*y(i)
 
would be computed with the variable sum held in an extended-precision regis-
ter, and the product x(i)*y(i) formed in extended precision and added in
extended precision to sum.    That way roundoff would be minimized and many
gratuitous intermediate overflows and underflows eliminated, without any change
to the source code.  This effect is
readily achieved with internal register allocation mechanisms embodied in
optimization models developed for other systems, which may pretend that
floating-point registers may contain single or double precision variables
exactly so that loads and stores are free of arithmetic consequences.
 
     However such a mechanism runs afoul of common programming paradigms like
 
 1      continue
                n = n + 1
                b = b + b
                t = b + 1
                t = t - b
                t = t - 1
        if (t .eq. 0) goto 1
 
intended to determine precision at run time, and similar methods  intended  to
detect underflow and overflow thresholds.
 
     So the correct usage of extended-precision registers must be accompanied
by language extensions to allow explicit declaration of variables like sum to
be extended precision, and optimizer discipline to ensure that variables like
b and t corresponding to explicit assignments in the source code are always
rounded to their declared storage precision.    This entails an efficiency
loss since a compound "round to storage precision in extended register and
store to memory" instruction is missing from x86 and m68k; instead a round and
store must be followed by a load of the rounded value, and the optimizer must
not delete that load as redundant.    Furthermore, function arguments and
results must be rounded to their declared precision rather than evaluated in
unrounded extended precision.  There is some uncertainty about whether, in

        if (x .eq. y*z) ...

the extended-precision temporary y*z should be rounded to the storage preci-
sion of x, y, and z prior to the comparison; either way will disappoint some
programs.    In general it remains desirable, for the reasons expressed origi-
nally in IEEE 754, that anonymous temporary expressions like (a * z + b) * z
should always be evaluated in extended and not rounded to storage precision
until forced to do so to conform to an explicit source code assignment, func-
tion argument, or function value.



More information about the Numeric-interest mailing list