(from comp.arch) rounding vs. chopping example from Nelson Beebe

Thu Jun 20 06:52:01 PDT 1991

Subject: truncating vs rounding floating-point arithmetic
Newsgroups: comp.arch

Nelson Beebe (beebeamath.utah.edu) recollected the following message
to a colleague:

>> The following little program can be used to illustrate the effect of
>> truncating arithmetic has on your larger program:
>> 
>>       real dt,t0,t1,t2,tend
>>       integer n
>> 
>>       n = 0
>>       dt = 0.018
>>       t0 = 4000.0
>>       tend = 5000.0
>>       t1 = t0
>>       t2 = t0
>> 
>>  10   n = n + 1
>>       t1 = t1 + dt
>>       t2 = t0 + float(n)*dt
>>       if (t2 .lt. tend) go to 10
>>       write (6,*) t1,t2,(t1 - t2)/t2
>>       end
>> 
>> On the IBM 3090, this single precision version prints:
>> 
>>    4879.89844       5000.00781     -0.240218341E-01
>> 
>> That is, the relative error is 2.4%.  On the Sun 4, it produces
>> 
>>     5003.70    5000.01    7.37889E-04
>> 
>> The effect of truncating arithmetic on the running sum is large.
>> 
>> The double precision version is:
>> 
>>       double precision dt,t0,t1,t2,tend
>>       integer n
>> 
>>       n = 0
>>       dt = 0.018D+00
>>       t0 = 4000.0D+00
>>       tend = 5000.0D+00
>>       t1 = t0
>>       t2 = t0
>> 
>>  10   n = n + 1
>>       t1 = t1 + dt
>>       t2 = t0 + dfloat(n)*dt
>>       if (t2 .lt. tend) go to 10
>>       write (6,*) t1,t2,(t1 - t2)/t2
>>       end
>> 
>> The IBM 3090 result is
>> 
>> 5000.00799995563648       5000.00799999999981     -0.887265231637227285E-11
>> 
>> The Sun 4 result is
>> 
>> 5000.0080000016    5000.0080000000    3.2341579848518D-13

Note that satisfactory results are obtained if you use enough precision
or if you round rather than chop.  Also note that this is not the program
that failed, but rather a drastic simplification of the user's actual
application to reveal the essential problem.  It's a simple example where
the superior statistics of rounding rather than chopping imply a broader
domain of applicability for a particular program.

Correct rounding and chopping, and several other good paradigms, can be
characterized by the property

	The rounded computed result is chosen from the two machine-representable
	numbers nearest the unrounded infinitely-precise exact result,
	according to a rule that depends only on the infinitely-precise
	exact result, and not on the operands or operation (or phase of moon). 

Most "fast" sort-of-rounding or sort-of-chopping schemes invented by 
hardware designers eventually frustrate error analysts because they 
can't be so characterized.

As for the first IBM RS/6000 implementation, I have heard that the original
floating-point unit was designed to implement IBM 370 arithmetic, and was
changed to IEEE 754 format relatively late in the game.  If true then it
would not be surprising that some aspects of 754 were problematic to add in.
The interesting question would then be
which aspects of IEEE arithmetic will be really
problematic for a high-performance RS/6000 implementation designed from
scratch to support 754.