Satan loves extended precision

uunet!ichips.intel.com!jmarshal uunet!ichips.intel.com!jmarshal
Thu Jan 19 12:40:11 PST 1995


	On chips like the 486DX that have internal extended precision registers,
	is it possible to calculate the single or double precision result of a
	single operation by performing the operation in extended precision, then
	storing the result to memory?  Rounding a result to 80 bits, then to
	53 bits can produce a different answer than rounding directly to 53 bits.
	However, it is possible for a chip that stores 80-bit extended precision
	numbers internally to save a little extra state to ensure that when a
	number is stored to memory, it is rounded just as it would have if it had
	originally been rounded to 53 bits, thereby complying with the IEEE 754
	standard.  Does the 486DX do this?  If not, is there any way to correctly
	simulate 53-bit rounding - preferably using portable C code?  Does the
	Pentium (or other chips) have this flaw as well?
	
	Here's a brief explanation of why I'm concerned:  I have an application in
	which I need to calculate a sum, and then calculate the exact roundoff
	error associated with that sum.  Given an FPU with exact rounding (the
	default configuration for an IEEE 754 conformant FPU), a well-known
	procedure due to Dekker can produce the desired results:
	
	    Assume |a| >= |b| (if not, swap a and b)
	    x := a + b
	    e := x - a
	    y := b - e
	
	This simple algorithm is guaranteed to produce a y such that a+b = x+y
	_exactly_.  (In other words, y is the roundoff error associated with the
	double-precision floating-point statement "x := a + b".)  Furthermore,
	x and y are both double-precision values, and y is no greater than half
	an ulp of x.  I might try to make this procedure work on a 486DX in C by
	declaring x to be a "volatile" variable, so that the result of the addition
	is stored to memory and then reloaded before execution of the first
	subtraction.  However, the procedure can fail if the addition is computed
	by first rounding the result to 80 bits of precision, and then to 53 bits.
	
	I wonder if it is possible to obtain the desired results on a 486DX,
	especially if I limit myself to C code that is portable across a wide
	variety of processors (which I suspect would remove the option of
	switching the 486DX into some sort of double-precision mode).
	
	Any information would be greatly appreciated,
	Jonathan Shewchuk
	School of Computer Science
	Carnegie Mellon University
	jrsacs.cmu.edu


Unfortunately Jonathan, there is no portable way on a extended precision family 
to guarantee single rounding to x precision. As you mentioned, storing to 
memory still doesn't produce single rounding if the precision mode in the FP 
control word is not set accordingly. Remember that many FP architectures have 
this dilemma with  run-time precision control either through compile-time or as 
a C function, the RS6000, 68881, etc... which have register precision control 
will face this portability issue.

Kahan has lately been involved in the ANSI/XJ11 committee on proposing and an
IEEE Floating Point C compiler standard which would solve this concern by
relying on the compiler to modify precision control of the FP unit when a
data declaration(single, double, long double) or proto-type is recognized
for a given arithmetic operation. Although this will be a performance
bottleneck, it would guarantee precise rounding as specified in IEEE 754
(not IEEE 854).


								Jon



More information about the Numeric-interest mailing list