some history of infinity in IEEE arithmetic

David Hough uunet!Eng.Sun.COM!David.Hough
Sat Jan 4 14:28:19 PST 1992


The initial drafts of what became IEEE arithmetic included a mode bit for
infinity treatment.   The modes mainly affected comparisons: in the default
projective mode, all infinities
compared equal regardless of sign bit; sums or differences or sqrts
of infinities were always NaNs.   The optional affine mode allows 
infinities to be ordered with respect to finite numbers by their sign bits
and generally corresponds to what most people would expect who are only
thinking in terms of real arithmetic on a line.

But a contingent from Apple of which I was a part got the draft standard
"simplified" by making the affine mode default, and also eliminating the
warning mode for subnormal operands and unnormalized extended operands.
Making normalizing mode the default has made life tougher for hardware
implementers ever since, but was probably a good thing.  Making affine mode
the default has made life easier for programmers that don't think in complex
arithmetic terms, but tougher for those that do.

As has been mentioned by others, it's often very
natural to think of complex zero and infinity in polar terms, as points with a
direction attached which may or may not be meaningful.  This corresponds to
the treatment of signed zero in real IEEE arithmetic and to the defunct 
projective mode treatment of signed infinity in real IEEE arithmetic, for the
sign bit is enough to record the direction.

But the typical efficient representations of complex numbers as pairs of reals
get in the way.   For finite nonzero numbers, w == z iff the real and
imaginary parts are equal, and this is efficient to test.   
For zeros, the same component testing works fine too, but the direction
information is reduced to two sign bits.

The most natural definition of complex infinity is that if either part
is an infinity and neither is NaN, then the composite entity is a complex
infinity.   The directional information consists of the signs of the parts
and the magnitude of the finite or second infinite part.

In the default projective mode all such infinities should compare equal, but
I don't know if anybody does that:  

Consider

	complex w,z,u,v,p
	w = (1.0e20,0.0)
	print *,' w  ',w
	z = (0.0,1.0e20)
	print *,' z ',z
	u = w * z
	print *,' u=w*z  should be inf ',u
	v = exp(100.0)
	print *,' v should be inf ',v
	print *,' v .eq. u should be true ',v .eq. u
	print *,' v .ne. u should be false ',v .ne. u
	p = u * v
	print *,' p=u*v should be inf ',p
	w=(1000.0,0.0)
	print *,' w ',w
	z=cexp(w)
	print *,' cexp(w) should be inf ',z 
	end

SunOS:

 w    (    1.00000E+20,  0.)
 z   (  0.,    1.00000E+20)
 u=w*z  should be inf   (  0., Inf           )
 v should be inf   ( Inf           ,  0.)
 v .eq. u should be true   F
 v .ne. u should be false   T
 p=u*v should be inf   ( NaN           , Inf           )
 w   (   1000.000,  0.)
 cexp(w) should be inf   ( Inf           , NaN           )
 Note: the following IEEE floating-point arithmetic exceptions 
 occurred and were never cleared; see ieee_flags(3M): 
 Inexact;  Overflow;  Invalid Operand; 
 Note: IEEE Infinities were written to ASCII strings or output files; see econvert(3). 
 Note: IEEE NaNs were written to ASCII strings or output files; see econvert(3). 
 Sun's implementation of IEEE arithmetic is discussed in 
 the Numerical Computation Guide.

HP-UX:

  w   (1.00000E+20,.0)
  z  (.0,1.00000E+20)
  u=w*z  should be inf  (.0,+INF0)
  v should be inf  (+INF0,.0)
  v .eq. u should be true  F
  v .ne. u should be false  T
  p=u*v should be inf  (NaN0,+INF0)
  w  (1000.0,.0)
  cexp(w) should be inf  (+INF0,NaN0)

AIX:

  w  (0.1000000020E+21,0.0000000000E+00) 
  z (0.0000000000E+00,0.1000000020E+21) 
  u=w*z  should be inf (0.0000000000E+00,INF) 
  v should be inf (INF,0.0000000000E+00) 
  v .eq. u should be true F 
  v .ne. u should be false T 
  p=u*v should be inf (NaNQ,INF) 
  w (1000.000000,0.0000000000E+00) 
  cexp(w) should be inf (INF,NaNQ) 

IRIX:

  w   (1.0000000E+20,0.0000000E+00)
  z  (0.0000000E+00,1.0000000E+20)  
  u=w*z  should be inf  (0.0000000E+00,Infinity)
  v should be inf  (Infinity,0.0000000E+00)
  v .eq. u should be true  F
  v .ne. u should be false  T
  p=u*v should be inf  (NaN,Infinity)
  w  (1000.000,0.0000000E+00)
  cexp(w) should be inf  (Infinity,NaN)


Industry standard implementation leads to industry standard bugs!
I don't remember ever receiving any bug reports about the foregoing kinds
of behavior; perhaps user expectations are too low.

Defining complex arithmetic operations in hardware, with trapping to 
supervisor software in bad cases, is probably the only way these problems
can be addressed efficiently.   There are microcoded complex arithmetic
functions on the Sun-3x/FPA+, but sales of the underlying CPU were so poor
that nobody ever found out about them.   And I didn't think to catch the
cases above; the results on FPA+ are no better.  But the microcode (or hardware
on a RISCier system) could have been constructed to take a hardware trap
on cases generating a NaN so they could be sorted out by supervisor software
without slowing down the normal cases.   Of course even the "correct"
results won't satisfy everybody all the time... that's the nature of
exceptions.   For instance, I don't know whether enforcing a single 
canonical representation for complex infinity would be good for conformal
mapping; a single canonical representation for zero would eliminate the
possibility of distinguishing sides of slits along the axes.

At this point I would be reluctant to prescribe exactly what should happen
in a language standard.
If NCEG hides its head in the sand on these issues, Fortran style,
that wouldn't be as bad as choosing the wrong behavior by democratic vote.
On the other hand, if the correct behavior can be identified at least
for IEEE systems, then it should be specified in the IEEE section of NCEG.

The underlying problem is the fact that complex numbers can almost always
be thought of as a field themselves or as length-2 vectors of real numbers
in cartesian form, or a length-2 vectors of real numbers in polar form,
whichever is most convenient.   Sometimes the difference becomes
manifest, as above, but at other times one form sheds light on another.
For instance, in deciding how csqrt and clog should be defined with respect
to the signs of zero imaginary parts, I think it's helpful to refer 
to the polar forms, whence

for sqrt,
	s(Im(sqrt(z))) = s(sin(arg(z)/2)) = s(arg(z)) = s(Im(z)).

for log
	s(Im(log(z))) = s(arg(z)) = s(Im(z))

so the sign bit s of the imaginary part of a complex sqrt or log should agree
with the sign bit of the imaginary part of the operand.  Hence restricting
the imaginary part of clog to satisfy -pi < ... <= +pi
would be a mistake; -pi <= ... <= +pi is what's needed. 	 



More information about the Numeric-interest mailing list