Correlated roundoff, the Hubble, and CFP: informal experiment

Douglas Priest uunet!Eng.Sun.COM!Douglas.Priest
Thu Jan 20 20:53:32 PST 1994


Dean Schulze recently asked about examples of computations in which
roundoff errors are correlated so as to reduce their net effect.  Here
is a simple example which is perhaps more interesting for its connection
with the Hubble repair mission than for the computation itself.

Last October, Stefano Casertano of Johns Hopkins University posted a
message to comp.lang.fortran asking about a problem he had encountered
using a public domain Gaussian random number generator due to R. Brent
(Algorithm 488: A Gaussian Pseudo-random Number Generator, Comm. ACM 17
(1974), pp. 704-706).  The problem turned out to be a bug in the random
number routine.  I suggested a simple fix based on an idea I came across
during my Ph.D. research, so I asked for permission to cite his example
as well as a brief description of his application.

He had written a program to simulate the effect of cosmic rays on a
charge-coupled device (CCD) used in the Wide-Field/Planetary Camera 2,
which was installed on the Hubble Space Telescope as part of last month's
repair mission.  The program used the Gaussian random number generator
to generate probabilities for electrons scattered by the cosmic rays as
they pass through the silicon layer of the CCD being ``herded'' by the
local electric field into different detector cells.  At one point,
after uneventfully generating some tens of millions of numbers, the
random number generator went into an infinite loop.

The problem arose as follows.  The random number generator computes a
floating point number u between zero and one (possibly zero, but strictly
less than one).  It then uses a loop to count the number of leading ones
in the fixed point binary representation of u.  Since the routine assumes
a binary floating point arithmetic, the loop is written as follows:

   10 u = u + u
      if ( u .lt. 1e0 ) goto 20
      ...
      u = u - 1e0
      goto 10
   20 ...

Before the routine returns, it computes a new value of u to be used on
the next invocation.  Using a second number v lying between zero and u,
inclusive, the routine computes the new u by the statement

      u = (u-v)/(1e0-v)

In Casertano's case, the program happened by chance to produce a value
of u which was the largest floating point number less than one and a value
of v close to 0.3.  The values were such that the exact values of u-v and
1-v both lay halfway between consecutive adjacent pairs of representable
numbers, and when rounded according to the IEEE 754 default rounding rule,
both were rounded to the same number.  Thus the new u was assigned the
value one, and when the routine was next invoked, the loop above never
terminated.

Note that if 0 <= v <= u < 1, then in exact arithmetic (u-v)/(1-v) <= u,
so in the absence of roundoff, the routine would never set u = 1.  Based
on this observation, one might fix the subroutine by computing the new
u as

      u = min(u, (u-v)/(1e0-v))
      
at a cost of an additional test and branch (or a conditional assignment,
on architectures that have such an instruction).

My suggested fix costs only one additional subtraction instead:

      u = (u-v)/(1e0-(u-(u-v)))

One can show that the u computed by this formula is never larger than
the previous u provided the floating point arithmetic is ``faithful'',
i.e., it delivers the exact result of an operation when that result is
representable and delivers one of the two representable numbers nearest
the exact result otherwise.  Faithful arithmetics include IBM 370, DEC
VAX, and those which strictly conform to IEEE 754.  (Proving this claim
using nothing more than faithfulness is not altogether trivial.)

This fix is evidently quite portable, but will it work under Intel x86/
Motorola 68k-style arithmetics?  (I distinguish these from what I call
_strictly_ IEEE conforming arithmetics such as SPARC, PA-RISC, MIPS,
etc.)  My fix certainly will work under these arithmetics provided each
operation is rounded to the same precision, be it register precision
(double extended) or storage precision (single, in the example); even
twice-rounded arithmetic is faithful.  But there is no portable way to
specify which operations will be rounded to which precision.  Instead,
one would have to prove the fix correct for every possible combination
of precisions.  This same problem afflicts many other examples which
could be classified as employing correlated roundoff, and some of them
_won't_ work with double rounding.

Unfortunately, the only way I can think of to make such programs portable
is to have a portable way to test for such arithmetic, and simply abort
the program if it is detected.  Devising such a test would be easier but
for the fact that it depends not only on the machine's arithmetic, but
also on the compiler.  I think the following test might do the trick,
but since I don't have access to every combination of machine and com-
piler worth checking, I would like to ask readers of this mailing list
to try the following program.  If you find a machine and compiler for
which the program guesses wrong (either detects x86/68k-style arithmetic
on a machine with something other than x86/68k hardware or fails to
detect x86/68k arithmetic on a machine with such hardware), please let
me know.  Note that this only applies to systems which claim to conform
to IEEE 754; some systems and/or compilers have certain modes in which
conformance is not guaranteed (e.g., using the -fast option with SunPro
compilers).

int
main()
{
    double	x, y, z, w;

    z = 1.0 / ( 65536.0 * 256.0 );
    w = z * z * z * z / ( 65536.0 * 65536.0 );
    w = w * w * w * w;
    x = ( 2.0 - z * z ) * w / 4.0;
    y = ( 1.0 + z ) * w / 2.0;
    z = ( 1.0 + z - z * z ) * w * w / 4.0;
    if ( x * y != z )
    	printf( "This looks like an Intel x86/Motorola 68k.\n" );
    return 0;
}

Thanks,

Douglas M. Priest
SunPro Floating Point Group
Sun Microsystems, Inc.
(but only speaking for myself)



More information about the Numeric-interest mailing list