Fiddling with machine epsilon in Fortran

Sat Jun 9 23:20:00 PDT 1990

>  >[tim p]
>  >...
>  >      IF (EPS .GT. 0. .AND. 1.+EPS .EQ. 1.) THEN
>  >C        EPS is positive & smaller than machine epsilon
>  >
>  >and this is about as portable a cheap test as there is; an experienced
>  >Fortran programmer would have no fears about relying on it.  If the
>  >compiler were really free to do its "mathematically equivalent" tricks
>  >here, it could act as if the user had written [... IF(.FALSE.)...]
>
>  [jim meyering]
>  ...
>  However, I hope that few experienced Fortran programmers
>  (at least not programmers working on numerical applications)
>  claim your test is portable.  Although many people use such
>  tests, they are very fragile.  As far as I know, nothing in
>  the language specifications prevents the compiler from
>  generating code to compute 1.+EPS in an extended precision
>  register and compare the (wider than REAL) result to 1.
>
>  (Caveat: I'm not an expert at all.  Please correct me if
>  you think I'm wrong.)
>
>  Obviously, such code will give the wrong answer for some
>  values of EPS smaller than the true machine epsilon.

The short answer is "yup, it's portable" (at least among systems with
conforming compilers <grin>).  Unfortunately, the explanation rests on
more tedious "language lawyer" hair-splitting.

Section 6.1.4 of the 77 std defines the type and interpretation of
arithmetic expressions.  Assuming a "REAL EPS" in the code above,
the types of the expressions "1.", "EPS", "1.+EPS", and "0." are all
explicitly defined to be REAL by that section of the std.  That's not
the whole story, though.  To ease into it, assume a "REAL X,Y,Z" and
consider

      Z = X * Y

It's a fact that 77 defines the type of "X*Y" to be REAL here.  It's a
matter of opinion whether the "but the processor can evaluate any
mathematically equivalent expression so long as it respects
parentheses" clause allows the compiler to evaluate X*Y in an extended
precision (non-REAL) and cut it back before storing.  The latter is
"opinion" because only God knows what "mathematically equivalent" is
supposed to mean -- Fortran doesn't even try to say, and classical
mathematics has nothing convincing to say about it either (the issue
at question is an artifact of running on a computer to begin with).

My own opinion is that yes, it's within the letter of the std to widen
the computation, but that it's also a sleazy trick to play on users
(Fortran 77 defines two precisions (REAL & DOUBLE), and there's not a
hint in the std that an implementation might spring a third (fourth
...) precision on them when their back is turned -- so I would
certainly never do this to a user by default).

Next step:

      Z = X + Y + Z

The same "mathematically equivalent" opinion applies here; my own
opinion is that a processor *can*, e.g., evaluate X+Z in infinite
precision, cut the result of that back to a REAL'th+1 precision format
but retaining only 2 significant bits, and then add that to Z.  That
wouldn't be *reasonable*, I'm just trying to make the point that those
whose opinion is that the processor *can't* do it all in an extended
precision are probably reading much more into the std that it actually
says.

Almost there:

      Z = (X + Y) + Z

This one's different!  The type of "(X+Y)" is clearly defined to be
REAL, and while I think the processor can do anything it likes to
evaluate X+Y, its freedom stops cold at the parentheses.  I.e., (X+Y)
*has* to be cut back to type REAL before proceeding, although nothing
prevents the processor from widening it again after it's been cut
back.  It's possible to argue that it's mathematically equivalent to
widen REAL(X+Y) by restoring exactly the bits that were thrown away,
but I think that stretches "mathematically equivalent" to the point
where it loses all meaning (and/or stretches the meaning of "respect
parentheses" to an equally useless extreme).

Finally, since "1.+EPS.EQ.1." is defined to mean "((1.+EPS)-(1.)).EQ.0."
and the type of "(1.+EPS)" is REAL, the parentheses force a conforming
processor to cut 1.+EPS back to REAL before proceeding (although
having done so I think it's free to widen the result again).  Because
of this, "extra bits" can't survive into the equality test even if it
is re-widened.  I.e., it's about as portable as such things can be.

Some final points on this:

  - Since everything ultimately rests on what "mathematically
    equivalent" and "respect parentheses" mean, and since Fortran
    doesn't define those phrases, there are endless opportunities for
    quibbling about it.  In the end, I think the equally vague <grin>
    "spirit of Fortran" counts most with me, and I think it's very
    much in that spirit to read "respect parentheses" restrictively --
    parens, along with breaking expressions into multiple statements,
    are the only defenses Fortran affords the user against the
    compiler's otherwise limitless license to "optimize".  Anyone
    is free to disagree with that, but they'd be wrong <grin>.

  - Since half of Fortran compiler developers these days are reluctant
    draftees from C compiler projects <0.4 grin>, it's not a *good*
    idea to rely on compilers getting the hard or subtle parts right.
    As a general matter of defensive coding, people "should" be writing
    tests like "1.0+EPS.EQ.1." by computing 1.0+EPS in a separate
    statement.  Of course, any compiler developer perverse enough to
    ignore Fortran's parens is likely perverse enough to ignore
    Fortran's statement boundaries too -- there really is no final
    defense against people determined to "help" you short of staying
    away from them <no grin>.

>  In fact, on 680x0 based machines this is almost invariably
>  the case with optimizing compilers.

I know, and I think they're in violation of IEEE-754 too (under any
reasonable mapping of Fortran's operator symbols into 754's
operations).

>  If I want to compare a value to the machine epsilon, I use the
>  routine EPSLON from EISPACK.  It is very cheap (computationally)
>  and robust.

Humbug <grin>.  I've had the pleasure of tracking down two program
failures due to this routine in the last couple years.  In order to
protect the guilty I won't name names (& no, surprisingly enough, Cray
wasn't one of them <grin>), but on one system the routine went into an
infinite loop (yes, it was the case that eps.eq.0 at the end, and no
the base was not a power of three), on the other it returned an eps
twice too large.

In different guises, the algorithm's flaw is in assuming the machine
will do a to-nearest rounding on the division.  E.g., consider a
754-like machine with a 4-bit fraction (just to make one of the
problems obvious), and suppose you call the routine when round-to-0 or
round-to-minus-infinity is in effect.  Then

  4./3. = 1.01010101010101010101... (base two)

so A gets set to 1.010 via truncation.  Then B = A-1.0 = 1.010-1.0 =
0.010, then C = B+B+B = 0.010 + 0.010 + 0.010 = 0.100 + 0.010 = 0.110,
then eps = abs(c-1.) = abs(0.110 - 1.) = abs(-0.01) = 0.01 when it
should be 0.001.  The infinite loop case I leave as an irritating
exercise for the reader <grin>.

I don't remember whether EPSLON works on a Cray, but if it does it's
working purely by accident (a Cray divide may be a couple ulp off).

I don't have the draft Fortran 90 std handy, but as I recall they
finally put an end to this nonsense with a new intrinsic function
(& it only took 40 years to get it <grin> ...).

>  Unfortunately, this isn't just a Fortran problem.  People write the
>  same junk in C, Lisp, ... .

Amen, Jim!  But I don't blame the programmers as much as I used to,
having been burned a few <ahem> times myself -- floating point is just
one flat-out *nasty* abstraction.

>  [code for EPSLON deleted -- don't want it to spread <grin>!]

pleased-to-meet-you!-ly y'rs  - tim

Tim Peters   Kendall Square Research Corp
timaksr.com,  ksr!timaharvard.harvard.edu