Microsoft C miscompiles FDLIBM's "hypot"

Cleve Moler moleramathworks.com
Sun Nov 12 13:02:50 PST 2000


Here is a posting I recently made to the MATLAB news group,
comp.soft-sys.matlab, and some additional information from
a colleague, Greg Wolodkin.

  -- Cleve Moler
  moleramathworks.com


A posting from Steve Vavasis a couple of days ago uncovered a bug
in MATLAB 5.3 on the PC that we hadn't seen before.  It's present
in Releases 11 and 11.1, as well as the Student Version of 5.3.
It turns out that we have already fixed it in MATLAB 6.0.

Several computations in MATLAB make use of a kernel function known
as "pythagorean addition" or "hypotenuse":

    hypot(a,b) = sqrt(a^2 + b^2)

Complex absolute value uses hypot:

    abs(x+i*y) = hypot(x,y)

The vector 2-norm, norm(v), uses hypot:

    s = 0;
    for k = 1:length(v)
       s = hypot(s,v(k));
    end

We make use of a single kernel for these operations because
we want to be able to compute hypot(a,b) without suffering the
overflow or underflow that might result from squaring very
large or very small values of a and b.  For example, if
a = 4e200 and b = 3e200, then hypot(a,b) should be 5e200,
but a^2 and b^2 both overflow to Inf and the obvious
implementation returns Inf.  If a = 4e-200 and b = 3e-200, then
hypot(a,b) should be 5e-200, but a^2 and b^2 underflow and
the obvious implementation produces 0.

Most of the elementary math function libraries, libm, that 
accompany C compilers include hypot, although it's unclear whether
or not it is part of the C standard.  Before MATLAB 5.3, we tried
to use these vendor libraries, but we had frequent problems with
improper scaling and inconsistent handing of exceptions.  For
MATLAB 5.3, we adopted Sun's "freely distributable" math library,
fdlibm, which includes robust and accurate code for hypot.

Unfortunately, the fdlibm hypot function uses a potentially
ambiguous construction involving aliased pointers to specify bit
operations on floating point numbers.  The optimized code
produced by the Microsoft C compiler for the PC reorders some
of the integer and floating point operations so that, in 
certain situations, the correct result is computed, but forgotten.

On the PC, hypot(a,b) with a >= b > 0 produces an incorrect
zero result when a > 2^500 and a/2 >= b >= a/2^60.

For Vavasis's vector, v = test_x, these conditions are satisfied
at the final step of the norm calculation when a = norm(v(1:7))
and b = abs(v(8)).  The computed norm(v) is zero.

A simplier example is the absolute value of a large complex number

    abs(10^150*(12+5i))

It turns out that we had already fixed the code for MATLAB 6.
On Linux, the GNU C compiler complained about the aliased pointers
and we changed the fdlibm code to use a union construction.
The optimizer in Microsoft C is no longer confused and so hypot,
complex abs and vector norm all work correctly for large arguments.

  -- Cleve Moler
  moleramathworks.com

------------------------

The problem is basically one of "pointer bashing".  ANSI C says
you can cast a pointer of one type to another, and back, and get the
same thing you started with.  It doesn't guarantee, however, that you
take the address of a double, cast it to an (int *), work on the
(int *), and have it affect the original double.

A simple case would be a function which computes abs(x+1.4).

  double plusabs(double x) {
      x += 1.4;
      HI(x) &= 0x7fffffff;
      return x;
  }

A good optimizing compiler will return x directly from the floating point stack,
ignoring the integer math you did on the sign bit.

Declaring x volatile will force the values into memory, and work around the
problem.  Compiling with no optimization will do the same.  The solution we've
used is a union, which tells the compiler you want to think of this data as
both a double and a pair of unsigned longs:

#if CPU_NUM_FORMAT==FIEEE_LE
/* Little-endian */
typedef union {
    double value;
    struct {
        uint32_T lw;
        uint32_T hw;
    } words;
} ieee_double_T;

#elif CPU_NUM_FORMAT==FIEEE_BE
/* Big-endian */
typedef union {
    double value;
    struct {
        uint32_T hw;
        uint32_T lw; 
    } words;
} ieee_double_T;

#endif  

   -- Greg Wolodkin
   gregamathworks.com




More information about the Numeric-interest mailing list