Status of a Hard/Software Pentium FDIV Workaround

Mon Dec 5 03:21:34 PST 1994

        Status of a Hard/Software Pentium FDIV Workaround
                 and of a Pentium-aware MATLAB

A couple of days ago, Tim Coe, Terje Mathisen and I proposed a
workaround for the Pentium FDIV bug.  This is a short report on
the status of that proposal.

We have now joined forces with Peter Tang, a computer scientist from
Argonne Laboratory who is currently visiting the Chinese University
of Hong Kong, and a team of computer scientists and engineers from
Intel Corporation.  It is our intention to refine and implement the
workaround in such a way that it can be used for compilers, libraries,
and floating point intensive applications.  The result will be an
efficient and reliable assembly language macro replacement for the
Pentium FDIV instruction.  This macro can be inserted by software
developers into their source code or emitted by compilers.

At the MathWorks, MATLAB is providing a prototypical implementation
and test bed for the approach.  It had been our intention to have
a Pentium aware version of MATLAB available by now, but the expanded
scope of the project will add another week or two to the release date.

The basis for the workaround is Coe's characterization, now confirmed
by Tang and Intel, of bit patterns that must be present in the
divisors of operand pairs that lead to the error.  A quick test
of the divisor is done before each FDIV is attempted.  The absence
of the bit pattern indicates that the FDIV can be done safely.  The 
presence of the pattern does not guarantee that the error will
occur, it is just a signal that it might.  In this case, scaling
both operands by 15/16 takes the divisor out of the unsafe region
and insures that the subsequent FDIV will be fully accurate.

With this approach, it is not necessary to test the magnitude of
the residual resulting from a division.  It is known a priori that
all divisions will produce fully accurate results.  An additional
test can compare the result of scaled and unscaled divisions and
thus count the number of FDIV errors that would occur on an
unmodified Pentium.  We will offer this test in MATLAB, but it
may be desirable to turn it off for maximum execution speed.

A simple characterization of the regions to be avoided expresses
floating point numbers in the form

    (n + f)*2^e

where, using MATLAB notation, n is an integer in the range 16:31
and f is a fraction in the range [0,1-eps/2].  The at-risk
divisors occur in the five bands characterized by

    n = 17, 20, 23, 26 or 29

If the test were based on this fact alone, than 5/16 of uniformly
distributed operands would be rescaled.  

A more refined characterization reduces the width of the bands
and hence the number of rescalings required by checking for the
number of high order bits in f that are equal to one.  A test 
requiring the first six bits of f to be one reduces the frequency
of rescaling to 5/1024.

For example, the denominator in Coe's now famous ratio

    4195835/3145727

is

    3145727 = 3*2^20-1 = 23.99999237060547*2^17 

In this case, n = 23 and f = 1-2^(-17).  The 17 consecutive high
order ones in f make this example an instance of worst-case error.

Together with the team at Intel, we are now in the process of
refining, proving correct, implementing, testing, timing, validating
and documenting this approach.  We will continue to report on our
progress.

(For background information on the Pentium FDIV problem, including
our previous postings, see World Wide Web site www.mathworks.com
or anonymous FTP site ftp.mathworks.com, directory
/pub/tech-support/moler/Pentium.)

    -- Cleve Moler
    moleramathworks.com