report from grenoble - fused multiply add - correctly-rounded dot product

Sun Aug 4 12:13:01 PDT 1991

Article: 1360 of sci.math.num-analysis
From: pmontgomaeuphemia.math.ucla.edu (Peter Montgomery)
Newsgroups: comp.arch,sci.math.num-analysis
Subject: Re: Multiply/accumulate et al
Date: 2 Aug 91 05:46:04 GMT
Organization: UCLA Mathematics Dept.

	During the 10th IEEE Symposium on Computer Arithmetic (ARITH 10)
in Grenoble, France June 26-28, 1991, there was considerable discussion
about possible changes in the way floating point arithmetic 
should be handled by computers.

	On the first morning, D. W. Matula of Southern Methodist 
University (Texas) presented a paper by himself, by G. Bohlender 
and W. Walter of University of Karlsruhe (Germany) and by P. Kornerup
of Odense University (Denmark) titled "Semantics for Exact
Floating Point Operations".  The paper proposes that the
lower part of a sum or product, or the remainder from a divide
or square root, always be available to the user.  For example,
if A and B are floating point numbers, and ROUND(A+B) is their
sum rounded to the nearest floating point number, then
A+B - ROUND(A+B) is always representable exactly 
as a floating point number (unless an exception occurs); 
this error in the sum should be available to the user.
This lower part might be made available through a new opcode, 
a new rounding mode, or an extra output operand on the FADD instruction
Likewise for the error A*B - ROUND(A*B) in multiplication.
For division and square root, the remainders A - B*ROUND(A/B)
and A - ROUND(SQRT(A))**2 are exactly representable and should
by available.   Since the latter must be calculated anyway 
(though not to full precision) when determining how to round
A/B and SQRT(A), the additional hardware cost should be slight.

	That evening and the next two days, some of us talked
informally about this and other proposals.  There was general
support for having the lower halves available, since they can
be used (for example) to support 128-bit arithmetic given
primitives for 64-bit arithmetic.  

	Some anticipated that 128-bit floating point arithmetic 
(quad precision) would someday become the norm, like double precision 
is in today's scientific programs (though one attendee wanted a 96-bit 
format in order to not use as much storage).

	When asked about having a multiply/add (MAC) which behaves
like IBM's (no intermediate rounding), none of the numerical
analysts in attendance found it adequate (though it
can be used to get the lower parts for multiply, divide,
and square root).  For example, consider a complex multiply
(a, b) * (c, d) = (a*c - b*d,  a*d + b*c).
If we use an IEEE multiply with round to compute a*d followed by
a MAC for the + b*c, then a*d + b*c may not equal b*c + a*d = c*b + d*a, 
and complex multiplication won't be commutative.
On the other hand, being able to get the upper and lower halves for
the products a*c, b*d, a*d, and b*c would let the complex product
be evaluated accurately with some more instructions.
Those in attendance did welcome a MAC with intermediate rounding,
if it performs the same as individual multiply and add but is faster.

	G. Bohlender, D. Cordes, U. Kulish (all from Karlsruhe, Germany)
distributed a proposal for a "vector extension" to IEEE.
A "dot precision" or DP data format [they desire a better name]
would be capable for holding an exact dot product,
such as a0*b0 + a1*b1 + a2*b2, where a0, b0, a1, b1, a2, b2
are IEEE double precision numbers (the number of
summands may be any integer representable on the host machine).  
Since the exponent range for double precision operands
is about 2048 bits, the range of the products will be 4096 bits,
and a DP register will consume about 4130 bits.  The proposers
claim that the three operations of initializing a DP to a product,
adding a product to a DP, and rounding a DP to a floating point
value will allow exact arithmetic and make linear algebra packages
much better.  For example, when solving a linear system
A*X = B with A, B known, and an approximate solution vector
X0, the error vector B - A*X0 can be calculated exactly using DP's,
and A*X1 = ROUND(B - A*X0) solved approximately to get a better approximation
X0 + X1 for X.  Recommended additional operations on DP's included 
addition, subtraction, and comparison of two DP values
The advocates also claimed it would add only 20% more size to a
floating point chip.  Hardware designers in attendance complained
that the huge register size of a DP would degrade context switching time.
I am skeptical in part because my own programs need multiple precision
INTEGER dot products but could not easily benefit from this proposal
as there is no proposed way to extract the lower
bits of a DP (or to round/truncate it to an integer and return the integer)
and there is no provision to scale a DP by a power of 2.
And while sums of products can be evaluated exactly by DP's
it will not handle other expressions such as polynomial evaluation.
--
        Peter L. Montgomery 
        pmontgomaMATH.UCLA.EDU 
        Department of Mathematics, UCLA, Los Angeles, CA 90024-1555
This mathematician needs a job.  All I've got so far is jury duty.