Floating Point Instructions

David Hough sun!Eng!David.Hough
Fri Jan 10 19:20:10 PST 1992


> 
>   1)  Instruction Counts
> 
>     Throughout computer history, there have been numerous 
>     studies of instructions and their relative frequencies.
>     Have there been any studies of the relative 
>     frequencies of just floating point operations ?

Can't this data be relatively easily derived from Pixie and Spix traces?
But does it really matter?

False Proposition:  The typical user executes a typical program.
False Proposition:  The typical computer executes an average program mix.

I believe each individual workstation, 
and each departmental time-shared machine,
executes a rather specific program or a rather variable
mix of programs that is not necessarily very similar to those found
elsewhere.   What gets user's attention is performance anomalies - 
performance much worse than expected, rather rightly or wrongly.
Although most people aren't aware that existing SPARC processors lack
integer multiplication, division, and remainder, number theorists are very
keenly aware.   Similarly most SPARC owners may not know whether their
machines have hardware sqrt, or whether they are compiling properly to exploit
it, but people doing geometric optics with CODEV most definitely depend on
hardware sqrt whether they know it or not.   The new SPEC suite will include a
benchmark program derived from CODEV.

While the happy campers are usually pretty quiet, the disappointed ones tend
to make a lot of noise (especially on the net) that may induce FUD feelings
in other prospective purchasers.

So which is better, making 99% of the users 1% faster or 1% of the users
2X faster?  Even if these decisions could be quantified so precisely, the 
answer isn't clear.
> 
>     In a similiar vein, are there any studies concerning
>     the relative frequencies of double and single 
>     precision ?

Similarly, statistical studies might not be conclusive in any event.
32-bit single precision is very important to some customers, 
not at all to others.   How do you average that out?

>   4)  Denormals
> 
>     Some microprocessors are handling denormal inputs in
>     hardware.  Is this justified ?
> 
For some users and some programs (especially single precision), yes.
How much is it worth to look 2X or more faster
on one class of programs than your competitors?   How much overall system
benefit do you get by NOT supporting subnormals in hardware?  I'd guess
that the main benefit of not supporting subnormals is faster time to market
rather than more gates available for something else. 

>   - Accordingly, the performance of the Division and
>     Square Root can be reduced appropriately.

I'd like to encourage all SPARC competitors to take this advice to heart.
The exceptional job that the TI 8847 did on these instructions 
helped make up for some SPARC shortcomings in other areas at various times.

>   - I think FP instructions should be optimized for
>     short alignment/normalization shifts, and the
>     longer shifts should take more time.  Handling
>     LZA for double precision is difficult with respect
>     to time-space tradeoffs.  Handling LZA for quad
>     precision may preclude adoption of it by designers.

Fortunately this one doesn't depend too much on specific applications.
It can be resolved empirically, I think, by simulating the adds in the
SPEC floating-point suite.   MIPS was built on this kind of analysis!
> 
>   - We should spend more hardware on handling denormals
>     than on handling division and square root instructions. 

I'm in favor of hardware subnormals, sqrt, and division, but I think I've
listed them in increasing order of importance.

>   - We need an additional program in the SPEC benchmark
>     that is concerned with the accuracy of floating
>     point computations as well as the speed.

I agree that performance and conformance ought to be reported together,
but that's beyond SPEC's current charter; they don't accept any test programs
that raise any exceptions currently.   It's hard enough to find non-exceptional
realistic applications that produce comparable results on all SPEC members'
platforms.

>   - We need additional programs in the SPEC benchmark
>     that show computations in application specific disciplines,
>     including graphics, finite-element analysis, etc ...
>     Are these disciplines any different in terms of floating
>     point computations ?

Everybody should study the PERFECT benchmarks as well.   They are all 
Fortran floating point, and several appear to be very crufty old codes,
but they are all different.
You learn a lot from trying to compile and run them.   They tend to be
"harder" somehow than the SPEC programs; they are very
resistant to "matrix300" type tricks.   You also learn other interesting
facts, like an unformatted intermediate file that measures 95 MB on SunOS,
AIX, and HP-UX, runs to 380 MB on IRIX.   I have been unable to get all the
PERFECT programs to compile and run successfully on our competitors' machines,
but in some cases we don't have the most recent software releases.



More information about the Numeric-interest mailing list