Floating Point Instructions

Bill Voegtli uunet!mips.com!voegtli
Fri Jan 10 13:12:52 PST 1992



  This newsgoup has quieted down; and as a hardware
  guy, the long long vs int64 discussions just doesn't
  grab me, so .....


  I have some questions concerning floating point
  operations and would appreciate comments.


  1)  Instruction Counts

    Throughout computer history, there have been numerous 
    studies of instructions and their relative frequencies.
    Have there been any studies of the relative 
    frequencies of just floating point operations ?

    There is a trend in RISC microprocessors to add
    additional FP instructions for Square Root, 
    Inverse Square Root, ...  Are these instructions 
    justified ?
  
    There has been the old "rule of thumb" that :
      - Add, Subtraction, Multiplication had
        to have equal latencies.
      - Division latency had to be 3-4 times slower 
        than Multiply/Add/Subtract.
      - Square Root latency had to be 3-4 times slower 
        than Division.
    Assuming performance is relative to occurance,
    do real-world programs really reflect this ?


  2)  Alignment & Normalization 

    In 1965, Sweeney (of SRT fame) analysed a bunch
    of programs to determine the frequency of alignment
    and normalization shifts AND the distances.  This
    work influenced the hexadecimal radix of the 360
    family.  Are there any more recent studies ?


  3)  Precision

    In a similiar vein, are there any studies concerning
    the relative frequencies of double and single 
    precision ?  I thought SPUR was only double precision,
    but I never heard any conclusion as to whether this 
    is a good idea.


  4)  Denormals

    Some microprocessors are handling denormal inputs in
    hardware.  Is this justified ?

    There are well-known programs that exhibit underflow,
    but how often does it occur ?  Is the usual Flush-to-Zero
    mode a real impact on the "typical" mix of scientific
    programs ?



OPINIONS :

  - I would be very surprised if divisions comprised
    more than 10% of a mix of scientific programs.

  - Accordingly, the performance of the Division and
    Square Root can be reduced appropriately.

  - I think FP instructions should be optimized for
    short alignment/normalization shifts, and the
    longer shifts should take more time.  Handling
    LZA for double precision is difficult with respect
    to time-space tradeoffs.  Handling LZA for quad
    precision may preclude adoption of it by designers.

  - We should spend more hardware on handling denormals
    than on handling division and square root instructions. 

  - We need an additional program in the SPEC benchmark
    that is concerned with the accuracy of floating
    point computations as well as the speed.  It galls 
    me to see some microprocessors with high SPEC marks 
    that are notoriously inaccurate.

  - We need additional programs in the SPEC benchmark
    that show computations in application specific disciplines,
    including graphics, finite-element analysis, etc ...
    Are these disciplines any different in terms of floating
    point computations ?

-- 
--
UUCP: {ames,decwrl,prls,pyramid}!mips!voegtli    (or voegtliamips.com)
USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086

Standard Disclaimer :  I speak from MIPS, not for MIPS.




More information about the Numeric-interest mailing list