Floating Point Instructions <9201102112.AA20434adalek.mips.com>

Mon Jan 13 10:14:12 PST 1992

>....

    There is a trend in RISC microprocessors to add
    additional FP instructions for Square Root, 
    Inverse Square Root, ...  Are these instructions 
    justified ?

Generally, yes. There is a synergy between what is in the machine and
what savvy programmers code.

"square-root" based formulations of some algorithms are numerically
more stable than non-sqrt formulations. All sorts of pain and
suffering result from forcing numerical coders to chose.

It is not sufficient to merely observe what coders *have done* one
must ponder what they were trying to accomplish.

In some sort of idea universe the costs will be as similar as possible
(and faster than light) so that programmers can worry about really
important things. Having some key operations (say divide) be much more
costly than others means numerical analysists have to worry about the
machine, not the algorithm and not the data .... ill-condition
systems, unstable algorithms, and crummy machines are very hard to
tell apart just from symptoms. And of course, making it go fast is
often *necessary* there is no point in solving the problem too late
for the Control system ;>

>  2)  Alignment & Normalization 

>    In 1965, Sweeney (of SRT fame) analysed a bunch

Machines that have alignment requirements (e.g. MIPS, SPARC) are
harder to move codes to. If ISV's don't port (and validate) their
codes, the computation speed is zero.

Denorms happen pretty often in some applications. In one large class
of algorithms, one tends to solve for deviations from some sort of
nominal (e.g. computing a "correction term"). As the problem
progresses, the desired solution is closer and closer to zero. If
denorms cost a lot, the application runs slower and slower as accuracy
requirements are raised. This isn't what folks expect!

>  3)  Precision
>
>    In a similiar vein, are there any studies concerning
>    the relative frequencies of double and single 

As has been pointed out, one workload is nothing like someone elses.
Trying to find a single magic workload is doomed to failure. Doing
vastly worse than users expect in any metric is often enough to get
them to stick with their current vendor.

Single precision, IEEE, is good enough for a wide range of estimation
problems I used to deal with. Nonetheless, many customers insisted on
double precision .... because that was what was required on their old
IBM and UNIVAC machines. I suspect that if SP is much faster, many
will migrate (or have, I left that area some time ago) back. Decisions
about mission critical codes are made slowly. Far more slowly than we
evolve chips!!!