bit-for-bit IEEE compliance vs performance

Thu Sep 28 10:27:01 PDT 1995

Jonathan Thornburg suggests pragmas for identifying sensitive and non-sensitive
floating-point source code.

I suspect that's overkill, and anyway most people couldn't tell where to put
the pragmas.   I think a simpler situation has already been implemented
at least at Sun and probably elsewhere.

Beyond what you can get with the optimization -O or -O[345...] there are 
additional compiler optimization flags that could in principle be applied
on a per-function basis by separate compilation.

-fsimple, for any architecture, instructs the compiler to optimize as if
floating-point variables had no modes or exceptions or side effects.   There
have been arguments about how much else to put into that option, e.g. whether
division by an explicit constant should be changed into multipication.   The
important thing to recognized is that it hardly ever makes a noticeable
difference on realistic applications, but it's there for benchmarking and
other purposes.

-nofstore, for x86 architectures, instructs the compiler not to force
storage (in order to force rounding to storage precision) on explicit
assignments.    Its opposite, -fstore, is the default.   This particular
option no doubt can make a noticeable difference on performance, and it
has analogs in other compilers: -ffloat-store in GCC and -mp in the
Intel reference compilers.

Beyond that, as I have remarked before, the x86 implementation is practically
by definition IEEE 754 compliant.    IEEE 754 allows quite a bit of latitude
in a number of aspects.    Unfortunately this latitude has been more often
exploited for ill than good and so probably is not as good an idea as 
obtaining uniform numerical results appears now.