inlining pow(x,two)

Wed Mar 27 09:25:23 PST 1991

Investigating a customer's performance complaint led me to wonder what 
current consensus there is about inlining pow(x,two) in an ANSI-C
compiler, where two is known to be some kind of 2 or 2.0 etc.  
The highest-performance thing to do is to compute
	t=x
	t*t
to avoid computing x twice (it may be an expression with function calls)
but that doesn't meet the requirement of setting errno=ERANGE
in case of overflow.  Some kind of machine-dependent test, not expressible
in C, has to be performed to determine whether overflow occurred
in the multiplication; the cost of that test will dwarf the cost of
the multiplication on all highly-pipelined systems, whether IEEE or not,
because of the cost of the conditional branch.

And that's why errno has to go, not to mention matherr.

In Fortran, there's no doubt; for the fragment

	subroutine pow2(x,y,z)
	real*8 x,y,z
	y=x**2
	z=x**2.0d0
	end

Sun Fortran 1.3.1 at -O4 produces

        save    %sp,-72,%sp
        ldd     [%i0],%o0
        std     %o0,[%fp-8]
        ldd     [%fp-8],%f0
        fmuld   %f0,%f0,%f0
        std     %f0,[%i1]
        ldd     [%fp-8],%f4
        fmuld   %f4,%f4,%f4
        std     %f4,[%i2]
        ret
        restore

which could be improved upon but gets rid of the the external references.