Numerical operators in NCEG

Sat Feb 3 16:37:50 PST 1990

     Sun engineering people can get a formatted copy of this document via
tbl ~dgh/memo/nceg.operators | eqn | troff -ms
Be aware that ~dgh is moving from teak to turnpike tomorrow.

     I think the way to deal with a variety of issues relating to mathematical
operators in C is to declare them as mathematical operators and have __NCEG__
compilers know about them.  Let's see how this would work:

     You'd compile (in Unix) cc -nceg, for instance.  That would cause
__NCEG__ to be defined during compilation, and NCEG-appropriate error handling
to be set at program startup time.  <math.h> would look this this:

#ifndef __NCEG__

enum version { c_issue_4, ansi_1, strict_ansi };
                                        /* SVR4 feature */

extern double acos(double);             /* Conventional definitions */
extern double asin(double);
        ...

#else /* __NCEG__ */

enum version { c_issue_4, ansi_1, strict_ansi, __nceg__ };
                                        /* Slight extension Sun and others
                                         would have to do anyway */

#if first way   /* nceg compiler has to keep an internal list anyway */

#__nceg_operators

#elif second way /* OR pretend nceg compiler doesn't keep an internal list */

                /* A maximal listing follows that is subject to trimming */
#operator abs atof aint ceil floor fmod max mmax min mmin nint square strtod pow
#operator copysign fpclass logb nextafter remainder scalb dprod
#operator sqrt hypot cbrt ldexp
#operator acos asin atan atan2 cos sin tan cosh sinh tanh asinh acosh atanh
#operator exp expm1 exp2 exp10 log log1p log2 log10 compound annuity
#operator j0 j1 jn y0 y1 yn rand48 erf erfc lgamma gamma

#elif third way /* OR nobody agrees whether #pragma can do anything useful,
                so define a new concept that can affect semantics */

#__nceg_pragma operator abs atof ceil floor ...

#endif /* __NCEG__ */

extern enum version _lib_version;       /* Global variable that tells
                                        functions how to handle exceptions
                                        at run time */

     What's an operator?  An operator is a reserved token known to the com-
piler like sizeof or + that applies to all numeric types.  Like +, the excep-
tion environment is not specified in general although it would be a good idea
for the new operators to have at least as good exception handling facilities
as the old ones like +. Thus the exception-handling requirements are more
stringent for VAXes than Crays and more stringent yet for IEEE systems.  The
last is the topic for another discussion.

     Some people claim that ANSI-C allows operator implementation.  What it
appears to allow is the following:  if <math.h> is included then the compiler
is permitted to know what the functions sqrt, exp, pow, etc. mean, and then
possibly implement them without calling functions sqrt(), exp(), and pow().
The compiler is not permitted to bypass the errno requirements, however, and
although you can recognize sqrt(float) as special, I don't see where the stan-
dard permits an implementation to recognized sqrt(long double) as permitting a
call to a long double function of a long double argument.  To preserve the
ANSI-C sqrt() semantics the compiler must convert the argument to double and
produce a double result.

     I'm not perturbed by the fact that understanding these operators makes a
NCEG compiler more complicated than a plain X3J11 one. It's still far less
complicated than a Fortran compiler.  And you don't have to implement NCEG to
comply with X3J11.

     Some people claim that C++ allows operator definition for arbitrary
types, and that is true.  But does a C++ compiler have any more built-in
knowledge of complex arithmetic than a C compiler has of sqrt?  If not, it
would not be likely to generate as good code as a Fortran compiler, for
instance.

     Besides exception handling, operators transfer the mental burden from the
application writer to the compiler/library implementer to document and
remember all the names for all the functions for all the old types int, long,
float, double, long double, and for new types that people might invent such as
complex int, complex long, complex float, complex double, complex long double,
interval float, ... complex interval long double, ...  I don't know whether
any of these beyond complex double will make it into the final NCEG report.

     What are the type semantics of these numerical operators ?  Corbett fig-
ured this out a number of years ago in the Fortran context, but I can give you
some flavor:  generally speaking the type of the result of an operator is the
smallest type encompassing its operands and destination.

        encompass(ieee single, 32-bit int) = ieee double
        encompass(ieee single, ieee double) = ieee double
        encompass(ieee single, 16-bit int) = ieee single
        encompass(n-bit int, m-bit unsigned) = whatever ANSI-C says
etc.  Exceptions include atof and strtod in which the types of the operands
are irrelevant.

     The destination type is always known in C, I think, unlike Fortran.  The
destination type of x = op(y) is the type of x; the destination type of (cas-
type) op(y) is castype; the destination type of op(z) in printf("..",op(z)...)
is int or double depending on z; Corbett's papers appeared in SIGPLAN and SIG-
NUM and possibly elsewhere.

     Implementation: it's not necessary to provide a hidden library function
for every combination of possible operands and results.  How many cases you
choose to implement is a quality-of-implementation issue.  For instance,
separate implementations of transcendental functions for integer arguments and
results aren't generally worthwhile.  But implementers would be well advised
to recognize important special cases like pow(floating,int) and
scalb(floating,int).

     The non-ANSI-C operators on the list are mostly from SVR4, plus a few
more.  It's not important now to get bogged down in the details of what's on
the list and what's not.  Here are some reminders about how some of them work;
most were mentioned in my X3J11 commentaries:

mmax and mmin
     are magnitude max and min.

aint and nint
     converts to the nearest integral value, by rounding toward zero or round-
     ing to nearest biased, respectively.

square(x)
     produces x*x

dprod(x,y)
     produces x*y in the smallest type that can hold all products of typex *
     typey without roundoff error or exponent spill, or in long double other-
     wise.

pow(x,y)
     is not obliged to produce the same results when x is floating and y is
     int or unsigned, as when x is floating and y is floating.

ldexp(x,y) and scalb(x,y)
     differ on systems with non-binary radix R by producing x*2**y and x*R**y
     respectively for integral y.

fmod(x,y) and remainder(x,y)
     both produce exact results, but the sign of fmod agrees with the sign of
     x while the sign of (IEEE 754) remainder may not agree with that of x or
     y.

exp1(x) and log1p(x)
     compute exp(x)-1 and log(1+x), both essential for accurate calculations
     involving exponential growth, such as interest rates, but that does not
     relieve us of the obligation to provide

compound(x,y) and annuity(x,y)
     since these most common financial functions are so often miscomputed if
     not built in.

rand48(x)
     is a SVR4 random number generator with commendably long period and pro-
     portionate expense.

lgamma(x)
     computes log(gamma(x)) for gamma(x) > 0 and handles zero and negative
     gamma(x) like log(x) does.

gamma(x)
     computes gamma(x) as it should have from the beginning.