slightly revised NCEG memo on exception handling and operators

Mon Mar 5 20:53:24 PST 1990

The following is the memo I'll be bringing to the NCEG meeting Weds
in NY.

     Sun engineering people can get a formatted copy of this document via
tbl ~dgh/memo/nceg.operators | eqn | troff -ms

NCEG Proposal - Exception Handling

     The following outlines a way of describing exceptions in an NCEG report
that encompasses most existing implementations without too much work, while
pointing out preferable alternatives. I'm interested in your comments as to
whether this trend of thinking is worth pursuing.  SHALL constrains all NCEG
implementations, SHOULD is a recommendation.

     Basically the idea is that an NCEG programmer can rely on this: for each
type of exception, there will be one uniform treatment on a system, and he can
find out what it is in general terms without defying abort. What he can't rely
on is that any particular type of exception handling will be available on a
particular system, nor that all exceptions can be handled the same way on a
particular system.  An NCEG system conforming to IEEE allows a programmer to
assume a great deal more.

4.1.7 Floating-Point Exceptions

     Operators and functions which take floating-point operands or produce
floating-point results may produce a variety of exceptions on different sys-
tems.  Some of these are inherent in the underlying mathematical functions of
real variables; others are consequences of the finiteness of the set of
representable floating-point numbers.   In the following, "functions" and
"operators" are interchangeable; the distinction between functions and opera-
tors is important in language definition and compiler implementation but both
represent mathematical functions.

4.1.7.1 - Exceptions arising from mathematical functions

     These exceptions must be dealt with in defining mathematical functions,
independent of any consideration of computers.

     SINGULARITY/POLE/DIVISION-BY-ZERO exception.  If division is viewed as a
function of finite real operands producing a finite real result, then an
operation like 1/0 has no meaning.  However the real number system can be
extended by adding an unsigned point at infinity or by adding two signed
infinities; then it may be meaningful to assign infinite results to operations
like 1/0.  Thus a SINGULARITY exception arises when an infinite result is
assigned to a function of finite real operands.  Since the meaningfulness of
such assignment varies among applications, an exception is appropriate to
alert applications for which such assignments are inappropriate and on systems
or for types for which such assignments are impossible.

     From the point of view of traditional complex analysis, the singularity
exceptions encompass poles like 1/0 and logarithmic singularities like log(0),
but not removable or essential singularities.  Applications like continued
fractions are common enough to suggest that singularity exceptions be dis-
tinguished from the following:

     INVALID/DOMAIN exception.  In contrast to the foregoing, 0/0 can't be
assigned a value consistently over broad classes of applications.  If 0/0
arises as lim(x->a) f(x)/g(x), for instance, the appropriate limit depends on
f and g.  Each application is a special case.  Thus an INVALID (operand)
exception arises when no finite or infinite real result can be meaningfully
assigned to a function of finite or infinite real operands.  Hence the real
function sqrt(-1) causes an invalid exception but a complex function would
not.

     An implementation may also define invalid exceptions on attempts to use
non-numeric representations like IEEE signaling NaN or VAX reserved operand.

4.1.7.2 - Exceptions arising from finite floating-point arithmetic.

     These exceptions don't occur in the context of exact mathematics but are
artifacts of finite computer number representations.

     INEXACT exception.  All fixed-length numeric data types can only
represent a finite number of real values.  Consequently, in order to continue,
a computed real value that is not representable exactly must be rounded
according to some rule to produce a representable number.  Thus an INEXACT
exception arises when an unrepresentable real value is replaced by some dif-
ferent representable value.

     IEEE 754 and 854 systems detect these normal inexact exceptions.  There
is an implementation-defined class of "normal" rounding errors; other excep-
tions are defined for abnormal ones, which are detected on most systems:

     INTEGER OVERFLOW exception.  Integer addition, subtraction, and multipli-
cation are usually exact, while division is expected to produce an error < 1
in absolute magnitude.  Thus an INTEGER OVERFLOW exception occurs when a
larger than normal roundoff error, >= 1 in magnitude, occurs in storing a
value into an integer or unsigned type.  Many C implementations don't detect
integer overflow.

     FLOATING-POINT OVERFLOW exception.  Every fixed-length floating-point
data type has a largest finite representable value.  Associated with this
bound is the possibility of unusually large errors in converting an arbitrary
real value to a floating-point representation.  Thus a FLOATING-POINT OVERFLOW
exception arises when a large real value can't be stored with only normal
roundoff error.

     UNDERFLOW exception.  Every fixed-length floating-point data type has a
smallest positive representable value different from zero. Associated with
this bound is the possibility of unusually large errors in converting an arbi-
trary real value to floating-point representation.  Thus an UNDERFLOW excep-
tion arises when a small real value can't be stored with only normal roundoff
error.

     IEEE 754 and 854 systems detect underflow in some cases for results that
are intermediate in size between the smallest normal and smallest subnormal
values, as well as for non-zero results that round to zero.

4.1.7.3 - Exceptions arising from roundoff magnification or algorithmic limi-
tations

     LARGE ERROR exception.  Nobody has done this yet, but in a context of
correctly-rounded elementary transcendental functions, one can imagine imple-
mentations which indicate a larger-than-normal rounding error from time to
time, but leave the result as is if the user-indicated action is to ignore the
large error exception.  Thus a LARGE ERROR exception arises when a real value
can't or won't be stored with only normal roundoff error, for reasons other
than extremely large or small magnitude.  Generally speaking a large error
exception could be viewed as a indicating that an unusually large roundoff
error - as defined by an implementation - was tolerated for better efficiency.
For instance, in IEEE 754 or 854 implementations, base conversions that are
not quite correctly rounded SHOULD indicate this exception, but by better
algorithms, better implementations SHOULD avoid this exception.

     SVID defines exception types TLOSS and PLOSS for "total and partial loss
of significance" in results of trigonometric and bessel functions.  PLOSS sig-
nifies function results that are highly sensitive to roundoff errors in the
function arguments, even though the functions themselves might be exact, and
thus is related to significance exceptions signaled on some older systems.
The ultimate source of the roundoff sensitivity is subtraction of nearly-equal
quantities, at least one of which is affected by roundoff.  Thus PLOSS arises
in ordinary subtraction, in remainder-type operations like fmod(), and in
log(x) and pow(x,y) for x near 1, as well as in trigonometric argument reduc-
tion.  Since roundoff and subtraction of similar quantities are endemic in
floating-point computation, the general problem of bounding roundoff requires
analytical error analysis or computational bounds using means such as interval
arithmetic.  Because there is no natural boundary between normal roundoff
accumulation and PLOSS situations, NCEG implementations SHOULD avoid indicat-
ing these exceptions.

     TLOSS signifies function results which are completely incorrect due to
algorithmic limitations.  For NCEG systems, such results SHOULD be avoided by
using algorithms without such limitations.

4.1.7.4 - Methods for handling exceptions

     Systems implement a variety of exception-handling mechanisms; these are
often provided inconsistently.  These include:

Ignore and continue
     with a default result.

Set errno
     per X3J11, and continue with a default result.  Setting the global errno
     to EDOM is appropriate for singularity and invalid exceptions, and to
     ERANGE is appropriate for inexact, large error, integer overflow,
     floating-point overflow, or floating-point underflow exceptions.

Call matherr()
     per SVID.  matherr() is a fixed error-handling function that by default
     will fix a substitute numerical value and may also print a message.  User
     programs may redefine matherr().

Set an accrued exception bit
     per IEEE 754 or 854, and continue with a default result. The accrued
     exceptions may be tested for NCEG IEEE systems by calling getfpexcep-
     tion().

SIGFPEgenerating a SIGFPE on the occurrence of an exception.  IEEE 754 or 854
     systems sometimes define SIGFPE to be their response to IEEE 754
     "trapped" exceptions.  An implementation may define a default result that
     will be used if the SIGFPE returns normally.

abortand terminate the process.

4.1.7.5 - NCEG requirements for exception handling

     X3J11-defined operators such as +-*/, and algebraic functions such as
sqrt and strtod, that have floating-point operands or results SHALL handle
each exception in a uniform manner, whether the operator or function is imple-
mented in hardware or software.  The handling of different exceptions need not
be uniform, however.  In addition, transcendental functions SHOULD handle
exceptions other than inexact uniformly with algebraic functions and opera-
tors.  How, or whether, inexact exceptions of transcendental functions are
detected is not specified for NCEG systems.

     Implementations define default results when execution is to continue with
or without noting the exception.  Suitable default results for most invalid
and TLOSS exceptions would be IEEE NaNs, VAX Reserved Operands, or a Cray
indefinites.  Suitable default results for most singularity exceptions would
be IEEE infinities or largest possible representable magnitudes.  The integer
overflow, floating-point overflow, underflow, large error, PLOSS, and inexact
exceptions all suggest their appropriate default results: representable values
chosen to be close to the exact result according to some implementation-
defined rule.

     The following functions support run-time inquiries:

        enum nceg handler t = { nceg ignore, nceg errno, nceg matherr,
                 -       -          -            -           -
                                nceg status, nceg sigfpe, nceg abort, nceg other } ;
                                    -            -            -           -

        enum nceg exception t = { nceg invalid, nceg singularity,
                 -         -          -             -
                                nceg integer overflow, nceg floating overflow,
                                    -       -              -        -
                                nceg underflow, nceg inexact, nceg large,
                                    -               -             -
                                nceg ploss, nceg tloss } ;
                                    -           -

        nceg handler t get nceg handler ( nceg exception t ) ;
            -       -     -    -              -         -
        int set nceg handler ( nceg exception t, nceg handler t ) ;
               -    -              -         -       -       -

get nceg handler() returns the current type of handling for a particular
   -    -
exception.  set nceg handler() allows setting the handling for that exception.
               -    -
set nceg handler() returns 1 if the request succeeds, 0 if it fails because
   -    -
the implementation doesn't support that type of handling for that exception.
An implementation may support more than one type of exception handling per
exception type.

     NCEG implementations of IEEE 754 and 854 SHALL handle exceptions as fol-
lows:

                    Exception           Default   Optional

                    invalid             status    sigfpe
                    singularity         status    sigfpe
                    large               ignore    sigfpe
                    integer overflow    ignore    ignore
                    floating overflow   status    sigfpe
                    underflow           status    sigfpe
                    inexact             status    sigfpe
                    tloss               ignore    ignore
                    ploss               ignore    ignore

Other NCEG implementations SHALL provide the same handling, except that the
defaults may be different for implementation efficiency reasons.  In such
cases, the omitted IEEE-style default handling SHOULD be available as an
option.

4.1.7.6 - Catalog of mathematical exceptions for X3J11 operators and functions

     Most X3J11 operators and functions may suffer from integer overflow,
floating overflow, underflow, large, and inexact exceptions because of inevit-
able finite data representations.  They SHOULD NOT suffer from tloss or ploss
exceptions.  The following catalogs mathematical exceptions that may occur for
finite operands:

        Operator             Exception     Example
        /                    singularity   nonzero/zero
        /                    invalid       zero/zero
        atof strtod *scanf   invalid       atof("xyz");
        acos,asin            invalid       argument magnitude > 1
        atan2                invalid       atan(0,0)
        fmod                 invalid       fmod(x,0)
        log,log10            singularity   log(0)
        log,log10            invalid       log(-1)
        pow                  invalid       pow(negative,non-integral)
        pow                  singularity   pow(zero,negative)
        sqrt                 invalid       sqrt(-1)
                                           ...table to be continued

Operators

     I think the way to deal with a variety of issues relating to mathematical
operators in C is to declare them as mathematical operators and have   NCEG
                                                                     --    --
compilers know about them.  Let's see how this would work:

     You'd compile (in Unix) cc -nceg, for instance.  That would cause
  NCEG   to be defined during compilation, and NCEG-appropriate error handling
--    --
to be set at program startup time.  <math.h> would look this this:

#ifndef   NCEG
        --    --

enum version { c issue 4, ansi 1, strict ansi };
                -     -       -         -
                                        /* SVR4 feature */

extern double acos(double);             /* Conventional definitions */
extern double asin(double);
        ...

#else /*   NCEG   */
         --    --

enum version { c issue 4, ansi 1, strict ansi,   nceg   };
                -     -       -         -      --    --
                                        /* Slight extension Sun and others
                                         would have to do anyway */

#if first way   /* nceg compiler has to keep an internal list anyway */

#  nceg operators
 --    -

#elif second way /* OR pretend nceg compiler doesn't keep an internal list */

                /* A maximal listing follows that is subject to trimming */
#operator abs atof aint ceil floor fmod max mmax min mmin nint square strtod pow
#operator copysign fpclass logb nextafter remainder scalb dprod
#operator sqrt hypot cbrt ldexp
#operator acos asin atan atan2 cos sin tan cosh sinh tanh asinh acosh atanh
#operator exp expm1 exp2 exp10 log log1p log2 log10 compound annuity
#operator j0 j1 jn y0 y1 yn rand48 erf erfc lgamma gamma

#elif third way /* OR nobody agrees whether #pragma can do anything useful,
                so define a new concept that can affect semantics */

#  nceg pragma operator abs atof ceil floor ...
 --    -

#endif /*   NCEG   */
          --    --

extern enum version  lib version;       /* Global variable that tells
                    -   -
                                        functions how to handle exceptions
                                        at run time */

     What's an operator?  An operator is a reserved token known to the com-
piler like sizeof or + that applies to all numeric types.  Like +, the excep-
tion environment is not specified in general although it would be a good idea
for the new operators to have at least as good exception handling facilities
as the old ones like +. Thus the exception-handling requirements are more
stringent for VAXes than Crays and more stringent yet for IEEE systems.  The
last is the topic for another discussion.

     Some people claim that ANSI-C allows operator implementation.  What it
appears to allow is the following:  if <math.h> is included then the compiler
is permitted to know what the functions sqrt, exp, pow, etc. mean, and then
possibly implement them without calling functions sqrt(), exp(), and pow().
The compiler is not permitted to bypass the errno requirements, however, and
although you can recognize sqrt(float) as special, I don't see where the stan-
dard permits an implementation to recognize sqrt(long double) as permitting a
call to a long double function of a long double argument.  To preserve the
ANSI-C sqrt() semantics the compiler must convert the argument to double and
produce a double result.

     I'm not perturbed by the fact that understanding these operators makes a
NCEG compiler more complicated than a plain X3J11 one. It's still far less
complicated than a Fortran compiler.  And you don't have to implement NCEG to
comply with X3J11.

     Some people claim that C++ allows operator definition for arbitrary
types, and that is true.  But does a C++ compiler have any more built-in
knowledge of complex arithmetic than a C compiler has of sqrt?  If not, it
would not be likely to generate as good code as a Fortran compiler, for
instance.

     Besides exception handling, operators transfer the mental burden from the
application writer to the compiler/library implementer to document and
remember all the names for all the functions for all the old types int, long,
float, double, long double, and for new types that people might invent such as
complex int, complex long, complex float, complex double, complex long double,
interval float, ... complex interval long double, ...  I don't know whether
any of these beyond complex double will make it into the final NCEG report.

     What are the type semantics of these numerical operators ?  Corbett fig-
ured this out a number of years ago in the Fortran context, but I can give you
some flavor:  generally speaking the type of the result of an operator is the
smallest type encompassing its operands and destination.

        encompass(ieee single, 32-bit int) = ieee double
        encompass(ieee single, ieee double) = ieee double
        encompass(ieee single, 16-bit int) = ieee single
        encompass(n-bit int, m-bit unsigned) = whatever ANSI-C says
etc.  Special cases include atof and strtod in which the types of the operands
are irrelevant.

     The destination type is always known in C, I think, unlike Fortran.  The
destination type of x = op(y) is the type of x; the destination type of (cas-
type) op(y) is castype; the destination type of op(z) in printf("..",op(z)...)
is int or double depending on z; Corbett's paper appeared in SIGPLAN and SIG-
NUM and possibly elsewhere.

     Implementation: it's not necessary to provide a hidden library function
for every combination of possible operands and results.  How many cases you
choose to implement is a quality-of-implementation issue.  For instance,
separate implementations of transcendental functions for integer arguments and
results aren't generally worthwhile.  But implementers would be well advised
to recognize important special cases like pow(floating,int) and
scalb(floating,int).

     The non-ANSI-C operators on the list are mostly from SVR4, plus a few
more.  It's not important now to get bogged down in the details of what's on
the list and what's not.  Here are some reminders about how some of them work;
most were mentioned in my X3J11 commentaries:

mmax and mmin
     are magnitude max and min.

aint and nint
     converts to the nearest integral value, by rounding toward zero or round-
     ing to nearest biased, respectively.

square(x)
     produces x*x

dprod(x,y)
     produces x*y in the smallest type that can hold all products of typex *
     typey without roundoff error or exponent spill, or in long double other-
     wise.

pow(x,y)
     is not obliged to produce the same results when x is floating and y is
     int or unsigned, as when x is floating and y is floating.

ldexp(x,y) and scalb(x,y)
     differ on systems with non-binary radix R by producing x*2**y and x*R**y
     respectively for integral y.

fmod(x,y) and remainder(x,y)
     both produce exact results, but the sign of fmod agrees with the sign of
     x while the sign of (IEEE 754) remainder may not agree with that of x or
     y.

exp1(x) and log1p(x)
     compute exp(x)-1 and log(1+x), both essential for accurate calculations
     involving exponential growth, such as interest rates, but that does not
     relieve us of the obligation to provide

compound(x,y) and annuity(x,y)
     since these most common financial functions are so often miscomputed if
     not built in.

rand48(x)
     is a SVR4 random number generator with commendably long period and pro-
     portionate expense.

lgamma(x)
     computes log(gamma(x)) for gamma(x) > 0 and handles zero and negative
     gamma(x) like log(x) does.

gamma(x)
     computes gamma(x) as it should have from the beginning.