NaNs

Tue Aug 28 12:27:48 PDT 1990

You won't be able to figure out what to do about NaNs by asking users what
they want.   That would be like asking users what they wanted from a
floating-point arithmetic standard before one had been written down.
They would have asked for, and gotten, too little.

IEEE 754 came into being more by thinking about what providers of high-quality
mathematical software, much but not all of it portable, could and would exploit 
if it were widely available and fully supported, balancing that against 
probable cost.

The only thing fully portable mathematical software can presume from a 754
system is that there is at least one quiet nan and one signaling nan.
The purpose of the signaling nan is to interrupt or invalidate computation
if used.  The purpose of the quiet nan is to indicate that an invalid
computation occurred and to fill the place where the invalid result
would go. 

Distinguishing individual quiet and signaling NaNs is already somewhat
system-dependent, but that's not necessarily bad; one of the 
often-forgotten goals listed in the ANSI-C rationale is to permit 
nonportable coding in a portable language.

Consequently for SunOS sys4-3.2 and 4.0 I enhanced scanf to recognize
+-NaN and +-NaN(...)  without attaching any particular significance
yet to the string within the parens that would ultimately describe the
NaN.  I could interpret the string later without changing the syntax.
As for case, there's no significance attached to upper or lower case e or E
in exponents in C or Fortran, so I attach none to the letters representing
infinity or nan.

In general, if you write any number including inf or nan,
to sufficiently wide unformatted binary output and
read it back in, it's unchanged in any bit.   This is the direction
that formatted conversions should approach.  If the field width
is wide enough to permit unambiguous specification of any finite
floating-point number, it should permit unambiguous specification
of any NaN, so that it can be written out and read back in unchanged.

That's fine for similar systems.  For different systems, it should
suffice that a signaling NaN written on one system be read as a
signaling NaN on another, not necessarily the same since "the same"
meaning may not be available.   Similarly a quiet NaN written on
one system should be read as a quiet NaN on another.

This implies that the differentiation between signaling and quiet
NaNs be portable and that any system be able to read in the NaN
output of another system, and interpret it as quiet or signaling,
but it might misinterpret anything beyond that.  Along these lines:

> 3) Full support for all NaNs.  In addition to above, support a string
>    representation that allows input and displays output of the general
>    bit pattern that NaNs are allowed to have.
>  
>    A suggested general NaN syntax:
>      NaN(n-char-sequence)
>  
>      n-char-sequence:
>        SQ-bit SQ-value(opt)
>  
>      SQ-bit:
>        S or Q
>  
>      SQ-value:
>        "," integer-constant
>  
>    The syntax  of integer-constant is from  ANSI C section  3.1.3.2 that
>    allows for  unsigned decimal or  octal or hexadecimal  constants with
>    any number of digits.
>  
>    The semantics of the  general NaN will need some words  to the effect
>    that  the value  is  a 22-bit  unsigned  integer  for floats,  51-bit
>    unsigned integer for doubles, and a  ??-bit unsigned integer for long
>    double (where ??   depends upon the implementation,  but is typically
>    62).  That is, the type  of these integer-constants do not correspond
>    to the int, long int, unsigned long int, ... of C.
>  
>    Some sample general NaNs are:
>     -NaN(Q,0x3fffff)
>     +NaN(S,017777777)
>      NaN(Q,4194303)
>      NaN(S)
>      NaN(S,0x5555555555555)
>  
>    If this level were adopted, we would still allow shorthands for the
>    common cases, as per level 2:  NaN, NaNS, and NaNQ.
>  
> Since a signaling NaN cannot be generated by any IEEE-754/854 operation,
> there are two ways to create signaling NaNs.

One important intended source of signaling NaNs is uninitialized storage.
Setting storage to all one bits generates signaling NaNs on Sun-3's
and Sun-4's, for instance.  Another possible source of signaling NaNs
is special system hardware or software that deals with exceptions by
inserting signaling NaNs and continuing; subsequent references to those
signaling NaNs causes the intended value to be fetched from a heap and
used to compute a result.  Mary Payne suggested such an approach for
over/underflow about ten years ago, and 754 was intended to allow such
a thing to be done.  Another possibility is to use a signaling NaN
to represent, for instance, exact pi or e; there are lots of issues
to be worked out to make that useful.   I'm not aware of anybody using
signaling NaNs at present except for detecting uninitialized storage,
at least in the Unix or PC worlds; in the more homogenous Mac
world, there may have been some applications.