[Cfp-interest 2048] Re: Underflow

Mon Jul 5 09:50:12 PDT 2021

On Fri, 2 Jul 2021 12:08:49 -0700 Jim Thomas wrote:
>
>> Both
>>  Float from Float * Float, eg (MinFloat*(1.f+FLT_EPSILON))  *  ( 1.f - FLT_EPSILON)  is MinFloat after rounding
>>  Float from Float / Float
>> Are not wider precision (everything done as Float), but have a rounding.
>
>I don't understand this response.
>
>For example, sin(x) might just return x after comparing x with an underflow threshold. What was rounded?

The mathematical result was rounded.

The Taylor series expansions:
  sin(x) = x - x**3/6 + ...
 asin(x) = x + x**3/6 + ...
help determine when underflow should happen.

Let min be the minimum normal number.
Let den be the minimum subnormal number.

Consider these cases:

sin(min) which should be just under min that rounds to min
asin(min) which should be just over min that rounds to min
hypot(0,min) which is exactly min
scalbn(min,-1) which is exactly min/radix
nextafter(min,0) which is exactly largest subnormal

sin(den) which should be just under den that rounds to den
asin(den) which should be just over den that rounds to den
hypot(0,den) which is exactly den
scalbn(den,1) which is exactly radix*den
nextafter(0,1) which is exactly den

So, 
sin(min) should be underflow
asin(min) should not be underflow
hypot(0,min) should not be underflow
scalbn(min,-1) should not be underflow
nextafter(min,0) should not be underflow (but is required to be)

sin(den) should be underflow
asin(den) should be underflow
hypot(0,den) should not be underflow
scalbn(den,1) should not be underflow
nextafter(0,1) should not be underflow (but is required to be)

To me, that means the mathematical result determines when the result
is underflow.  And, if the mathematical result is not equal to the
returned value (that is, is inexact).

So, I am in favor of something like:

  The result underflows (unless specified otherwise) if the magnitude
  (absolute value) of the mathematical result is nonzero and less than
  the minimum normal number in the type and not equal to the result in
  the type.249)

---
Fred J. Tydeman        Tydeman Consulting
tydeman at tybor.com      Testing, numerics, programming
+1 (702) 608-6093      Vice-chair of PL22.11 (ANSI "C")
Sample C99+FPCE tests: http://www.tybor.com
Savers sleep well, investors eat well, spenders work forever.