[Cfp-interest 2254] Re: [SC22WG14.20776] Overflow, normalized numbers, N2805 and N2806

Mon Nov 1 14:45:23 PDT 2021

On Sat, 30 Oct 2021, Jim Thomas wrote:

> > On Oct 29, 2021, at 9:58 AM, Joseph Myers <joseph at codesourcery.com> wrote:
> > 
> > On Thu, 28 Oct 2021, Jim Thomas wrote:
> > 
> >> The intention of N2806 was not to change the definition of normalized, 
> >> but to simplify it and eliminate the typo. The new wording of the 
> >> definition of normalized floating-point numbers “all x with f1 > 0” is 
> >> in the context of the floating-point model described in the preceding 
> >> two paragraphs where x is given by the summation in paragraph 3 right 
> >> after “p, emin, and emax are fixed constants” in paragraph 2. For a 
> >> given type, the implementation defines fixed p, emin, and emax and 
> >> provides all the normalized floating-point numbers (represented by the 
> >> summation), along with signed or unsigned zero, in the type. The type 
> >> may include other numbers, including ones larger than the maximum 
> >> normalized floating-point number. The normalized floating-point numbers 
> >> give a common (parameterized) range of numbers for C portability.
> > 
> > That's not how I read the wording, so maybe N2806 needs more work (in 
> > which case N2805 and N2843, and maybe N2842, should be deferred if we 
> > don't have a fixed version of N2806 in time).  I read it as saying that 
> > f_1 > 0 makes a number normalized (whether or not all values of the f_i 
> > for that exponent result in a value representable in the type).
> 
> I agree that f1 > 0 just defines normalized. It’s the preceding words 
> “Floating types shall be able to represent signed zeros or an unsigned 
> zero (all fk == 0) and all normalized floating-point numbers …” that 
> state the requirement. Isn’t that clear? If not we can work on a 
> rewording.

I think the old wording (+ typo fix) has the effect of requiring all 
values with f_1 > 0 in that floating-point model to be representable to 
consider a value with that exponent normalized (and so LDBL_NORM_MAX < 
LDBL_MAX for certain formats).  I don't think the wording in N2806 has 
that effect.

That old wording with LDBL_NORM_MAX < LDBL_MAX was fine because it didn't 
affect overflow / fpclassify / isnormal - those could still treat values 
in the range that have the full precision but an exponent where not all 
choices of digits are representable as not overflowing, and as being 
normal because of not fitting any of the other classifications.  Nor did 
it affect LDBL_MAX_EXP.

I don't think the wording should be changed to introduce an ambiguity.  
And I think that, for such formats where numbers with the largest exponent 
have the same precision as for smaller exponents but the largest finite 
value is not the one suggested by the generic floating-point model, the 
existing definitions of overflow, fpclassify / isnormal and LDBL_MAX_EXP 
are the right ones, and it would be bad to change those definitions 
(directly or indirectly) so that values with that exponent are considered 
overflowing, or not to classify as normal, or so that LDBL_MAX_EXP 
changes.

For formats that have supernormal values (a range where precision is less 
than for smaller exponents), it seems more reasonable to change 
definitions so that supernormal values are considered overflowing and not 
normal and not considered for LDBL_MAX_EXP.

-- 
Joseph S. Myers
joseph at codesourcery.com