[Cfp-interest 2267] Fwd: [SC22WG14.20798] Overflow, normalized numbers, N2805 and N2806

Jim Thomas jaswthomas at sbcglobal.net
Sun Nov 14 09:53:03 PST 2021


This is a message I sent to Rajan as background information and response possibilities for the WG14 discussion this coming week about the issues Joseph Myers raised about our recent proposals and support for double-double formats (see Cfp-interest 2247, 2254, and 2259).

- Jim Thomas

> Begin forwarded message:
> 
> From: Jim Thomas <jaswthomas at sbcglobal.net>
> Subject: Re: [SC22WG14.20798] Overflow, normalized numbers, N2805 and N2806
> Date: November 12, 2021 at 4:08:45 PM PST
> To: Rajan Bhakta <rbhakta at us.ibm.com>
> Cc: Joseph Myers <joseph at codesourcery.com>, "Fred J. Tydeman" <tydeman.fred at gmail.com>
> 
> 
> Rajan,
> 
> Here are some thoughts about presenting CFP proposals N2805, N2806, N2842, and N2843, in the context of the issues Joseph raised.
> 
> Note that Joseph is CCed, in hopes he will correct any misrepresentations of his concerns and review the suggestions.
> 
> - Jim
> Background
> 
> C99 and C18 have no explicit requirement about which model numbers are in a type. However, there are implicit requirements. For example, the definitions of the characteristics macros include formulas referring to the model parameters. The type_MAX formula (1 – b^(−p ))*b^emax implies the maximum representable finite number in the type is the maximum normalized floating-point number in the model and the type_MIN formula b^(emin−1) implies the minimum normalized positive floating-point number in the type is the minimum positive normalized floating-point number in the model.
> 
> C99 or C18 don’t have an explicit requirement for floating types to include all normalized floating-point numbers in the model. However, the type_DIG formula is derived assuming the density of floating-type representations that is given by the model. Users might make the same assumption.
> 
> This C99/C18 specification is ok for IEC 60559 formats and others whose internal representations correspond (close enough) to the C model, but can be problematic for others, notably double-double formats. (Double-double formats represent numbers as pairs of doubles, but implementations put different constraints on what pairs constitute valid representations -- there isn’t a standard double-double format.)
> 
> The current C23 draft includes changes intended to accommodate double-double formats, e.g. revised type_MAX and new type_NORM_MAX.
> 
> The current C23 draft includes (what is intended to be) an explicit requirement for floating types to include all normalized floating-point numbers in the model. It also clarifies that the model parameters are fixed for each floating type. Thus (referring to the model formula) all x with f1 > 0, with any f2 … fp, and with any e such that emin <= e <= emax, are the normalized floating-point numbers for the given parameters, and all must be representable in the type.
> 
> The CFP proposals
> 
>  N2805
> 
>  Joseph’s concern (for double-double formats) is that this definition requires overflows in cases where the result is in the range of full precision representations. A possible replacement for #5:
> 
> A floating result overflows if a finite result value with ordinary accuracy would have magnitude (absolute value) too large to represent with full precision in the specified type. A result that is an exact infinity does not overflow. …
> 
> A problem with this is definition (and others) is the requirement in the next sentence:
> 
> If a floating result overflows and default rounding is in effect, then the function returns the value of the macro HUGE_VAL, HUGE_VALF, or HUGE_VALL according to the return type, with the same sign as the correct value of the function.
> 
> (This requirement seems to be the only definition for the value of the HUGE_VAL macros. Vincent recently raised issues with HUGE_VAL, which are on the agenda for the next CFP meeting.)
> 
> A possible solution might be to append after the second sentence in #5:
> 
> Alternatively, for types with reduced-precision representations of values beyond the overflow threshold, the function may return a representation of the result with less than full precision for the type.
> 
> N2806
> 
> I suggest (see background above) you present this paper as-is. See if the committee wants a rewording of the definition of normalized floating-point number and/or the requirement that all normalized floating-point numbers in the model be represented in the type. Ask the committee if they would like #4 expanded into separate paragraphs as Joseph suggests (with normalized, subnormal, and unnormalized being defined in separate paragraphs).
> 
> N2842
> 
> Joseph’s concern (for double-double formats) is that this proposal does not classify as normal (or as anything else) numbers that are in the range of full precision representations but beyond the range for normalized floating-point numbers. A possible change would be to insert before the last sentence in new paragraph 0:
> 
> Larger magnitude finite numbers represented with full precision in the type may also be classified as normal.*)
> 
> *) Double-double formats can provide full precision representations of numbers greater than the maximum normalized floating-point number (5.2.4.2.2).
> 
> N2843
> 
> Joseph’s concern (for double-double) is that the change requires LDBL_MAX_EXP to be smaller than the exponent for the largest-magnitude full-precision values in the type. A possible change would be to add a qualification (as was done for FLT_MAX, etc.) to the formula:
> 
>  - maximum integer such that FLT_RADIX raised to one less than that power is a representable finite floating-point number. If that representable finite floating-point number is normalized, emax
> 
>  
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.oakapple.net/pipermail/cfp-interest/attachments/20211114/2905b165/attachment.htm>


More information about the Cfp-interest mailing list