[Cfp-interest 2723] Re: definition of "floating types"

Jim Thomas jaswthomas at sbcglobal.net
Thu Mar 2 10:11:00 PST 2023



> On Mar 2, 2023, at 8:01 AM, Vincent Lefevre <vincent at vinc17.net> wrote:
> 
> On 2023-03-02 16:17:27 +0100, Vincent Lefevre wrote:
>> On 2023-03-01 20:58:57 -0800, Jim Thomas wrote:
>>> On Mar 1, 2023, at 8:39 AM, Vincent Lefevre <vincent at vinc17.net> wrote:
>>>> IMHO, the standard should require that the evaluation format
>>>> associated with float be float_t, and similarly for double and
>>>> long double.
>>>> 
>>>> Would this break actual implementations?
>>> 
>>> Maybe yes. In the past, there were implementations that evaluated
>>> expressions in wider registers but (when they ran out of registers)
>>> stored intermediate values into narrower storage formats.
>> 
>> However, a narrower storage format, e.g. float, could be regarded as
>> the evaluation format (e.g. double), but with a reduced accuracy of
>> the operation (as the accuracy is implementation-defined). With an
>> x86 processor, this is similar to an operation performed on double
>> but where the processor is configured to round in single precision
>> (even though no float type is involved).
> 
> In practice, there is even an issue with FLT_EVAL_METHOD = 2:
> 
> #include <stdio.h>
> #include <float.h>
> #include <math.h>
> 
> int main (void)
> {
>  volatile double x = 1.0, y = 0x1p-55;
> 
> #if __STDC__ == 1 && __STDC_VERSION__ >= 199901 && defined(__STDC_IEC_559__)
>  printf ("__STDC_IEC_559__ defined, FLT_EVAL_METHOD = %d\n",
>          (int) FLT_EVAL_METHOD);
> #endif
> 
>  printf ("d = %.17g\n", (double) (fma(x,x,y) - fmal(x,x,y)));
> 
>  return 0;
> }
> 
> gives
> 
> __STDC_IEC_559__ defined, FLT_EVAL_METHOD = 2
> d = -2.7755575615628914e-17
> 
> under Linux x86 with both GCC 12 and Clang 13, using the 32-bit ABI
> (optimizations do not seem to matter), while I would expect d = 0.

Evaluation methods don’t affect function calls, including library functions. (TS 18661-5 provides a way of having them do so.) So, fma() must produce a double format result, with any evaluation method.

- Jim Thomas

> 
> -- 
> Vincent Lefèvre <vincent at vinc17.net> - Web: <https://www.vinc17.net/>
> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
> _______________________________________________
> Cfp-interest mailing list
> Cfp-interest at oakapple.net
> http://mailman.oakapple.net/mailman/listinfo/cfp-interest




More information about the Cfp-interest mailing list