[Cfp-interest 2243] Re: Usual arithmetic conversions for long double and _FLoat64

Sun Oct 24 14:25:40 PDT 2021

> On Oct 17, 2021, at 9:20 PM, David Olsen <dolsen at nvidia.com> wrote:
> 
> On an implementation where double, long double, and _Float64 all have the same format of IEEE 64-bit, what is the result type of an operation with operands of long double and _Float64?  My reading of the usual arithmetic conversions in section X.4.2 if N2601 <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2601.pdf> is that the result type is _Float64, because interchange floating types are preferred over standard floating types when the two types have the same sets of values.  Is my interpretation correct?

Yes.

> I am working on modifying the C++ floating-point proposal P1467 so that the usual arithmetic conversion rules match those in N2601 and C23, as was requested in the joint SG22/CFP meeting.  I am having trouble with this one situation.
>  
> On implementations where long double is bigger than double, long double + _Float64 -> long double.  I think the usual arithmetic conversion rules should be consistent so that long double + _Float64 -> long double all the time, even when long double has the same set of values as _Float64.  This will help users when porting code between implementations that have different representations for long double. 

How specifically would this help? There’s still the issue of long double arithmetic possibly being less robust than _Float64.

> Is it possible to tweak the usual arithmetic conversion rules in C23 to make this change?

Are you suggesting that there be a special rule that applies only to long double + _Float64 when they have the same formats?

>  
> (I believe that long double + _Float64 is the only combination of standard and interchange floating types where the result type can change between conforming implementations.  I guess long double + _Float128 -> long double if long double is bigger than _Float128, but I don’t think any such implementations exist.)

If long double were double-double then long double and _Float128 would be unordered.

If all possible C implementation are considered I think there are several exceptions. For example, long double could be narrower than _Float64. (This is not true with strict conformance to the C annex for IEEE floating types.)

- Jim Thomas

>  
> (I have no problem with double + _Float64 -> _Float64.  That makes sense, and I have changed the C++ proposal to do that.  It is only long double + _Float64 that I have an issue with.)
>  
>   - David Olsen
>  
>  
> _______________________________________________
> Cfp-interest mailing list
> Cfp-interest at oakapple.net <mailto:Cfp-interest at oakapple.net>
> http://mailman.oakapple.net/mailman/listinfo/cfp-interest <http://mailman.oakapple.net/mailman/listinfo/cfp-interest>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.oakapple.net/pipermail/cfp-interest/attachments/20211024/ee32f35d/attachment.htm>