[Cfp-interest 1365] Re: (SC22WG14.16713) N2380: printf of NaN()

Sun Jul 28 11:43:26 PDT 2019

> \On May 15, 2019, at 8:28 AM, Martin Sebor <msebor at gmail.com> wrote:
> 
> On 5/15/19 3:37 AM, Jens Gustedt wrote:
>> Hello JF,
>> On Tue, 14 May 2019 12:33:24 -0700 JF Bastien <cxx at jfbastien.com> wrote:
>>> Specifically, I think C should instead support:
>>> 
>>> 1. Extraction of NaN integer payload from double / float / long
>>> double. 2. Creation of NaN with integer payload (without going
>>> through character sequences).
>> C in its current flux has this already in Annex F, `getpayload` and
>> `setpayload`. Would you want them to be mandatory?
>>> The specification of NaN-related *n-char-sequence* should then be
>>> constrained to match the restrictions imposed on integer payloads
>>> (i.e. must be positive, maximum value).
>>> 
>>> I'd then like to understand what encodings must be supported: does the
>>> integral encoding support decimal only, or does it support
>>> hexadecimal (and must it be preceded by 0x)? I think this
>>> determination should be made by surveying existing implementations.
>>> 
>>> Then, and only then, does it make sense to figure out the maximum
>>> number of characters of NaN-related *n-char-sequence* as proposed by
>>> N2380.
>> I can't follow you here. This macro is intended to provide the
>> knowledge to the user how large a buffer should be if they are
>> expecting that a NaN could be printed. This makes sense to me
>> regardless what the encodings could be. The user here is just at the
>> receiving end and tries to deal with buffer overflows.

IEC 60559 does not attach any semantics to NaN payloads, beyond recommending propagation rules. It notes that implementations might encode diagnostic information in payloads. There have been various ideas about how to do that, and some of the ideas have been implemented. There has been no use of payload inspiring enough to suggest that further standard specification would be more valuable than implementations’ flexibility to have their own meaning, if any, for payloads. Thus, payloads are intentionally minimally specified, and hence non-portable. The printf specification for NaNs is intended to serve both implementations that do and ones that do not attach meaning to payloads.

> FWIW, the macro is just a band aid on a small subset of
> the problem described in N2301:
> 
> 1) there is no requirement/guarantee that the printf output
>  is the same even for the same representation of a NaN

Implementations should be free to print whatever seems appropriate for any meaning they give payloads. 

Or is the “problem" referring to behavior on one implementation? For implementations that do not conform to Annex F, is there a requirement that printf output be the same for the same representation of a number? For Annex F, such a requirement (for consistency on a given implementation), seems ok, though of uncertain value.

> 
> 2) there is no guarantee that what printf outputs for a NaN
>  can be parsed by scanf to get the same NaN back

What does this mean? There’s no 1-1 correspondence between NaNs and their I/O string representations. 

Implementations that attach no meaning to NaNs should print (-)nan or maybe (-)nan(), but they need to scan printf output from other systems.

> 
> 3) there is no limit on the amount of output for a NaN

Right. This is an unnecessary risk.
> 
> 4) there is no way for printf callers to choose which of
>  the two forms of NaN to format, so no portability

Payloads aren’t portable. It would help portability if the user could ask for the (-)nan form. With the (-)nan(n-char-sequence) form, there can be no expectation of implementation-independent printf output. A user option to get the nan(n-char-sequence) form can’t determine the n-char-sequence, so it would only serve to get nan() instead of nan.

- CFP group

> 
> When we discussed N2301 there was 12/0/1 consensus to address
> "the issue" in the paper.  All of the above is "the issue."
> 
> The group then started bike-shedding how "the issue" should
> be addressed and someone had the bright idea that precision
> would be a better way to do it then the pound AKA hash flag
> in the proposal.  A straw poll of that idea was 7/4/3, which
> by our arbitrary standards of interpretation was viewed as
> direction to proceed.  But precision isn't a viable mechanism
> for selecting between the NaN formats because it would screw
> up the formatting of finite numbers.  So with that, "the issue"
> has morphed into just (3) above which is the subset N2380 tries
> to solve.  In practice (3) isn't a real problem because no sane
> implementation would produce more output than the number of bits
> in a NaN, so a portable program can conservatively allocate at
> least that much space for each number and be assured it won't
> overflow.  Unlike (3), though, all the others aspects of
> the issue are real or at least far more likely.  N2301 solved
> all of them, including (3).  But in our wisdom, we choose to
> solve the part that doesn't affect anyone and call it good.
> 
> Martin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.oakapple.net/pipermail/cfp-interest/attachments/20190728/f4c7975c/attachment-0001.html