narrowing casts <9212291826.0.UUL1.3#20134aplauger.UUCP>

queueing-rmail uunet!lupine!segfault!rfg
Tue Dec 29 21:23:29 PST 1992


P.J. Plauger writes:

  In reply to rfg:
  
  Standard C explicitly settled on the model that all type casts behave
  as if they assign the value to a variable of the appropriate type.
  Thus, a cast to a narrower type stuffs a value through a knothole.
  Don't ask me to cite chapter and verse...

It looks like I won't have to.

I did a bit more research (i.e. RTFM'ing) and found that much of what I
was asking about in my previous posting (relating to floating-point
comparisons and floating-point casts) is already covered by the NCEG's
"Floating Point C Extensions" document.

As a matter of fact, I think the following paragraph (from section 3.2.3.1
of that document) pertains quite directly to the example I posted.  (Note
that on x86 machines, the `long double' type is really equivalent to the
80-bit FP format used by the x87 floating-point coprocessor, except that
such numbers, when stored to memory, are normally allocated a full 12 bytes
of storage...  just because 12 is a rounder number than 10 :-)

	"A minimum evaluation format of `long double' is common on extended-
	based architectures.  Programs that run under one of the other
	expression evaluation methods generally run at least as well when
	all expressions are evaluated to `long double'.  Most program
	failures due to extra precision arise from inconsistant use (see B.5).

Section B.5 says explicitly that assignments and casts "must perform their
specified conversion" and then cites the ANSI C standard, section 3.2.1.5,
which contains a footnote saying exactly the same thing.

So I'm feeling pretty confident now about what the required semantics of
casts are, but I'm still more than a little bit worried about the semantics
of comparison operators when one or more of their operands have been
evaluated in the "minimum evaluation format" on some machine (e.g. an x86)
for which the "minimum evaluation format" is something wider than the
"semantic type" of the operands in question.

Specifically, if I understand the rules correctly, it seems that if I
have a type `double' variable named `x', and if it contains some non-NaN
floating point value, then it appears that implementations which are fully
"conforming" to both the ANSI C standard and the latest NCEG specs are
permitted to yield zero for the expression:

		(x * x) == (double) (x * x)

If that's true, then I might offer the observation that this will come as a
bit of a surprize to some C programmers, especially given that an expression
like `x * x' has type `double', thus making the cast appear to be a no-op.

Of course, this kind of surprizing result is not limited to just x86
processors.  Nor is it limited to just "extended-based" architectures
(as the "Floating Point C Extensions" document defines that term).  It
seems that such surprizes can (and probably already do) arise also on
"double-based" architectures for expressions like:

		(y * y) == (float) (y * y)

.. where `y' is some floating-point variable with a non-NaN value (although
a zero result in this case would probably less surprizing than in the case
involving type `double' operand and casts shown earlier, due to "C's tradition
of wide evaluation".


In view of these unsettling examples, I feel compelled to offer yet another
small (and unsolicited) suggestion to the NCEG and, in particular, to the
floating-point working group of NCEG.  I have some doubts that this new
suggestion will be acted upon in any way (because it is bound to be quite
controversial) but I'll offer it to you folks anyway, and let you make up
you own minds.

Basically, it seems to me that "surprizing" results from floating-point
comparison operators could easily be eliminated via a simple stipulation
that all such comparisons should be implemented using the floating-point
format which corresponds to the "semantic type" of the operands, irrespective
of the "minimum evaluation format" normally used by the implementation for
for other kinds of FP operations.

It seems to me that that would solve the whole problem.

Naturally, if such a stipulation *were* made, the first question that would
come up is "What do we do when the two operands have two different semantic
types?"  Fortunately, this question is already answered by section 3.2.1.5
of the ANSI C standard.  It already requires that in cases where a binary
operator has operands which have two different FP types, the narrower one
is promoted to the type of the wider one.  Case closed.

It probably goes without saying that there might be some performance penalty
associated with using the "semantic type" rather than the "minimum evaluation
format" for FP comparisons, but I think that any possible (small) performance
penalty would be far outweighed by the advantages gained by using the semantic
type as the "implementation type" for FP comparisons.  Specifically, the
advantages that I can think of right off the top of my head are:

	1)	Various "nasty surprizes" (like those illustrated above) are
		eliminated.

	2)	The programmer would gain control over the precise semantics
		of his comparisons.

	3)	The portability of code would tend to be improved because the
		semantics of comparisons would be based directly upon C's
		data types (some of which are quite portable... at least
		among IEEE machines) rather than being based upon the
		vagaries of the "minimum evaluation format" (which tend to
		be more variable from implementation to implementation).

Point #2 is perhaps the most important.  In C, we have a strong tradition
of giving the programmer complete control over semantics, right down to the
bare silicon.  But with the current rules, it appears that I'm not permitted
to take control over the format used to perform FP comparisons, even in the
cases where I actually *need* that control.  (I can live with having addition,
subtraction, multiplication, and division done in a wider-than-I-wanted
format, because I can always toss out the extra significance later on, but
in the case of comparisons, there simply *is* no "later on"... the result
yielded is of type int, and if the input operands had too much precision,
I don't just get back an answer which is too precise... I get back an answer
that is WRONG.)

'Nuff said.


// Ron ("Loose Cannon") Guilmette    uucp: ...uunet!lupine!segfault!rfg
//
// 	"On the one hand I knew that programs could have a compelling
// 	 and deep logical beauty, on the other hand I was forced to
// 	 admit that most programs are presented in a way fit for
// 	 mechanical execution, but even if of any beauty at all,
// 	 totally unfit for human appreciation."
// 						-- Edsger W. Dijkstra



More information about the Numeric-interest mailing list