ANSI C & perfect decimal<-> fp conversions

Fri Jun 1 13:14:00 PDT 1990

> 
> Another layer of confusion:  I read that section of the C std as
> constraining only the accuracy of decimal->fp conversion performed *by*
> the compiler on floating literals appearing in the program text.  I
> read the "perfect rounding" section in the NCEG paper as constraining
> not only that, but also the accuracy of conversions in both directions
> performed by the std library routines (printf, scanf, atof, etc).  Does
> the ANSI std actually constrain more than the compiler?  Or does the
> NCEG proposal actually constrain just the compiler?  (note in passing:
> if the NCEG proposal does constrain the libraries too, it seems to go
> beyond 754's section 5.6 not only in accuracy but also in requiring that
> compile-time and run-time conversions yield the same results).

It was always the intent of IEEE 754 that compile-time and run-time
expression evaluation (including base conversion) yield the same results.
It's somewhat contentious whether compile-time expression evaluation
should honor rounding modes and generate exceptions just as if the expression
were evaluated at run time.  It is certainly the intent of NCEG to ratify
the non-contentious part of that requirement; whether X3J11 requires
that as well (in the normal mode of compiling in the execution environment),
I leave to the X3J11 members to opine.
> 
> I agree there are real benefits to perfect conversions, but the
> networking benefits specifically would (I think) carry a lot more weight
> if the proposal were being made under the auspices of a group aiming to
> improve the usefulness of C in networked applications/environments.  I
> don't know how important networking considerations are under NCEG's
> charter.  I.e., benefits & costs are relative to the players involved,
> and it's at least conceivable to me that networking benefits are no more
> relevant to the NCEG effort than would be the also-real benefits of
> mandating arbitrary-precision rational arithmetic.

This is like saying that NCEG shouldn't think about vector or multiprocessor
systems, and there should be another group to worry about those (at least 
for multiprocessor there are too many already; same for vector if you count
Fortran-90 as including vector (array) processing).
> 
> No argument, but think the proper place to fix 754 problems is in
> revisions to 754; it's not really a *C* issue.

Everybody has a finite amount of time and energy for standards work.  One
of the main problems with IEEE 754 and 854 is that no bindings to standard
languages were specified.  The bindings between C and 754 are a legitimate
C issue, judging by the size of this mailing list (130 people not counting
Sun employees).  X3J11 decided a long time ago that libraries were part of
its language standard, since you don't have any useful portability otherwise.

> Darned right the sloppy methods I'm using now don't meet normal 754
> requirements -- they're whatever came with the Berkeley UNIX(tm)
> distribution, and probably don't even meet Cray's stds <grin>.

The machine-independent base conversion that accompanied 4.2 BSD wasn't
so hot; it last appeared in SunOS 2.0.  4.4 BSD should be better.

By the way, I should mention that the "machine independent" code that
I intend to adapt for public distribution is only independent among 
machines for which
	short = 16 bits
	int = long = pointer = 32 bits
What I distribute will have float pack/unpack routines for IEEE 754 single,
double, quad, and extended, but adapting to other floating-point formats
should be relatively simple - and will be left as an exercise for the student.
> 
> Agreed that you can usually get what you design for, and add that I
> think the appalling state of fp arithmetic these days is a consequence
> of putting accuracy at the bottom of the design list.

More likely a consequence of putting time to market at the top of the list.

> And also good examples of implementations unsuitable for supercomputers
> (because the 8847 fdiv/sqrt don't pipeline -- the supercomputer game is
> more concerned with maximizing issue rate than minimizing latency).

This may be a red herring - the 8847 uses Newton algorithms with a correction
step at the end.  The instructions could probably be pipelined if it were
worth the trouble.  IBM has demonstrated that if you can issue a multiply-add
every cycle along with the necessary loads and stores, you can do pretty well
on many problems that supercomputers do well on.   This just means that
the supercomputer niche will continue to shrink as a fraction of the total
market.

> 
> >  The only place in 754 where a tradeoff between speed and accuracy is
> >  permitted is in the one specific instance of base conversion of large
> >  exponents.
> 
> the tradeoffs 754 permits between The Right Thing and Speed and/or Cost.
 ...
> <grin>?!"); or look at its odd permissiveness in how underflow may be
> detected.  The only reason I can think of for these concessions to The
> Wrong and/or The Lazy is that even 754 had to draw lines somewhere on
> "practicality" grounds.

At far as underflow goes, there was no demonstrable advantage to mandating
one of the various permitted options over the others, and good implementation
reasons to permit more than one approach.   Time has demonstrated that this
wasn't such a good idea for other reasons than we considered at the time. 

> Honest -- I'm not against it.  But I have yet to hear anyone *ask* for
> it.

Just yesterday I marked a bug report "not going to fix".  The customer
wanted identical numerical results on Sun-3's with software floating
point, 68881's, and FPA's.  I've gotten similar requests from other people
who should know better, at Mathematica for instance.  Anyway if the
customer had wanted to know about Sun-4's, I could have answered "on the agenda -
to be implemented some day".
> 
> Take a personal check <smile>?

Sun decided to support me in this endeavor, so I'm OK.  
The people needing support 
are elsewhere; some may choose to identify themselves on this list.
> 
> ???  While I'm a compiler jockey by trade, due to a sequence of
> historical accidents I became a lightning rod for customer complaints
> about CRI's fp arithmetic, and it really was the case that addition
> gripes outnumbered all others put together by an easy 10-to-1 ratio.

Probably because the addition anomaly was easier to see and understand.

> often in *practice*, but they were routinely thrust into difficulties by
> the drift in a long chain of adds.  Since they continued to get results

If you don't know the error, then you don't know the answer.  Most customers
don't want to know:

> Just as a point of interest, the only repeated "customer-type" gripes
> I ever heard about CRI's division were after stuff like the Fortran
> 
>       X = 6.
>       Y = 3.
>       I = X/Y
>       PRINT *, I
> 
> printed "1" due to a too-small reciprocal and/or a too-small multiply
> leading to the fractional portion of a bit-too-small quotient getting
> truncated away by the conversion.  People were (understandably!) *livid*
> about this when it happened

That this sort of thing happens in general is nothing to get excited about;
 (int) 3.0*(2.0/3.0) or equivalent is going to get you on any kind of computer.
That 6.0/3.0 is not exactly 2.0 is more irritating but only because you know
the answer and hence the error.

> Ah, but don't you think you might be undervaluing the worth of perfect
> conversions here?  The monotonicity and "if you print it & read it back
> you'll get back what you started with" guarantees perfect conversions
> provide are worth quite a bit to the careful numerical programmer,
> regardless of whose crazy arithmetic they're using -- even if they're
> just using a laptop computer on a plane ride.  *If* it's practical
> ("fast enough") then I think the general scientific community would eat
> it up -- and it's something NCEG C could offer that Fortran doesn't.  I
> have no doubts about the benefits, just concerned about the costs and
> the propriety of mandating something that apparently *isn't* public art.

If I ever get the code published, anybody will be welcome to adapt it to their
crazy arithmetic system!

> >  Our routines, which are accurate over the full range, take 860 cycles
> >  to do
> >  	ecvt (value, 17, &decpt, &sign);
> >  and 903 to do
> >  	atof (s);
> >  using 64-bit integer arithmetic.  About half the time in the atof call
> >  is parsing, and not conversion.  The ecvt time is mostly conversion.

Independent of the data?  That's sort of amazing.