Performance vs. determinism in expression evaluation

Sat Mar 30 13:59:53 PST 1991

The issues of extended-precision register allocation, correctly-rounded
this and that, etc. have led me to wonder if there is a reasonable way to
proceed that allows future portable programs to know something with certainty,
in the face of all the hardware and software
optimizations past, present, and future, without impinging on the performance
of typical existing programs on typical existing systems, and without
requiring full implementations of interval arithmetic as language types.

I don't think the answer lies in conventional environmental inquiries;
they're too vague:  Does IBM 370 subtraction round or chop?  What about
Cray division?   Do we want environmental inquiries for every operation
as a function of its operands?

The only common arithmetic systems that define results only in terms of
the correct unrounded results, independent of operation
and operands, as far as I know, are VAX and IEEE.

I don't think the answer lies in conventional implementation-defined pragmas
that tell how to evaluate the following expression; they're not portable.

I can imagine a couple of operators that might be helpful.
To avoid premature heated discussion of syntax in specific languages, I'll
just specify them in functional notation, but they are necessarily operators
known to compilers.

----------------------------------------

ruleround(expression)
ruleround(expression,type or scalar)

ruleround computes the rounded value of an expression,
according to the format of the destination, if in an assignment or 
function argument value, or according to the type of the second argument.
Rounding is by an implementation defined rule that rounds
infinitely precise correct results independent of operands or operations.
On IEEE systems, it's the current rounding mode; on DEC systems, it's 
"biased" rounding; on IBM 370 and Crays, it's probably round toward zero.
You could have a macro like ANSI-C FLT_ROUNDS with four IEEE
directions plus biased rounding for VAX as possibilities,
and possibly something different for IBM 370 and another for
Cray depending on what rule their compilers decided to implement.
The existing ANSI-C "indeterminate" rounding specification won't do. 

How complicated an expression a compiler is willing to accept here is a quality
of implementation issue, but at a minimum a compiler must accept
x+y, x-y, x*y, x/y, x%y, sqrt(x), for scalars x and y, strto("string"),
and a few others.  In this context decimal strings and constants are treated
as if infinitely precise.  Compilers generate compile-time errors for 
expressions that are too complicated.

How efficient is this?  ruleround(x+-*/y)
can be computed with normal arithmetic on
IEEE and VAX systems, but x+-y on IBM 370
has a few tricky cases that fail to round
toward zero and thus have to be tested for.  I don't know how much slower
x/y would be on a Cray.  However most of these sensitive spots are not
in inner loops and thus efficiency doesn't matter too much.

Among other things x=ruleround(x) provides an escape mechanism for
existing extended-precision implementations.  Programmers can portably
specify round to storage format in this way,
when they really need it, without compromising performance
when it doesn't matter.

Overflow and underflow can't occur; how such values are rounded
is part of the rule.

----------------------------------------

intervalbound ( expression )

intervalbound() returns an interval, consisting of a lower bound l
and an upper bound r - languages that don't want to have interval types
need to define a syntax for assignment to a list of function return values,
e.g.
	list (lb, ub) = intervalbound(expression)
The interval is guaranteed
to bound the value of expression.

How complicated an expression a compiler is willing to accept here is a quality
of implementation issue, but at a minimum a compiler must accept
x+y, x-y, x*y, x/y, x%y, sqrt(x), for scalars x and y, strto("string"),
and a few others.  In this context decimal strings and constants are treated
as if infinitely precise.  Compilers generate compile-time errors for
expressions that are too complicated, and run-time errors for expressions
they can't handle: on IEEE systems intervalbound() can't generate overflow,
underflow, or inexact, but on VAX and IBM 370 systems, 
intervalbound() can fail due
to overflow since there's no infinity symbol to use to bound the correct
result.

How wide an interval a compiler will generate is another quality of 
implementation issue.  IEEE implementations that generate (-inf,+inf)
will be conforming but low quality!  Better quality 
implementations, for instance,
would return lb==ub when the answer is exactly representable.
I would guess that intervalbound() would be implemented in the greatest
hardware precision available.

How useful is intervalbound in the absence of complete interval arithmetic
in a language?
It provides a way of determining when arithmetic
is exact, that's implementation-independent at least among high quality
implementations.  Among compilers that accept slightly more complicated
expressions - three rational operands are probably sufficient - it provides
a way to determine when functions being zero'ed are close enough to a zero:
lb<=0 and ub >=0.  Of course these compilers must have access to interval
arithmetic in a run-time library.

----------------------------------------

next(scalar)

next(x) simply returns the next larger machine-representable number from x
toward +infinity.  Compare with IEEE 754 nextafter(x,y) which is 
more expensive to implement in hardware and more often than not less useful.
Certainly in environmental inquiry cases, you almost always want
either next(x) or -next(-x), typically to determine next(x)-x.
"scalar" here means a simple variable or constant, not an array or expression.