Instruction Set Architecture support for Interval Arithmetic

Sat Mar 2 16:42:36 PST 1991

Interval arithmetic is a method for bounding the effects of
computational uncertainties due to uncertain input data, substitution
of rational arithmetic operations for analytical procedures, and
rounding errors in the arithmetic and base conversion.  Instead of a
single number x, the value of a variable is represented by a pair of
numbers [xl,xr] representing the left and right endpoints of an
interval containing the actual value of the variable.

Analytical and statistical methods can be applied to the problem of
bounding uncertainty in computed solutions, but are too difficult or
too unreliable to hold much promise of general applicability.  Interval
analysis has been believed to be much more promising... for years.  So
far lack of convenient language support and efficient hardware support
have prevented this potential from being fully realized.

Miranker and Kulisch and his colleagues at Karlsruhe have developed an
interval analysis paradigm based upon computing correctly-rounded
scalar products sum(xi*yi) of two vectors x and y.  There have been
microcoded implementations of their paradigm on some IBM 370 mainframes
and some German 370 compatibles, and language support in the form of
Pascal and Fortran preprocessors that generate the needed instructions
and function calls.  Many common scientific computations can be fit to
sleep in the Procrustean bed of the correctly-rounded scalar product
paradigm.  However cost-effectiveness is questionable both with respect
to software development effort and sleepy run-time performance.

Conventional interval analysis is not so cheap and easy, either, but
might be more so if the language support were added to a
portable common base, such as gf77 if such a thing existed, and if the
hardware were relatively efficient, the topic to be explored below.
The attractiveness of any kind of error assessment declines rapidly if
computing it costs more than 2-3X as much as computing a conventional
answer.  As mentioned in a previous note, there is still a considerable
analytical burden in finding the correct formulation of a problem for
interval methods: simply replacing all REAL*n declarations with
REALINTERVAL*2n only works for the simplest problems, and some problems
just don't lend themselves to boxy error bounds.

Anyway, a conventional interval [a,b] is defined to mean
	{ z | a <= z <= b } 
so a <= b.  

Such an interior interval could
be augmented by legitimizing exterior intervals, for which a > b,
defined to mean
	{ z | z <= b or z >= a } 
so such intervals contain infinity.
This is a natural generalization in IEEE arithmetic, which allows
nonstop division by zero by default; if there are no exterior
intervals, then division by intervals containing zeros must
be seriously exceptional since there is no way of indicating
a suitable result.

In IEEE arithmetic, signed zeros and infinities can be exploited for
some problems by adopting appropriate conventions such as

	[-0,+0] or [+0,-0] means the point at zero with no sign interpretation 
	[-0,-0] means negative zero 
	[+0,+0] means positive zero 
	[-0, +x] means [0,x] 
	[+0, +x] means (0,x]
	[-inf,+inf] means the whole number line 
	[+inf,-inf] means the point at infinity with no sign interpretation 
	[-inf,-inf] means negative infinity 
	[+inf,+inf] means positive infinity 
	[x,+inf] means [x,inf) 
	[x,-inf] means [x,inf]

Whether an interval whose endpoint is zero is open or closed can be
important for symmetric conformal mapping involving complex functions
with slits.  Note that the rules for signed zeros in IEEE arithmetic
are intended to support such usage.

Note that some easy operations in normal arithmetic - often called
"scalar" in this context - are more complicated in interesting ways
when applied to intervals:

	-[a,b] = [-b,-a] 
	abs[a,b] = [a,b]                if sign(a) == sign(b) == +
		 = [-b,-a]              if sign(a) == sign(b) == - 
		 = [0, max|a|,|b|)]     if sign(a) != sign(b)

Interior addition: is easy
	[a,b]+[c,d]=[a +l c, b +r d] 
where +l means addition with
directed rounding to the left (negative direction) while +r means
rounding to the right (positive direction).  From the point of view of
implementation in instruction set architectures, a 128-bit register
that can hold one complex pair of 64-bit values can also hold one
interval pair of 64-bit values.

Exterior addition: is more complicated.  Exterior intervals with
infinite endpoints are really interior intervals closed at infinity; so
assume exterior intervals with finite endpoints.

	When a>b, all finite, 
	[a,b]+[c,d]=[a +r c, b +l d] 

note the
reversal of rounding direction compared to the interior case; but if
the result interval would overlap and appear to be an interior interval
then it must be replaced by [-inf,+inf].

	When a>b and c>d, all finite, 
	[a,b]+[c,d]=[-inf,+inf].

Multiplication: has several cases, depending on the signs of a, b, c,
and d; there are 16 possibilities.  Excluding exterior intervals leaves
9 possibilities.  We can multiply an operand and the result by -1 if
the operand has both signs negative; that leaves 4 possibilities in
three interesting interior cases:

	[a,b] * [c,d] = [ a *l c, b *r d] 
	0<=a<=b 0<=c<=d 
	[a,b] * [c,d] = [ b *l c, b *r d] 
	0<=a<=b c<=0<=d 
	[a,b] * [c,d] = [ min(b *l c, a *l d), max(a *r c, b *r d)] 
	a<=0<=b c<=0<=d

Hardware implementation: given the previously posted architecture for
high-performance double precision complex multiply-add,
two of the adders or multipliers could be exploited in parallel
to handle the common cases: interior
addition, and interior multiplication by two operands, each of definite
sign:
[a,b] is definitely interior positive if 0<=a<=b and definitely
interior negative if a<=b<=0.

Sums or products involving one or two exterior intervals, and products
involving one or two interior intervals of indefinite sign, require
more complicated logic and in a true RISC design ought to be relegated
to software, at least initially - once interval applications become
common enough to be a SPEC category it may be discovered that some of
these "unusual" cases are common enough that hardware "microcoding" is
appropriate.

Conventional arithmetic on a Sun-4, for instance, executes common cases
in hardware and otherwise [as for subnormal operands and results]
causes an "incomplete instruction" hardware trap, which eventually
leads to kernel code which decodes the trapping instruction, emulates
it in software, places the result in the intended destination, and
continues.  This is very slow, awkward to debug, may block inhibit
system response by blocking lower-priority
traps during trap handling, and may cause
the kernel size to be larger even when no applications cause incomplete
traps.

A better mechanism for supporting incomplete instructions would be to
have user-mode trap handlers that could intercept incomplete traps (via
a better mechanism than SIGFPE, or at least a different, undocumented
one) and handle them in user mode with minimal overhead over a
conventional function call.  Outstanding instructions in the queue
prior to the trapping one must be handled in a way that doesn't
jeopardize the integrity of the system.  There is no run-time overhead
if the recomputation code is not used because it is dynamically loaded 
only when needed.  Note that putting the recomputation code in user mode
is not to allow user programmability of recomputation, but to avoid the
problems inherent in extensive computational kernel code.

This is the way the Sun-3 FPA and Sun-3x FPA+ recomputation
work in principle; in
reality there is a privileged hardware trap that gets converted to
SIGFPE, and a user-mode SIGFPE handler in libc.  So the SIGFPE overhead
gets added to the kernel trap overhead and the emulation overhead
inherent in the Sun-4 approach.   Sun-3 programmers idly enabling
SIGFPE handlers of their own via signal() or sigvec() don't get very
far, which is a weakness of handling incomplete instructions in user
mode by overloading another mechanism.  Some seldom-seen documentation
in the floating-point programmer's guide hinted at ways to enable SIGFPE
handlers without interfering with the SIGFPE recomputation process.