64 bit integers -- the real problem
David Boundy
uunet!apollo.com!boundy
Thu Jan 2 06:36:17 PST 1992
In the discussion about 64 bit integers, a most important point has gone
undiscussed. Recall that a data structure is a data type and operations on
that type. An assumed part of that definition is that those operations be
"useful!" Most of the proposals floated have failed this "usefulness" test.
Short form: Whatever else NCEG does or does not do w.r.t. 64 bit integer
types, NCEG really should propose a reasonable model for integer arithmetic
semantics to repair the brain damage in the ANSI C standard.
Long form:
I start with two theses. First, a programming language is "good" to the
degree that it makes it easy to write correct programs. A good programming
language does not have features that seduce all but the wariest programmers
into writing programs that don't work. Programs that have an obvious naive
interpretation should be interpreted in the naive and obvious way.
Second, the most useful behavior for computer integer arithmetic is to model
pure algebraic integers as closely as possible. Just as mathematicians
arrived at our current notions of integers over several millenia because they
usefully model the real world, a computer arithmetic usefully solves real
problems to the degree that it models mathematical integers. But, of course,
the essential problem in defining computer arithmetic systems is that
computers are constrained to modelinfinite number systems within a costly,
finite, physical implementation.
A model for correct evaluation of integer expressions.
A C compiler stands on solid ground when it exploits a programmer's assertion
that the values that will be stored in an explictly-typed lvalue won't exceed
the bounds of its type. Once a programmers has, for instance, declared a
signed 16-bit variable, he promises not to set it to a value outside of
[ -32768, 32767 ]. And, of course, most processors physically prevent it.
It just happens that 2's complement implementations make modular arithmetic
behavior most convenient for the overflow cases. Almost always when the
difference between unbounded integers and modular integers is exposed, it is
at best an irritating behavior; it should not be the central model presented
to the programmer, only the consistent fallback when no non-irritating
alternative exists. The goal should be to do the arithmetic as if on the
unbounded integers of mathematics, and lose bits only at the points where the
programmer has provided explicit size assertions, guaranteeing that no useful
bits will be discarded.
*** CENTRAL PARAGRAPH *** HEREINAFTER REFERED TO AS "***" ***
We thus come to a utopian characterization of "correct" integer arithmetic:
a correct rendering of an integer assignment statement is indistinguishable
from one that extends all right-hand side (RHS) operands to infinite width
( either zero-extending or sign-extending as appropriate ), evaluates the
RHS in infinite width, then deposits the low-order bits into the left-hand
side (LHS) variable. ( "assignment" includes evaluation of the actual
parameters of a function call, evaluating an expression that is the operand
of a type cast, or any other action that computes a value into a cell of a
programmer-defined type. )
Interesting result of *** -- utopia isn't expensive in most cases.
To "correctly" evaluate assignments whose RHS' use only the right-to-left
operators add, subtract, multiply, and, or, xor, left-shift, negate, and
not operators, it is necessary and sufficient to extend all operands to at
least the width of the LHS, do the computation in that width, and deposit
the low-order bits into the LHS. The high-order bits may be ignored or
manipulated with arbitrary malice!
Division ( and its cousins mod and right-shift ) are essentially the entire
problem -- either the language design center must be a scheme that makes
them work, or the it must explictly decide where they break. Division ( I
use D to mean D, M, and >> ) has an irritating property: the value of any
bit of either the dividend or divisor can affect the value of any bit of the
result. Thus, most current architectures -- and the C language -- implement
both a signed divide ( which implicitly sign-extends both operands to infinite
width before dividing them ), and an unsigned divide ( which implicitly
zero-extends both operands ).
So, now let's take a look at the ingredients we need to mix together to
cook a useful integer data structure. First, the data types. Most C compilers
implement the following kinds of integers, though one or more of these classes
may be empty on a particular implementation:
* A "native" integer, "int". Typically this is the width of the
integer registers in the smallest member of the target
architecture's family.
* "Narrow" integers. "short" and "char" in most implementations,
these are integer formats that are narrower than the "native"
format. If a narrow operand is consumed by a divide, either the
architecture contains instructions operating on these narrower
operands, or the operands need to be extended to the width of
native integer.
* Some C implementations include "convenient wide" integers --
"long" or "long long" that is wider than "int". Typically this
is the width of the integer registers in the largest member of
an architecture family, or the width of the operands of an
instruction that operates on multiple-register operands.
* "Bignums". These are much wider than the machine's registers.
Bignums may either be unbounded LISP-like bignums, or may be
statically-sized multiple-word operands. Static-size bignums
can be implemented in multiple registers. On register-poor
machines, or for unbounded values, bignums must be implemented
completely in memory using add-with-carry instructions, long
multiplication and division, etc.
[ I absolutely don't care what the types are called. If we do the following
correctly, and the program uses ANSI prototypes instead of allowing types to
default, essentially everything else can be solved with typedefs -- enough
just works that I think essentailly all my customers will be happy. Without
the following, no nomenclature for types will be portable. ]
Now the interesting part of our data structure, the operations:
Unfortunately, language designers and implementors are forced to trade off
the correctness of mathematical integers against the cost of multi-instruction
sequences. There are four interesting points on the correctness vs. execution
time spectrum:
* Scheme 1. Fully correct arithmetic at unbounded cost.
The most correct integer arithmetic scheme is an unbounded bignum-based
implementation where the width of operands is determined dynamically.
This implements full mathematically correct unbounded-value integer
arithmetic. This is appropriate in some languages, not in C.
* Scheme 2. Fully correct arithmetic at modest cost.
Scheme 1 ignores the hints given by the programmer in the declarations:
the programmer promises to the compiler that the values of variables are
bounded as specified by their types. In scheme two, the compiler exploits
this bounds information to do all arithmetic in the minimum width required
to fully meet the correctness criterion of ***, potentially using
statically-sized bignums for intermediate values.
Thus, a RHS not containing divides can be evaluated in the width of the
LHS ( a property shared by schemes 2, 3, and 4 ). If a RHS does contain
a divide, all RHS intermediate results that eventually are used by the
divide must be computed to enough bits to contain the widest possible
range of those results, even if this requires doing this intermediate
arithmetic in an "inconveniently wide" static bignum format.
* Scheme 3: A reasonable approximation to fully correct arithmetic at
almost no cost over incorrect arithmetic.
In the third scheme, all intermediates are the width of the widest atom
in the statement, or the native integer format if that is wider. Thus,
width information must be synthesized bottom-up and inherited top-down.
Thus, if the LHS is the only wide integer, all RHS atoms are promoted to
the width of the LHS before the arithmetic is performed.
* Scheme 4: Really Stupid arithmetic.
Though several language standards discuss multiple representations of
integers, most do not specify explicit promotion rules, specifying only
general arithmetic rules that lead implementors to something close to
scheme 3. ANSI C, however, specifies explicit promotion rules that
emphasize execution speed over any pretense of correctness: the width
of the intermediates is only a bottom-up property -- the width of the
widest constituent operand, or "int" format if that is wider. Thus,
this assignment
signed-64-bit-int-a = signed-32-bit-int-a * signed-32-bit-int-b;
if the native integer is 32 bits, will be evaluated in 32 bits, then
sign-extended to the width of the LHS. Thus, the high-order bits of the
LHS are all but guaranteed to be incorrect.
Under this scheme, all programmers using integers wider than "int" must
examine their programs *very carefully* *by hand* to ensure that the
correct explicit widening casts have been used.
The reason that the masses haven't risen up and burned down most vendors' R&D
facilities is that most of us have implementations where "int" is the widest
format; thus the problem doesn't arise. But because we haven't heretofore seen
it doesn't mean it isn't a problem; ask PC programmers. Several of the proposals
that have been floated in this forum would expose it.
The cost differential between scheme 3 and scheme 4:
* All arithmetic is done in a width "convenient" for the hardware
( assuming that the language implementation does not define a
primitive data type that the underlying hardware finds totally
"inconvenient" ).
* The only time wider-than-native arithmetic is required is when
the programmer has specified a wider-than-native operand or
result. The program is explicitly telling the compiler that a
wider-than-native ( though still "convenient" ) operation is
required to do the arithmetic correctly.
* Assuming that "convenient wide" operations execute at the same
speed as "native width" instructions, the only additional cost
of scheme 3 over scheme 4 is when the LHS is wider than native,
and then each native-or-narrower operand must be pre-extended to
wide format, rather then the entire RHS being post-extended.
Summation: There are two possibilities:
1) If NCEG is unwilling to unbreak the ANSI standard, then NCEG should
recommend that all implementations define "int" to be the widest
integer datatype in the machine. Congratulations, Cray.
2) My preference: NCEG should recommend something like scheme three to
replace ANSI's scheme four.
-boundless
More information about the Numeric-interest
mailing list