integer sizes

Fri Dec 20 13:30:00 PST 1991

I figure I may as well put in my two bits on the "integer sizes" debate.  This
time I argue for the most conservative approach so I'll probably have the
compiler writers on my side.  There is clearly a need for some guaranteed way
of getting integers that have at least 64 bits.  So I'm in favor of the
proposal to make an "extra long" type, guaranteed to be at least 64 bits (and
at least as big as long), with the corresponding constants in limits.h.  Some
compilers already use "long long" for this purpose but, as Earl Killian has
pointed out, this name is ugly and can't be typedefed or #defined to anything.
Suppose the new type is called the "moby".  A compiler writer would be free to
make mobies be bigger than 64 bits; e.g., long might be 64 bits and moby might
be 128 bits.  I'm not in favor of any more extensive modifications to the
language, in particular such things as declaring integers with ranges or
specific numbers of bits.

There are three degrees of specificity that a programmer may wish to request
when declaring an integer:

1. Give me any integer type that's "at least this big".

2. Give me the smallest integer type that's "at least this big".  Although you
   can always get integers capable of holding 32 bits by specifying "long",
   on some computers this will give you 64 bit integers, which wastes a lot of
   storage for a 10,000,000 element array.  So if ints are 32 bits you'd rather
   use them.

3. Give me an integer type that has "*exactly* this many bits."  (The example
   was given of using the expression "(x << 24) >> 24" to do a signed
   truncation of an int to 8 bits, which will fail if ints are smaller or
   bigger than 32 bits.)

Case (1) is handled by the ANSI requirement that shorts are at least 16 bits
and longs are at least 32 bits, and by the proposed new requirement that mobies
are at least 64 bits.

Case (3) is inconsistent with the C philosophy of providing primitive types
that are efficient and map well to the underlying hardware.  To provide a
true 32 bit int on a machine with a 36 bit word size would require masking
steps after every integer operation, a big penalty for a questionable gain.

Case (2) seems to be the principal reason for the various proposals to allow
for specifying integer ranges or numbers of bits.  But there is no need to
extend the language to get this.  Just write a header file that tests the
constants in limits.h to create appropriate types int32, uint32, etc.  This
header file only has to be created once.  It is no harder to say "int32" than
"int<32>", or whatever other peculiar syntax may have been proposed.

The only people who lose are those who require some guaranteed way of
specifying an integer with more than 64 bits.  But even among numerical
programmers, the fraction who require 128 or 256 or 2048 bit integers is quite
small.  I question the wisdom of forcing every C compiler to implement
arbitrary precision integer arithmetic as built-in functionality in order to
satisfy them.  I don't think that NCEG should be trying to invent the ideal
language "D", it should simply try to patch up the most serious flaws in C.

I don't think NCEG should take any position on whether compilers for 64 bit
computers should have a "compatibility mode" that make longs 32 bits (instead
of 64).  Such modes are necessary only for poorly written programs, and if a
manufacturer has many customers that use such poorly written programs it will
have all the incentive it needs to provide the compatibility mode; NCEG's
recommendation yeah or nay becomes irrelevant.

Nor do I think NCEG should make any recommendation on whether ints should be 64
or 32 bits.  The best argument for making ints 64 bits on 64 bit machines is
that they are used to index arrays that might have more than 2^31 elements.
One problem with this is that even on 64 bit machines it may be useful to have
both 16 and 32 bit integers, and to do that you want ints to be 32 bits so that
shorts can be 16.  Since there are sensible arguments to be made for both
sides, I don't think any definate recommendations should be made.

If I wanted to be certain I could index giant arrays on a 64 bit machine, I'd
typedef idx to the right thing and use idx rather than int or long.

Bob Wilber     wilberahomxb.att.com