Anti-decay operator <9212142157.AA16773ajuniper16.cray.com>

Mon Dec 21 14:26:16 PST 1992

Dave Becker wrote:

  I have some comments on your proposal to add first class arrays to C...

Just to clarify, as I noted in my earlier message, ANSI C already *has*
arrays as types, but whether we can properly call them "first-class" or
not is a debatable point.

  I must say that your proposal is the cleanest attempt I have
  seen thus far for making arrays first class (or more first
  class) in C.

Thanks.

  The problem is that once first class arrays are
  added to C, you are forced to add a lot of other things.  Here
  is a summary of what always seems to happen when first class
  arrays are added to C:

     1) New syntax is added to create "first class" arrays.
        This is required because traditional C style arrays
        are still needed.

The amount of new syntax (and new semantics) added by my proposal is quite
minimal, and generally "fits" into the existing grammar of ANSI C quite
easily.  (I would like to see *any* array proposal which adds *less* new
syntax than my proposed language extensions do!)

     2) Operator overloading or creation of new operators/intrinsics
        are needed for operands of "first class" array type.

The overloading of operators in ANSI C is *not* something new which I have
suggested adding out-of-the-blue.  Quite the contrary, operator overloading
(i.e. making operators work on different types of operands) already exists
in ANSI C, and has existed since day one.  If it didn't then you would be
unable to use `+' on both integral and real values.  What I suggested was
*not* the invention of a totally new (and previously unheard of) mechanism
to the language, but rather the mere extension (in the most intutive way)
of operator overloading (which the language already supports) to array types.

     3) Shape conformability is a concern and requires more syntax.

I proposed array-type casts.  Was there something specific which was wrong
with that?

     4) Function interface becomes very complicated.

As I noted in my previous message, I consider this to be an entirely separate
and orthogonal issue (or set of issues, if you prefer).  Fortunately, as I
noted, the approach to adding array-handling features to C which I suggested
allows these "function interface" issues to be dealt with separately (if
and only if that seems particularly necessary).  I'm not sure if other
approaches to array handling in C can likewise permit us to treat these
"function interface" issues as separate issues.

     5) Control flow complexities are introduced.

This is (I believe) a red herring.  (See below.)

     6) Non-elemental operations require either additional syntax,
        new operators, or intrinsics.

You probably have a good point here, but it all depends upon how much
additional functionality you need (and expect) from a new array-handling
capability in C.  If your needs are small, and your expectations modest,
then what I proposed is more than adequate.  If however, you really won't
be happy until C looks like PL/1, then I'll be forced to admit that what
I proposed did not provide a way to do a "matrix multiply" (as that is
usually defined within the numerical computing comminuty).  I'll say more
about this below.

  I would like to now walk through your proposal and show you how
  the above six points do occur in your proposal.

  > So that's what I decided would be the best thing.  Add a new postfix
  > operator `[]' which, when applied to a preceeding operand which (nominally)
  > has some array type, simply prevents the type from decaying into a
  > pointer type.  This is simple to understand, explain, and implement,
  > and it doesn't conflict in any way with any of the existing syntax or
  > semantics of either C or C++.  (Note that the rules for this new operator
  > would require that the operand be of some array type.)

  So here is point #1 - adding new syntax for getting "first class" arrays.

This objection, while fundamentally true, is essentially a red herring.
As I've said, the amount of new syntax I proposed is altogether trivial
(and is probably *less* than the amount suggested in other array-handling
proposals).

  > First, what do we do about comparison operators?  What does the expression:
  > 
  > 			left[] <= right[]
  > 
  > really mean?

  > 
  > Naturally, some users would prefer to have comparison operators work one
  > way, and other users would prefer to have them work the other way.  As long
  > as we only have one set of comparison operators, we cannot accomodate both.

  Here enters point #2 - operator overloading/new operators.  I believe you
  have just hit the tip of the ice berg with this one.  Basically you have
  to go through all of the C operators and decided on what the meaning
  of these operators are with one or two first class array operands.

Once again, I think this is much ado about nothing.  In my posting, I
suggested that the most reasonable (and intutive) thing to do would be
to simply say that when some existing operator is applied to an operand
(or to a pair of compatible operands) of some array type (of some
arbitrary dimensionality) then the result is itself an array type value,
compatible with the input operand types, and that result value simply
consists of the set of values yielded by the application of the given
operator to each of the elements of the input operand(s) (element-by-element).

To claim that it will be necessary to think separately about the semantics
of each individual operator now provided by C (for scalar operands) when
those same operators are applied to arrays is silly.  One general rule
can describe the results in all such cases.

  What does "x += vec[]", "!vec[]",  or "vec[] = 2" mean?

You've given three separate expressions here.  The answer for each one
seem obvious.

In that case of "x += vec[]", if we assume that x is a scalar, then the
expression is illegal because the += operator is only defined for either
a pair of scalar operands or a pair of array operands (which have compatible
"shapes").  I never suggested that the += operator should be defined between
operands of different dimensionality or incompatible shapes (although it
could be if that seems warranted).

In the case of "!vec[]" the result is a vector consisting of the element-by-
element logical negation of each element of the input vector.

In the case of "vec[] = 2" I must again note that `=' is yet another operator
for which I DID NOT suggest defining a meaning for operands of different
dimensionality (or incompatible shape).  Such a meaning *could* be defined
(if warranted) but I did not suggest doing that.

  Just as your analysis of comparison operators demonstrates, there are no
  simple rules that you can apply here.  In some cases you want both vector
  and scalar results for the same operator.

I agree that in some cases you may wish that you could get scalar results
from some operators, even when the input operands are vectors (or matricies).
That fact, in and of itself does not mean that "there are no simple rules"
which can be applied.  Rather, it only means that the simple rules which
I have already suggested will not fulfill all possible needs of all possible
programmers in all possible situations.  I can live with that.  The scheme
I proposed was designed to be very simple to understand, very simple to
implement, and entirely "upward compatible" with both the letter and the
spirit of ANSI C.  The goals for this scheme were very modest and I think
the scheme achieves those goals well.  The fact that it doesn't fully
address the broader set of goals which you (and others) may have for a
new array-handling scheme in C can either be viewed as a failure of my
proposal to address the *real* goals, or as an argument in favor of
adoping more modest goals.

  > But wait!  What about unary operators applied to array of unknown length?
  > Also, what about binary operators where both operands are of some unknown
  > length?

  > 	void vector_add (int vec1[], int vec2[], unsigned len)
  > 	{
  > 		vec1[] += (int[len]) vec2[];
  > 	}
  > 
  > To make this work, some special "type compatability" rules would have to
  > apply to values which have some "dynamic array type".  Specifically, we
  > would want to stipulate that (for all applicable binary operations) a
  > value of some dynamic array type is *only* compatible with a value of
  > some *incomplete* array type which has the same element type.

  Welcome to point #3, shape compatibility.  Once first class arrays are added,
  you have to ask what is the behavior of operations that are not shape
  compatible?  Although you can make some default shape compatibility promotion
  rules, at some point you need to add syntax similar to your cast technique.

For the record, allow me to note that my suggestions for adding "array casts"
and also "dynamic array casts" to the language DO NOT add any new syntax.
They do add some new semantics however.

  > Well, the more asture readers will have noticed that I glossed over a rather
  > important point in my examples above.  Specifically, given a function like:
  > 
  > 	void vector_add (int vec1[], int vec2[], unsigned len)
  > 	{
  > 		/* ... */
  > 	}
  > 
  > the ANSI C standard says that this function is really equivalent to:
  > 
  > 	void vector_add (int *vec1, int *vec2, unsigned len)
  > 	{
  > 		/* ... */
  > 	}
  > 
  > ect.

  It is becoming more apparent to me that issue #4 (function interface
  complexity) is one of the more difficult issues to solve when adding
  first class arrays to C.

I disagree.  As I noted in my previous posting (and again above) the rather
trivial additions to the language which I have suggested (i.e. the anti-
decay operator, extension of all operators to arrays, array casts, and
dynamic array casts) can be used together in ways which make it unnecessary
to do *anything* to change the current rules of ANSI C with respect to
passing or returning arrays.  (Somthing *could* be done in this area also,
but that is an entirely separate issue.)

  It is really a trade off between ease of
  programming and execution efficency.  Take for example "sin(vect[])".
  What does this mean?

Looks bogus to me.  Once again, I think you are asking for vastly more new
functionality than my modest proposal intended to provide.

  Does the behavior change depending upon whether a
  prototype of sin is present which expects a vector?

In my previous message, I suggested that the decay of array type to pointer
types in the type specifications for formal parameter could be (and perhaps
should be) left alone.  If that were done, then it would be impossible to
declare a function which "expects a vector".  Functions taking pointers
would still be allowed however.

  Can a user write a function that will accept an argument of any shape?

No, but a user could (with the extensions I proposed) write a function
which accepts a pointer to a "first element" and then use the anti-decay
operator to turn that into an array of some sort, and then use a dynamic
array cast to turn that into an arbitrarily-shaped array value.

  And what is of intense interest for me recently - how does the complexity of
  distributed memory impact all this?  There is no way around it - either
  function interface has to change significantly or the user's hands are
  tied in what they can express.

The user's hands are always tied by the language rules.  This is sometimes
more true in C than it is in other languages (e.g. APL, PL/1).  If this
really bothers you, perhaps you should be coding in one of these other
languages (rather than C).  (Note that C "ties one's hands" by disallowing
BCD arithmetic also.  Why do I suddenly get this strange feeling that there
is a "Commercial C Extensions Group" out there somewhere, coalesing as we
speak?)

  Issue #5 (control flow complexity) was not cover in your discussion of your
  proposal. What is the behavior of "if (vec1[] <= vect2[])" or even better yet
  what is the behavior of "while (vect1[] <= vect2[])"?  I am assuming
  that "<=" results is a vector of values.

Correct, and the ANSI C standard has some very specific rules regarding
the types of expressions which are allowed to appear as the "condition"
part of an `if' or a `while' statement.  In particular, no permission
is ever given for these "condition expressions" to have types which are
array types.

  > The idea might also extend nicely from vectors to matricies, if we think
  > of those as undergoing (in ANSI C) multiple stages of "type decay" (as I
  > believe they do).  Then we could write:
  > 
  > 		int sink[100][100];
  > 		int source[100][100];
  > 
  > 		void matrix_multiply ()
  > 		{
  > 			sink[][] *= source[][];
  > 		}
  > 
  > (Note that this is definitely *not* the kind of matrix multiply that those
  > in the numerical computing community would like to have.)

  Last but not least, here is a reference to point #6 - non-elemental operations
  are difficult.  With your array syntax proposal, a user can not write a
  general matrix multiply or a transpose.  This is true with any "first class
  array proposal" that I have seen.  A linear algebra programmer is forced to
  use "for" loops in all cases except for a few trivial operations.

I could live with that.

Once again, this illustrates an "impedence mismatch" between your expectations
and my goals (for my simple array-handling proposals).

I believe that there may be a simple way to extend the proposals I have
already put forth (for array handling) so as it make it slightly easier
to write (for example) a "normal" matrix multiply routine (and I'll post
that separate idea in a separate message) but let me reiterate once again
my proposals so far have been INTENTIONALLY MODEST.  Perhaps that means
that these proposals are not of sufficient value to the broader numerical
computing community to warrant their added complexity.  I don't know.

  So in conclusion, I believe your proposal will end out being as large of a
  proposal as C* (or at least the same order of magnitude).  I believe that
  any attempt to add first class arrays to C will result in an unacceptably
  large change to the language.

I disagree with both of these assertions.  As long as the goals are kept
modest, I see no reason to expand dramatically upon what I have already
proposed, so that "size" of my proposal will remain quite small.  Separately,
I must also say that I believe that what I have proposed *does* make arrays
into more of a first-class data type in C (although now with all of the
bells and whistles some would like) and it does so via very modest changes
to the language.

// Ron ("Loose Cannon") Guilmette    uucp: ...uunet!lupine!segfault!rfg
//
// 	"On the one hand I knew that programs could have a compelling
// 	 and deep logical beauty, on the other hand I was forced to
// 	 admit that most programs are presented in a way fit for
// 	 mechanical execution, but even if of any beauty at all,
// 	 totally unfit for human appreciation."
// 						-- Edsger W. Dijkstra