A few thought regarding Cray's "Variable Length Array Proposal" <9302040004.AA14589awillow29.cray.com>

Wed Feb 3 17:28:36 PST 1993

This is my response to Tom MacDonald's response to my response to his
VLA proposal.

I'll try to be brief.

  > For example, using Tom's terminology, what does it mean for a type to
  > "contain" another type?  I have only a vague idea based upon my own best
  > guess.  From that guess, I'd have to conclude that (under this proposal):
  > 
  > 		void foobar (int n)
  > 		{
  > 			static int (*ap) [n];
  > 		}
  > 
  > ... would be illegal.  But I can see no practical reason why it should be!
  > (Note that in this example, the type of `ap' *involves* a VLA type, but it
  > is not itself a VLA type.)

  You are correct that this is not allowed by the proposal.  Defining the
  behavior of this example raises some questions.  If each time `foobar'
  is called with a different value of `n' then does the size of the array
  being pointed at by `ap' change?

The size, as understood by the compiler (and as understood by the program
itself, at run-time), does change, yes.

  If it does change, then it raises questions about what does `static' mean?

`static' in this case means exactly what it has always meant.  The variable
`ap' gets allocated only once, before the program even begins execution.
I don't see why there should be any mystery here.

  If it does not change then when is `n' evalutated.

`n' is evaluated each time the declaration of `ap' is elaborated... and we
should probably agree that the declaration of `ap' gets elaborated each
time we enter ap's scope.

  It's a good point though.  The proposal has gone back and forth with
  this example a couple of times.  I do not have a very strong opinion
  on this issue.  However, I could not think of a test case where it
  mattered if the declaration were static or automatic.

It matters to implementors.  You want to strike that perfect balance
between ease of implementation and power for the users.  In this case,
I believe that the rules which the implementors have to make their
compilers enforce will be simpler to implement if you allow the above
example.

  > >      ...Is  this  fundamental change  to  C's  lexical  ordering
  > >      within  a  prototype acceptable?
  > 
  > No way!  It creates a horrendous amount of problems (some of which are
  > described in the proposal itself) and does so for little (if any) *real*
  > additional functionality.  TAKE THIS OUT!

  A large part of the rationale is dedicated to this very subject.
  My belief is that, if thiss change to the lexical ordering is taken
  out, then the entire proposal should be abandoned.

The rest of the proposal is admirable, and I see no reason to believe
that it would not continue to be so, even in the absence of this
unfortunate breach of C's existing lexical scoping rules (which
generally go a long way towards permitting simple-minded one-pass
compilation).

  First, it is a
  very common coding style to put the most important parameters first.

I'm glad you said that.  It seems that we are in agreement after all!
Of course you realize that the array dimensions are the "most important"
parameters, right? (1/2 :-)

  In general, the order of parameters doesn't matter.

Exactly so.  This is precisely the point I would make if I were trying
to illustrate why there is no need to deviate from ANSI C's existing
scoping rules (which will make life harder for implementors).  The
very though that some names might have "backward reaching" scopes
sounds to me... well... backwards!

  Without this
  change to the lexical ordering rules it is not possible to declare
  numerous well defined routines (e.g., BLAS) and keep the same order
  of the parameters.

Same order as what???  Are we to be convinced that there are certain
functions (written in entirely different languages) whose calling
conventions are so deeply burned into the hearts and minds of certain
numerical programmers that those programmers would suffer irreparable
mental collapses (and cease to function as programmers) if they had
to call the same functions in a slightly different way when working in
an entirely different language?  This stretches credulity.

  ... You say it
  "creates a horrendous amount of problems" but I contend that they
  are only problems for the vendors and not for the users.

Am I to infer from this that the problems faced by implementors are not
real problems, that such problems will not have any effect upon the
speed with which implementations will be delivered to users, that they
will not have any effect upon the reliability of implementations, and
that (in general) we simply should not care one wit about the problems
faced by implementors?

I think not.

The problems faced by implementors are real, and they have a profound
effect upon what users get, and upon *when* they get it.  It is all too
easy for *any* user community to say "We don't care about your problems.
We want the world and we want it now!"  But such an attitude doesn't
help anything.  Good languages (like C) are based upon a careful balance
between what is powerful to use and what is reasonably easy to implement.

"Backward scopes" are indeed *possible* to implement, but I don't think
that you will find a single implementors who will say that they will
be nearly as easy to implement as traditional forward-looking scopes.

A balance has to be struck, and given how obviously easy it is for users
to put their parameters in the proper left-to-right order, this seems
like a totally unnecessary annoyance to burden the implementors with.
Wouldn't you rather have your implementor working on better relaibility
of the *whole* compiler, or better conformance to the ANSI C standard,
or better code optimization?

'nuff said.

  Cray Research has already implemented it.

I have agreed that it is *possible* to implement "backward scopes".  It
is also possible to (re-)implement the entire NORAD Air Defense System.
It may even be possible to implement Star Wars (aka SDI).  The fact that
something is technologically possible is not (in and of itself) an argument
for doing it (or for forcing others to re-do it).

  I'm sure this hasn't changed your mind.

Likewise (I imagine).

  > >  ... If the object is variably qualified...
  >                         ^^^^^^^^^^^^^^^^^^
  > PLEASE PLEASE use a different term (e.g. "variable sized").  The term
  > "qualified" already has an established (and very different) meaning
  > with respect to ANSI/ISO C programs.

  How about variably modified?...

Anything will do... as long as you don't try to overload the meaning of
"qualified".

  > >  ... and the block is
  > >  entered by a jump to a labeled statement, then the  behavior
  > >  is  undefined.
  >        ^^^^^^^^^
  > 
  > Humm... Later on you say that jumping into a block past a declarative
  > statement which requires the elaboration of a variable-length array type
  > specification is simply illegal, and that a compile-time diagnostic is
  > required!  So which is it?  Compile-time diagnostic or undefined behavior
  > at run-time?

  Good point.  How aobut undefined behavior.  It's similar to bypassing
  an initializer of an automatic object and the referencing the object.
  I'll recommend that the constraint be removed...

Yikes!  That is exactly the *opposite* of what I was hoping for!  I think
it is a good idea to *require* a diagnostic for attempts to jump into
a block in such a way that you *avoid* elaborating declarations which
involve VLA types.

Note that C++ already has similar rules.  In C++ you may not jump into a
block if doing so will cause you to miss the elaboration of declarations
which include initializations.

(I never liked those nasty old goto's anyway, so as far as I'm concerned,
the more we clamp down on the most ill-structured uses of them, the better.)

  > The rationale for this `[*]' notation is not provided (as far as I could
  > see).  Why is it needed?  What does it do for us?  Can you get rid of it
  > if you get rid of the weird exception to the normal lexical scoping rules
  > for formal parameter names?

  I'll add some more rationale.  Getting rid of the exception to the lexical
  scoping rules doesn't eliminate the need for the *.

I think it does.  (There goes another whole big batch of unnecessary
complexity!)

  Consider the following prototype:

  	void f(int, int[*][*]);

  currently names are not required if there is no actual definition
  of the function.  Without the `*' a name would be required.

And what would be wrong with that???  Is there some reason that:

	void f(int n, int[n][n]);

... would be "bad"?

  > >  7.6.2.1 (4.6.2.1) The longjmp function
  > >  
  > >  Description
  > >  
  > >       If a longjmp function invocation causes the termination
  > >  of  a  function  or  block  in  which  variable length array
  > >  objects are still allocated, then the behavior is undefined.
  >                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  > 
  > No good.  The behaviour *must* be defined.  Otherwise you have invented
  > a highly crippled VLA feature.

  Really, it could be defined.  However, in many implementations this
  could mean storage is lost.  This is just trying to acknowledge
  that issue.  If a VLA is allocated on the heap...
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

... then that would be a poor implementation.  Why would anyone want to
implement this VLA feature in such a way that `auto' VLA objects get
allocated out in the heap?  That would be a rotten way to do it, and it
would make longjmp unreliable (as you have noted).

All `auto' objects should go on the stack.  That's what `auto' means!
If a programmer wants a VLA object from the heap, let him ask for that
explicitly via malloc().  Conversely, if I explicitly ask for something
to be allocated on the stack (via and explicit or implicit `auto'
storage class specifier) then PLEASE give it to me on the stack as I
requested.

  ...easy to see how storage might get lost.  However, you are right that
  a smart longjmp could figure it all out as it unwinds through the
  call chain.

If you allocate all `auto' objects on the stack (including auto VLA objects)
as you should, then you don't need to get into any hairy stack unwinding
schemes in order to implement longjmp.  You can just do it in the simple
way it is already being done now on most machines, i.e. restore all the
registers (including the stack pointer and frame pointer) and be done with it.

  Thank you for your comments.  I would much rather have critical comments
  because they help me focus on weak parts of the proposal.

You are welcome.  Thanks for taking it all in stride.  As I said, I still
think the fundamental ideas are all sound... but I like to nit-pick the
details.

// Ron Guilmette
// Critic at Large