A few thought regarding Cray's "Variable Length Array Proposal"

Wed Feb 3 16:04:04 PST 1993

Thank you for your comments.  I'm sure they will lead to a better
final proposal.

> I read Cray's VLA proposal, and, in general, I think it is going in the
> right direction.  I would like to pick a few nits however, and perhaps
> suggest that a bit more crafting and wordsmithing is in order (here and
> there).
> 
> >      Only  identifiers  with  automatic  storage duration can
> >      have a type that  contains  a  VLA  type.

These words are in the cover letter.  I agree they are not precise.
The proposal itself says:

   Only identifiers with block scope can have a variably qualified type.
   Objects with static storage duration shall not be declared with a
   variably qualified type.  Only  ordinary  identifiers  (as  defined
   in 6.1.2.3 (3.1.2.3)) may be declared with a variable length array type.

I'm sorry for this confusion.  There is a difference between the
cover letter and the proposal in that the propsal always wins
if there is a conflict.

> The proposal explicitly defines the term "an identifier with automatic
> storage duration".  But that definition given DOES NOT include typedef
> names.  Yet later on in the proposal it is noted quite clearly that
> typedef names may indeed be defined in terms of VLA types.

The intent was that typedefs are included.  A typdef is an identifer,
it is an ordinary identifier, it is not a static object, therefore
it can have a VLA type.

> I think the proposal ought not to try to talk about VLA *objects* on the
> one hand and then about VLA typedefs on the other.  Rather, it should
> just say that VLA *type specifications* can only appear in block scope
> (or in prototype scope) and then say that objects explicitly declared
> as either `static' or `extern' may not have types which are "VLA types".

Since both `static' and `extern' declarations produce objects with
static storage duration, this proposal elected to say that
"objects with static storage duration" can't have a VLA type.

> I think this would simplify the presentation, and also help to clarify a
> number of subtle points.

I don't think the specification is wrong but I'm wiling to think about
a rewording that might be easier to understand.  Perhaps more rationale.

> For example, using Tom's terminology, what does it mean for a type to
> "contain" another type?  I have only a vague idea based upon my own best
> guess.  From that guess, I'd have to conclude that (under this proposal):
> 
> 		void foobar (int n)
> 		{
> 			static int (*ap) [n];
> 		}
> 
> ... would be illegal.  But I can see no practical reason why it should be!
> (Note that in this example, the type of `ap' *involves* a VLA type, but it
> is not itself a VLA type.)

You are correct that this is not allowed by the proposal.  Defining the
behavior of this example raises some questions.  If each time `foobar'
is called with a different value of `n' then does the size of the array
being pointed at by `ap' change?  If it does change, then it raises
questions about what does `static' mean?  If it does not change then
when is `n' evalutated.

It's a good point though.  The proposal has gone back and forth with
this example a couple of times.  I do not have a very strong opinion
on this issue.  However, I could not think of a test case where it
mattered if the declaration were static or automatic.  If you can
think of an example where:

	void foobar (int n) {

		auto int (*ap) [n];
	}

would not work, I'd love to see it.  It would certainly sway my opinion
toward allowing this declaration.  BTW, Example 2 in the proposal clearly
states that your example contains an error.

[.. stuff deleted ..]

> >     -  For function prototypes, this proposal allows an iden-
> >      tifier that is declared as a parameter and appears in an
> >      array size expression to have its scope extended to  the
> >      beginning of the parameter-type-list (rather than begin-
> >      ning just after the completion of its  declarator)...
> 
> >      ...Is  this  fundamental change  to  C's  lexical  ordering
> >      within  a  prototype acceptable?
> 
> No way!  It creates a horrendous amount of problems (some of which are
> described in the proposal itself) and does so for little (if any) *real*
> additional functionality.  TAKE THIS OUT!

A large part of the rationale is dedicated to this very subject.
My belief is that, if thiss change to the lexical ordering is taken
out, then the entire proposal should be abandoned.  First, it is a
very common coding style to put the most important parameters first.
In general, the order of parameters doesn't matter.  Without this
change to the lexical ordering rules it is not possible to declare
numerous well defined routines (e.g., BLAS) and keep the same order
of the parameters.

The paper clearly points out that there are equal number of votes on
both sides.  The ncegacray.com Email list has debated this subject
at length and it turns out to be a religious argument.  You say it
"creates a horrendous amount of problems" but I contend that they
are only problems for the vendors and not for the users.  The users
do not want to be concerned with argument ordering.  It is also
possible to implement because there is an existence proof.  Cray
Research has already implemented it.

I'm sure this hasn't changed your mind.  There is another proposal
by David Prosser that should come out before too long (I hope).
This solves the problem by allowing declarations such as:

	float (*ap)[?][?];

and the `?' gets filled in with a cast such as:

	ap = (float (*)[n][m]) x;

If it is a parameter, then it is filled in at the call site.  These
pointers are sometimes called "fat" pointers because they contain
more information than just an address.  It's another approach to
consider.  I happen to believe that the Cray VLA propsal is easier
for users to understand because it doesn't involve complicated
declarations such as "pointer to 2-D arrays" that are need for the
other approach.  Those things scare your average Fortran programmer.

It's the center of controversy however.

> >  6.1.2.4 (3.1.2.4) Storage Duration of Objects
> 
> >  ... If the object is variably qualified...
>                         ^^^^^^^^^^^^^^^^^^
> PLEASE PLEASE use a different term (e.g. "variable sized").  The term
> "qualified" already has an established (and very different) meaning
> with respect to ANSI/ISO C programs.

How about variably modified?  Since a pointer to a VLA is variably
qualified and has a fixed size, variably sized doesn't seem right.

> >  ... and the block is
> >  entered by a jump to a labeled statement, then the  behavior
> >  is  undefined.
>        ^^^^^^^^^
> 
> Humm... Later on you say that jumping into a block past a declarative
> statement which requires the elaboration of a variable-length array type
> specification is simply illegal, and that a compile-time diagnostic is
> required!  So which is it?  Compile-time diagnostic or undefined behavior
> at run-time?

Good point.  How aobut undefined bahavior.  It's similar to bypassing
an initializer of an automatic object and the referencing the object.
I'll recommend that the constraint be removed.  An implementation is
still at liberty to to diagnose it but not required to.

> >  6.3.3.4 (3.3.3.4) The sizeof operator
> >  
> >  Semantics
> >  
> >       When applied to an operand that  has  array  type,  the
>                                          ^^^
> 
> Don't forget about the operands which *are* array types, e.g.:
> 
> 		typedef int array_type[n];
> 
> 			... sizeof (array_type) ...

OK - sounds good.

> >  Rationale:
> >  
> >       The notion of ``size'' is an  important  part  of  such
> >  operations  as  pointer increment, subscripting, and pointer
> >  difference.  Although the sizeof operator will now produce a
> >  value  computed at runtime, there still exists a consistency
> >  when applied to the previously mentioned operations.
> 
> This needs some fleshing out.  I know what this is *trying* to say, but
> the effects of VLA types on pointer arithmetic really need to be described
> in more detail.  An example of pointer arithmetic on a pointer to a VLA
> type would be helpful.

I will add an example.

> >  6.5.2 (3.5.2) Type Specifiers
> >  
> >  Contraint
> >  
> >       Only identifiers with block scope can have  a  variably
> >  qualified  type.
> 
> What about prototype scope?

Another good point.

> >  ISSUES
> >  
> >  Overview:
> >  
> >       This section discusses two  issues  about  declarations
> >  containing  variable  length  array specifiers (i.e., VLAs).
> >  First, VLAs must be declared at either block scope or  func-
> >  tion  prototype scope, and must have automatic storage dura-
> >  tion (i.e., no static storage duration objects).
> 
> Again, I think it is inappropariate to even discuss VLA *objects* at
> such length.  Just say that VLA *types* can only appear in block scope
> or prototype scope and that captures the major thrust.  Doing that should
> also avoid any doubts about whether of not a type like `int (*)[n]' could
> possibly be a function return type.  (It can't.)
> 
> >  ...then it must be an ordinary identifier.
> 
> What is an "ordinary identifier"?
> 

An ordinary identifer is defined in section 3.1.2.3 of the ANSI standard.
The first mention of ordinary identifiers is made in section 6.5.2 (3.5.2)
Type Specifiers, and a reference is made to the appropriate section of the
ANSI standard.

> >  Constraints
> >  
> >       The [ and ] shall delimit an expression or *.
> 
> The rationale for this `[*]' notation is not privided (as far as I could
> see).  Why is it needed?  What does it do for us?  Can you get rid of it
> if you get rid of the weird exception to the normal lexical scoping rules
> for formal parameter names?

I'll add some more rationale.  Getting rid of the exception to the lexical
scoping rules doesn't eliminate the need for the *.  Consider the
following prototype:

	void f(int, int[*][*]);

currently names are not required if there is no actual definition
of the function.  Without the `*' a name would be required.

> >  ...  In a parameter-type-
> >  list, if an identifier is both declared as a  parameter  and
> >  appears in  an array size expression that is not a constant
> >  expression, then the scope of that identifier is extended to
> >  the beginning of the parameter-type-list (rather than begin-
> >  ning just after the completion of its declarator).
> 
> You have no idea what trouble this will cause!

I've implemeted it.  It exists.  It's nothing to be afraid of.
Perhaps you could elaborate on exactly what you are concerned about.

> >  Overview:
> >  
> >       By far the most controversial issue involves the ``lex-
> >  ical  ordering  problem''  that is presented when prototypes
> >  with variably qualified parameters are  used  and  the  size
> >  expression  involves  an  identifier  that  is  not visible.
> 
> As well it should be!
> 
> >  Currently, programmers do not  need  to  concern  themselves
> >  with the order in which formal parameters are specified, and
> >  one common programming style is to declare the  most  impor-
> >  tant  parameters  first.   Consider  the following prototype
> >  definition:
> >  
> >  Example 14
> >  
> >    /* prototype declaration for old-style definition */
> >  
> >    void f(double a[*][*], int n);
> >  
> >    void f(a, n)
> >       int n;
> >       double a[n][n];
> >    {
> >       /* ... */
> >    }
> >  
> >  The order in which the names are specified in the  parameter
> >  list  does not depend on the order of the parameter declara-
> >  tions themselves.  The accompanying prototype declaration is
> >  compatible  with this definition and thus it seems appropri-
>                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >  ate to allow the following prototype definition in  lieu  of
>    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >  the previous old-style definition.
>    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Why not just disallow both??  You could have a rule which said that in
> the case of an old-style (non-prototyped) declaration a given name of a
> given formal parameter could *never* be used in any way in the specifica-
> tion of the type of any other formal parameter whose name appeared earlier
> than the given parameter name in the (old-style) parameter name list.

I can see that we are going to have to agree to disagree on this one.
I stated my opinion earlier.  It hasn't changed.

> >  6.5.6 (3.5.6) Type definitions
> >  
> >  Constraints
> >  
> >       Typedef declarations which specify a variably qualified
> >  type  shall  have  block scope.  The array size specified by
> >  the variable length array type shall  be  evaluated
> >  at the time the type definition is declared... 
>    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> When is a type definition "declared"?  (Next week perhaps? :-)
> 
> This really needs some serious wordsmithing.  The ANSI/ISO C standard
> somehow managed to totally avoid introducing the concept of the dynamic
> "elaboration" of declarative statements, although how this was done is
> still a bit of an enigma (to me at least).
> 
> In contrast, if you look at the Ada standard (Oh God!  He said the `A' word!)
> you will see that the authors of that document had the foresight and clarity
> to understand that when dynamic types are involved, you simply cannot avoid
> talking about the "dynamic elaboration" of declarative statements.  Of
> course, when I say "talk about it" I really mean "define it".  You have
> to define *when* things get elaborated, and what effect such elaborations
> have.
> 
> I don't think there is any way to sweep this problem under the carpet
> anymore for C... not if we are going to have VLA types!  Somebody will
> *have* to define the "elaboration" of declarative statements AS A DYNAMIC
> PROCESS which occurs at certain specified points in time.

You've raised a good point again.  I will work on this.

> >  6.7.1 (3.7.1) Function Definitions
> >  
> >  On entry to the function all size  expressions  of  variably
> >  qualified parameters are evaluated.
> 
> How about saying instead "On entry to a given scope, all declarative
> statements of the scope are elaborated in turn, in the sequence in
> which they appear."?

I think this could work.  I'll have to see how I might work this
into the document.

> >  7.6.2.1 (4.6.2.1) The longjmp function
> >  
> >  Description
> >  
> >       If a longjmp function invocation causes the termination
> >  of  a  function  or  block  in  which  variable length array
> >  objects are still allocated, then the behavior is undefined.
>                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> No good.  The behaviour *must* be defined.  Otherwise you have invented
> a highly crippled VLA feature.

Really, it could be defined.  However, in many implementations this
could mean storage is lost.  This is just trying to acknowledge
that issue.  If a VLA is allocated on the heap, then it is
easy to see how storage might get lost.  However, you are right that
a smart longjmp could figure it all out as it unwinds through the
call chain.

Thank you for your comments.  I would much rather have critical comments
because they help me focus on weak parts of the proposal.  You've given
me plenty to think about.  We'll vote on it :^).

Tom MacDonald
tamacray.com
uunet!cray!tam