The need for two passes was not immediately evident
during the design and implementation of the code in the FFE
that was to produce GBEL.
Only after a few kludges,
to handle things like incorrectly-guessed ASSIGN
label nature,
had been implemented,
did enough evidence pile up to make it clear
that std.c
had to be introduced to intercept,
save, then revisit as part of a second pass,
the digested contents of a program unit.
Other such missteps have occurred during the evolution of the FFE, because of the different goals of the FFE and the GBE.
Because the GBE's original, and still primary, goal
was to directly support the GNU C language,
the GBEL, and the GBE itself,
requires more complexity
on the part of most front ends
than it requires of gcc
's.
For example,
the GBEL offers an interface that permits the gcc
front end
to implement most, or all, of the language features it supports,
without the front end having to
make use of non-user-defined variables.
(It's almost certainly the case that all of K&R C,
and probably ANSI C as well,
is handled by the gcc
front end
without declaring such variables.)
The FFE, on the other hand, must resort to a variety of "tricks" to achieve its goals.
Consider the following C code:
int foo (int a, int b) { int c = 0; if ((c = bar (c)) == 0) goto done; quux (c << 1); done: return c; }
Note what kinds of objects are declared, or defined, before their use, and before any actual code generation involving them would normally take place:
Whereas, the following items can, and do, suddenly appear "out of the blue" in C:
Not surprisingly, the GBE faithfully permits the latter set of items to be "discovered" partway through GBEL "programs", just as they are permitted to in C.
Yet, the GBE has tended, at least in the past, to be reticent to fully support similar "late" discovery of items in the former set.
This makes Fortran a poor fit for the "safe" subset of GBEL. Consider:
FUNCTION X (A, ARRAY, ID1) CHARACTER*(*) A DOUBLE PRECISION X, Y, Z, TMP, EE, PI REAL ARRAY(ID1*ID2) COMMON ID2 EXTERNAL FRED ASSIGN 100 TO J CALL FOO (I) IF (I .EQ. 0) PRINT *, A(0) GOTO 200 ENTRY Y (Z) ASSIGN 101 TO J 200 PRINT *, A(1) READ *, TMP GOTO J 100 X = TMP * EE RETURN 101 Y = TMP * PI CALL FRED DATA EE, PI /2.71D0, 3.14D0/ END
Here are some observations about the above code, which, while somewhat contrived, conforms to the FORTRAN 77 and Fortran 90 standards:
X
is not known
until the DOUBLE PRECISION
line has been parsed.
A
is a function or a variable
is not known until the PRINT *, A(0)
statement
has been parsed.
ARRAY
depend on a computation involving
the subsequent argument ID1
and the blank-common member ID2
.
Y
and Z
are local variables,
additional function entry points,
or dummy arguments to additional entry points
is not known
until the ENTRY
statement is parsed.
TMP
is a local variable is not known
until the READ *, TMP
statement is parsed.
EE
and PI
are not known until after the DATA
statement is parsed.
FRED
is a function returning type REAL
or a subroutine
(which can be thought of as returning type void
or, to support alternate returns in a simple way,
type int
)
is not known
until the CALL FRED
statement is parsed.
100
is a FORMAT
label
or the label of an executable statement
is not known
until the X =
statement is parsed.
(These two types of labels get very different treatment,
especially when ASSIGN
'ed.)
J
is a local variable is not known
until the first ASSIGN
statement is parsed.
(This happens after executable code has been seen.)
Very few of these "discoveries" can be accommodated by the GBE as it has evolved over the years. The GBEL doesn't support several of them, and those it might appear to support don't always work properly, especially in combination with other GBEL and GBE features, as implemented in the GBE.
(Had the GBE and its GBEL originally evolved to support g77
,
the shoe would be on the other foot, so to speak--most, if not all,
of the above would be directly supported by the GBEL,
and a few C constructs would probably not, as they are in reality,
be supported.
Both this mythical, and today's real, GBE caters to its GBEL
by, sometimes, scrambling around, cleaning up after itself--after
discovering that assumptions it made earlier during code generation
are incorrect.
That's not a great design, since it indicates significant code
paths that might be rarely tested but used in some key production
environments.)
So, the FFE handles these discrepancies--between the order in which it discovers facts about the code it is compiling, and the order in which the GBEL and GBE support such discoveries--by performing what amounts to two passes over each program unit.
(A few ambiguities can remain at that point,
such as whether, given EXTERNAL BAZ
and no other reference to BAZ
in the program unit,
it is a subroutine, a function, or a block-data--which, in C-speak,
governs its declared return type.
Fortunately, these distinctions are easily finessed
for the procedure, library, and object-file interfaces
supported by g77
.)