Building a Code Generator

In this video, which follows on from the last
one, with tidying up, first we fixed the bug in the loop that I had before, and the bug
was that I had condition for the end of the loop and the incremental pointer loop swapped,
and I was also trying to go forward in the loop when I should have gone backwards. Now
I have got the correct loop and the correct amount of indentation and I’ve made a few
other changes to my PrintTree that are cosmetic they just make the output of the tree look
better. If you can see in the command prompt window, this is the kind of tree we now get
for this small test program. So you can see it is indented, which clearly shows the layout
of the syntax. We’ve got a program, inside that statements and inside that a statement,
and inside that an if statement, and then inside that a condition, and inside that condition
is an expression and the next thing underneath the if statement is a statement and that is
an assignment statement and it says its an assignment to c there and you can see in he
expression we’ve got b. You can see this little three line program that its parsed, underneath
there. The next thing to do is to write a code generator to generate code. Well that’s why we have spent so much time
on the print tree as generating code is just to print out some code instead of the tree.
So, we probably just need a similar template to this PrintTree, so that’s what we can do.
We can then add, and this is not a DEBUG so it goes outside the “ifdef DEBUG”, we could
actually make something called WriteCode or GenerateCode or PrintCode or Code or something
like that. It takes in a tree and then is going to output some code. Well I suspect
its going to have the same “if (t ” is NULL do nothing, and I suspect it is going to be
a recursive tree walk, much like that. Instead of PrintTree we want to WriteCode of t->first,
WriteCode of t->second, WriteCode of t->third. Then the otherr thing we’ve got to do is put
the declaration, and a call for it. So, we will need a call WriteCode from the main program
here passing in our ParseTree that we previously printed, and we need to add a template for
it, which is not ifdef DEBUG, it goes in underneath there. We can WriteCode and it takes an argument
of TERNARY_TREE. Thats our definitions of the main structure of that. One of the hints
of how a code generator is structured is we can look at the earlier calculator example
hat we made, and here it had a routine called “evaluate”, and what it did was a switch on
the nodeIdentifier, and I’m going to copy this chunk of code, and did a different thing,
depending on what kind of node it had. So here we can go to our WriteCode routine and
paste in that little thing, but its not going to do things that a calculator did, which
is return a value or do a calculation, but it might have this overall structure. Now,
one of the things that we usually will, that we need to do next, is to actually design
what code we want out, and that’s the problem that we’ve got. So if we don’t know what code
we want out, for what code we’ve got in, we can’t write code to implement it. We’ve got
to design what we’re going to do. So we really have to start with our BNF, or something like
that, that defines the language and start to think about what we want to do.
So for a program statement, what do we want printed out? Well maybe we would want something
like this: we would need to look for a program, what a main program looks like in C. What
does a main program look like in C/ We can go and look at our main program. It starts
“int main(void)” seems to be what it has. So we could do something like that. So, when
we have a program we would want to put “int main(void) {” and then something and then
a “}”. That will be our design for that. So we need to also think what do we want to
output when we get some statements. So if we’ve got some statement we basically want
the code generated for the first statement and then we want to output a semicolon and
then we want the code generated for the second one, and we probably want to put a newline
in there. OK. What do we want printed when we get a statement?
Well probably there is nothing much to do there. What do we want if we’ve got an if
statement? Hmm. Well probably I want to put the C word if and then I want whatever is
in the condition and then I probably want an open parenthesis and then I probably want
to generate the code for the first statement and then I probably want a close curly bracket.
Now, we’ve already removed the with, so we can pick what we want to do for the while
statement. Hmm. Well perhaps that’s while, some condition,
do curly bracket and something. Something like that is going to be the C equivalent.
What about the assignment statement? What shall I design for that one? Well that’s going
to be simple; it’s going to be the identifier and an equal sign and then the expression,
not a “:=”. Let’s scroll down a little bit. So for the condition I probably generate the
same for the expression I probably generate the same, and so on. So they’re all very similar
thereafter. So, that’s my design. I can go and save that. That’s the first point
to make: without a design, I can’t write the program. So now I need to go to my code generator
with my design in mind. So now I know that in the case of a PROGRAM I now know what to
do. I can printf this string that I’ve designed here, and then probably a newline, and then
in the “…” I want to write the code for that main body part and then I probably want
to print the closing curly bracket, and then I’m probably done. So, then for the next one,
when I’m dealing with some STATEMENTS I probably want to write the code for
the first part and then perhaps print a semicolon
and a newline and then write the code for the second part, and then I’m probably done.
So for statement, I’m going to let this default action at the bottom deal with it. So for
an IF_STATEMENT I am going to print out the string “if (” and then I’m going to write
out the code for the first part which is the condition and then I’m going to print the
closing parenthesis, open curly and perhaps a newline and then I’m going to write out
the code on the second part, which is the statement and then, I’m probably done. So,
well I could just similarly do the WHILE_STATEMENT, and you can see in my design its getting very
similar. I can printf a while, open parenthesis and then I can write out the code for the
condition, which is first, and then I can probably print closing parenthesis and the
word do and an opening curly bracket and perhaps a newline, and I can write out the code for
the second part, and I can return. So that’s the while loop.
So I can go an ASSIGNMENT_STATEMENT. What do I want to do? So for the assignment statement,
on the left hand side we’ve got an identifier, so we want to print that identifier, so how
do we print an identifier? The last time we printed an identifier was in PrintTree, so
we’ll put something for that. Let’s go and have a look, if we go back up, how did we
print an identifier? It was here. So I could have put that in a method for printing an
identifier, or I can just steal it. That’s what I will do. It’s quite odd to print the
word “Identifier”, I just want to print the identifier, otherwise its an unknown identifier.
So I’ve printed the identifier there, and now I can print the equal sign, and then I
want to write code for whatever is on the right hand side, and I’m probably done; and
just for completeness, here, if I’ve got a NUMBER_VALUE, then I can print that one; and
where’s the code to print a NUMBER_VALUE? It’s up here. We just printed it like that.
The after I’ve done that I can probably just return.
And so I’ve gone through all the possibilities in my design and anything else is caught by
the default at the bottom. So its hard to see it all that in one go; I’ll just put that
on one screen and you can see its just a section of alternatives of what to print for that
piece of code. So we can now try and go through bison and check we haven’t made typing mistakes.
Now we can build it, but we don’t want to set the debug flag any more. So now we can
run it through my test program. So you can see its now printed out main, its printed
out the if statement, its translated the assignment, it hasn’t done condition because I haven’t
generated code for the condition. So we have now translated this program into C, and I
can see where I’ve missed the closing bracket off the if statement. There we are. I’ve just
fixed that one, and I’ve probably, looked like I’ve made the same mistake on the while
statement here of mismatched brackets. So, if I now wanted to go a bit further, I shouldn’t
just read that code on the screen. Once I’ve run that I need to put that in “test-program.c”,
so I use my compiler to take the input language, our pascal like language, generate C in that
file, which I can now stick through the gcc. So I can now get a “test-program” dot executable
by compiling “test-program.c”. Now it doesn’t compile because of two things: I haven’t got
declarations of variables and I haven’t finished the thing. I’m just showing you. When yours
is complete don’t forget to compile the C code you generated and run it to check its
got the correct result. We’ve no gone all the way through from the PrintTree to the
code generator for a pascal-like language, should show you how to do all of those steps.

Leave a Reply

Your email address will not be published. Required fields are marked *