>>So today we continue

our discussion of syntax directed translation. And syntax directed translation,

what we are doing here is that we have the grammar. And in this grammar we add

what we call embedded actions. So these are the actions that

we embed into our grammar. And these actions will

get executed when? When will these actions

get executed? Yes.>>When it’s parsing?>>Yes when it’s parsing. But, at what point in the

part during the parts?>>During the reduction.>>Yeah exactly, whenever

it makes a reduction. So whenever a reduction

happens, you know, whenever this reduction happens

we would execute this action. And this reduction may

happen multiple times during the derivation. And it may happen

at different levels. So this can happen

multiple times. Okay? This is assuming an LR1

parser that passes, you know, in a bottom-up manner. So you know, if we look at this,

you know, this example string. By the way maybe we should

do a little bit of tracing of the parsing of this string. This is what we have

on the stack. Initially the stack is

empty and this is the string and this is the action. So initially we have, you know A or X equals A times 3 plus

B. This is our string. And our stack is

initially empty. Now we will not show the

building of the LR1 tables and all of these details. But we will assume that

we have an LR1 parser that is working correctly

on this. Now on this LR1 parser what

would be the first step here?>>Shift.>>It will shift. So so after it shifts,

we ill get X here and we will get equals

A times 3 plus B here. Now X is an identifier. So X is an, you know, this is

an identifier, this is a number, this is a, sorry, this is

another identifier, a number, and another identifier. So now what will it do? Will it reduce at this point? Would it reduce this

to identifier and then identifier to factor? No it will not reduce. How will it know

not to reduce here? Of course we are not showing

the LR1 parsing details. But how will it know

not to reduce that it needs to

shift not reduce? Yeah the lookahead exactly. So there will be a rule, you

know to reduce by rule 10. The reduction by rule 10 will

not have a lookahead of equal. It will have a lookahead of

something that follows a factor. And what follows a factor

is a multiply or an add or it could be an end of file. But a factor is never

followed by an equal. Okay? So that’s why

by the lookahead, there will be somewhere

in there, in one of the states there

will be a reduction, you know, factor, factor identifier. But the lookahead for that will

be anything that follows the.>>Plus.>>Yeah plus, minus, multiply,

divide but it will yeah. And it will not have

a, or end of file, and it will not have an equal. So in this case it

will just shift again. And when it shifts

we’ll have X equals. And that will result

in another shift. Then the other shift will the

other shift will put the X equals A Now what’s the action? Now what’s the right action? So when we reduce this X equals

A does this match anything? By the way, can replace

with an identifier. So do we have X equals

identifier somewhere here? No.>>What about 1 [inaudible]?>>No not yet. So what is reducible here is. This is an identifier. We can reduce an

identifier to a factor. So what will happen here is that it will reduce the

identifier to a factor. So it’s going to reduce

by rule number 10. Reduce 10 and this is going

to give us X equals factor. Then factor can be

reduced to a term. So by the way, you are here,

you know, in this derivation to an identifier to a factor

then the factor gets reduced to a term. So then we reduce by rule

number 7 and this is going to do X equals term and we still

have multiply 3 plus B. Now term will this then get reduced to

an expression at this point?>>No.>>No. It will not reduce

this term to an expression. Why? Because of the lookahead. We have a lookahead of multiply. And expression cannot be

followed by a multiply because a multiply can appear

in the middle of an expression. An expression may

be followed by, you know, closed parenthesis. But it may not be

closed by a multiply. The multiply will

appear in the middle of an expression it will

not follow an expression. So here, you know, we will

not substitute expression for term, we will just shift. Here, you know, we are not

showing the LR1 details. But we are assuming that we have

an LR1 parser that’s working correctly on this. So it’s going to

shift X equals term and then 3 plus B. Sorry,

the multiply is here. And this, of course,

does not match anything so it’s going to shift again. And this is going to be X

equals term multiplied by 3. And this is the 3. Now what is the right

thing to happen to the 3?>>Reduces.>>Yeah reduces to a factor. So reduce, reduce by rule 9. Of course this is then 3

plus B and this is going to do X term times number.>>Wait why is, why

is that 3 plus B since it got moved

over into the stack?>>3 plus, oh sorry. Yeah so here shifting

so there is no 3. Okay. So when we shifted yeah

the 3 the symbol can appear only in the stack or on the

string it cannot be in both. Yeah so that was an error. Okay. So now we have the plus

and we reduce the 3 to a number. Then we reduce the

number to a factor?>>Isn’t 3 already a number?>>I’m sorry?>>Isn’t 3 already a number? Because we didn’t reduce

the A to an item first, so.>>Okay, okay. In fact yeah this is

not even a reduction. So 3 to a number is

not even a reduction. So 3 is reduced to a factor. So in fact, you know, I

should have, you know, I should have written identifier

number ID from the beginning. Because the parser works with

what it gets from the scanner. And the scanner will send

it identifier number ID. It will not send it to

actual place, pocket. Yeah so here it’s term times

factor plus B. Now the term times factor gets

reduced to a term. Reduce to a term and

that’s by rule 5. So this is going to be X

equals term times factor. So yeah so this is the term

times factor it gets reduced to a term. And this term will gt

reduced to an expression. Reduce the term to an

expression by rule 4. and this is going to

be X equals expression. And now we will do what? We will shift. So now we will shift and

we will shift which means that we will have X

equals expression plus and that’s a B. Then

we need to shift again. Shift again. And shifting again will give

us X equals expression plus B. And expression plus B

then we do reductions. So the B gets reduced to

a factor then to a term. So we have a sequence

of reductions. So reduce by 10, reduce 10 and this gives us X

equals expression.>>Isn’t B already

an identifier?>>Yeah. So by rule number 10

we’re reducing an identifier to a factor. And then we reduce by,

we reduce the factor to a term which is by 7. Reduce by 7. We get X equals expression

plus term. And then expression plus

term gets reduced by rule 2 to an X equals expression. Okay. In fact, we

forgot the semicolon. But that’s fine it’s

not that big of a deal. We can remove it

from the grammar. It’s not the point here. So X equals expression. You know, with the semicolon, the semicolon you’d

have a shift. So if you add the

semicolon you’d have this and then you will, you

know, you will shift and then you will shift

and get the semicolon. So you will eventually

get this and reduce. X equals expression, you will

reduce this to an assignment. Reduce this to an

assignment by rule number 1. So X equals, sorry

this is an assignment. So why are we doing this? Now here, the derivation

is a sequence of shifts and reductions. And some of these reductions, with some of these reductions

an action will take place and with some of them no

action will take place. So now let’s identify

the reductions on which an action

will take place. The actual reduction are, in which an action will take

place are which reductions? Give me.>>Number 9 and 10.>>Okay. So I reduce 10? No. So there is no action

because we are reducing by 10. So this is, well yeah, we are creating a node

for identifier yeah. So this is yeah so we are

doing a make node here for the identifier make

an identifier node. And on 9 when we reduce by 9

we are making a number node. And when we are reducing by

5, okay, when we are reducing by 5 we are making

a multiply node. And this is by 5, this is by. The point is that, you know

reductions like this, you know, for example reduce,

reducing by 4 which is substitute an

expression for a term, this is not a, there is no

actual, there is no action here. There is no action. So I’m just making

a substitution. So reducing by 2 here will

result in adding an add node. Make add. Okay. And then we make an

assignment when we reduce by 1. So now how does this work? Well, in the way we

do this in practice is that you will write a grammar with these embedded

actions in it. These embedded actions

would not have pseudo code, they’d have C code. So you’d write code in C.

then that’s all what you do. The parser generator is going

to generate and LR1 parser for this grammar that you wrote. And in that LR1 parser,

whenever it does a reduction, whenever it does a reduction

according to a certain rule, it’s going to generate the code that you wrote for

that reduction. So whenever it, you know,

it performs this reduction, it’s going to execute, you know,

whatever code you put in here. And this code you can put

any cod that you want. So this can be a line,

2 lines, 1,000 lines. So you control it. You tell it what to do when

it performs this reduction these reductions. So you are basically saying okay

generate the LR1 tables for me and generate an actual parser

in C. But in that parser that has the C code in it,

whenever there is a reduction by this rule, I want you

to execute this code. So you tell it the code that

it executes whenever it does the reduction. So then it will do

whatever you want. So you control it. It only generates

the parser for you. Okay? And in order to do

this you have to think about the actions

that you wanted to do at each, for each reduction. You want to think about

your, which actions which of these reductions

result in actual actions and which ones are the ones that will not result

in actual actions. So just be passing

things up, you know, for example when

we reduce a factor to a term there is no actual,

there is no actual action, so we just pass the

information up. You need to use this $

notation, $ notation. And this $ notation,

the $$ stands for? What does $$ stand for?>>You want to cancel the tree?>>Well it stands, yeah, it

stands for the current value which is the value of the

left hand side of the rule. So $$ is always associated

with this symbol on the left hand side. And, the $ number, $1,

$2, $3 they correspond to the symbols to the right. And make sure that you

count all the symbols. So this is 1 symbol. So this is $1. The equal is $2. And the expression is $3. So every symbol count. And the symbol would be a

terminal or a non-terminal. Okay? All right. So now any questions on this? So in order to understand

this $ notation we’d have to give another example. We’ll just, in a minute, we’ll

go through another example that instead of constructing

an abstract syntax tree, we’ll compute an estimate

of the computational cost of a given computation. We’ll see what that means. But, before I move to that, are

there any questions on this? Yes.>>How come there’s no

action for reduce 10?>>There is no action for what?>>Reduce 10?>>In reduce 10? Yeah there is always an action. So you make an ID note yeah. Make ID note. Yeah for every reduce

10 there is an action. So if we missed that, yeah

so on 4 there is no action, 7 there is no action, yeah, yeah 4 and 7 do not

result in actual actions. So we just passed those up. Okay. So any questions

on what’s on the board? Okay. Now using the

same grammar, you know, we don’t necessarily

have to use this to construct an abstract

syntax tree. We will go through

an example right now to understand how this works and how the information

is passed up. Okay. So assuming

that we know the cost of the different

operations, so we know. Okay. So let’s. Okay. Let’s assume that the

cost of an add equals 1 unit. The cost of a subtract

equals 1 unit. The cost of a multiply

equals say 3 units. The cost of a divide

equals 7 units. And the cost of a load

equals say 4 units. The cost of a store

equals 3 units. Now given these costs, we would

like to write embedded actions in this grammar to compute to

estimate the computational cost of a given, you know,

of a given string which is an assignment

statement. So this will accept only

assignment statements. So in this case, I was going

to say, we will say that, okay, we will think about the leaves because the computation

will start from the leaves. So for an identifier here,

in order to compute this, we will assume that, you know, the code that will get generated

is going to load A from memory. So for every identifier reduced to a factor we will have an

embedded action $$ equals the cost of a load. And the cost of the load

is, in this case, 4. So whenever there is an

identifier we’re assuming that we, that the generated

code will be a load. Now is this assumption, is

this a valid assumption or not? Does it make sense

or not in general? What do you expect? You know we haven’t talked

much about code generation. But is it a reasonable

assumption to say that whenever there’s an

identifier it will get loaded from memory?>>Well if it’s already loaded.>>Yeah.>>The problem is where

do you have to store.>>No well store is

a different story. But the question is not

every identifier that appears in an expression will

get loaded into memory. So for example, if you

have something like this. X equals A times B plus

C plus A plus, you know, 3 times A plus 7 times A. So

the code that will get generated for this is not going to load

A, you know, 4 times here.>>No.>>It’s not going to

load the A 4 times. It’s going to load it

once then reuse it. Okay. But now, well, in order

to do something like this, what do you think, how do

you think compilers do this, you know, in order to find out

whether this A has been loaded or not, you know, whether

we have an instruction that does load A

into some register?>>Looking at a table.>>Yeah exactly. So we have to track

this in a table. We have to track

this information in some data structure. And on data structure

that we can use, this is the symbol table. So in fact, one of the most

important data structures in a compiler is

the symbol table. And that symbol table will have,

you know, all the symbols in it with some, you know,

with some information. So it will have, in this

case, you know there is A and there is B, C. And

it will have information about each symbol. One of the information,

one of the pieces of information is

the memory location. Symbol table. And then you can

have memory location. So the memory location is

the location of this variable in memory which is

what we care about. And in this you can add is

loaded a flag indicating if this variable has been

loaded or not or is loaded at least in current block. And so if it’s loaded, if

this is loaded then we will, the cost is the cost of a load. If it’s not loaded then we

can say that the cost is 0. So in this case, you know, the

code that you can add here, you can add an if statement. If. Identifier or if the

actual, you know, token is loaded then

$$ equals 0, else $$ equals the

cost of a load. So if it’s already

loaded, you know, basically this incurs

a search in that table or in some data structure

that has information about this particular identifier

whether it has been loaded in this block. Because it may have been

loaded in another block. Now don’t worry about

the concept of a block we will

get to this later. But for now, we want it to

be loaded, you know, recently and in the current block of code that we are currently

processing. It may have been loaded

previously in a piece of code that is distant from the

current piece of code that we are processing. So it could be too distant. But let’s not worry about

these details for now. So the idea here is that a variable will not

get loaded all the time. Now for this number,

you know, we’ll do, what’s the corresponding

operation for a number?>>Merging [inaudible].>>Not now we’re not

building the tree here. So for a number, if this

is, if identifier is loaded, we can do a load

immediate, you know, usually it’s the compiler does

a load immediate for a number. So the cost of load

immediate is say 1. So here we say that,

okay, it’s $$ is the, equals the cost of

load immediate. Okay. Now the interesting stuff. Now what should we do here? The cost of a factor is?>>The cost of the expression?>>The cost of the expression

because these are just, you know, open parenthesis

closed parenthesis. And how do we express this? $$ equals. How do we express this

using the $ notation? Yeah.>>$2?>>Yeah exactly. It’s $2. Why $2 because this

is $1, this is $2, this is $3. So you count all the symbols

whether they are terminals or non-terminals. And what’s the symbol

number of expression on the right hand

side of the, you know, the right hand side

of the production? On the right hand side of the

production there are 3 symbols. Symbol number 1 is

open parenthesis, symbol number 2 is expression, symbol number 3 is

closed parenthesis. Okay. So the symbol that we

care about, the symbol number 2. Okay. Now when we substitute

factor to term, so this is, again this is a fast action. So we are just saying $$ equals.>>$1?>>$1 yes. Now what should we, should,

what should be the cost of term divided by factor?>>Cost of dividing?>>Well not, that’s

not sufficient. Because if you’re dividing, so

you have a divide I this tree and you have a dividing

a term and the factor.>>Okay you’ve got to make

sure that everything’s loaded in the register so

you can do the divide.>>Well we are dealing with a

term and a factor at this level. We are not at that level. You know this is the least level at the least level we

may load or not load. But at this level we have a

term and we have a factor. So what’s the cost of the divide

in terms of the cost of the term and the cost of the factor? Because this term can

have a whole tree. It can be a huge

term and this factor. By the way, we’ll go

through a concrete example. And this factor can have a

huge tree behind, below it. So whatever this term is,

we want to take its cost and the cost of the factor

and then add to that.>>The cost of the divide.>>The cost of the

divide, exactly. So how do we express this

using the $ notation?>>It would be $$ equals $1

plus cost of divide plus $3?>>Yeah exactly yeah. It’s $$ plus $ equals $1 plus

cost of divide, divide plus $3. So when we are doing the divide,

this is $1 and this is $3. Okay? Now obviously this

is going to be similar. So $$ is $1 plus cost

of multiply plus $3.>>That would mean that the add and subtract would

be similar to it.>>Yeah and this

would be similar. Term is going to

be $$ equals $1. So this is just a pass. We’ll go through a

complete example now and it will become obvious. Expression again this is

going to be a $$ equals $1 which is the expression plus

cost of what, subtract plus $3. And this is $$ equals $1

plus cost of add plus $3. Now what should we write here?>>That’d be $3 plus

the cost of storing?>>Yeah exactly. $3 plus the cost of storing. This identifier is not something

that we’re going to load. Right? So we will store

it at a given address. So there is no loading here. So $$ is going to be

the cost of expression which is $3 plus

the cost of store. So the only thing that requires

computation is the expression. Then we will have to assign it to store it at the

given address. How will we know the address? We will look up the

address for this identifier in the symbol table and then

we will know the address to store it to. And finally, you know, in the

last rule, what should we write? $$ equals what?>>$1.>>$1 yeah. Okay. So let’s trace it. Let’s trace it on some example. Okay. Should we use the same

example or a different example. I think we’ve done

these, does it, oh for, I deleted the numbers

but that’s fine. Yeah okay. So now, okay so let’s

take this example. X equals A times 3 plus B

for which we have the 3 here. So now, with this 3, what will

happen is that, the first thing that will happen is

we will reduce an item to file it to a factor. So when we reduce

the item to file to a factor we will calculate

the cost of the load. And the cost of the

load is going to be, the cost for factor

is going to be 4. Okay? So this is going to be 4. Then we will do a reduction

from factor to term. So when we reduce factor to term

we’re just passing the 4 up. So this is what a

rule like this means. So this 4 which is the cost

of a load, load is going to be get passed up to this. So the cost of this

load the term is 4. So this $$ is the value

that is associated with a given node in the tree. So in each node in the tree

there is a value associated with it. And this $$ is this value. And these actions define

the value of a node in terms of the values of the

children of that node. So this is what kind

of tree traversal? Yeah. What kind of debt first.>>Post order.>>It’s post order yeah

because you are defining, you are defining the value

of the current node in terms of the values of the children. So you have to process the

children first then process the term. So processing term

after children, this is post processing,

post order. Okay. So now term is 4. Again the next step

is going to be number, reducing the number to a factor. And for a number, for a load

immediate we said it’s going to be a 1. So the 1 is going to be here. Then, when we do this

reduction, terms times factor, terms time factor, when we

do this reduction, the cost. So we are here now. At this point $$ is this. And $1 is the term. $2 is the multiplication. $3 is the factor. So what we are saying here is

the cost of this term is going to be $1, which is the 4, so at this point $1 is a 4

plus the cost of a multiply. And what did we put there

a 3 for the multiply? Yeah 3. So it’s going to

say okay 3 and 4 and 1. So this is going to be an 8. So the term, the value for

the term is going to be an 8. Okay. So this is an important

step in this calculation. When this reduction takes

place we calculate the value of the node in terms of

the values of the children. So the values of

this children is 4, the value of this child is 1. And we add to this the

cost of the multiply. So in order to compute this we

need 8 units of computation. Similarly, you know, here the

identifier it’s going to be a 4, it will get passed

to the factor. And this is going

to be, you know, from factor to term

you just pass it down. And from term to expression,

when we reduce term to expression, when we

pass, when we reduce term to expression you just pass up. So you pass the 8 up. So now you have 8 and

4 and you add them.>>Which is the add operation.>>The add as a cost of 1. So in this case you

would get a 13.>>Right.>>So the cost of computing

this is going to be a 13. And again, when you do this,

at this point, you know, $$ is the expression,

$1 is this expression, this is $2, and this is $3. So when we are computing

this expression, the parent expression is $$. This, the left hand

side, is this expression. And the right hand

side, sorry the, on the right hand side you have

this expression and the cost of an add and this

expression which is the cost of a term which is $3. Okay? So this is here. Now we’ll substitute. So this is the expression. Then the final step

is the store. So this takes 13 units of

computation to compute. And the cost of the

store did we put 3 or 4?>>3.>>3. So 13 plus 3

this is going to be.>>16.>>16. And again, when you

look at this assignment, this is going to be your $1,

this is going to be your $2, and this is going to be your $3. And in fact $4 is the semicolon. Okay. So and yeah. So we have a semicolon here

but it doesn’t matter it’s. So this is, the semicolon’s

going to be $4. You know, don’t forget

any symbol. Here this symbol

does not matter. Okay? Yeah. Yeah sure.>>I understand the

instruction scheduling part. But why are we doing, computing

the tops of each operation? Is that useful later on?>>This is an exercise. Now this is an exercise

at this point. So we’re doing it

right now in class because this is an exercise

that explains how this works. From a practical point of

view is it ever useful? Yes. So it’s, sometimes

it’s useful because sometimes the compiler, even at the high

level representation of the code may want to estimate

the number of operations needed to compute a certain expression

or a certain block of code. So it’s useful to compute

an estimate of, you know, evaluating a certain

block of code or a certain function

or, you know, whatever. So for example, you know,

at some point I work on the prefetcher

in the GCC compiler. The prefetcher is a

compiler optimization state that adds prefetch instructions to prefecth memory locations

before they are needed so that when you need them they’re

already in the cache. And, in order to figure out the

prefetching in a given loop. Now in order to figure out, you

know, how early you will need to prefetch something in

a certain loop iterations to use it in the next iteration. Right. So you want to prefetch, prefetch A of, you

know, I plus 1. Well, you are here in iteration

I, you prefetch I plus delta. So you need to prefetch

something for the next iteration. Now how far ahead

should this be? You know, should you fetch for

iteration number 2 or 3 or 4? It just depends on the body of

the loop, what you are executing for the loop and how much

computation you expect to happen inside the loop so you

know whether to fix, you know, 3 iterations ahead or

10 iterations ahead or 100 iterations ahead, depending on how

bit the loop is. And, in order to know how big

the loop is, you need to come up with some estimate like

this of the operation. So it’s not just counting. Counting instruction

is not enough because some operations are

more expensive than others. You know a divide is more

expensive than a multiply. So yeah in practice

compilers may want to do this kind of computation. But we are doing it here only to understand the embedded

actions and the $ notations. So that’s why we

are doing it here to understand how the embedded

action and the $ notation work. Any other questions? Okay. So well, okay so let

me then ask the question, let’s try to do a different

question, this same thing. Let’s do right embedded actions that we’ll count the

number of divide operations. Count the number of divide

operations in the expression. So how do we modify

these actions? So instead of computing an

estimate of the computation of cost, we would like to just

count how many divides there are in an expression. So if you have an

expression like, you know, X equals A divided

by D plus 3 divided by 4 plus C times D

plus F divided by N. If you have something like

this then you want this to give you the value 3. So if you have an

expression like this, we have 3 divides in this. Now how do you modify these

actions to count the number of divides in an expression? So let’s go through

them one by one. So what should we right here? $$ equal 0 yeah. This is the leaf. So this is $$ equals 0. For the number same thing. So these are the leaves. $$ equals 0. What should we write here?>>Stay as is.>>Stays the same because

the number of divides in this thing is

equal to the number of divides in the expression. Okay. Now what should

we do here?>>Same.>>Same thing. Now what should we do here.>>$$ equals $$ plus 1.>>Yeah exactly. So it’s $1 plus $3 plus 1. So we can do it this

way $3 plus 1. So the number of divides

is the number of divides in this term plus the number

of divides in this factor which may have many

divides in it, plus the 1. Okay? So now what

should we do here?>>Same thing, put a 0.>>$1 plus $3.>>$1?>>Plus $3.>>Plus $3. And that’s it. Okay. So this is

we just pass up. This is $.>>Same as multiply.>>Yeah. So this is plus $3. And this is $1 plus $3. And this is $$ equals.>>$3?>>$3. Yeah. Okay. So this was a question

in the quiz last semester. So let’s expand the

grammar a little bit. So this is go equals. No instead of a sign let’s

do a recursion on assign. So let’s do block equals,

sorry, block is a sign or block oh yeah we have

to, we have to have 2 rules. So let’s put it here. Block is assign or

a block assign. So what does this mean? What does it mean?>>Block is one or

more assignments.>>Block is one ore

more assignments or you can have any list of assessments an arbitrarily

long list of assessments. Now, in this case, you know, obviously what we write

here is $$ equals.>>$1.>>And here $$ equals. Yeah it’s S1 plus

S2, S1 plus $2. Why? Because we are saying that, because we are doing

the recurgent. So we’re saying that when we

do the recurgent the number of divides in this,

the big block is equal, this block times the assign. And what will this block

have at some point? So, you know, if we have 10

assignments in this block, when do the final reduction,

the final reduction, this assign is going to

be the last assignment and this is going to

have 9 assignments in it.