>>So today we talk about

global register allocation. So in the previous lectures, we talked about local

register allocation within the same basic block. Now the problem with local

register allocation is that it doesn’t allow you

to assign a virtual register to a physical register and keep

the value across basic blocks. So, if you have something

like this define our one, use R1, use R1, and you have. In another basic

block, you have use R1. In local register allocation,

there is no way you can assign, you know, this R1 and

this R1 to the same, to the same physical register. Why? Because you are allocating

each block separately. So you are invoking that local

register allocation algorithm. You are invoking it on this

basic block, and it’s working on this basic block, it

doesn’t know anything about this basic block

or this basic block. And when it’s working

on this basic block, it doesn’t know anything about

what’s in this basic block. So this means that when you

do local register allocation, you cannot keep a

value in a register, and you keep it alive

across basic block. But you’ll have to do

in this case is okay, you can probably allocate

our one physical register, like P1 for example, but here in

this basic block, you don’t know about this basic block. So anything that is used in

this basic block will have to get loaded before it’s used. So you have to load R1 here. The only way you can do it with

local register allocation is to load R1 from memory. You have to load it from

memory and then use it. This means that with

local register allocation, you will have lots of loads

at basic block boundaries. And in fact, you will have

loads and you will have stores. You need stores because

in order for this to work correctly,

we have to what? What do we have to

insert to store? [ Inaudible Responses ] Well at the end of

this basic block or after the last definition

of this basic block. So in this basic

block, this happens to be the last definition,

right. So if this is the

last definition, we should insert

a, sorry, store R1. We can do it here. It doesn’t make a

difference in this case. So you can either put the

stores at the very end of the basic block

or you can put them after the last definition

of the basic block. So, what does this mean? It means that when you do

local register allocation, you will end up adding all

of these loads and stores that will make sure that the

values are stored in memory after a definition,

and a value is loaded from memory before a use. Okay. Is the idea clear here

why, you know, the difficulty of or what’s, well in fact why

local register allocation is not very efficient? Yeah. Okay. So in this case, when we

have, when we have a register that is used in this

basic block but is defined in another basic block, we say that this register

belongs to the live in set. So for each basic block, it is a

live in set, a set of registers that are live at this

point, so they’re defined in other basic blocks but

are used int his basic block. And the live-in set

here will have R1 in it. So if we had, you know,

for example, you define R1. So if we are to define R2 and

define, it’s getting messy, okay, define R2, define R3. And we have, use R3 and

we have a use R2 here. So in this case, in the live

in set for this basic block, we have R1 and what is live in? Live in is the register that is

defined in another basic block that is a predecessor basic

block in the control flow graph and is used in this basic block. So in this case, what is

live here is R1 and R3. While the live in here for

this basic block is R2. Now what’s live out

of this basic block, so what is live out? Well in fact, all

of them are live out because they are used late. So live out is going

to have R1, R2 and R3. Okay. So if you had,

you know, for example, if we let’s extend it, let’s

extend this basic block. So if we had define R4 and

use R4, and R4 is not used in any basic block below this,

so let’s call it basic block 1, and this is basic block 2,

and this is basic block 3. So, R1, R2, R3 are live out

because they are used later in successive basic blocks. While R4 here is not live

out because it’s defined here and it’s used here, and it’s

not used below this basic block. Yes.>>Live out for 1, 2, 3 only

because they’ve been stored?>>No, no, no. Because they are used in

successive basic blocks. Because R1 is defined

here and used here, okay. R2 is defined here

and used here. And R3 is defined

here and used here. So you’ll have, you know,

you’ll have a live range. So R1, the define for R1

you have this, define R1, and this is use of R1, so

this is a whole live range. And for R2, you have this

live range, and for R3, it’s defined here,

and it’s used here, so you’ll have this live range. Okay? So, live means

live at this point. It means it has been

defined above this point. But it will be used

below this point. So this applies to R1, R2, R3,

but it does not apply to R4. R4 has been defined above this

point, but there is no use of R4 below this point. So live out means it will

be used below this point. It will be used later. It will be used in a

successive basic block in the control flow graph. While live in here, live in

means that there is a register that is used in this basic

block but was defined in a previous basic block or

in a predecessor basic block. A basic block that is, you

know, above this basic block in the control flow graph. Although the meaning of above

and below may not make sense when we have cycles

in the graph, right. Because if the graph has cycles,

then you can no longer talk about above and below, right. So if you have something like

this, then you can no longer say that this is above

or below this. It’s a cycle. But we are, at least, you know,

without the cycle, we can above and below makes sense. It makes sense to

say above and below. Okay? So, any questions on

the concept of liveness? Okay. So when we have live

in and live out registers, if we are doing local

register allocation, if we are doing local register

allocation, we will need to store all live out

registers, so store live out. So in this case you’ll need

to store the live outs, and you need to load

the live ins. So you store these. Why? Because in some, this

is when you are doing, when you are doing local

register allocation, when we do local

register allocation, we must store all live out

first at the end of the block and load all live in registers

at the start of each block. At the end of each block. So for each block, we will

find the live in and live out. How do we find the live

in and the live out? There is an analysis for

finding live in and live out. It’s not in today’s lecture

covering how we can compute the live in and live out. But it’s something that we will

be covering in a future lecture. Okay? So this is the, what

we need to do with local, with local register allocation. Which means that we

will have lots of loads and stores that may be avoided. If you can do global

register allocation, look at all basic blocks

at once, you may avoid, you may be able to avoid all of

these stores for the live out and these loads for

the live ins. Okay. So let’s look

at an example. So these are virtual

registers, you know. A, B, C, D and E are

virtual registers. Okay. Now let’s look at the

live in and live out sets. Okay, so for this

basic block, live in, there is nothing live in, so

live in, so we’re assuming that this is the

empty basic block. So live in here is going

to be the empty set. Okay? Now, what’s live out?>>A.>>Live out means something

defined here and used later. So –>>So it’s just A.>>Yeah, just A. Why? Because E is not used in any

of the subsequent blocks. So live out is going to be

A. Now what’s live in here? What’s live in?>>Nothing.>>It could be A.>>Yeah, still be A.

Still, A would be live because even though A is not

used in this basic block, it is still used below

this basic block. So there is a live

range like, you know, try to imagine the live range. So there is a live range

that is going this way. Okay. So this live range, this

is a possible bath in the code. So if this path in the code is

taken, A will still be live here because it’s used here. It’s not used in

this basic block, but it’s used in

a successor block. So, A will still be

live in in this case. Okay, so live in is

going to be A. Live in here will have A. Now,

what’s live out for this?>>A and D.>>Yeah, so, live out, A is live

because at this point it’s live. It’s defined here. And so A is used

below this point, and D is used below this point. So and A and D both

are live out. And same applies to this. Live out equals A and D. Okay. And these, in fact, in general,

the live in for this will be in general, what will be

the relation between live in for a load, for a basic block

like this and the live outs of its predecessor basic blocks.>>It’s equal.>>It’s a union of all –>>The union, exactly. Generally speaking, it’s

going to be the union. So this example, they

happen to be the same. So the union is going

to give the same set. But generally speaking, what

is live out, what is live in here is the union

of these live outs. So if we had a C here or an

F or a G, it would have added to the live out to

this, live in, sorry. Live in here equals

A and D. Okay? So now we know the

live in and live out. Now remember, now we are trying to do global register

allocation, so we are trying to avoid, avoid spilling

the live outs and loading the live ins. So when you do local, you spill

or you store the live outs and you load the live ins. Now with global register

allocation, we’re trying to avoid this. And we are trying to keep

these virtual registers in the same physical

register across basic blocks. So we would like to keep A

in the register and would like to keep it in [inaudible]

until it’s used here. Okay, so throughout this

entire live range of A, now this is the definition,

and this is the use, we would like to keep

it in a register. So before we look at the live

ranges here for the global, let’s look at live ranges in the local sense just

because it’s easier. Before we look at global

live ranges, let’s see, let’s make sure that we

understand local live ranges. So if we have things like in

the same basic block define, you know, define A, use

A, define B, use A, use B, define C, use A, use C.

If we have a sequence of defines and uses like this. The live range is going

to be, is going to start at the definition, and it

will end at the last use. So the live range for

A is going to be this. This is live range for A. The

live range for B is going to be, this is defined B, this is used

B, there is only one use of B, so this is live range

of B. And what about C? C is defined here and is used

here, so this is the live range of C. So within a basic block

or in straight line code, live ranges are actually

live intervals. So in straight line code, in straight line code,

or within a basic block, live ranges are live intervals, which means that

they are linear. Everything is linear. Now, now why are live

ranges interested for register allocation? Why do you think when we do

register allocation we care about live ranges?>>So you don’t,

because if you’re going to use a value later on,

later on in that block, you need to have it in a

register so it could be used.>>Okay.>>But if a value’s no longer

needed, it can be dumped, for a different value to

be used for something else.>>Okay, but how

does that notion of live range help us with this? So we’re trying to map virtual

registers to physical registers, and we have these live ranges. So, if we have two live

ranges that overlap, then what does this tell

us about the allocation to physical registers? So in this case, for example, the live ranges of

A and B overlap. So there is an overlap

between the two live ranges. This means that we –>>You need at least

two physical registers.>>Yeah, yeah, two physical

registers, they cannot be placed in the same physical registers. So the key idea is if

two live ranges overlap, if the live ranges of two

virtual registers overlap, you cannot put them in the

same physical register. Okay? If the live ranges are

two virtual registers overlap, you can’t put them in the same physical register. So in fact, this is often, the

compiler’s often represent this by a conflict graph or

an interference graph. So in this case, you know,

A conflicts with B overlaps with B, but does

A overlap with C?>>No.>>Yes. A overlaps with

C, but do C and B overlap?>>No.>>No. So B and C

do not overlap. So if we draw a conflict graph

like this, conflict graph, sometimes it’s called

conflict graph. Sometimes they call

it interference graph. Okay. Just another name, shows

conflicts are interferences. So here, A and B conflict. So there is an edge that

indicates that A and B conflict with each other and A and

C conflict with each other, but B and C do not

conflict with each other. Now in order to find, to

allocate physical registers for these virtual registers,

we can think of this problem as a graph coloring problem. So how many people are

familiar with graph coloring? Okay, so what does it mean? Yeah.>>The graph coloring problem

is like you have like a map and you don’t want two

colors to touch each other.>>Okay. So here we

don’t have a map. We have a graph. So in the context of a graph,

what does it mean to color it? Yeah. It is the same

idea, the same idea, where in the map you don’t

want, you know, two countries that are next to each other,

adjacent, you don’t want them to have the same color. You want different colors. So adjacent, so adjacency

in the context of a graph is represented

by edges. So A and B are adjacent. So we don’t want them

to have the same color. So we can say okay, this

is red, and this has to be a different color. This is blue. Now C is adjacent to A, right. But it’s not adjacent to b.

So adjacency in the context of graph, it means there

is an edge between them. The X and Y are adjacent. There’s an edge between them. So now C can be colored blue,

but you cannot color it red. Right. So this is the

graph coloring problem. So now in this case, one of

these colors will correspond to what in the context

of register allocation?>>It will correspond

to physical registry.>>Physical registers,

yeah exactly. So now, colors correspond

to physical registers. And these nodes are

virtual registers. And we would like to do this

coloring using the minimum number of colors or minimum

number of physical registers. Now in general, for

a general graph, the graph coloring

problem is NP complete. So graph coloring

is NP complete. [ Inaudible ] Okay. So which means that there

is no polynomial time algorithm, there is no known

polynomial time algorithm that will take an

arbitrary graph and color it optimally

using the minimum number of colors all the time. So it’s, you know, you

may find an algorithm that will give you the minimum

number of colors for say 99% of the instances, if

it’s a good algorithm. So it would give you the

minimum number of colors for 99% of the instances

but not for 100%. So there is no algorithm

that is guaranteed to give you the minimum

number of colors for every single instance that

you can input to that algorithm. Okay, so but there are

heuristics, there are heuristics for doing graph coloring. So there are reasonable

heuristics for doing graph coloring. Now let’s, let’s switch

to the global problem. So far we have been talking

about this local problem. We understand what

the live range means. We understand why we are

interested in the conflict between live ranges

or the overlapping or interference between

life ranges. And the relation between

register allocation and graph coloring,

how you can think of the register allocation

problem as a graph coloring problem. You want to color this graph

using the minimum number of colors. Now, what applies to local

register allocation applies to global register

allocation, but here, the notion of a live

range is more complex because it’s no longer linear. You know, our live ranges

here are no longer linear. For example, the live range

for A is going to split here. A is defined here, and it’s

live here and live here. So it’s going to split,

then it’s going to merge. So, a live range in the

global context, in the context of a control flow graph with multiple basic blocks

is not going to be linear. But still, we can talk

about live ranges, even in the context of global. So now, you know, for

example, the live range of A is going to be like this. So A is defined here, and this

is the last use of A, so A, you can think of the live

range of A as, it starts here, and then it goes through all of this basic block,

and it ends here. But this is not the only path.>>Yeah, it’s actually used in

that other range block too, A.>>Yeah, and it goes here,

and it goes through this. But it keeps going

because it’s used here. So this is, so this is what

the live range of A looks like. It’s not a line,

it’s not an interval. It’s not linear. It’s, you know, it

looks more like a web. You know, sometimes they

call the live ranges and the global context webs

because you have lines all over, and they split and then merge. So they don’t look like lines. So this is for A. Now

the live range of E, because E is not used in

any other basic block, so E is basically defined

and used locally, right. So its live range is going

to be an interval like this. This is the live range for

E. The live range for B, so B is defined here

and is used here. So B is another local live

range, which is an interval. For C, another local one. But for D, so D is defined

here, so its live range is going to be this, but there

is another live range, another line, which is this. Okay? So, what does this mean? It means that when we look for

the conflicts or the overlaps between live ranges

in the global context, we have to be more careful. So let’s see if we

can determine now which live ranges

interfere and which don’t. Because if we can draw

that interference graph, then we can think of the problem

as a graph coloring problem. So let’s construct

the interference gap. Let’s start with A. So with which live ranges

does A conflict? So, okay, A and B. Does

it conflict with B? Yeah, so A goes this way. So in fact, it’s

conflicting with everyone. So it’s conflicting

with B. It’s conflicting with C. It’s conflicting with

E. It’s conflicting with, even with D. So if you look at

the live range for A, it overlaps with all other live

ranges, because it’s very long, and it, and this is the last use

of K. So A is going to conflict with all, it’s going to conflict

with B. It’s going to conflict with C, D and E. It

conflicts with all of them. Okay. So this is, you know, this part of constructing the

conflict graph is something that has to be done carefully,

because it’s kind of tricky. Okay? So we know

the interferences of A. Now, what about B? So maybe we should

ask the question, with which other live

ranges does B conflict?>>D.>>It conflicts with D.

Does it conflict with A? Yes, it already conflicts

with it. So B conflicts with D. Okay. So if we look at C, so

here’s the live range for C, with which –>>Just A.>>Yeah, it only conflicts

with A, so it’s already, we already have an edge from

C to A. Now, what about D? Just A and B.>>Well, it conflicts with

A, and it conflicts with B, but both of these conflicts

have already been captured in the graph. Now what about E?>>Just A.>>It only conflicts

with A. So this is going to be our conflict graph. Now looking at this

conflict graph, of course, you know with simple instances

like this, we can easily figure out the optimal solution, right. So we only have five nodes, so

by inspection, we can figure out the optimal solution here,

the minimum number of colors. What’s the minimum number

of colors that we need here?>>Three.>>Yeah. Why is it three?>>Because there’s

three connected nodes.>>Yeah, exactly. Because A, B and D, so

these are connected. So these are all, not only

connected, but each one of them is connected

to the two others. So this subgraph in red,

what do we call this?>>That’s a cycle.>>No, and when a number of nodes are all

connected to each other –>>A complete graph.>>Well, it’s a complete

subgraph. So what do we call it? What’s the name for

a complete subgraph? So have you heard of clique? It’s a click. So this is a clique

of size three. So clique, it’s a

subset of the nodes that are all connected together. They form a subgraph,

a complete subgraph. So this is clique

of a size three, a clique that includes

three nodes, this means that you cannot do

it with less than three colors. So, at best, you will

need three colors. So, if in this case, if the target machine has three

physical registers or more, we don’t have a problem. We can color this. You’re assuming that you know, the graph coloring

algorithm is going to discover the best coloring. And there are good heuristics

for doing graph coloring. They are not perfect,

because it’s NP complete. The problem is NP complete. But there are good heuristics that will discover the

optimal solution most of the time, for most instances. So assuming that we

have a good heuristic and we have three physical

registers, we have no problem. We can color them, so we can

say, for example, that this is, you know, this is

blue, and this is red. And this is maybe green. Okay. And see, you can

give it any color but blue, so it can color it red. And E, you color it

any color but blue, so it can color it red. So this is if you have

three physical registers. What if you have two now? What if you have two

physical registers? So what will make sense to do if

we have two physical registers? Well, in this case,

in this case, having two physical registers

means that you cannot, there is no way you can

allocate a physical register for every live range. So you have so many conflicts that you cannot keep each live

range in a physical register. Now, you know, before

we move to this, remember that if we have

three physical registers here and we manage to, you

know, to map each one to a physical register, now

we have this A, we will put it in one physical register,

and it can keep it through all basic blocks. So unlike local register

allocation, you will no longer need to do

your stores here and loads here. Okay. So if you have

three physical registers, you can keep A in a physical

register, and you don’t need to store A here and

load here and here. Okay, now if we have two, we have to victimize

one live range. And the live range that we will

victimize is going to be one of three live ranges, all right, the ones that have

this big conflicts. So these three cannot be kept in the same register

at the same time. So we need to victimize

one of them.>>D.>>Now how do, what will

make sense to consider when selecting a live range

to victimize or to spill? To make a spilling decision. Well, compilers will

look at how, you know, the cost of this spill. Well, compilers will

look at how, you know, the cost of this spill. So how much, they will

estimate the spill cost. What’s this spill cost? The spill cost is how

much it will cost me to spill this live range. And spilling a live range

means you insert a store after every definition, and you

insert a load before every use. So spilling, spilling

a live range means inserting a store

after every definition. Spilling live range of X,

spilling the live range of X means inserting a store

after every definition of X and a load before every use of

X. So in this case, for example, if we choose to spill A,

spilling A, we haven’t decided to spill A. But just assuming

A will be the register that will end up

getting spilled, we will put a store here,

store A, after the definition. And this is the use of,

no, there is no A here. It is a use of A here, so

what do we insert here? Load. Load A. And what else?>>There’s one more use.>>Yeah, there is a use. So what do we insert?>>Load.>>Load. Load A. So,

what’s spilling means is that we decided, we haven’t

decided this for A yet, but if we decide, if we end up

spilling A, we will put a store after A, after every definition,

so that we store it in memory. And we put a load

before every use. What does it mean? It means that we

spilled it to memory. It’s no longer in the register. And A is no longer

in the register. And it means that after

we are done with it, we just put it in memory. And whenever we need

it, we have to load it or fetch it from memory. Okay, so this is

what spilling means? Now, given this definition

of spilling or how spilling is

implemented, will it make sense to spill the register that has

lots of definitions and uses or the register that has fewer?>>Fewer.>>Yeah, the one that has fewer. Because the one that

has fewer definitions and uses will require

fewer loads and stores. Our objective is to minimize

the number of loads and stores that we will be inserting

into the code. So we prefer to spill

the register that has fewer defs and uses. But the number of defs and uses

is not the only consideration. We also have to take

into account what? Well, not all basic

blocks are going to get executed the

same number of times. Because some basic blocks

may be inside loops. Some basic blocks

may be outside loops. So if a register is used

only once in a basic block that is inside a deeply

nested loop, then that load that we will be inserting, that load will get executed

a large number of times. So if there is a block and

this block is inside of a loop, and we just put one load here, and this loop gets

executed a million times, then this load will get

executed a million times. So, we’ll have to take

into account the frequency of execution of the

loops, if we know that. Of course, you know, compile

time, remember that this is, all of this is done

at compile time. So the compiler doesn’t know

precisely what, you know, how many times each basic

block will get executed. Because not all basic blocks,

not all loop, trip counts or loop, the number of

iterations of the loops, not all of them are known

at the compile time. So you can have something

like this 4I equals zero, I less than X, I plus,

plus, and you have a loop. And then read X. So X is read

from the input or from a file. So if X is read from the screen

or from a file or whatever, at compile time when the

compiler is looking at this, it doesn’t know how many times

this loop will get executed, right. Yeah. So in general, the

compiler doesn’t know at compile time how many

times a loop gets executed. It doesn’t know how many times

a basic block will get executed, but usually it comes

up with some estimates. And it can distinguished

between basic blocks that are inside loops and basic

blocks that are outside loops. So if there is a, you know,

if there is code in here, basic block here, then we

know that it’s inside the loop and most likely it will get

executed more frequently than other basic blocks.>>So it gets a higher

priority for register hours.>>Yeah, exactly. So we try not to spill, we

try not to spill inside loops. We try to spill outside loops. And again spilling, you know,

adding these loads and stores. We try to do it outside

loops, not inside loops. But the point that I’m trying

to make here is at compile time, the compiler doesn’t

have precise information. It may come up with

some estimates. So in fact, register

allocation would be based on these estimates. So if the compiler comes

up with, for example, this loop will get, this basic

block will get executed once, and then it will split. So this is going to be half,

we’ll have a weight of a half. And this will have

a weight of a half. But this is inside of a loop. So, this is going to get

executed ten times, for example. So, the register allocator

is going to do the analysis, the analysis based on

the execution frequency with these estimates

that are, like I said, usually are not accurate. Now given these estimates, which register do you

think it will be spilling?>>D.>>Yeah. Well no, D is here. So D is here. So A and D are here.>>B.>>Yeah, so B because B is the

one that is outside the loop. So if it spills D, sorry,

B, B is only in this loop. And it will insert, you

know, a store here, store B, and here it will insert what?>>Load.>>Load B.>>Oh, okay.>>Okay. So in this case, it’s

going to split this live range. So the live range of

B is this and this. So now if it does this, B

will no longer conflict with, will it still conflict with D?>>Yeah D.>>It will still conflict

with D and A. So in fact, what this means is that the

compiler, with this algorithm, we will identify the live

range that is least expensive to spill, which is in

this case, B. We spill it, and then we try again. We construct the

conflict graph again, and we try to color again. This may or may not help. So the spilling may

or may not help. May or may not make

things easier. Things may continue to be hard. And in this case,

it didn’t help. So if it doesn’t help, it

will make another decision. Okay. It will, it

will try now to take, to spill the next cheapest

live range to spill. And now it’s not obvious,

so we’ll, you know, we have to do some calculations. In fact in general,

it’s, you know, one way of calculating this is

to count the number of loads and the number of stores and

to multiply each load or store with the execution

frequency of that basic block. And that will give you

an accurate estimate of the spill cost. And since we are running

out of time, you now, we will not get the chance

to do this in detail. But if you do this, if

you count how many stores and loads you will need, and

then you multiply each one by the execution frequency, you

will get a higher cost for A because A has more

definitions and uses than D.