Branches and Binary Operators¶
Let’s add branches and binops!
Branches¶
data Expr =
-- ...
| EIf Expr Expr Expr
Examples¶
Treat 0 as false, non-zero as true
if 10:
22
else:
sub1(0)
-- 22
if sub1(1):
22
else:
sub1(0)
-- -1
Quiz: https://tiny.cc/cse110a-if3-ind -> 0
Assembly Control Flow¶
To compile branches, we will use assembly labels - landmarks from which execution can be started from or skipped to
- Also, comparisons:
cmp a1, a2
performs a numeric comparison between the 2 args
stores the result in a special processor flag
- And jumps:
jmp LABEL
,je LABEL
,jne LABEL
uses the result of the flag set by the most recent
cmp
Quiz: https://tiny.cc/cse110a-control-ind -> A
Example:
if 10:
22
else:
33
-- compiles to
mov eax, 10
cmp eax, 0
je if_false
if_true:
mov eax, 22
jmp if_exit
if_false:
mov eax, 33
if_exit:
Strategy¶
to compile:
if eCond:
eThen
else:
eElse
we will:
compile eCond
compare the result in eax to 0
jump if the result is zero to a special false label
otherwise continue into eTrue, then jump to exit label
What about multiple if-statements?
We need a way to generate unique labels for each branch, since assembly can’t reuse labels
Types¶
data Expr a =
-- ...
| EIf (Expr a) (Expr a) (Expr a) a
We add tags of type a
for each subexpression (e.g. source-position info for error messages)
Let’s define a name for Tag
:
type Tag = Int
and we now use:
type BareE = Expr ()
type TagE = Expr Tag
Now, extend the assembly:
data Label
= BranchFalse Tag
| BranchExit Tag
data Instruction
= ...
| ICmp Arg Arg
| ILabel Label -- Create a label
| IJmp Label -- Jump always
| IJe Label -- Jump if equal
| IJne Label -- Jump if not-equal
Transforms¶
Tags¶
We don’t want the programmer to put in tags, so let’s add a tag step to our pipeline!
> let e = parseStr "if 1: 22 else: 33"
> e
If (Number 1 ()) (Number 22 ()) (Number 33 ()) ()
> label e
If (Number 1 ((),0)) (Number 22 ((),1)) (Number 33 ((),2)) ((),3)
- The key work is done by
doTag i e
: recursively walk over the
BareE
(e) starting tagging at counter ireturn a pair
(i', e')
of the updated counter and tagged expr
Quiz: https://tiny.cc/cse110a-tag-ind -> D
We can now tag the entire program by calling doTag with the initial counter, and throwing away the final counter.
CodeGen¶
compile env (If eCond eTrue eFalse i)
= compile env eCond ++ -- start with the input
[ICmp (Reg EAX) (Const 0), -- test if it's false
IJe (BranchFalse i)] ++ -- if it is, jump to the false label
compile env eTrue ++ -- otherwise keep doing true
[IJmp (BranchExit i)] ++ -- then leave
[ILabel (BranchFalse i)] ++ -- the false label
compile env eFalse ++ -- do the false
[ILabel (BranchExit i)] -- the leave label
Binary Operations¶
Quiz: http://tiny.cc/cse110a-sub-ind -> B
Examples¶
Numbers are easy:
-- 33 - 10
mov eax, 33
sub eax, 10
-- 2 * 2
mov eax, 2
mul eax, 2
If either operand is a number, just use it as a literal
But what about:
let x = 10,
y = 20,
z = 30
in
x + (y * z)
-- ==
let x = 10,
y = 20,
z = 30,
tmp = y * z
in
x + tmp
-- ==
mov eax, 10
mov [ebp-4*1], eax -- x = 10
mov eax, 20
mov [ebp-4*2], eax -- y = 20
mov eax, 30
mov [ebp-4*3], eax -- z = 30
mov eax, [ebp-4*2] -- y
mul eax, [ebp-4*3] -- y * z
mov [ebp-4*4], eax -- tmp = y * z
mov eax, [ebp-4*1] -- x
add eax, [ebp-4*4] -- x + tmp
So, if either operand is a variable, just grab it from the stack
Quiz: https://tiny.cc/cse110a-nest-ind -> C
But then what about (1 + 2) * (3 + 4)
?
Both 1
and x
(numbers and variables) are immediate operations - the operand values don’t require any computation
Administrative Normal Form (ANF)¶
An expression is in ANF if all primitive operations have immediate arguments
Quiz: https://tiny.cc/cse110a-anf-ind -> C
Not in ANF: (1 + 2) * (3 + 4)
in ANF:
let t1 = 1 + 2,
t2 = 3 + 4
in
t1 * t2
Strategy¶
We can convert any expression to ANF by adding temporary variables for subexpressions, so..
we add a Normalize step to our code pipeline to make things ANF before we get to codegen
Text =Parse> AST =Norm> ANF =Tag> ANF-Tag =CodeGen> ASM
Types¶
data Prim2 = -- called prim2 since it takes 2 args
Plus | Minus | Times
data Expr a =
...
| Prim2 Prim2 (Expr a) (Expr a) a
-- so, 2+3 becomes
Prim2 Plus (Number 2 ()) (Number 3 ()) ()
-- assembly:
data Instruction =
...
| IAdd Arg Arg
| ISub Arg Arg
| IMul Arg Arg
ANF¶
We can define a separate type for ANF, but that’s boring
so let’s write a function that describes immediate code:
isImm :: Expr a -> Bool
isImm (Number _ _) = True
isImm (Var _ _) = True
isImm _ = False
-- and isANF:
isAnf :: Expr a -> Bool
isAnf (Number _ _) = True
isAnf (Var _ _) = True
isAnf (Prim2 _ e1 e2 _) = isImm e1 && isImm e2
isAnf (If e1 e2 e3 _) = isAnf e2 && isAnf e3
isAnf (Let x e1 e2 _) = isAnf e2
Now, we define an ANF expression any Expr
s.t. isAnf expr -> True
.
type BareE = Expr ()
type AnfE = Expr ()
type AnfTagE = Expr Tag
type ImmTagE = Expr Tag
Compiling¶
ANF -> ASM¶
Going from AnfTagE
to Asm
is easy, since both operands for every binop will be an immediate.
compile :: Env -> TagE -> Asm
compile env (Prim2 o v1 v2)
= [ IMov (Reg EAX) (immArg env v1),
(prim2 o) (Reg EAX) (immArg env v2) ]
prim2 :: Prim2 -> Arg -> Arg -> Instruction
prim2 Plus = IAdd
prim2 Minus = ISub
prim2 Times = IMul
immArg :: Env -> ImmTag -> Arg
immArg _ (Number n _) = Const n
immArg env (Var x _) = RegOffset ESP i
where
i = fromMaybe err (lookup x env)
err = error "unbound variable"
Quiz: https://tiny.cc/cse110a-anf2-ind -> E
Bare -> ANF¶
AKA A-Normalization
The base cases are easy:
anf (Number n) = Number n
anf (Var x) = Var x
But…
Note
Example 1: 1 + 2 + 3
needs to become:
let t1 = 1 + 2
in
t1 + 3
Example 2: ((1 + 2) + 3) + 4
becomes
let t1 = 1 + 2,
t2 = t1 + 3
in
t2 + 4
Example 3: ((1 + 2) + 3) + ((4 + 5) + 6)
becomes
let t1 = 1 + 2,
t2 = t1 + 3,
t3 = 4 + 5,
t4 = t3 + 6
in
t2 + t4
- So, let’s write a helper function
imm :: BareE -> ([(Id, AnfE)], ImmE)
, which returns: - a pair containing:
a list of pairs of
ti, ai
containing new temp vars bound to ANF exprsan immediate value
v
such that:
let t1 = a1,
...
in
v
- Then, we can:
invoke
imm
on both the operandsconcat the let bindings
apply the binop to the immediates
anf (Prim2 o e1 e2) = lets (b1s ++ b2s) (Prim2 o (Var v1) (Var v2))
where
(b1s, v1) = imm e1
(b2s, v2) = imm e2
lets :: [(Id, AnfE)] -> AnfE -> AnfE
lets [] e = e
lets ((x,e):bs) e' = Let x e (lets bs e')
-- and lets and ifs are just recursive
anf (Let x e1 e2) = Let x e1' e2'
where
e1' = anf e1
e2' = anf e2
anf (If e1 e2 e3) = If e1' e2' e3'
where
e1' = anf e1
e2' = anf e2
e3' = anf e3
-- imm
imm :: BareE -> ([(Id, AnfE)], ImmE)
imm (Number n l) = ( [], Number n l )
imm (Id x l) = ( [], Id x l )
imm (Prim2 o e1 e2) = ( b1s ++ b2s ++ [(t, Prim2 o v1 v2)]
, Id t )
where
t = makeFreshVar ()
(b1s, v1) = imm e1
(b2s, v2) = imm e2
imm e@(If _ _ _) = immExp e
imm e@(Let _ _ _) = immExp e
immExp :: AnfE -> ([(Id, AnfE)], ImmE)
immExp e = ([(t, e')], t)
where
e' = anf e
t = makeFreshVar ()
So what’s up about makeFreshVar
?
Well… we’re going to have to pass around a counter.
anf :: Int -> BareE -> (Int, AnfE)
imm :: Int -> AnfE -> (Int, [(Id, AnfE)], ImmE)
-- code not included. Check out the slides, page 20.
fresh :: Int -> (Id, Int)
fresh n = (n+1, "t" ++ show n)