Exercises for Week 6#

Exercise overview#

In this exercise we programmatically generate LLVM code for the following function (in pseudo-code)

var x = read_integer () 
if x >= 0 then
   x = x + 1 
else
   x = x - 1
print_integer x

We start off by studying example final LLVM code that we may want to generate. We use alloca for allocating space on the stack for variable x, and note that we do not use phi instructions here.

declare i64 @read_integer()
declare void @print_integer(i64) 
define void @dolphin_main () {
 %x0 = alloca i64
 %t1 = call i64 @read_integer ()
 store i64 %t1, i64* %x0
 %t2 = icmp sge i64 %t1, 0
 br i1 %t2, label %then3, label %else4
then3:
 %t6 = load i64, i64* %x0
 %t7 = add i64 %t6, 1
 store i64 %t7, i64* %x0
 br label %merge5
else4:
 %t8 = load i64, i64* %x0
 %t9 = sub i64 %t8, 1
 store i64 %t9, i64* %x0
 br label %merge5
merge5:
 %t10 = load i64, i64* %x0
 call void @print_integer (i64 %t10)
 ret void
}

Save this in prog.ll and compile together with the runtime library provided for Assignment 4.

clang prog.ll runtime.c

Run and ensure that the program behaves as expected.

Skeleton#

One particular aspect of the CFG Builder library is that the CFG construction is non-destructive. All the helper functions there return a new version of the CFG Builder instead of modifying it in-place. Moreover, much of the code generation does not depend on the state of the CFG builder but is generally about extending it. For this reason, we use the notion of CFG transformers, that we dub buildlets, and that have type cfgbuilder -> cfgbuilder so that they can be conveniently combined.

We exemplify the usage of buildlets in the rest of the exercise. We start off with the following skeleton

module Sym = Symbol
open Ll
let symbol = Sym.symbol 

let fresh_symbol =
  let c = ref 0 in
  fun initial ->
    let n = !c in c := n + 1; symbol (initial ^ (string_of_int n))


exception NotImplemented
(* string -> insn -> CfgBuilder.buildlet * operand *)
let add_instruction_with_fresh s i = 
  raise NotImplemented


(* bop -> operand -> CfgBuilder.buildlet *)
let change_by_one_buildlet op loc = 
  let b_load, load_op = add_instruction_with_fresh "t" (Load (I64, loc)) in
  let b_binop, op = add_instruction_with_fresh "t" (Binop (op, I64, load_op, IConst64 1L)) in 
  let b_save = CfgBuilder.add_insn (None, Store (I64, op, loc)) in 
  CfgBuilder.seq_buildlets [ b_load; b_binop; b_save] 

(* CfgBuilder.buildlet *)
let exercise_buildlet = 


(* let b_update_add = change_by_one_buildlet Add alloca_op in  *)
(* ... *)
(* let b_update_sub = change_by_one_buildlet Sub alloca_op in  *)


(* CfgBuilder.seq_buildlets [ b_update_add; ... ; b_update_sub ] *)
  CfgBuilder.term_block (Ret (Void, None)) (* replace this with your code *)


let p : prog = 
  let b = exercise_buildlet CfgBuilder.empty_cfg_builder in 
  let cfg = CfgBuilder.get_cfg b in 
  let f = { fty = ([], Void); param = []; cfg = cfg} in 
  {
  tdecls = [];
  extgdecls = [];
  gdecls = [];
  extfuns = [ (symbol "print_integer",  ([I64], Void))
            ; (symbol "read_integer", ([], I64))];  
  fdecls = [ (symbol "dolphin_main", f )]
}

let _ = Printf.printf "%s" (Ll.string_of_prog p)

Building parts#

A common task in code generation is to generate a fresh name for some instruction. To help us with this, we start off by writing an auxiliary function add_instruction_with_fresh that takes a string and an instruction, and returns a pair of two things:

  • a buildlet function that corresponds to the instruction

  • a freshly generated new name that is used by the instruction and can be used by the caller of this function

We exemplify the usage of this function in change_by_one_buildlet that takes a binary operation, a memory location (presumably a result of alloca or some other memory allocation), and does the following:

  • it loads the value from memory

  • performs the binop

  • saves the result in-place

It also returns a builder that is a sequence of the underlying buildlet combinators. Do check out the definition of CfgBulider.seq_buildlets and make sure you understand it.

Putting it all together#

Write the main body of the example. Implement the function exercise_buildlet. For the bodies of the ‘then’ and ‘else’ branches, do make use of the helper function change_by_one_buildlet. Once done, use the generated LLVM code and test your solution.