Basic optimizations#

Important

This chapter is in a draft status. Mistakes are possible. Comments and clarifications are particularly welcome.

This section explains basic optimization techniques for the language of Expression Programs.

The auxiliary files for this section are available as a download basic-opt-demo.zip.

Intermediate language#

As a first stepping stone we introduce an intermediate language of simplified expression programs, with the following definitions.

(* sprog.ml *)
type binop = Eprog.binop
(* type binop = | Add | Sub | Mul | Div *)
type varname = Eprog.varname

type operand = Lit of int 
             | Var of varname 

type sstmt = 
   | ValBinop of varname * binop * operand * operand 
   | Input of varname

type sprog = sstmt list * operand  

An important characterization of this intermediate language is that the program representation is flattened. There are no tree-like structures to be found in here. In particular, check the declaration of the type operand that we now use in the right-hand-side of the statements and the return position.

Flattening of expressions to trees is a relatively standard transformamtion in compilers. In the auxiliary material this functionality is implemented in module eprogToSprogDestDriven.ml. We explain a few key ingredients of this module below.

Fresh name generation in OCaml#

An important ingredient of many compiler translations is generation of fresh names. Consider the code below.

let fresh_variable, current_counter_value = 
  let ctr = ref 0 in 
  ( fun x -> let c = !ctr in ctr := c + 1; x ^ (string_of_int c))
  , fun () -> !ctr

There is quite a bit packed in this short snippet. The top-level let binding declares a tuple (in many places OCaml syntax allows omitting the tuple-surrounding parenthesis for clarity) of two values fresh_variable and current_counter_values. These values are functions bound to the values fun x -> ... and fun () -> ... respectively. Both of these functions have access to the counter ctr that is declared in the scope of the let statement as a reference with the initial value set to 0. This scoped declaration implements encapsulation of the counter implementation. For outer code, there is no access to the raw value of the counter other than through those functions above. For example, the above snippet exposes no ways of resetting the counter value.

Before reading further, check your understanding of this code. What type do these functions have?

Constant and copy propagation, the need for an environment#

The shape of our target language introduces one subtlety. Consider the source-level program Example.example_program_1.

let example_program_1: eprog = (
      [Val ("a", Int 5);
       Val ("b", Var "a")],
      Var "b"
)

corresponding (in an imaginary source syntax) to something like

let a = 5 
let b = a 
in b

Now, observe that the language of sprog has no facilities for representing a non-computational move instruction. Neither of the assignments to a and b can be represented! It is then the job of this translation to do an on-the-fly substitution for such moves.

For the example above, the translation is

return 5

This already implements two forms of optimization known as constant propagation and copy propagation. To achieve this, we need some notion of environment. For simplicity, we let the environment type be just an associative list for association of variable names to operands.

type t_env = (varname * operand) list

Flattening translation of expressions#

The main workhorse of flattening of expressions is the function tr_expr defined as follows. This function bakes-in a simplifiying assumption that there must be a way to ensure that the names generated by the fresh name generator may never (!) coincide with the names used in the program source.

(* Simplified translation that assumes there is a way to avoid name clashing
   with source defined variables not starting with a $ symbol. *)
let rec tr_expr env (dest:varname option) (e:Eprog.expr) = 
  match e with 
    Int x -> [], Lit x 
  | BinOp (op, e1, e2) -> 
      let i1, o1 = tr_expr env None e1 in  
      let i2, o2 = tr_expr env None e2 in 
      let x = 
          match dest with 
            None -> fresh_variable "$t"
          | Some y -> y in
      i1 @ i2 @ [ ValBinop (x, op, o1, o2)], Var x
  | Eprog.Var x -> []
          match List.assoc_opt x env with 
            None -> Var x 
          | Some o -> o  

The type of tr_expr is t_env -> varname option -> Eprog.expr -> sstmt list * operand. The first argument is the environment we introduced above. The second argument is the optional destination that we discuss below. The third argument is the expression to translate. This function returns the list of statements that correspond to the body of the expression (remember it may be rather deep) and the operand where the result is to be found.

Let us review the three outer cases that we match on.

  1. When the source-level expression is just an integer literal, e.g., Int 5, there are no statements that correspond to it, and so the result of this case is an empty list, paired with the literal in the target language Lit 5.

  2. When the source-level expression is a complex binop, we recursively translate the subexpression.

    Here, let us revisit the argument dest. It specifies what variable to use for storing the result of complex binops. It is optional because for the subexpressions there is no expected source-level destination, and in that case we use a fresh name.

    The final list of instruction we return is the concatenation of what is returned from the recursive calls, together with one more instruction implementing the binop of the current expression. Note how we consult the destination with a match, and propagate the the resulting operand back to the caller through the Var operand constructor.

  3. The varname case is the one place where the environment is accessed. We check if the variable is associated with an operand in the the environment and use it – this is the on-the-fly substitution alluded to earlier. If there is no association (this is possible, if the variable is declared by an input statement – see code below), we keep the variable name.

This idiom of passing over the destination by the caller is an instance of a more general technique known as destination-driven code generation.

Translation of statements#

Translation of statements has type t_env -> Eprog.estmt -> t_env * Sprog.sstmt list

This return value is a tuple of an environment that is updated and a list of sprog statements.

let tr_stmt env = 
  let open Eprog in function 
    Val (x,e) -> 
        let i, o = tr_expr env (Some x) e 
        in (x, o)::env, i
  | Input x -> 
        env, [Sprog.Input x]

We consider each of the outer matches in detail.

  1. For variables, we call into the tr_exp defined above, explicitly providing the destination. Here i is the returned list of instructions, and o is the operand. The returned environment extends the original with the entry (x,o).

  2. Inputs are simply translated to inputs. We do not extend the environment, because of how it tr_exp handles this.

Because expressions now create a list of statements, note that the result type of this function is a list.

Translation of the whole program#

Translation of the whole program is done as follows. We use the built-in List.fold_left_map that combines both map and fold operations, and remember to translate the return expression. The last concatenation is a bit clunky, but it is necessary.

let eprog_to_sprog (stmts, e) = 
  let env', stmts' = List.fold_left_map tr_stmt [] stmts in
  let e_insts, e_op = tr_expr env' None e in 
  ((List.concat stmts') @ e_insts), e_op

Alternative translation#

We also include a module eprogToSprog.ml that uses more efficient environments and slightly different traversal structure. We leave the examination of the source code of this module to you as an exercise.

Optimization driver and the main function#

We organize our optimizations using a simple loop.

(* opt.ml *)
let rec opt sprog = 
  let s_opt = 
    sprog 
    |> ConstFold.const_fold_sprog 
    |> CSE.cse
    |> DCE.dce 
  in if s_opt = sprog then sprog 
  else opt s_opt 

The function opt repeatedly applies the sequence of different optimizations, until the program representation no longer changes. The main function of the whole program lives in file main.ml.

let _ = 
  let s1 = EprogToSprogDestDriven.eprog_to_sprog 
           Examples.example_program_1 in
  (* print the program after translating to sprog 
     representation before any optimizations *)
  let _ = Printf.printf "%s\n" (Sprog.string_of_sprog s1) in  
  let s2 = Opt.opt s1 in
  Printf.printf "-----------\n";
  (* final program after optimization *)
  Printf.printf "%s\n" (Sprog.string_of_sprog s2) 

We encourage you to modify the main function above as you wish, for example, plugging in the pretty printer for the source programs that you have developed as part of your assignment.

Constant folding#

Optimization driver, with extra debugging#

We start by tweaking our top-level optimization function so that we (1) turning off other optimizations, and (2) print out the updated programs after each iteration of opt.

(* opt.ml, tweaked temporarily *)
let rec opt sprog = 
  let s_opt = ConstFoldSimpl.const_fold_0 sprog 
  in if s_opt = sprog then sprog 
  else 
    let _ = Printf.printf "--\n" in
    let _ = Printf.printf "%s\n" (Sprog.string_of_sprog s_opt) in  
    opt s_opt

We start with a version that does nothing

let const_fold_0 x = x

Consider example 2

let val a = 5
let val b = 5
in a + b

Running our implementation on this example demonstrates the flattening (and the associated constant/copy propagation). But afterwards, the program does not change.

$t0 = 5 + 5;
return $t0
-----------
$t0 = 5 + 5;
return $t0

Let us now implement the first version of this optimization

let subst_oper env = function 
   Lit n -> Lit n
 | Var x  -> 
      match List.assoc_opt x env with 
        None -> Var x 
      | Some op -> op
      
let f_op = let open Eprog in function 
    Add -> ( + ) | Sub -> ( - ) | Mul -> ( * ) | Div -> ( / )

let const_fold_stmt_1 env stmt = 
   match stmt with 
   | ValBinop (x, op, Lit a, Lit b) -> 
        (x, Lit ((f_op op) a b)) :: env,
        stmt (* keep the statement, even though it's dead code *)
   | ValBinop (x, op, o1, o2) -> 
        env, 
        ValBinop (x, op, subst_oper env o1, subst_oper env o2)
   | Input _ -> env, stmt

let const_fold_stmt_1 (stmts, e) = 
   let env', stmts'  = List.fold_left_map const_fold_stmt_1 [] stmts in 
   let e' = subst_oper env' e in 
   (stmts', e')

The overall structure should by now look familiar.

  1. Just like before, we use an environment for copy/constant propagation. We stick to a list-based environment in this simple version.

  2. Function const_fold_stmt_1 distinguishes three cases.

    1. The case when the binop is over two constants. In this case, we extend our environment with statically precomputed result of the binop. We keep the statement around, even though we expect it to be dead code from now on (Why?!)

    2. The general binop case. This means that one of the operands must be a variable. We proceed to propagate the substitution of the operands; leaving the environment unchanged.

    3. The input case that does changes neither the program nor the environment.

Let’s run this on example 2. We get the following

$t0 = 5 + 5;
return $t0
--
return 10
-----------
return 10

This works as expected. Let’s try another test asg1_task4.

$t0 = 10 + 49;
$t1 = 6 / $t0;
$t2 = $t1 + 10;
$t3 = 70 * 77;
$t4 = 12 / 9;
$t5 = $t3 - $t4;
$t6 = $t5 + 5;
$t7 = $t2 * $t6;
return $t7
--
$t1 = 6 / 59;
$t2 = $t1 + 10;
$t5 = 5390 - 1;
$t6 = $t5 + 5;
$t7 = $t2 * $t6;
return $t7
--
$t2 = 0 + 10;
$t6 = 5389 + 5;
$t7 = $t2 * $t6;
return $t7
--
$t7 = 10 * 5394;
return $t7
--
return 53940
-----------
return 53940

So far so good. There are opportunities to do more rewriting for when the value of variables is not known, but algebraic laws allow us to do simplifications, e.g., in let y = 0 + x . The full implementation of constant folding in constFold.ml does this, do check it out!

Dead-code elimination#

Implementation of dead-code elimination is in the module DCE.ml

Common subexpression elimination#

Implementaiton of common subexpression elimination is in the module CSE.ml