4. Dolphin – Phase 1#
Attention
This is a group assignment. You should have formed groups by now. If not, please do so as soon as possible. There will an announcement on BrightSpace explaining how you should register your groups on BrightSpace.
4.1. Assignment overview#
This assignment covers translating a fragment of Dolphin language, in the AST form, into LLVM--
.
There are 3 tasks and 3 questions in this assignment.
Questions marked for glory are optional.
Do them only if you feel ambitious and have the time.
4.1.1. What you need to get started#
For
LLVM--
: make sure you understand everything from the lectures onLLVM--
. That is, everything in theLLVM--
chapter except for aggregate and named types, GEP, and phi nodes.For semantic analysis: this will be covered in lectures.
For programmatic
LLVM--
code generation: this is partly covered in TA classes for week 5; and also will be complemented in lectures. This part, in spirit, resembles programmatic generation of x86 code we hand in assignment 2; the constant/copy propagation technique that you will need to use for conversion to LLVM is also covered in the notes on basic optimizations.For OCaml, the assignment uses OCaml records which are covered in Section 3.4 of the OCaml book.
Important
You will need files symbol.ml
, symbol.mli
, cfgBuilder.ml
, cfgBuilder.mli
, and ll.ml
as they have been released in the TA classes.
4.1.2. What you need to hand in#
Please hand in a .zip
file containing the following
A brief report documenting your solution. Acceptable report formats are
.pdf
,.rtf
, and.md
. For each task and question, briefly (1 – 4 sentences) describe your implementation or answer. Write concisely.All the source files needed to reproduce your solution. This also includes the
C
code provided. Please explain in your report how the solution could be reproduced, e.g., callingmake
(if you have made aMakefile
), the command line to callclang
, etc.
Note
If you are answering any of the glory questions, please submit them as separate solutions, i.e., in a separate, clearly marked subfolder in the .zip
file of your submission.
Important
Make sure to understand all the code you hand in, including what is copied from here. (The code for pretty printing (typed) ASTs is an exception here; see the appendix below.)
4.2. The Abstract Syntax Tree (AST) of Dolphin (phase 1)#
In this phase, a Dolphin program is simply a sequence of statements. Intuitively, this is to be understood as the body of the main function.
The OCaml types for the AST describing programs is given below:
(* -- Use this in your solution without modifications *)
type ident = Ident of {name : string;}
type typ = | Int | Bool
type binop =
| Plus | Minus | Mul | Div | Rem | Lt | Le | Gt | Ge | Lor | Land | Eq | NEq
type unop = | Neg | Lnot
type expr =
| Integer of {int : int64}
| Boolean of {bool : bool}
| BinOp of {left : expr; op : binop; right : expr}
| UnOp of {op : unop; operand : expr}
| Lval of lval
| Assignment of {lvl : lval; rhs : expr;}
| Call of {fname : ident; args : expr list}
and lval =
| Var of ident
type statement =
| VarDeclStm of {name : ident; tp : typ option; body : expr}
| ExprStm of {expr : expr option}
| IfThenElseStm of {cond : expr; thbr : statement; elbro : statement option}
| CompoundStm of {stms : statement list}
| ReturnStm of {ret : expr}
type program = statement list
Action item
Put the AST declarations above in a module called Ast
.
That is, make a file called ast.ml
and copy the code above into it.
Do not change the code above.
The types ident
(program identifiers), typ
(types), binop
(binary operators), and unop
(unary operations) are self-explanatory.
Expressions consist of the following (in the order they appear in type expr
above):
Integer literals
Booleans literals
Binary operations consisting of the a left operand, a binary operation, and a right operand
Unary operations consisting of a unary operation and an operand
L-values, i.e., values that can appear on the left-hand-side of an assignment expression (see below), in our fragment the only possible l-values are variables
Assignments consisting of an l-value, and right-hand-side expression assigned to the l-value in question
Function call consisting of a function name (an identifier), and a list of arguments. Note: even though the phase-1 fragment of Dolphin has no function definitions, programs do have access to a very limited standard library, customised for this phase.
Statements consist of the following:
Variable declarations consisting of a name (identifier), optionally a type, and a body (an initialization expression). The optionality of the type annotation for variable declarations can be seen in the difference following two declarations of variable
x
invar x = 10;
andvar x : int = 10;
as written in Dolphin’s syntax.Expression statements which allow us to use expressions as statements (this is restricted in semantic analysis). Note that the expression is optional. A
None
expression statement corresponds to the empty statement, i.e.,;
on its own in languages like C and Java.If statements where
cond
is the condition (an expression),thbr
is the statement for the then branch, andelbro
is the optional else branch.Compound statements: these correspond to statement blocks. Think of code blocks in C and Java enclosed in curly braces. Note: compound statements serve two purposes. They delimit the scope of declarations, and allow us to use a statement blocks inside other statements, e.g., a statement block as the body of the then, or else branch of an if statement.
Return statement where the result of an expression is returned. (Recall that in this phase, the program being the body of the main function should return an integer.)
4.3. The Typed Abstract Syntax Tree (AST) of Dolphin (phase 1)#
The typed AST differs from the AST in that it includes all the necessary type information for (LLVM) code generation. The differences between the two notions of AST can be subtle. Please pay attention to the details.
The OCaml types for the typed AST describing programs is given below:
(* -- Use this in your solution without modifications *)
module Sym = Symbol
type ident = Ident of {sym : Sym.symbol}
type typ = Void | Int | Bool | ErrorType
type binop = Plus | Minus | Mul | Div | Rem | Lt | Le | Gt | Ge | Lor | Land | Eq | NEq
type unop = Neg | Lnot
type expr =
| Integer of {int : int64}
| Boolean of {bool : bool}
| BinOp of {left : expr; op : binop; right : expr; tp : typ}
| UnOp of {op : unop; operand : expr; tp : typ}
| Lval of lval
| Assignment of {lvl : lval; rhs : expr; tp : typ}
| Call of {fname : ident; args : expr list; tp : typ}
and lval =
| Var of {ident : ident; tp : typ}
type statement =
| VarDeclStm of {name : ident; tp : typ; body : expr}
| ExprStm of {expr : expr option}
| IfThenElseStm of {cond : expr; thbr : statement; elbro : statement option}
| CompoundStm of {stms : statement list}
| ReturnStm of {ret : expr}
type param = Param of {paramname : ident; typ : typ}
type funtype = FunTyp of {ret : typ; params : param list}
type program = statement list
Action item
Put the AST declarations above in a module called TypedAst
.
That is, make a file called typedAst.ml
and copy the code above into it.
Do not change the code above.
Below, we enumerate and explain the main differences between (untyped) ASTs we saw earlier and typed ASTs:
Types now include the
Void
type, and theErrorType
type. While the source (ASTs) do not have void in them, library functions, which are featured in this phase 1 Dolphin fragment, can involve theVoid
type. It is needed for error recovery where the compiler does not merely stop when it first encounters a type error in the program, but continues and reports all errors that it detects.Identifiers, instead using strings for names, use symbols. Symbols are much more efficient to use than strings.
Many expressions, e.g., unary and binary operations, now also include a type. This extra type, is the type of the entire expression, that is, a binary operation for comparison, with both left and right expressions bing integers, will have its as
Bool
. Note also how variable declarations, which optionally included types in ASTs, always include a type in typed ASTs (it is not an option anymore). Note that assignments are also accompanied by a type. The reason for this is that an assignment, in addition to storing the right-hand-side value into the relevant memory address, also produces this right-hand-side value as a result. That is, the expressionx = y = 1 + 4
is a valid expression, which should be understood in two steps, in the first step assigns 5 toy
and results in 5, in the second step, the result of the first step, i.e., 5, is assigned tox
. The following is also a valid expression:x = (y = 4 - 2) + 7
. It assigns 2 toy
and 9 tox
.
Inclusion of the typing information allows one to directly determine the type of an expression with a shallow inspection of the expression when necessary, e.g., when we need to decide what LLVM type to use when generating LLVM--
code.
This can be done using the following OCaml functions:
let rec type_of_expr = function
| TAst.Integer _ -> TAst.Int
| TAst.Boolean _ -> TAst.Bool
| TAst.BinOp {tp; _} -> tp
| TAst.UnOp {tp; _} -> tp
| TAst.Lval lvl -> type_of_lval lvl
| TAst.Assignment {tp; _} -> tp
| TAst.Call {tp; _} -> tp
and type_of_lval = function
| TAst.Var {tp; _} -> tp
4.4. Semantic analysis of Dolphin programs (phase 1)#
The first task asks you to implement semantic analysis.
The semantic analysis must check the program semantically.
That is, it must make sure that the program satisfies to the criteria for being a valid program that we can produce executable code for.
At a high level view this process takes an AST (something of type Ast.program
above) and produces a typed AST (something of type TypedAst.program
above).
This requires to determine all the necessary types to produce a typed AST, e.g., the types of all variable declaration, if omitted, must be inferred.
This, however, is not the only requirement.
There are a few extra requirements, some of which are specific to this phase.
In particular, not all OCaml values of type TypedAst.program
are semantically valid programs.
The following are extra conditions, that should be guaranteed by semantic analysis, and can therefore be assumed when producing LLVM--
code.
No expression, except for a call to a function with
Void
return type can have typeVoid
. This also includes the type (inferred for) declared variables. That is, a programvar x = f();
wheref
is a function withVoid
return type is not semantically valid. Hence, one needs to be careful when inferring types.Arithmetic operations
Plus
,Minus
,Mul
,Div
, andRem
can only be applied to integer expressions and produce integer results.Comparison operators
Lt
,Le
,Gt
, andGe
(in this phase) can only be applied to integer operands and result in a boolean.Comparison operators
Eq
andNEq
require the two operands to have the same (non-void) type and result in a boolean.Boolean operations
Lor
andLand
can only be applied to boolean operands and result in a boolean. These operators are short-circuiting (explained later).Variables should be properly resolved and type checked. Note that this is a non-local requirement. In other words, semantic analysis must use environment to remember the type of all variables that are in scope.
A well-typed assignment must have a right-hand-side that has the same (non-void) type as the l-value it is being assigned to.
A valid call must correspond to a call to a known function, in this phase just library functions. The types and number of arguments must match the parameters of the function. Furthermore, the type of the overall expression must the same as the return type of the function.
The body given for variable declaration should match its type, if provided. If not provided, the type of the variable is inferred to be the type of the body expression which must not be void.
A valid expression statement can either be an assignment, or a function call. Any other expression is not considered valid as an expression statement.
The condition of an if statement must be an expression of type boolean. The then branch and the else branch (if present) must valid statements.
A compound statement is valid if all its statements are valid. Any variable declared inside a compound statement are only valid within that block (and obviously only after that declaration itself within the block).
Variables can shadow one another. That is, when used, a variable refers to its closest declaration, e.g.,
var x = 10; var x = 12; print_integer(x); print_integer(x);
is valid and should print 12 twice,var x = 10; {var x = 12; print_integer(x);} print_integer(x);
should print 12 followed by 10. However,var x = 10; {var x = 12;} print_integer(x); print_integer(x);
should print 10 twice, etc.Note that function names and local variables live in the same scope. This means that local variables can mask functions. That is,
var f = 10; f(12);
should always produce an error even if functionf
exists and has the appropriate type.Return statements are only valid (in this phase) if the expression returned is an integer.
All programs (in this phase) must always end in a return statement (which must be an integer).
Note
In case of questions regarding ambiguity in the above semantic rules ask questions on the forum. We will try our best to answer promptly. In case necessary, e.g., if you are in doubt and there is no enough time (please do not leave assignments for the last minute!), use your best judgment (you must be able to reasonably defend it) and explain your reasoning in your report. We will do our best to be understanding.
Important
The details of type checking and type inference are discussed in class.
Task 1: Implement semantic analysis
Implement semantic analysis for phase 1 of Dolphin as described above.
The following template OCaml code is a good starting point to embark in this assignment.
Here, we assume that a module Errors
exists which declares an exception TypeError
and a number of errors, e.g., the TypeMismatch
error used below. This is explained below.
exception Unimplemented (* your code should eventually compile without this exception *)
let typecheck_typ = function
| Ast.Int -> TAst.Int
| Ast.Bool -> TAst.Bool
(* should return a pair of a typed expression and its inferred type. you can/should use typecheck_expr inside infertype_expr. *)
let rec infertype_expr env expr =
match expr with
| _ -> raise Unimplemented
and infertype_lval env lvl =
match lvl with
| _ -> raise Unimplemented
(* checks that an expression has the required type tp by inferring the type and comparing it to tp. *)
and typecheck_expr env expr tp =
let texpr, texprtp = infertype_expr env expr in
if texprtp <> tp then raise Unimplemented;
texpr
(* should check the validity of a statement and produce the corresponding typed statement. Should use typecheck_expr and/or infertype_expr as necessary. *)
let rec typecheck_statement env stm =
match stm with
| _ -> raise Unimplemented
(* should use typecheck_statement to check the block of statements. *)
and typecheck_statement_seq env stms = raise Unimplemented
(* the initial environment should include all the library functions, no local variables, and no errors. *)
let initial_environment = raise Unimplemented
(* should check that the program (sequence of statements) ends in a return statement and make sure that all statements are valid as described in the assignment. Should use typecheck_statement_seq. *)
let typecheck_prog prg = raise Unimplemented
We recommend that you create three separate modules for this task. One module is for the semantic analysis following the example above; let us call this the Semant
module.
The other two should be the Errors
module, and the Env
module, roughly following the structures below.
(These modules will be discussed further in class.)
(* Errors module *)
module Sym = Symbol
module TAst = TypedAst
module TPretty = TypedPretty
type error =
| TypeMismatch of {expected : TAst.typ; actual : TAst.typ}
(* other errors to be added as needed. *)
(* Useful for printing errors *)
let error_to_string err =
match err with
| TypeMismatch {expected; actual; _} -> Printf.sprintf "Type mismatch: expected %s but found %s." (TPretty.typ_to_string expected) (TPretty.typ_to_string actual)
(* Env module *)
exception Unimplemented (* your code should eventually compile without this exception *)
type environment = | (* to be defined *)
(* create an initial environment with the given functions defined *)
let make_env function_types = raise Unimplemented
(* insert a local declaration into the environment *)
let insert_local_decl env sym typ = raise Unimplemented
(* lookup variables and functions. Note: it must first look for a local variable and if not found then look for a function. *)
let lookup_var_fun env sym = raise Unimplemented
- Question 1
Reflect on the usage of type inference as opposed to type checking in your implementation. Briefly describe what principle guides you in writing your code, that is when you decide to use one vs the other.
- Question 2
Reflect on the error recovery strategy in your implementation. How do you implement it?
4.5. Code generation#
The compiler must generate valid LLVM code for all semantically valid programs.
For this purpose we use the cfgBuilder
module.
Task 2: Implement code generation
Implement code generation for phase 1 of Dolphin as described above.
There are a few key points to pay attention to when producing code for Dolphin programs:
Dolphin has a guaranteed order of evaluation: left to right. For instance, in binary operations, the left operand is executed first. Once the result of the first operand is obtained, then the program continues with executing the right operand, followed by the operation itself. Similarly, function arguments are computed from left to right before the function call.
Pay attention to the short-circuiting nature of boolean and (
&&
) and or (||
) operations. These operations first compute the left operand. If the left operand determines the result of the operation (if it isfalse
in case of&&
ortrue
in case of||
), the right operand is not computed. This is the standard semantics for such operations. Here, the LLVM program generated must perform the analysis of branch appropriately so as to not compute the right operand, if not necessary. To maintain SSA form, we must either allocate temporary stack space for the result (as we have seen earlier), or usephi
nodes (this is one of the glory points below).As we have done before when translating programs by hand to LLVM, local variables are allocated on the stack (using
LLVM--
‘salloca
instruction). We need an environment to remember the assignedLLVM--
register for each variable and itsLLVM--
type. This is in many ways similar to the environment we use for semantic analysis. However, we no longer need to have functions and their types in the environment. All the necessary information, i.e., the return type, and the type of arguments, is already stored in the typed AST.Code generation for division must produce code that checks whether the divisor is zero. If so, it should print an error and exit. That is, the “division by zero” hardware exception should never occur in a Dolphin program.
- Question 3 (glory **)
Use
phi
nodes to re-implement short-circuiting behavior of booleanand
andor
operators.
Task 3: Testing and consolidation
In this task, we consolidate the two parts from the previous tasks, and test our project in an end-to-end fashion.
Write a top-level function
compile_prog
that given an AST, runs the semantic analysis on it. If there are any errors, they should be printed on standard error output, and the program exits with exit code 1. If there are no errors, it proceeds to generate the LLVM translation. The result of the translation should be output on standard output, and the program exits with exit code 0.Write a test corpus consisting of 10 or more example programs that cover all the features of the language. It is important that the tests you write are relevant. Think about negative and positive tests. For negative tests, the critical aspect is the semantic analysis. In particular, if your semantic analysis reports a particular type of an error, your test corpus should include a program that exhibits that kind of error. For positive tests, your tests should cover all features of the language, i.e., all possible binary operations, calls to the standard library, etc.
Attention
Do not take this part of the assignment lightly. Use this task as a vehicle for discovering bugs and logical errors in your project.
4.6. Appendix#
You will need libraries printbox
and printbox-text
to use pretty printers below. These can be installed using opam using the following command opam install printbox printbox-text
These pretty printers produce a so-called box
which is the terminology that printbox
uses to refer to formatted, structured texts.
A box
ca be printed as follows:
PrintBox_text.output stdout (Pretty.program_to_tree prog)
This will print the AST of the program as a tree.
4.6.1. pretty printer for ASTs (module Pretty
)#
module PBox = PrintBox
open Ast
(* producing trees for pretty printing *)
let typ_style = PBox.Style.fg_color PBox.Style.Green
let ident_style = PBox.Style.fg_color PBox.Style.Yellow
let fieldname_style = ident_style
let keyword_style = PBox.Style.fg_color PBox.Style.Blue
let info_node_style = PBox.Style.fg_color PBox.Style.Cyan
let make_typ_line name = PBox.line_with_style typ_style name
let make_fieldname_line name = PBox.line_with_style fieldname_style name
let make_ident_line name = PBox.line_with_style ident_style name
let make_keyword_line name = PBox.line_with_style keyword_style name
let make_info_node_line info = PBox.line_with_style info_node_style info
let ident_to_tree (Ident {name}) = make_ident_line name
let typ_to_tree tp =
match tp with
| Bool -> make_typ_line "Bool"
| Int -> make_typ_line "Int"
let binop_to_tree op =
match op with
| Plus -> make_keyword_line "Plus"
| Minus -> make_keyword_line "Minus"
| Mul -> make_keyword_line "Mul"
| Div -> make_keyword_line "Div"
| Rem -> make_keyword_line "Rem"
| Lt -> make_keyword_line "Lt"
| Le -> make_keyword_line "Le"
| Gt -> make_keyword_line "Gt"
| Ge -> make_keyword_line "Ge"
| Lor -> make_keyword_line "Lor"
| Land -> make_keyword_line "Land"
| Eq -> make_keyword_line "Eq"
| NEq -> make_keyword_line "NEq"
let unop_to_tree op =
match op with
| Neg -> make_keyword_line "Neg"
| Lnot -> make_keyword_line "Lnot"
let rec expr_to_tree e =
match e with
| Integer {int; _} -> PBox.hlist ~bars:false [make_info_node_line "IntLit("; PBox.line (Int64.to_string int); make_info_node_line ")"]
| Boolean {bool; _} -> PBox.hlist ~bars:false [make_info_node_line "BooleanLit("; make_keyword_line (if bool then "true" else "false"); make_info_node_line ")"]
| BinOp {left; op; right; _} -> PBox.tree (make_info_node_line "BinOp") [expr_to_tree left; binop_to_tree op; expr_to_tree right]
| UnOp {op; operand; _} -> PBox.tree (make_info_node_line "UnOp") [unop_to_tree op; expr_to_tree operand]
| Lval l -> PBox.tree (make_info_node_line "Lval") [lval_to_tree l]
| Assignment {lvl; rhs; _} -> PBox.tree (make_info_node_line "Assignment") [lval_to_tree lvl; expr_to_tree rhs]
| Call {fname; args; _} ->
PBox.tree (make_info_node_line "Call")
[PBox.hlist ~bars:false [make_info_node_line "FunName: "; ident_to_tree fname];
PBox.tree (make_info_node_line "Args") (List.map (fun e -> expr_to_tree e) args)]
and lval_to_tree l =
match l with
| Var ident -> PBox.hlist ~bars:false [make_info_node_line "Var("; ident_to_tree ident; make_info_node_line ")"]
let rec statement_to_tree c =
match c with
| VarDeclStm {name; tp; body} -> PBox.tree (make_keyword_line "VarDeclStm")
[PBox.hlist ~bars:false [make_info_node_line "Ident: "; ident_to_tree name];
PBox.hlist ~bars:false [make_info_node_line "Type: "; Option.fold ~none:PBox.empty ~some:typ_to_tree tp];
PBox.hlist ~bars:false [make_info_node_line "Body: "; expr_to_tree body]]
| ExprStm {expr; _} -> PBox.hlist ~bars:false [make_info_node_line "ExprStm: "; Option.fold ~none:PBox.empty ~some:expr_to_tree expr]
| IfThenElseStm {cond; thbr; elbro; _} ->
PBox.tree (make_keyword_line "IfStm")
([PBox.hlist ~bars:false [make_info_node_line "Cond: "; expr_to_tree cond]; PBox.hlist ~bars:false [make_info_node_line "Then-Branch: "; statement_to_tree thbr]] @
match elbro with None -> [] | Some elbr -> [PBox.hlist ~bars:false [make_info_node_line "Else-Branch: "; statement_to_tree elbr]])
| CompoundStm {stms; _} -> PBox.tree (make_info_node_line "CompoundStm") (statement_seq_to_forest stms)
| ReturnStm {ret; _} -> PBox.hlist ~bars:false [make_keyword_line "ReturnValStm: "; expr_to_tree ret]
and statement_seq_to_forest stms = List.map statement_to_tree stms
let program_to_tree prog =
PBox.tree (make_info_node_line "Program") (statement_seq_to_forest prog)
4.6.2. pretty printer for ASTs (module TypedPretty
)#
module Sym = Symbol
module PBox = PrintBox
open TypedAst
let typ_to_string = function
| Void -> "void"
| Int -> "int"
| Bool -> "bool"
| ErrorType -> "'type error'"
(* producing trees for pretty printing *)
let ident_to_tree (Ident {sym}) = Pretty.make_ident_line (Sym.name sym)
let typ_to_tree tp =
match tp with
| Void -> Pretty.make_typ_line "Void"
| Int -> Pretty.make_typ_line "Int"
| Bool -> Pretty.make_typ_line "Bool"
| ErrorType -> PBox.line_with_style (PBox.Style.set_bg_color PBox.Style.Red PBox.Style.default) "ErrorType"
let binop_to_tree op =
match op with
| Plus -> Pretty.make_keyword_line "PLUS"
| Minus -> Pretty.make_keyword_line "Minus"
| Mul -> Pretty.make_keyword_line "Mul"
| Div -> Pretty.make_keyword_line "Div"
| Rem -> Pretty.make_keyword_line "Rem"
| Lt -> Pretty.make_keyword_line "Lt"
| Le -> Pretty.make_keyword_line "Le"
| Gt -> Pretty.make_keyword_line "Gt"
| Ge -> Pretty.make_keyword_line "Ge"
| Lor -> Pretty.make_keyword_line "Lor"
| Land -> Pretty.make_keyword_line "Land"
| Eq -> Pretty.make_keyword_line "Eq"
| NEq -> Pretty.make_keyword_line "NEq"
let unop_to_tree op =
match op with
| Neg -> Pretty.make_keyword_line "Neg"
| Lnot -> Pretty.make_keyword_line "Lor"
let rec expr_to_tree e =
match e with
| Integer {int; _} -> PBox.hlist ~bars:false [Pretty.make_info_node_line "IntLit("; PBox.line (Int64.to_string int); Pretty.make_info_node_line ")"]
| Boolean {bool; _} -> PBox.hlist ~bars:false [Pretty.make_info_node_line "BooleanLit("; Pretty.make_keyword_line (if bool then "true" else "false"); Pretty.make_info_node_line ")"]
| BinOp {left; op; right; tp; _} -> PBox.tree (Pretty.make_info_node_line "BinOp") [typ_to_tree tp; expr_to_tree left; binop_to_tree op; expr_to_tree right]
| UnOp {op; operand; tp; _} -> PBox.tree (Pretty.make_info_node_line "UnOp") [typ_to_tree tp; unop_to_tree op; expr_to_tree operand]
| Lval l -> PBox.tree (Pretty.make_info_node_line "Lval") [lval_to_tree l]
| Assignment {lvl; rhs; tp; _} -> PBox.tree (Pretty.make_info_node_line "Assignment") [typ_to_tree tp; lval_to_tree lvl; expr_to_tree rhs]
| Call {fname; args; tp; _} ->
PBox.tree (Pretty.make_info_node_line "Call")
[typ_to_tree tp;
PBox.hlist ~bars:false [Pretty.make_info_node_line "FunName: "; ident_to_tree fname];
PBox.tree (Pretty.make_info_node_line "Args") (List.map (fun e -> expr_to_tree e) args)]
and lval_to_tree l =
match l with
| Var {ident; tp} -> PBox.hlist ~bars:false [Pretty.make_info_node_line "Var("; ident_to_tree ident; Pretty.make_info_node_line ")"; PBox.line " : "; typ_to_tree tp;]
let rec statement_to_tree c =
match c with
| VarDeclStm {name; tp; body; _} ->
PBox.tree (Pretty.make_keyword_line "VarDeclStm")
[PBox.hlist ~bars:false [Pretty.make_info_node_line "Ident: "; ident_to_tree name];
PBox.hlist ~bars:false [Pretty.make_info_node_line "Type: "; typ_to_tree tp];
PBox.hlist ~bars:false [Pretty.make_info_node_line "Body: "; expr_to_tree body]]
| ExprStm {expr; _} -> PBox.hlist ~bars:false [Pretty.make_info_node_line "ExprStm: "; Option.fold ~none:PBox.empty ~some:expr_to_tree expr]
| IfThenElseStm {cond; thbr; elbro; _} ->
PBox.tree (Pretty.make_keyword_line "IfStm")
([PBox.hlist ~bars:false [Pretty.make_info_node_line "Cond: "; expr_to_tree cond]; PBox.hlist ~bars:false [Pretty.make_info_node_line "Then-Branch: "; statement_to_tree thbr]] @
match elbro with None -> [] | Some elbr -> [PBox.hlist ~bars:false [Pretty.make_info_node_line "Else-Branch: "; statement_to_tree elbr]])
| CompoundStm {stms; _} -> PBox.tree (Pretty.make_info_node_line "CompoundStm") (statement_seq_to_forest stms)
| ReturnStm {ret; _} -> PBox.hlist ~bars:false [Pretty.make_keyword_line "ReturnValStm: "; expr_to_tree ret]
and statement_seq_to_forest stms = List.map statement_to_tree stms
let program_to_tree prg =
PBox.tree (Pretty.make_info_node_line "Program") (statement_seq_to_forest prg)
4.6.3. C runtime#
#include <stdio.h> /* make sure these two includes are */
#include <inttypes.h> /* present in the start of your C file */
// your LLVM program must produce a function called dolphin_main of the following type.
extern int64_t dolphin_main();
int64_t read_integer () {
int64_t value;
printf("Please enter an integer: ");
scanf("%" PRId64 "" , &value);
return value;
}
void print_integer (int64_t value) {
printf("%" PRId64 "\n" , value);
}
int main(){
return dolphin_main();
}
The following OCaml code defines the type of the two library functions. Use this code, placed in a separate module, to bootstrap your semantic analysis and code generation.
module TAst = TypedAst
module Sym = Symbol
let make_ident name = TAst.Ident {sym = Sym.symbol name}
let library_functions =
[
(Symbol.symbol "read_integer", TAst.FunTyp {ret = TAst.Int; params = []});
(Symbol.symbol "print_integer", TAst.FunTyp {ret = TAst.Void; params = [Param {paramname = make_ident "value"; typ = TAst.Int}]})
]