3. LLVM-- programming#

3.1. Assignment overview#

This assignment covers writing and debugging LLVM-- programs with a focus on loops and conditionals. There are 3 tasks and 6 questions in this assignment. Questions marked for glory are optional. Do them only if you feel ambitious and have the time.

3.1.1. What you need to get started#

Make sure you understand everything from the lectures on LLVM-- (the lectures on Friday, Sep 15th, and Wednesday, Sep 20th). That is, everything in LLVM-- chapter except for aggregate and named types, GEP, and phi nodes.

Important

Revisit Arithmetic Expressions in OCaml and x86 Assembly and be sure you recall the specialties of your setup with respect to architecture and/or operating system.

3.1.2. What you need to hand in#

Please hand in a .zip file containing the following

  1. A brief report documenting your solution. Acceptable report formats are .pdf, .rtf, and .md. For each task and question, briefly (1 – 4 sentences) describe your implementation or answer. Write concisely.

  2. All the source files needed to reproduce your solution. This also includes the C code provided. Please explain in your report how the solution could be reproduced, e.g., calling make (if you have made a Makefile), the command line to call clang, etc. You do not need to provide any code for Task 3. Just write the answer in the report.

Note

If you are answering any of the glory questions, please submit them as separate solutions, i.e., in a separate, clearly marked subfolder in the .zip file of your submission.

Important

Make sure to understand all the code you hand in, including what is copied from here.

3.2. Write LLVM-- by hand#

Consider the following Dolphin program.

void number_perfect(n : int){
  var i = 1;
  while(i <= n){
    var temp = 0;
    for (var j = 1; j < i; j = j + 1){
      let d = i % j;
      temp = temp + (d == 0 ? j : 0);
    }
    if(temp == i)
      print_perfect(i);
    i = i + 1;
  }
}

The function print_perfect is defined as follows in a C file that is to be linked with the LLVM file you produce:

#include <stdio.h>         /* make sure these two includes are    */
#include <inttypes.h>      /* present in the start of your C file */

void print_perfect(int64_t i) {
  printf("%" PRId64 ": perfect\n", i);
}

int main() {
  number_perfect(10000);
  return 0;
}

The following is the expected output:

6: perfect
28: perfect
496: perfect
8128: perfect

Task 1: Rewrite in your favorite language

Pick your favorite programming language (second-favourite if your favourite is already Dolphin) and write the program above, including the part written in C above. In your report, explain briefly, in a short paragraph, what the program does. Write comments explaining the most important lines of code in the code you hand in (the code in your favorite language). Try to keep your program as close as you can to the code given above. A note about perfect numbers: a number is said to be perfect if it is equal to the sum of all its proper divisors, i.e., not the number itself.

Task 2: Translate to LLVM--

Translate the Dolphin code above, the function number_perfect, into LLVM--.

Remember to test your LLVM-- program by linking and running with different inputs by changing the number with which the function number_perfect is called (on the C side). Recall that this can be done using the following commands:

clang perfect.ll main.c
./a.out

where perfect.ll is your LLVM-- code and main.c is the C code above.

Question 1

How many basic block does your program have? Explain how you count basic blocks.

Question 2

How do you handle division by zero? Do you produce code for checking that the divisor is non-zero? Why?

Question 3 (glory **)

Use Phi nodes to translate the ternary operator (?:) above.

Question 4 (glory ***)

This is an extension of question 3 above. Write your LLVM-- code without the use of the alloca instruction.

Task 3: Debug LLVM--

Consider the following Dolphin program.

int rec_fun(acc : int, n : int){
  if(n > 0)
    return rec_fun(acc + 2, n - 1);
  return acc;
}

First, convince yourself that when called rec_fun(0, -10) will return 0.

A buggy compiler could produce the following code for this program:

define i64 @rec_fun (i64 %acc, i64 %n) {
 %n6 = alloca i64
 %acc5 = alloca i64
 store i64 %acc, i64* %acc5
 store i64 %n, i64* %n6
 %load_local_var7 = load i64, i64* %n6
 %arith_comp_op8 = icmp slt i64 %load_local_var7, 0
 br i1 %arith_comp_op8, label %ifthenelse_true_branch9, label %ifthenelse_false_branch10
ifthenelse_true_branch9:
 %load_local_var12 = load i64, i64* %acc5
 %arith_bin_op13 = add i64 %load_local_var12, 2
 %load_local_var14 = load i64, i64* %n6
 %arith_bin_op15 = sub i64 %load_local_var14, 1
 %call16 = call i64 @rec_fun (i64 %arith_bin_op13, i64 %arith_bin_op15)
 ret i64 %call16
after_return17:
 br label %ifthenelse_merge11
ifthenelse_false_branch10:
 br label %ifthenelse_merge11
ifthenelse_merge11:
 %load_local_var18 = load i64, i64* %acc5
 ret i64 %load_local_var18
after_return19:
 unreachable
}

In the LLVM-- code above, when the function rec_fun(0, -10) is called it crashes with a segmentation fault.

Identify the issue in the code above that causes the segmentation fault. Explain why the segmentation fault occurs.

Question 5

Make an educated guess as to what the compiler has done wrong to produce such a buggy code (this is about the bug in the compiler that leads to the bug in the code of Task 3).

Question 6

Find at least one other way a subtle mistake in the code generation phase of the compiler can cause a segmentation fault in a program like the one above.