Exercises for Week 40#

Important

These exercises are in a draft status.

Exercise overview#

This week’s exercises focus on testing and validation tools for compilers. We’ll explore techniques for testing lexers, pattern recognition, and building comprehensive end-to-end testing frameworks for your compiler implementation.

Part 1: Pattern recognition in lexical analysis#

Many real-world applications require recognizing patterns beyond traditional programming language tokens. This exercise extends pattern recognition capabilities.

Task: Extending pattern recognition#

Starting with the hashtags example (available on Brightspace), extend the lexer to also identify email addresses in the input text.

Your implementation should:

  • Recognize hashtags (e.g., #compiler, #testing)

  • Identify email addresses (e.g., student@university.edu)

  • Handle edge cases like emails with dots, hyphens, and underscores

  • Properly tokenize mixed content containing both patterns

Part 2: Dummy parsers for lexer testing#

A dummy parser is a parser that, rather than building an AST from the lexed tokens, simply emits each token as it is read from the input. This is especially useful for verifying that your lexer works as expected.

Task: Create a dummy parser#

Write a parser that reads tokens from your lexer and outputs them in a readable format. Your dummy parser should:

  • Read tokens one by one from the lexer

  • Display each token with its type and value

  • Handle all token types your lexer produces

  • Report the line and column position if available

This tool will help you debug lexer issues by making token streams visible and verifiable.

Task: Test your lexer#

Use your dummy parser to test various Dolphin source files:

  • Create test inputs with edge cases (comments, string literals, numbers, operators)

  • Verify that tokens are correctly identified and positioned

  • Check that whitespace and comments are handled properly

Task: Testing pattern recognition#

Create comprehensive test cases for your pattern recognizer:

  • Valid and invalid email formats

  • Hashtags with various characters

  • Mixed text with multiple patterns

  • Edge cases and boundary conditions

Part 3: End-to-end testing framework#

Unlike other software domains, compiler testing generally does not involve extensive unit testing. Instead, most positive tests are end-to-end, while negative tests are end-to-wherever-the-error-occurs. This makes the test suite more robust to modifications of the AST and the addition of features.

Task: Design your testing framework#

Develop an end-to-end testing framework for your compiler. You may choose to use a unit testing framework to write your end-to-end tests, such as OUnit or Alcotest, or you may choose to hand-write your test execution.

For each test, your framework should:

  1. Read Dolphin source code from a test file

  2. Attempt to compile the source code to a binary

  3. Handle two cases:

    • Compilation fails: Compare the error message to an expected error

    • Compilation succeeds: Execute the compiled code and compare output to expected output

Task: Implement test categories#

Organize your tests into categories:

Positive tests (should compile and run):

  • Basic arithmetic expressions

  • Variable declarations and assignments

  • Control flow (if/else, loops)

  • Functions and recursion

  • Arrays and records

Negative tests (should fail with specific errors):

  • Syntax errors

  • Type errors

  • Undefined variables

  • Division by zero

  • Array bounds violations

Task: Automation and reporting#

Make your test suite production-ready:

  • The entire suite should run without human interaction

  • Provide clear output showing which tests passed/failed

  • Generate a summary report with statistics

  • Make it easy to add new test cases

  • Support running individual test categories

Consider implementing features like:

  • Timeout handling for infinite loops

  • Memory leak detection

  • Performance regression testing

  • Coverage reporting for your compiler code

Advanced task: Differential testing#

If time permits, implement differential testing:

  • Compare your compiler’s output with a reference implementation

  • Test optimization passes by comparing optimized vs unoptimized output

  • Verify that semantics-preserving transformations actually preserve semantics