Cover
Hanspeter Mössenböck
Compiler Construction
Fundamentals and Applications

Springer 2025
ISBN 978-3-031-84812-4 (print)
ISBN 978-3-031-84813-1 (eBook)

The book covers the practical fundamentals of compiler construction, from lexical analysis and syntax analysis to semantic processing and code generation. Other topics include the description of translation processes using attributed grammars and the use of the compiler generator Coco/R to automatically generate the core parts of a compiler. As a running example, a compiler for MicroJava - a simple Java-like programming language - is developed, which generates executable bytecode similar to Java bytecode.


Lecture Slides

The book originated from a compiler construction course that the author has been giving for many years at the Johannes Kepler University Linz (Austria) and at Oxford Brookes University (England). It can be used as a textbook for an introductory compiler course or for self-study in practice. As a supplement to the book, you can find the lecture slides here in PowerPoint format (under the Creative Commons License CC BY-NC-SA 4.0).

  1. Overview
  2. Lexical Analysis
  3. Syntax Analysis
  4. Semantic Processing and Attributed Grammars
  5. Symbol Table
  6. Code Generation
  7. The Compiler Generator Coco/R
  8. Bottom-up Parsing

If you cannot open the slides in PowerPoint, because you do not have MS Office or some other compatible program (e.g. OpenOffice, LibreOffice, Keynote) on your computer, you can download the corresponding PDFs from here.


The MicroJava Compiler

As a running example, the book describes the implementation of a compiler for MicroJava, a simple Java-like programming language from which bytecode (similar to Java bytecode) is generated. The source code of the MicroJava compiler and the MicroJava VM can be downloaded from here.

  • Download: Download the file Compiler.zip, expand it, go to the directory Compiler and execute the batch file build.bat, which compiles the MicroJava compiler.
  • Compilation: To compile a MicroJava program, execute the batch file MJ.bat in the Compiler directory (e.g., MJ Eratos.mj). The compiler generates an object file (e.g., Eratos.obj) and outputs the generated bytecode to the console for checking. Eratos is an example program that calculates and outputs the prime numbers up to a certain upper limit.
  • Execution: To run the compiled MicroJava program, execute the batch file run.bat in the Compiler directory (e.g., run Eratos.obj [-debug]). After starting Eratos, enter the upper limit as a number (e.g., 100 plus return key). By activating the debug option, the interpreted bytecode instructions as well as the current contents of the expression stack are displayed on the console.
  • Decoding: To decode a compiled MicroJava program, execute the command java MJ.Decode fileName.obj in the Compiler directory. The decoded bytecode instructions are displayed on the console.
  • MicroJava VM: If you want to study the MicroJava VM including the bytecode interpreter in source code, you can take a look at the file MJ/CodeGen/Run.java.

The Compiler Generator Coco/R

The book describes the compiler generator Coco/R, which can be used to generate a scanner and a parser of a compiler (or a compiler-like tool) from a compact compiler specification. Coco/R accepts LL(*) grammars and generates parsers in recursive descent. There are versions for Java, C#, C++ and many other languages. Coco/R is open source and can be downloaded together with the user manual from https://ssw.jku.at/coco/.

For the Java version of Coco/R, it is sufficient to download the files Coco.jar, Scanner.frame and Parser.frame and place them in the same directory as the compiler specification (*.atg) to be processed with Coco/R.


Downloads for Individual Chapters of the Book

  • Chapter 2: Lexical Analysis, Exercise 11
    Download the file TestScanner.zip and expand it. Place your files Scanner.java and Token.java developed as part of the exercise into the directory Compiler/MJ/, go to the Compiler directory and execute runTestScanner.bat.
  • Chapter 3: Syntax Analysis, Exercise 12
    Download the file TestParser.zip and expand it. Place your files Scanner.java, Token.java and Parser.java developed as part of the exercise into the directory Compiler/MJ/, go to the Compiler directory and execute runTestParser.bat.
  • Section 7.3: Building Abstract Syntax Trees with Coco/R
    Section 7.3. of the book describes how to use the compiler generator Coco/R to generate a compiler that builds an abstract syntax tree from a MicroJava program. The source code of this example can be downloaded from Taste.zip. Expand this file, go to the directory Taste and execute build.bat there. You can then run the generated compiler, e.g., by the command java Taste Test.tas.

Sample Solutions to the Exercises

The book contains more than 70 exercises on all topics of the book, which are provided at the end of each chapter. Here are the sample solutions, but try to solve the exercises on your own, before looking at the solutions.

Chapter 1: Overview
- 1. Grammar of E-Mail Addresses
- 2. Grammar of Simplified Boolean Expressions
- 3. Grammar of Roman Numbers
- 4. Terminal Start and Successor Symbols (1)
- 5. Terminal Start and Successor Symbols (2)
- 6. Elimination of Left Recursion (1)
- 7. Elimination of Left Recursion (2)
- 8. Syntax Trees (1)
- 9. Syntax Trees (2)
- 10. Ambiguity

Chapter 2: Lexical Analysis
- 1. Regular Grammars (1)
- 2. Regular Grammars (2)
- 3. Regular Grammars (3)
- 4. Conversion of a Regular Grammar into a DFA
- 5. Conversion of a DFA into a Regular Grammar
- 6. Introduction of a New Token
- 7. Processing Nested Bracket Comments
- 8. Scanning Names
- 9. Scanning Numbers
- 10. Scanning Character Constants
- 11. Implementation of the Scanner for MicroJava

Chapter 3: Syntax Analysis
- 1. Pushdown Automaton
- 2. Recursive Descent (1)
- 3. Recursive Descent (2)
- 4. Recursive Descent (3)
- 5. Recursive Descent (4)
- 6. Recursive Descent (5)
- 7. LL(1) Property (1)
- 8. LL(1) Property (2)
- 9. LL(1) Property (3)
- 10. Syntax Error Handling with General Anchors
- 11. Syntax Error Handling with Specific Anchors
- 12. Implementation of the Parser for MicroJava

Chapter 4: Attributed Grammars
- 1. Understanding Attributed Grammars
- 2. Roman Numbers
- 3. Calculating with Clock Times
- 4. Address Assignment in Variable Declarations
- 5. Courses and Participants
- 6. Evaluation of Prefix Expressions
- 7. Scripting Language for Boolean Expressions
- 8. Translation from XML to JSON

Chapter 5: Symbol Table
- 1. Symbol Table with Simple Types
- 2. Symbol Table with Arrays
- 3. Symbol Table with Classes
- 4. Object and Structure Nodes
- 5. Type Checking
- 6. Implementation of Structural Equivalence
- 7. Parameters
- 8. Indirect Recursion
- 9. Implementation of the Symbol Table for MicroJava

Chapter 6: Code Generation
- 1. Operands of Code Generation (1)
- 2. Operands of Code Generation (2)
- 3. Control Structures (1)
- 4. Control Structures (2)
- 5. Compound Boolean Expressions
- 6. Method Call
- 7. Function Call
- 8. Increment and Decrement Statement
- 9. Language Extension: Special While Loop
- 10. Language Extension: Enumeration Types
- 11. Language_Extension: Data Type boolean
- 12. Implementation of the Code Generator for MicroJava

Chapter 7: The Compiler Generator Coco/R
- 1. Path Calculation
- 2. Interpreter for Boolean Expressions
- 3. Processing a Phone Book
- 4. Complexity Analysis
- 5. Extraction of Documentation Comments

Chapter 8: Bottom-up Syntax Analysis
- 1. Table Generation (1)
- 2. Table Generation (2)
- 3. Table Generation (3)
- 4. LR Error Handling (1)
- 5. LR Error Handling (2)