|
The book covers the practical fundamentals of compiler construction, from lexical analysis
and syntax analysis to semantic processing and code generation. Other topics include the
description of translation processes using attributed grammars and the use of the compiler
generator Coco/R to automatically generate the core parts of a compiler.
As a running example, a compiler for MicroJava - a simple Java-like programming language -
is developed, which generates executable bytecode similar to Java bytecode.
The book originated from a compiler construction course that the author has been giving
for many years at the Johannes Kepler University Linz (Austria)
and at Oxford Brookes University (England).
It can be used as a textbook for an introductory compiler course or for self-study in practice.
As a supplement to the book, you can find the lecture slides here in PowerPoint format (under the
Creative Commons License CC BY-NC-SA 4.0).
- Overview
- Lexical Analysis
- Syntax Analysis
- Semantic Processing and Attributed Grammars
- Symbol Table
- Code Generation
- The Compiler Generator Coco/R
- Bottom-up Parsing
If you cannot open the slides in PowerPoint, because you do not have MS Office or some
other compatible program (e.g. OpenOffice, LibreOffice, Keynote) on your computer,
you can download the corresponding PDFs from here.
As a running example, the book describes the implementation of a compiler for MicroJava,
a simple Java-like programming language from which bytecode (similar to
Java bytecode) is generated. The source code of the MicroJava compiler and the MicroJava VM
can be downloaded from here.
-
Download: Download the file Compiler.zip, expand it,
go to the directory Compiler and execute the batch file build.bat,
which compiles the MicroJava compiler.
-
Compilation: To compile a MicroJava program, execute the batch file MJ.bat
in the Compiler directory (e.g., MJ Eratos.mj).
The compiler generates an object file (e.g., Eratos.obj) and outputs the generated
bytecode to the console for checking. Eratos is an example program that calculates and outputs the prime numbers
up to a certain upper limit.
-
Execution: To run the compiled MicroJava program, execute the batch file run.bat
in the Compiler directory (e.g., run Eratos.obj [-debug]).
After starting Eratos, enter the upper limit as a number (e.g., 100 plus return key). By activating the
debug option, the interpreted bytecode instructions as well as the current contents
of the expression stack are displayed on the console.
-
Decoding: To decode a compiled MicroJava program, execute the command
java MJ.Decode fileName.obj in the Compiler directory.
The decoded bytecode instructions are displayed on the console.
-
MicroJava VM: If you want to study the MicroJava VM including the bytecode interpreter in source code,
you can take a look at the file MJ/CodeGen/Run.java.
The book describes the compiler generator Coco/R, which can be used to generate a scanner and a parser
of a compiler (or a compiler-like tool) from a compact compiler specification. Coco/R accepts LL(*) grammars
and generates parsers in recursive descent. There are versions for Java, C#, C++ and many other languages.
Coco/R is open source and can be downloaded together with the user manual from
https://ssw.jku.at/coco/.
For the Java version of Coco/R, it is sufficient to download the files Coco.jar,
Scanner.frame and Parser.frame and place them in the same
directory as the compiler specification (*.atg) to be processed with Coco/R.
-
Chapter 2: Lexical Analysis, Exercise 11
Download the file TestScanner.zip and expand it. Place your files
Scanner.java and Token.java developed as part
of the exercise into the directory Compiler/MJ/, go to the
Compiler directory and execute runTestScanner.bat.
-
Chapter 3: Syntax Analysis, Exercise 12
Download the file TestParser.zip and expand it. Place your files
Scanner.java, Token.java and
Parser.java developed as part of the exercise into the directory
Compiler/MJ/, go to the Compiler directory and
execute runTestParser.bat.
-
Section 7.3: Building Abstract Syntax Trees with Coco/R
Section 7.3. of the book describes how to use the compiler generator Coco/R to generate a compiler
that builds an abstract syntax tree from a MicroJava program. The source code of this example can be downloaded
from Taste.zip. Expand this file, go to the directory Taste
and execute build.bat there. You can then run the generated compiler, e.g.,
by the command java Taste Test.tas.
The book contains more than 70 exercises on all topics of the book, which are provided at the end of each chapter.
Here are the sample solutions, but try to solve the exercises on your own, before looking at the solutions.
Chapter 1: Overview
- 1. Grammar of E-Mail Addresses
- 2. Grammar of Simplified Boolean Expressions
- 3. Grammar of Roman Numbers
- 4. Terminal Start and Successor Symbols (1)
- 5. Terminal Start and Successor Symbols (2)
- 6. Elimination of Left Recursion (1)
- 7. Elimination of Left Recursion (2)
- 8. Syntax Trees (1)
- 9. Syntax Trees (2)
- 10. Ambiguity
Chapter 2: Lexical Analysis
- 1. Regular Grammars (1)
- 2. Regular Grammars (2)
- 3. Regular Grammars (3)
- 4. Conversion of a Regular Grammar into a DFA
- 5. Conversion of a DFA into a Regular Grammar
- 6. Introduction of a New Token
- 7. Processing Nested Bracket Comments
- 8. Scanning Names
- 9. Scanning Numbers
- 10. Scanning Character Constants
- 11. Implementation of the Scanner for MicroJava
Chapter 3: Syntax Analysis
- 1. Pushdown Automaton
- 2. Recursive Descent (1)
- 3. Recursive Descent (2)
- 4. Recursive Descent (3)
- 5. Recursive Descent (4)
- 6. Recursive Descent (5)
- 7. LL(1) Property (1)
- 8. LL(1) Property (2)
- 9. LL(1) Property (3)
- 10. Syntax Error Handling with General Anchors
- 11. Syntax Error Handling with Specific Anchors
- 12. Implementation of the Parser for MicroJava
Chapter 4: Attributed Grammars
- 1. Understanding Attributed Grammars
- 2. Roman Numbers
- 3. Calculating with Clock Times
- 4. Address Assignment in Variable Declarations
- 5. Courses and Participants
- 6. Evaluation of Prefix Expressions
- 7. Scripting Language for Boolean Expressions
- 8. Translation from XML to JSON
Chapter 5: Symbol Table
- 1. Symbol Table with Simple Types
- 2. Symbol Table with Arrays
- 3. Symbol Table with Classes
- 4. Object and Structure Nodes
- 5. Type Checking
- 6. Implementation of Structural Equivalence
- 7. Parameters
- 8. Indirect Recursion
- 9. Implementation of the Symbol Table for MicroJava
Chapter 6: Code Generation
- 1. Operands of Code Generation (1)
- 2. Operands of Code Generation (2)
- 3. Control Structures (1)
- 4. Control Structures (2)
- 5. Compound Boolean Expressions
- 6. Method Call
- 7. Function Call
- 8. Increment and Decrement Statement
- 9. Language Extension: Special While Loop
- 10. Language Extension: Enumeration Types
- 11. Language_Extension: Data Type boolean
- 12. Implementation of the Code Generator for MicroJava
Chapter 7: The Compiler Generator Coco/R
- 1. Path Calculation
- 2. Interpreter for Boolean Expressions
- 3. Processing a Phone Book
- 4. Complexity Analysis
- 5. Extraction of Documentation Comments
Chapter 8: Bottom-up Syntax Analysis
- 1. Table Generation (1)
- 2. Table Generation (2)
- 3. Table Generation (3)
- 4. LR Error Handling (1)
- 5. LR Error Handling (2)
|