Beruflich Dokumente
Kultur Dokumente
Compiler Construction
HIGH-LEVEL LANGUAGE - uses human languages
- focus on logic rather than complex computer
architecture
WE NEED A COMPILER!
• Compiler phases
– Lexical analysis
– Syntax analysis
– Semantic analysis
– Intermediate (machine-independent) code generation
– Intermediate code optimization
– Target (machine-dependent) code generation
– Target code optimization
Source program
Lexical Analyzer
FRONT END
Syntax Analyzer
Symbol table ANALYSIS Error
manager Semantic analyzer
Handler
Intermediate Code Generator
BACK END
Code Optimizer
SYNTHESIS
Target Code Generator and optimizer
Int main ( )
Int main () { {
for(i=0; i<5; i++) { For ( i =
printf(“Hello World”) 0 ; i <
} 5 ; i +
} + ) {
Printf ( I )
}
}
Feed an input to a finite automaton. Accepts and rejects
Syntax Analysis
• Context-free grammar
• Checks whether the token stream meets the grammatical specification of
the language and generates the syntax tree.
• If the program is grammatically correct, this phase generates an internal
representation that is easy to manipulate in later phases. Typically a syntax
tree (also called a parse tree).
• A grammar of a programming language is typically described by a context
free grammar, which also defines the structure of the parse tree.
• There are notation techniques like Backus-Naur Form
Syntax Analysis
CFG for arithmetic expressions:
<expression> --> number
<expression> --> ( <expression> )
<expression> --> <expression> + <expression>
<expression> --> <expression> - <expression>
<expression> --> <expression> * <expression>
<expression> --> <expression> / <expression>
Parsing:
1-1+1*1
-
number +
*
number number
number
Semantic Analysis
• Semantic analysis is applied by a compiler to discover
the meaning of a program by analyzing its parse tree or
abstract syntax tree.
• A program without grammatical errors may not always
be correct program.
Semantic Analysis
• Static semantic checks: performed at
compile time.
• Dynamic semantic checks: performed at
run time, and the compiler produces code
that performs these checks.
Code Generations and Intermediate
Code forms
• A typical intermediate form of code produced by the
semantic analyzer is an abstract syntax tree (AST)
• The AST is annotated with useful information such
as pointers to the symbol table entry of identifiers
AST of the code:
while b ≠ 0
if a > b
a := a − b
else
b := b − a
return a
Code Generations and Intermediate
Code forms
• There are other intermediate code forms
such as three-address code and single
static assignment.
Target Code Generation and
Optimization
• From the machine-independent form assembly or object
code is generated by the compiler
• This machine-specific code is optimized to exploit
specific hardware features
• Basically a compiler's code generator converts
the intermediate representation of source code or the
intermediate code forms generated by the
intermediate code generation and optimization into a
form (e.g.,machine code) that can be readily executed by
a machine.
Basic Understanding: Compiler (Summary)
• Compiler front-end: lexical analysis, syntax analysis,
semantic analysis
– Tasks: understanding the source code, making sure
the source code is written correctly
• Compiler back-end: Intermediate code
generation/improvement, and Machine code
generation/improvement
– Tasks: translating the program to a semantically the
same program (in a different language) that can be
easily understand/execute by the machine.
Error Detection in Compilers
• A compiler should detect all errors in the
source code and report them to the user.
Error Detection: Types of Error
Lexical Errors
Compilation Errors
Syntactic Errors
Semantic Errors
Execution Errors
Run-time Errors
Errors during Lexical analysis
• Strange characters. (ñ, श, £ etc)
• Long quoted strings
• Invalid numbers (12231.545.23)
Errors during Syntax Analysis
• A syntax error is produced by the compiler when the
program does not meet the grammatical
specification.
• Example: for the previous CFG. 1+*1 is an error.
Error during Semantic Analysis
• One of the most common errors reported during
semantic analysis is "identifier not declared"; either you
have omitted a declaration or you have misspelt an
identifier.
• Another error is assignment of incompatible types.
• Other possible sources of semantic errors are parameter
miscount and subscript miscount.
REFERENCES
http://www.pasteur.fr/formation/infobio/python/ch05s02.ht
ml
www.pcmag.com/encyclopedia/term/44266/high-level-
language
www.diku.dk/~torbenm/Basics/basics_lulu2.pdf - BASICS OF COMPILER
DESIGN