Beruflich Dokumente
Kultur Dokumente
Overview
Compilers
Compilers translate from a source language (typically a high level language) to a functionally equivalent target language (typically the machine code of a particular machine or a machine-independent virtual machine). Compilers for high level programming languages are among the larger and more complex pieces of software
Original languages included Fortran and Cobol Often multi-pass compilers (to facilitate memory reuse) Compiler development helped in better programming language design Early development focused on syntactic analysis and optimization Commercially, compilers are developed by very large software groups Current focus is on optimization and smart use of resources for modern RISC (reduced instruction set computer) architectures.
2
token stream
1 ident "val"
3 assign -
2 number 10
4 times -
1 ident "val"
5 plus -
1 ident "i"
token number
token value
Statement
syntax tree
syntax tree
Front end
machine code
Interpreter
VM
source code is translated into the code of a virtual machine (VM) VM interprets the code simulating the physical machine
7
symbol table
maintains information about declared names and types
Lexical Analysis
Stream of characters is grouped into tokens Examples of tokens are identifiers, reserved words, integers, doubles or floats, delimiters, operators and special symbols
int a; a = a + 2;
int a ; a = a + 2 ; reserved word identifier special symbol identifier operator identifier operator integer constant special symbol
9
= a a + 2
10
Semantic Analysis
Parse tree is checked for things that violates the semantic rules of the language
Semantic rules may be written with an attribute grammar
Examples:
Using undeclared variables Function called with improper arguments Number and type of arguments Array variables used without array syntax Type checking of operator arguments Left hand side of an assignment must be a variable (sometimes called an L-value) ...
11
if (a <= b)
{ a = a c; }
c=b*c
Code Optimization
Compiler converts the intermediate representation to another one that attempts to be smaller and faster. Typical optimizations:
Inhibit code generation for unreachable segments Getting rid of unused variables Eliminating multiplication by 1 and addition by 0 Loop optimization: e.g. removing statements not modified in the loop Common sub-expression elimination ...
14
15
JIT (Just-In-Time) compilation of intermediate code (e.g. Java bytecode) can discover more context-specific optimizations not available earlier.
16
Symbol Table
Symbol table management is a part of the compiler that interacts with several of the phases
Identifiers are found in lexical analysis and placed in the symbol table During syntactical and semantical analysis, type and scope information is added During code generation, type information is used to determine what instructions to use During optimization, the live analysis may be kept in the symbol table
17
Error Handling
Error handling and reporting also occurs across many phases
Lexical analyzer reports invalid character sequences Syntactic analyzer reports invalid token sequences Semantic analyzer reports type and scope errors, and the like
The compiler may be able to continue with some errors, but other errors may stop the process
18
19
20