Beruflich Dokumente
Kultur Dokumente
Chapter 1
Introduction to Compiling
Chapter 1
CSE309N
Introduction to Compilers
As a Discipline, Involves Multiple CS&E Areas
Programming Languages and Algorithms
Theory of Computing & Software Engineering
Computer Architecture & Operating Systems
Has Deceivingly Simplistic Intent:
Source Target
Compiler Program
program
Error messages
Chapter 1
CSE309N
Classifications of Compilers
Compilers Viewed from Many Perspectives
Single Pass
Multiple Pass Construction
Load & Go
Debugging
Functional
Optimizing
Chapter 1
CSE309N
The Model
The TWO Fundamental Parts:
Chapter 1
CSE309N
Important Notes
Today: There are many Software Tools for helping
with the Analysis Part. This Wasn’t the Case in
Early Days. (some) analysis is also important in:
Structure / Syntax directed editors: Force
“syntactically” correct code to be entered
Takes input as a sequence of commands to build a
source program.
Performs:
– Text-creation
– Text modifications
– Analyzes the source program
Chapter 1
CSE309N
Important Notes (Continue)
Pretty Printers: Standardized version for program structure
(i.e., blank space, indenting, etc.)
Analyzes the source program and prints it in such a way that
the structure of the program becomes clearly visible.
Examples
Comments may appear in a special font
Statements may appear with an amount of indentations
proportional to the depth of their nesting in a hierarchical
organization of the stmts.
Static Checkers: A “quick” compilation to detect
rudimentary errors
Examples
Detects parts of the program that can never be executed
A variable used before it is defined
Interpreters: “real” time execution of code a “line-at-a-
time”
Chapter 1
CSE309N
Important Notes (Continue)
Compilation Is Not Limited to Programming Language
Applications
Text Formatters
LATEX & TROFF Are Languages Whose Commands
Format Text ( paragraphs, figures, mathematical
structures etc)
Silicon Compilers
Textual / Graphical: Take Input and Generate Circuit
Design
Database Query Processors
Database Query Languages Are Also a Programming
Language
Input is compiled Into a Set of Operations for
Accessing the Database
Chapter 1
CSE309N
The Many Phases of a Compiler
Source Program
1
Lexical Analyzer
2
Syntax Analyzer
3
Semantic Analyzer
5
Code Optimizer
6
Code Generator
Target Program
Chapter 1
CSE309N
Language-Processing System
Skeleton Source Program
1
Pre-Processor
Source program
2
Compiler
Target Assembly
program 3
Assembler
4 Relocatable
Machine Code
5 Library,
Loader relocatable
Link/Editor object files
Executable
Chapter 1
CSE309N
The Analysis Task For Compilation
Three Phases:
Linear / Lexical Analysis:
L-to-R Scan to Identify Tokens
token: sequence of chars having a collective meaning
Hierarchical Analysis:
Grouping of Tokens Into Meaningful Collection
Semantic Analysis:
Checking to ensure Correctness of Components
Chapter 1
CSE309N
Phase 1. Lexical Analysis
For
Example:
Position := initial + rate * 60 ;
_______ __ _____ _ ___ _ __ _
Chapter 1
CSE309N
Phase 2. Hierarchical Analysis
Parsing or Syntax Analysis
For previous example,
we would have
assignment Parse Tree:
statement
:=
identifier expression
+
position expression expression
*
identifier expression expression
initial identifier number
rate 60
expression is an (expression), or
expression + expression, or
expression * expression, or
number, or
identifier, or ...
Chapter 1
CSE309N
Why Have We Divided Analysis
in This Manner?
Lexical Analysis - Scans Input, Its Linear Actions
Are Not Recursive
Identify Only Individual “words” that are the the Tokens
of the Language
Chapter 1
CSE309N
Phase 3. Semantic Analysis
Find More Complicated Semantic Errors and
Support Code Generation
Parse Tree Is Augmented With Semantic Actions
:= :=
position + position +
initial * initial *
60
Compressed Tree Conversion Action
Chapter 1
CSE309N
Phase 3. Semantic Analysis
Most Important Activity in This Phase:
Type Checking - Legality of Operands
Many Different Situations:
Chapter 1
CSE309N
Analysis in Text Formatting
\begin{proof} begin
Embedded
\end{proof} in a single Language
\noindent stream of Commands
noindent
text, i.e.,
\section{Introduction a FILE section
}
$A_i$
\ and $ serve as signals to LATEX
$A_{i_j}$
Chapter 1
CSE309N
The Many Phases of a Compiler
Source Program
1
Lexical
Analyzer
2
Syntax Analyzer
3
Semantic Analyzer
5
Code Optimizer
6
Code Generator
Target Program
Chapter 1
CSE309N
The Synthesis Task For Compilation
Intermediate Code Generation
Abstract Machine Version of Code - Independent of
Architecture
Easy to Produce and
Easy to translate into target program
Code Optimization
Find More Efficient Ways to Execute Code
Replace Code With More Optimal Statements
Final Code Generation
Generate Relocatable Machine Dependent Code
Chapter 1
CSE309N
Reviewing the Entire Process
position := initial + rate * 60
lexical analyzer
id1 := id2 + id3 * 60
syntax analyzer
:=
id1 +
id2 *
id3 60
semantic analyzer
:=
Symbol + E
Table
id1 r
id2l *
r
position .... id3 inttoreal o
60 r
initial …. s
intermediate code generator
rate….
Chapter 1
CSE309N
Reviewing the Entire Process
Symbol Table E
r
position ....
r
initial …. o
intermediate code generator r
rate…. s
temp1 := inttoreal(60)
temp2 := id3 * temp1
temp3 := id2 + temp2 3 address code
id1 := temp3
code optimizer
temp1 := id3 * 60.0
id1 := id2 + temp1
final code generator
MOVF id3, R2
MULF #60.0, R2
MOVF id2, R1
ADDF R2, R1
MOVF R1, id1
Chapter 1
CSE309N
Assemblers
Assembly code: names are used for instructions,
and names are used for memory addresses.
MOV a, R1
ADD #2, R1
MOV R1, b
Two-pass Assembly:
First Pass: all identifiers are assigned to memory
addresses (0-offset)
e.g. substitute 0 for a, and 4 for b
Second Pass: produce relocatable machine code:
Load
0001 01 00 00000000 *
Store 0011 01 10 00000010 relocation
bit
0010 01 00 00000100 *
add
Chapter 1
CSE309N
Loaders and Link-Editors
Loader: taking relocatable machine code, altering
the addresses and placing the altered instructions
into memory.
Chapter 1
CSE309N
Compiler Cousins: Preprocessors
Provide Input to Compilers
1. Macro Processing
#define X 3
#define Y A*B+C
#define Z getchar()
Chapter 1
CSE309N
2. File Inclusion
defs.h main.c
Chapter 1
CSE309N
3. Rational Preprocessors
Augment “Old” Languages With Modern Constructs
Add Macros for If - Then, While, Etc.
#Define Can Make C Code More Pascal-like
#define begin {
#define end }
Chapter 1
CSE309N
4. Language Extensions for a
Database System
is Preprocessed into:
ingres_system(“Retr…..Research’”,____,____);
Chapter 1
CSE309N
The Grouping of Phases
Number of Passes:
A pass: requires r/w intermediate files
Chapter 1
CSE309N
Compiler Construction Tools
Parser Generators:
Produce Syntax Analyzers
Scanner Generators:
Produce Lexical Analyzers
Syntax-directed Translation Engines:
Generate Intermediate Code
Automatic Code Generators:
Generate Actual Code
Data-Flow Engines:
Support Optimization
Chapter 1
CSE309N
The End
Chapter 1