Overview of Compilation: Programming Language Principles

Overview of Compilation
Programming Language Principles Lecture 2
Prepared by
Manuel E. Bermdez, Ph.D.

Associate Professor University of Florida
Overview of Translation
Definition: A translator is an algorithm that converts source programs into equivalent target programs.
Source
Translator
Target
Definition: A compiler is a translator whose target language is at a lower level than its source language.
Overview of Translation (contd)

When is one languages level lower than anothers? Definition: An interpreter is an algorithm that simulates the execution of programs written in a given source language. input
Source
Interpreter
output
Overview of Translation (contd)

Definition: An implementation of a programming language consists of a translator (or compiler) for that language, and an interpreter for the corresponding target language. input Source Compiler Target Interpreter output
Translation
A source program may be translated an arbitrary number of times before the target program is generated. Source Translator1
Translator2 . . .
TranslatorN Target
Translation (contd)
Each of these translations is called a phase, not to be confused with a pass, i.e., a disk dump. Q: How should a compiler be divided into phases? A: So that each phase can be easily described by some formal model of computation, and so the phase can be carried out efficiently.
Translation (contd)
Q: How is a compiler usually divided? A: Two major phases, with many possibilities for subdivision. Phase 1: Analysis (determine correctness) Phase 2: Synthesis (produce target code) Another criterion: Phase 1: Syntax (form). Phase 2: Semantics (meaning).
Typical Compiler Breakdown

Scanning (Lexical analysis). Goal: Group sequences of characters that occur on the source, into logical atomic units called tokens. Examples of tokens: Identifiers, keywords, integers, strings, punctuation marks, white spaces, end-of-line characters, comments, etc.,
Source
Scanner (Lexical analysis)

Sequence of Tokens
Lexical Analysis
Must deal with end-of-line and end-offile characters. A preliminary classification of tokens is made. For example, both program and Ex are classified as Identifier. Someone must give unambiguous rules for forming tokens.
Screening
Goals: Remove unwanted tokens. Classify keywords. Merge/simplify tokens. Sequence of Tokens
Screener Sequence of Tokens
Screening
Keywords recognized. White spaces (and comments) discarded. The screener acts as an interface between the scanner and the next phase, the parser.
Parsing (Syntax Analysis)

Goals To group together the tokens, into the correct syntactic structures, if possible. To determine whether the tokens appear in patterns that syntactically correct.
Parsing (Syntax Analysis)

Syntactic structures: Expressions Statements Procedures Functions Modules Methodology: Use re-write rules (a.k.a. BNF).
String-To-Tree Transduction
Goal: To build a syntax tree from the sequence of rewrite rules. The tree will be the functional representation of the source. Method: Build tree bottom-up, as the rewrite rules are emitted. Use a stack of trees.
Contextual Constraint Analysis

Goal: To analyze static semantics, e.g., Are variables declared before they are used? Is there assignment compatibility? e.g., a:=3
Is there operator type compatibility? e.g., a+3

Do actual and formal parameter types match? e.g. int f(int n, char c) {} ... f('x', 3); Enforcement of scope rules.
Contextual Constraint Analysis

Method: Traverse the tree recursively, deducing type information at the bottom, and passing it up. Make use of a DECLARATION TABLE, to record information about names. Decorate tree with reference information.
Example
Chronologically, 1. Enter x into the DCLN table, with its type. 2. Check type compatibility for x=5. 3. X2 not declared! 4. Verify type of > is boolean. 5. Check type compatibility for +. 6. Check type compatibility between x and int, for assignment.
Code Generation
Goal: Convert syntax tree to target code. Target code could be: Machine language. Assembly language. Quadruples for a fictional machine: label opcode operands (1 or 2)
Code Generation
Example: pc on UNIX generates assembly code pi on UNIX generates code for the p machine, which is interpreted by an interpreter. pc: slow compilation, fast running code. pi: fast compilation, slow running code.
Method: Traverse the tree again.
Code (for a stack machine)

LOAD STORE LOAD LOAD BGT COND LOAD LOAD BADD STORE GOTO . . . L3 5 X X 10 L1 X 1 X L3 L2
L1
L2
Code Optimization
Goals: Reduce the size of the target program. Decrease the running time of the target.
Note: Optimization is a misnomer. Code improvement would be better.

Two types of optimization: Peephole optimization (local). Global optimization (improve loops, etc.).
Code Optimization (contd)

Example (from previous slide): LOAD 5 STORE X LOAD X can be replaced with LOAD 5 STND X
Store non-destructively, i.e., store in X, but do not destroy value on top of stack.
Summary
Source
Scanner Tokens Screener
Table Routines
Tokens Parser
Error Routines
Tree Constrainer
Tree Code Generator Code (for an abstract machine) Input Interpreter Output
Overview of Compilation
Programming Language Principles Lecture 2
Prepared by
Manuel E. Bermdez, Ph.D.

Associate Professor University of Florida

Overview of Compilation: Programming Language Principles

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Overview of Compilation: Programming Language Principles

Hochgeladen von

Copyright:

Verfügbare Formate

Overview of Compilation

Programming Language Principles Lecture 2

Manuel E. Bermdez, Ph.D.

Overview of Translation (contd)

Overview of Translation (contd)

Typical Compiler Breakdown

Scanner (Lexical analysis)

Screener Sequence of Tokens

Parsing (Syntax Analysis)

Parsing (Syntax Analysis)

Contextual Constraint Analysis

Is there operator type compatibility? e.g., a+3

Contextual Constraint Analysis

Method: Traverse the tree again.

Code (for a stack machine)

Note: Optimization is a misnomer. Code improvement would be better.

Code Optimization (contd)

Manuel E. Bermdez, Ph.D.

Das könnte Ihnen auch gefallen