Sie sind auf Seite 1von 8

CS010 702

COMPILER CONSTRUCTION

1. FOUNDATIONS
Mahalingam P. R, Assistant Professor, Computer Science and Engineering
Number of sessions 2

Review of finite automata 1.1

Review of Context Free Grammar 1.2

Language translation from an NLP Perspective 1.3

Phases of a compiler 1.4

Analysis Phase 1.4.1

Synthesis Phase 1.4.2


Learning Outcomes

1. Understand the basic structure of a compiler


2. Apply the structure of compiler into a basic NLP specification

Learning Activities References

1. Lectures on automata and grammar 1. Aho A Ravi Sethi and J D Ullman, Compilers Principles
2. Tutorial on designing translator from English to Malayalam Techniques and Tools,Addison Wesley
3. Analogies between NLP and compilers 2. Kenneth C Louden, Compiler Construction Principles and
Practice, Cenage Learning Indian Edition
3. Tremblay and Sorenson, The Theory and Practice of Compiler
Writing - Tata McGraw Hill & Company
About Finite Automata

A finite automaton (FA) is a simple idealized machine used to recognize patterns within input taken from some character set (or
alphabet). The job of an FA is to accept or reject an input depending on whether the pattern defined by the FA occurs in the US
input.
A finite automaton consists of A=(Q,S,d,q0,F) IN
a finite set of N states G
a special start state
a set of final (or accepting) states TH
a set of transitions from one state to another, labeled with chars
We can represent a FA graphically, with nodes for states, and arcs for transitions.
E
PI
Executing the Automaton Modeling Finite Automata

The machine is generated from Regular Expressions.


TC
1. Begin in the start state HB
2. If the next input char matches the label on a transition
from the current state to a new state, go to that new O
state
3. Continue making transitions on each input char OK
4. If no move is possible, then stop TE
5. If in accepting state, then accept
Finite Automata accept Regular Languages M
Varieties of Automata
PL
Deterministic Finite Automaton AT
Non-determininstic Finite Automaton
E
About Context Free Grammar

A context-free grammar (CFG) is a set of recursive rewriting rules (or productions) used to generate patterns of strings.
A CFG consists of G=(V,T,S,P) US
a set of terminal symbols, which are the characters of the alphabet that appear in the strings generated by the grammar.
a set of nonterminal symbols, which are placeholders for patterns of terminal symbols that can be generated by the
IN
nonterminal symbols. G
a set of productions, which are rules for replacing (or rewriting) nonterminal symbols (on the left side of the production) in a
string with other nonterminal or terminal symbols (on the right side of the production). TH
a start symbol, which is a special nonterminal symbol that appears in the initial string generated by the grammar.
E
PI
Executing the Grammar Modeling Grammar as Trees

The grammar is executed using a Push Down Automaton


TC
1. Begin with a string consisting of the start symbol HB
2. Apply one of the productions with the start symbol on
the left hand size, replacing the start symbol with the O
right hand side of the production;
3. Repeat the process of selecting nonterminal symbols in OK
the string, and replacing them with the right hand side of
some corresponding production, until all nonterminals
TE
have been replaced by terminal symbols.
M
Forms of Context Free Grammar
PL
Chomsky Normal Form
Greibach Normal Form AT
E
Language Translation as an NLP problem

NLP Natural Language Processing

1. Identify different words in the source language


2. Understand the structure of the source language
3. Understand the meaning of the words in the context of the source language
4. Interpret the essential parts of the source sentence in a form that can be translated easily
5. Extract important meaning from the interpretation
6. Replace source words with words in the destination language
7. Form the sentence

TUTORIAL In terms of computer language translation

Rama killed Ravana 1. Identify important words of the language


2. Process as per syntax of the language
3. Understand meaning of the input statement
4. Convert statement into intermediate form
??????????????????????? 5. Optimize the intermediate code
6. Generate target code

COMPILER High level language Assembly language


Compilers Why study compilers?

A program that translates an executable Compilers are interesting


program in one language into an executable o Large complicated software systems that must efficiently tackle hard
program, usually in another language algorithmic problems that apply theory to practice
The compiler should improve the program in
some way Compilers are fundamental
o Primary responsibility for application performance (especially when
processors become more complex)
C and C++ are typically compiled
o The alternative (assembly language) is much less attractive

Interpreters
Compilers (& interpreters) are everywhere
o Many applications have embedded languages
A program that reads an executable program
and produces the results of executing that (XML, HTML, macros, commands, Visual Basic in Excel, )
program o Many applications have input formats that look like languages

Variety of abstractions
Python & Scheme are typically interpreted o Ranging from object orientation to hash maps to closures to ...
o Each of these abstractions has a price
o Understand the costs and make intelligent decisions about when to
replace an abstraction with a more efficient & concrete
Combination implementation. It can be the difference between a fast system & a
slow (or infeasible) one
Java is complicated
compiled to bytecode (code for the Java In many applications, performance matters.
VM ) which are then interpreted o Design at the appropriate level of abstraction
or a hybrid strategy is used Just-in-Time o Measure where the application spends time
compilation o In those places, replace the abstract implementation with a
semantically equivalent implementation that is faster and more
concrete
o Repeat until you are happy with the results
Phases of a compiler

Analysis Phase Synthesis Phase

1. Lexical Analyzer 1. Intermediate Code Generator


2. Syntax Analyzer 2. Code Optimizer
3. Semantic Analyzer 3. Code Generator
4. Intermediate Code Generator

Das könnte Ihnen auch gefallen