Sie sind auf Seite 1von 479

CSE340 - Principles of

Programming Languages
Lecture 01:
Course Presentation

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-21
Office Hours: By appointment

Definitions

CSE340 - Principles of
Programming Languages

Tell a computer what to do

Method of communication consisting of the use of


signs or words in a structured and conventional way

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Language Levels

C++
Java
Fortran

High-Level Language
Assembly Language
Machine Language
Hardware
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Machine Language

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Assembly Language

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

High-Level Languages
compilation

execution

// sorce code
int x;

Lexer

int foo () {
read (x);
print (5);
}

Parser

main () {
foo ();
}

Semantic Analyzer

Virtual Machine
(interpreter)

Code Generation

X,E,G,O,O
#e1,I,I,0,7
@
OPR 19, AX
STO x, AX
LIT 5, AX
OPR 21, AX
LOD #e1,AX
CAL 1, AX
OPR 0, AX

01001010101000010
01010100101010010
10100100000011011
11010010110101111
00010010101010010
10101001010101011

Assembler
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Language Paradigms

Procedural

program = algorithms + data

Object-Oriented

program = objects + messages

Functional

program = functions functions

Logic Programming

program = facts + rules

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Calendar

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Calendar

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Grading
Exams (2)

20% + 20%

40%

20%

20%

10% + 10% + 10% + 10%

40%

Final Comprehensive
Exam
Programming
Assignments (4)

100%

97

A+

86

B+

74

C+

93

82

70

89

A-

78

B-

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Text book
Chapter 1. Introduction
Chapter 6. Syntax
Chapter 7. Basic Semantics
Chapter 8. Data Types
Chapter 9. Expressions and Statements
Chapter 10. Procedures

Chapter 3. Functional Programming


Chapter 4. Logic Programming
Chapter 5. OO Programming
Chapter 12. Formal Semantics

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Homework

Read the Syllabus of the course

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 02:
Introduction

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-21
Office Hours: By appointment

Teaching Assistants
Bolun Li
bolunli@asu.edu
Graduate Student / Computer Science MS
Steven Lombardi
steven.lombardi@asu.edu
Undergraduate Student / Computer Science BS

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Instructor
Javier Gonzalez-Sanchez
javiergs@asu.edu
GraduateTeaching Associate
www.javiergs.com

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Keywords

Lexical

Level

Input:
Symbols
Output:
Words

Paradigm

Input:
Analysis

Language

Syntax

Words
Output:
Sentences

Translate or
Execute
Semantic

Input:
Sentences

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Analysis
int x = 5;
float y = "hello;
String @z = "9.5";
int x = cse340;
if ( x > 14) while (5 == 5) if (int a) a = 1;
x = x; for ( ; ; );
y = 13.45.0;
int me = 99999000001111222000000111111222223443483045830948;
while { x != 9} ();
int {x} = 10;

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

High-Level Languages
compilation

execution

// sorce code
int x;

Lexer

int foo () {
read (x);
print (5);
}

Parser

main () {
foo ();
}

Semantic Analyzer

Virtual Machine
(interpreter)

Code Generation

X,E,G,O,O
#e1,I,I,0,7
@
OPR 19, AX
STO x, AX
LIT 5, AX
OPR 21, AX
LOD #e1,AX
CAL 1, AX
OPR 0, AX

01001010101000010
01010100101010010
10100100000011011
11010010110101111
00010010101010010
10101001010101011

Assembler
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Keywords

Lexical

Alphabet

Symbol

String

Word

Token

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Lexical Analysis
int x = 5;
float y = "hello;
String @z = "9.5";
int x = cse340;
if ( x > 14) while (5 == 5) if (int a) a = 1;
x = x; for ( ; ; );
y = 13.45.0;
int me = 99999000001111222000000111111222223443483045830948;
while { x != 9} ();
int {x} = 10;

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Lexical Analysis | Steps


a) Read a text FILE line by line
b) For each LINE:
Read character by character.
Create sets of consecutive characters (STRING). Try to
group the bigger amount of characters as possible.
Start a new set each time that you need. Take care
of: Whitespace, Delimiter, Operator, End of Line and
others special characters.
c) For each STRING: verify if it is a valid WORD.
d) Create a VECTOR and store the STRINGs and WORDs.
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Lexical Analysis
int x = 5; float y = "hello;
String@z="9.5;intx=cse340;if(x>
14) while
(5 == 5) if (int a) a = 1; x = x;
for ( ; ; );y = 13.45.0;int me
=99999000001111222000000111111222
223443483045830948;while { x !=
9} ();int {x} = 10;

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Lexical Analysis
# of STRINGS

int x = 5; float y = "hello;


String@z="9.5;intx=cse340;if(x>
14) while
(5 == 5) if (int a) a = 1; x = x;
for ( ; ; );y = 13.45.0;int me
=99999000001111222000000111111222
223443483045830948;while { x !=
9} ();int {x} = 10;
hello "world" bye"

9
12
3

18
12
2
6
12

3
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Keywords

Lexical

Alphabet

Symbol

String

Word
Regular
Expression

Token

Rules
Deterministic
Finite
Automata
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Homework

Review the following topics:


Regular Expressions (Text Book: Chapter 6)
and Deterministic Finite Automata

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 03:
Lexical Analysis

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-21
Office Hours: By appointment

Keywords

Lexical

Alphabet

Symbol

String

Word
Regular
Expression

Token

Rules
Deterministic
Finite
Automata
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Regular expression
A rule to describe finite combination of symbols
(sequences) that are considered well-formed.
Regular expression has symbols and operators
Symbols are defined in the alphabet
The operators used in regular expressions are: * (0 or
more), + (1 or more), ? (0 or 1), | (or). Besides those
we can use [ ] to enclose sets of symbols without
enumerating all of them, such as [0-9] or [A-Z]. Also,
we can use parenthesis.
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Regular expression | Examples


Token

Regular Expression (rule)

Examples (words)

foobarOne

(a | b)

{a, b}

foobarTwo

(a | b)(a | b)

{aa, bb, ba, ab}

foobarThree

a*

, a, aa, aaa, aaaa, ... }

foobarFour

(a | b)*

, a, b, aa, bb, ...abba ...}

foobarFive

a+

{ a, aa, aaa, aaaa, ... }

foobarSix

[a-z]+

{hello, world, etc, }

number

[0-9]+

{1934, 0101, 33, 12321}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Regular expression | Examples


Token1

Regular Expression (rule)

Example (word)

digit

0 | 1 | 2 | 3 | ... | 9

integer

digit+

1945

fraction

.digit+

.55

exponent

e(+|-)?digit+

e+210

floatDraftOne

integer(fraction?) (exponent?)

340.08e-14

floatDraftTwo

{[-+]?([0-9]+\.?[0-9]*|\.[0-9]+)([eE][-+]?[0-9]+)?}

binary

0b(0|1)+

0b1010

1. These definitions are NOT fully complete or correct. They purpose is only to exemplify RE. For
instance 07 match as an integer, which will NOT be the case for our language.
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Regular expression | Operators

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Deterministic Finite Automata


It is a finite state machine that accepts/rejects finite
strings of symbols and produces a unique result for
each input string.
In the automaton, there are three states (denoted
graphically by circles) and transition arrows
connecting one state with other.
Upon reading a symbol, a DFA jumps
deterministically from a state to another by
following the transition arrow.

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

DFA | Examples
binary
0b(0|1)+

string
([a-z] | [A-Z] | [0-9])*

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

DFA | Examples
{}
Char

{.}

{}
Operator

{.}

{+,-,*,/,%,
<,>,=,!,}

Start

{(, ), {, }, [, ]}

{}

{}

String

{a-z}

Delimiter

{0-9}

{_}

{\.}

{0-9}

ID

{$}
Integer

Float

{\.}

{0-9}

{$, _, 0-9, a-z}


{0-9}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Additional Examples
Regular Expressions and Deterministic Finite Automata

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Handwritten notes

Regular expression

Regular expression

Regular expression

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Handwritten notes

Deterministic
Finite Automa

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Handwritten notes
Regular expression
Regular expression

-9
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Handwritten notes

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Handwritten notes

Is this correct?
Is this correct?
Is this correct?
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Handwritten notes

Error
Correct

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Regular expressions - Examples


Token

Regular Expression (rule)

Examples (words)

foobar4

{"a", "b", "c"}*

{, "a", "b", "c", "aa", "ab", "ac", "ba", "bb", "bc",


"ca", "cb", "cc", ...}

foobar5

{"ab", "c"}*

{, "ab", "c", "abab", "abc", "cab", "cc", "ababab",


"ababc", "abcab, ...}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Regular expressions - Examples


Define a regular expression for each case
a) URLs
b) Email addresses
c) ZIP codes

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

DFA- Examples
Define a DFA for each case
a) URLs
b) Email addresses
c) ZIP codes

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 19

Ours Tokens
Which tokens are needed for a programming language?
a) Reserved words
b) Special Symbols: Operators and delimiters
c) Identifiers

d) Literals or constants

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 20

Drafting a Lexer
Keywords =

Operator =

Delimiter =

ID =

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 21

Drafting a Lexer
Float =
Integer =
Hexadecimal =
Octal =
Binary =
String =
Char =
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 22

Homework

Define the necessary lexical rules for a programming language


Express these rules using a DFA and Regular Expressions
Share them on Blackboard and discuss their correctness with your classmates.

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 23

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 04:
Lexer Implementation 1

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-21
Office Hours: By appointment

Review
Given the following token definitions (using regular expressions)
!
t1 = aabb!
t2 = aab!
t3 = (a | b) *!

1. Are the following strings correct?


aaba!
a!
aab!
!

2. Which are the token for each of them?


4. Which symbols are in the alphabet ?
3. Create a DFA that represents the previous rules.
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Review
1. How many words:
-5
-5.5e-5
5-5

2. Which is the difference between these regular expressions?


[0-9]+.[0-9]+
[0-9]+\.[0-9]+

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Programming a Lexer

Regular
Expresion

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Programming a Lexer

Regular
Expresion

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Using IF-ELSE
It is not a good idea!

February 13th, 2008 by Rich Sharpe. Posted in Software Quality, Software Quality Metrics

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Using a State Machine


1. Put the DFA in a Table

Delimiter, operator,
b 0 1 ... whitespace, quotation
mark

S0

S1

S2

S3

S0 SE S1 SE SE

Stop

S1 S2 SE SE SE

Stop

S2 SE S3 S3 SE

Stop

S3 SE S3 S3 SE

Stop

SE SE SE SE SE

Stop

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Using a State Machine


2. Put the Table in Java
b

// constants
private
private
private
private
private
private
private

static
static
static
static
static
static
static

final
final
final
final
final
final
final

int
int
int
int
int
int
int

ZERO
ONE
B
OTHER
DELIMITER
ERROR
STOP

= 1;
= 2;
= 0;
= 3;
= 4;
= 4;
= -2;

// table as a 2D array

...

Delimiter, operator,
whitespace, quotation
mark

S0 SE

S1 SE SE


Stop

S1 S2

SE SE SE

Stop

S2 SE

S3 S3 SE


Stop

S3

SE

S3

S3 SE

Stop

SE

SE

SE SE SE

Stop

private static int[][] stateTable = {


{ERROR,
1, ERROR, ERROR, STOP},
{
2, ERROR, ERROR, ERROR, STOP},
{ERROR,
3,
3, ERROR, STOP},
{ERROR,
3,
3, ERROR, STOP},
{ERROR, ERROR, ERROR, ERROR, STOP}
};
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Using a State Machine


STEP 3. Algorithm
b

...

S0 SE S1 SE SE Stop



S1 S2 SE SE

void splitLine (line) {


state = S0;
String string ="";
do {
l = line.readNextLetter();
go = calculateNextState(state, l);
if( go != STOP ) {
string = string + l;
state = go;
}
} while (line.hasLetters() && go != STOP);

SE Stop

if (state == S3)
print (It is a BINARY number);
else
print (error);

S2 SE S3 S3 SE Stop




S3 SE S3
SE SE SE

S3
SE

if( isDelimiter(currentChar))
print (Also, there is a DELIMITER);
else if (isOperator(currentChar) )
print (Also, there is an OPERATOR);

SE Stop

// loop
if (line.hasLetters() ))
splitLine( line string );

SE Stop

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Programming Assignment #1
1. Read a File; Split the lines using the
System.lineSeparator
2. For each line read character by character and use the
character as an input for the state machine
3. Concatenate the character, creating the largest
STRING possible. Stop when a delimiter, white space,
operator, or quotation mark and the current state
allowed. If there are more characters in the line, create
a new line with those characters and go to step 2.
4. For each STRING and WORD report its TOKEN or ERROR
as correspond.
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Homework

Define the necessary lexical rules for a programming language


Express these rules using a DFA and Regular Expressions
Share them on Blackboard and discuss their correctness with your classmates.

Remember: Using a

DETERMINISTIC Finite Automata


Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 05:
Lexer Implementation 2

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-21
Office Hours: By appointment

Review
1. how to exclude keywords from Identifiers?
2. The dot:
foobar1
foobar1
foobar1
foobar1

=
=
=
=

.
.*
.+
\.

3. What is the problem here?


string = " \.* "
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Review
4. Rules and Sub-rules
URL definition:
https?:// ([a z]+ | [A Z]+ | [0 9]+ | - | . | _ | ~ | %21
| %23 | %24 | %26 | %27 | %28 | %29 | %2A | %2B | %2C | %3A |
%3B | %3D | %3F | %40 | %5B | %5D )* %2F ([a z]+ | [A Z]+ |
[0 9]+ | - | . | _ | ~ | ! | # | $ | & | | ( | ) | * | +
| , | / | : | ; | = | ? | @ | [ | ] )+

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Review
5. Making things shorter
ZIP definition:

[1-9][1-9][1-9][1-9][1-9](-[1-9][1-9][1-9][1-9])?
[0-9]{5}(-[0-9]{4})?

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Programming Assignment #1

Only BINARY, DELIMITER, and OPERATOR are implemented. You will implement the rest of the
required tokens (rules).
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Programming Assignment #1

* Lexer.java is the only file that you are allowed to modify

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Code | Token.java

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Code | Gui.java

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Code | Lexer.java

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Code | Lexer.java

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Code | Lexer.java

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Code | Lexer.java

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Code | Lexer.java

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Code | Lexer.java

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Code | input.txt
hello;world cse340 asu 2013/05/31 // end
boolean $xx= ((((((((23WE + 44 - 3 / 2 % 45 <=17) > 0xfffff.34.45;

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Code | output.txt
IDENTIFIER
DELIMITER
IDENTIFIER
IDENTIFIER
IDENTIFIER
INTEGER
OPERATOR
OCTAL
OPERATOR
INTEGER
OPERATOR
OPERATOR
IDENTIFIER
KEYWORD

hello
;
world
cse340
asu
2013
/
05
/
31
/
/
end
boolean

IDENTIFIER
OPERATOR
DELIMITER
DELIMITER
DELIMITER
DELIMITER
DELIMITER
DELIMITER
DELIMITER
DELIMITER

$xx
=
(
(
(
(
(
(
(
(

ERROR
OPERATOR
INTEGER
OPERATOR
INTEGER
OPERATOR
INTEGER
OPERATOR
INTEGER
OPERATOR
OPERATOR
INTEGER
DELIMITER
OPERATOR

23WE
+
44
3
/
2
%
45
<
=
17
)
>

ERROR
DELIMITER

0xfffff.34.45
;

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Homework

Programming Assignment #1
Develop a Lexical Analyzer by coding a DFA

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 06:
Closing with Lexical Analysis

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-21
Office Hours: By appointment

Errata #1

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Errata #2

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Errata #3

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Programming Assignment #1

?
70%
?
100%
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Programming Assignment #1

?
100%
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Programming Assignment #1

?
70%
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Programming Assignment #1

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Programming Assignment #1

Firstname_Lastname_P1.zip
Compile and Run
Recognize BINARY
Recognize DELIMITER and OPERATOR

approx. 20%

Recognize INTEGER
Recognize OCTAL
Recognize HEXADECIMAL
Recognize IDENTIFIER

approx. 40%

Recognize STRING
Recognize CHAR
Recognize KEYWORD
Recognize FLOAT

approx. 40%

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Review | Lexical Analysis


Are the following STRINGS correct or not? Why?
000000005
000000009
000000009.1
000000005
000000005.1
0x0000002
0123456789
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Review | Lexical Analysis


Are the following STRINGS correct or not? Why?
1.2e---3++
$50

float ________________ = 5;

double x = 000000.1;
'''a'
'\''b'
'\'b'
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Review | Lexical Analysis


Are the following STRINGS correct or not? Why?
" \\\\\\\\\\a"
"Hello""world"
abc"Hello"
''
'\x
\a'
\w
"\\\"
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

State Transition Table


for our Lexer
(step by step)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

BINARY
INTEGER
OCTAL
HEXADECIMAL
IDENTIFIER

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Lexer Step by Step

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Lexer Step by Step

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Lexer Step by Step

states

1
2
3
5
7
8
9
10

=
=
=
=
=
=
=
=

INTEGER
INTEGER
IDENTIFIER
OCTAL
INTEGER
IDENTIFIER
BINARY
HEXADECIMAL

columns

0
$
_
[1]
[2-7]
[8-9]
[A]
B
[C-F]
[G-W]
X
[Y-Z]

[a-z] = [A] B [C-F] [G-W] X [Y-Z]


[a-f] = [A] B [C-F]
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Lexer Step by Step


0

[1]

[2-7]

[8-9]

S2

S2

[C-F]

[G-W]

S3

S3

S3

S3

SE

Stop

SE S4 SE SE

[Y-Z]

...

Delimiter, operator,
whitespace, quotation
mark

S1

S3

S3

S2

S1

S5

SE

SE

S5 S5 SE

S6

SE

SE

Stop

S2

S7

SE

SE

S7 S7 S7 SE SE SE SE SE

SE

SE

Stop

S3

S8

S8

S8 S8 S8 S8 S8 S8 S8 S8 S8

S8

SE

Stop

S4

S9

SE

SE

S9

SE

SE SE SE SE SE

SE

SE

Stop

S5

S5

SE

SE

S5 S5 SE

SE SE SE SE SE

SE

SE

Stop

S6

S10

SE

SE

S10 S10 S10 S10 S10 S10 SE SE

SE

SE

Stop

S7

S7

SE

SE

S7 S7 S7 SE SE SE SE SE

SE

SE

Stop

S8

S8

S8

S8 S8 S8 S8 S8 S8 S8 S8 S8

S8

SE

Stop

S9

S9

SE

SE

S9

SE SE SE SE SE

SE

SE

Stop

S10

S10

SE

SE

S10 S10 S10 S10 S10 S10 SE SE

SE

SE

Stop

SE

SE

SE

SE

SE

SE

SE

Stop

SE
SE

SE
SE

S3

S0

SE

S3

SE SE SE SE SE

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

Homework

Review Recursion
Solve the Problem Set #1 in preparation for your exam

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 19

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 07:
Syntactic Analysis 1

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-21
Office Hours: By appointment

Next Step

Lexical Analysis

Syntactic Analysis

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Question
For each cases indicate whether it is possible or not to
generate a regular expression or a DFA.

i.

Detect the balance of N parenthesis in a string


that has N parenthesis nested and any characters
in between the parenthesis.

ii. Is it possible to detect binary strings with the same


quantity of 0s and 1s (it does not matter the order
or sequence).
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Where are we now?


After lexical analysis, we have a series of tokens.
But we can not:
I.

define a regular expression matching all


expressions with properly balanced parentheses.

II. i.e., define a regular expression matching all


functions with properly nested block structure.
void a () { b (c); for (;;) {a=(-(1+2)+5); } }

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Where are we now?


Now, we want to:
I.

Review the structure described by that series of


tokens

II. Report errors if those tokens do not properly


encode a structure

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

High-Level Languages
compilation

execution

// sorce code
int x;

Lexer

int foo () {
read (x);
print (5);
}

Parser

main () {
foo ();
}

Semantic Analyzer

Virtual Machine
(interpreter)

Code Generation

X,E,G,O,O
#e1,I,I,0,7
@
OPR 19, AX
STO x, AX
LIT 5, AX
OPR 21, AX
LOD #e1,AX
CAL 1, AX
OPR 0, AX

01001010101000010
01010100101010010
10100100000011011
11010010110101111
00010010101010010
10101001010101011

Assembler
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Outline
Symbols
Rules
Token

Lexical Analysis
(Lexer)

Regular Expression
Tools
DFA

Language
Grammar
(Rules)
Syntactic
Analysis
(Parser)

Terminal
Non-terminal
BNF
(Backus-Naur Form)

Tools
Syntax Diagrams

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Grammar | Example
Describe all legal arithmetic expressions using
addition, subtraction, multiplication, and division with
integer values
E E OP E
E integer
OP + | - | * | /
E ( E )

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Grammar | Definition
A Grammar is a collection of
four elements:

E E OP E
E integer
OP + | - | * | /
E ( E )

Set of nonterminal symbols


(uppercase)
Set of terminal symbols
(lowercase). Terminals can be
tokens or specific words
Set of production rules saying
how each nonterminal can
be converted by a string of
terminals and nonterminals,
A start symbol

Javier Gonzalez-Sanchez | CSE340 | Summer 2013 | 9

Grammar | Derivation
5
/
20
integer operator integer
E
E OP E
integer OP E
integer / E
integer / integer

E E OP E
E integer
OP + | - | * | /
E ( E )

Javier Gonzalez-Sanchez | CSE340 | Summer 2013 | 10

Grammar | Derivation
5
*
(
7
+
20
Integer operator delimiter integer operator integer
E
E OP E
integer OP E
integer * E
integer * (E)
integer * (E OP E)
integer * (integer OP E)
integer * (integer + E)
integer * (integer + integer)

)
delimiter

E E OP E
E integer
OP + | - | * | /
E ( E )

Javier Gonzalez-Sanchez | CSE340 | Summer 2013 | 11

Grammar | Derivation
5
*
(
7
+
20
Integer operator delimiter integer operator integer
E
E OP E
E OP (E)
E OP (E OP E)
E OP (E OP integer)
E OP (E + integer)
E OP (integer + integer)
E * (integer + integer)
integer * (integer + integer)

)
delimiter

E E OP E
E integer
OP + | - | * | /
E ( E )

Javier Gonzalez-Sanchez | CSE340 | Summer 2013 | 12

Derivations
A leftmost derivation is a derivation in which
each step expands the leftmost
nonterminal.
A rightmost derivation is a derivation in
which each step expands the rightmost
nonterminal.
Derivation will be very important when we
talk about parsing.

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Grammar | Example
H2 O
C O2 (S O4)3
Na Cl
S O3

Notation 1:
Comp
Mix
Elem
Num

Mix | Mix Num | Comp Comp


Elem | ( Comp )
H|O|C|S|Na|Cl| ...
1|2|3|4| ...

Notation 2:
<Comp>
<Mix>
<Elem>
<Num>

<Mix>|<Mix><Num> | <Comp><Comp>
<Elem> | ( <Comp> )
H|O|C|S|Na|Cl| ...
1|2|3|4| ...

Javier Gonzalez-Sanchez | CSE340 | Summer 2013 | 14

Grammar | Derivation
C

Comp
Comp Comp
Term Comp
Elem Comp
C Comp
C Term Num
C Elem Num
CO Num
CO2

Comp Term | Term Num | Comp Comp


Term Elem | ( Comp )
Elem H|O|C|S|Na|Cl| ...
Num 1|2|3|4|

Javier Gonzalez-Sanchez | CSE340 | Summer 2013 | 15

What about this?


BLOCK STMT | { STMTS } | { }
STMTS STMT | STMT STMTS
STMT

EXPR; |
if (EXPR) BLOCK |
while (EXPR) BLOCK |
BLOCK |
. . .

EXPR

EXPR + EXPR |
EXPR EXPR |
EXPR * EXPR |
identifier |
integer |
...
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Homework

Using the rules in the previous slide, apply derivation to show that the following
expression is syntactically correct

while ( 5 ) { if ( 6 ) { } }

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Homework

Review Recursion
Solve the Problem Set #1 in preparation for your exam

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 08:
Syntactic Analysis II

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-21
Office Hours: By appointment

Outline

Non-terminal

Grammar
(Rules)
Terminal

Language

Syntactic
Analysis
(Parser)

Derivation

Parse
Tree

BNF
Tools
Syntax Diagrams

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Grammar | Derivation
5
Integer

*
operator

(
delimiter

7
integer

E
E OP E
integer OP E
integer * E
integer * (E)
integer * (E OP E)
integer * (integer OP E)
integer * (integer + E)
integer * (integer + integer)

+
operator

20
integer

)
delimiter

E E OP E
E integer
OP + | - | * | /
E ( E )

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Grammar | Derivation
5
Integer

*
operator

(
delimiter

7
integer

E
E OP E
E OP (E)
E OP (E OP E)
E OP (E OP integer)
E OP (E + integer)
E OP (integer + integer)
E * (integer + integer)
integer * (integer + integer)

+
operator

20
integer

)
delimiter

E E OP E
E integer
OP + | - | * | /
E ( E )

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Parse Tree
A parse tree is a tree encoding the steps in a
derivation.
Internal nodes represent nonterminal symbols used
in the production.
Inorder walk of the leaves contains the generated
string.
Encodes what productions are used, not the order
in which those productions are applied.

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Parse Tree

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Parse Tree

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Goal
Goal of syntax analysis:
Recover the structure described by a series of
tokens.
Recover a parse tree for the given input.

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

The problem
5
Integer

*
operator

7
integer

+
operator

20
integer

E E OP E
E integer
OP + | - | * | /
E ( E )

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

A serious problem

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Ambiguity
A grammar is said to be ambiguous if there is at
least one string with two or more parse trees.
Note that ambiguity is a property of grammars, not
languages.

We will review this topic in the next lecture


Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Our Tools
Backus-Naur Form (BNF)
Extended Backus-Naur Form (EBNF)
Syntax Diagrams

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

BNF (Backus-Naur Form)


Formal, mathematical way to specify grammars
All the previous examples, where we use:
or ::= is defined as
| or operator
<nonterminal> or use uppercases
terminal (lowercases)

* John Backus and Peter Naur


Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

EBNF
Extended BNF include notation to indicate:

0 or more occurrences {}
1 or more occurrences +
0 or 1 occurrences []
Use of parentheses for grouping ( )

* Niklaus Wirth
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Example
Grammar rule for calling a method:
draw(x, y, z);
print (a, b, c, d);
done();
foobar(one, two, three, four, five);
sqrt(x);

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

BNF vs EBNF
BNF
<call_method> identifier(<identifiers>); | identifier();
<identifiers> identifier | identifier,<identifiers>

EBNF
<call_method>
<identifiers>

identifier ('('<identifiers>')' | '('')' ) ';'


identifier | identifier,<identifiers>

EBNF
<call_method> identifier'('[<identifiers>]')' ';'
<identifiers> identifier | identifier,<identifiers>

EBNF
<call_method> identifier'('[<identifiers>]')' ';'
<identifiers> identifier { ,identifier }

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Syntax Diagrams
Call_method

Identifiers

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Is it BNF or EBNF?
BLOCK STMT | { STMTS } | { }
STMTS STMT | STMT STMTS
STMT

EXPR; |
if (EXPR) BLOCK |
while (EXPR) BLOCK |
BLOCK |
. . .

EXPR

EXPR + EXPR |
EXPR EXPR |
EXPR * EXPR |
identifier |
integer |
...
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

Syntax Diagrams
BLOCK

STMTS

STMT

EXPR

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 19

Is it BNF or EBNF?
BLOCK STMT | '{' { STMT } '}'
STMT

EXPR; |
if '(' EXPR ')' BLOCK |
while '(' EXPR ')' BLOCK |
BLOCK |
. . .

EXPR

EXPR '+' EXPR |


EXPR '' EXPR |
EXPR '*' EXPR |
identifier |
integer |
...
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 20

Syntax Diagrams
BLOCK

STMT

EXPR

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 21

Homework

Create a Parse Tree for the following expression.


Use the rules stated in the previous lecture

while ( 5 ) { if ( 6 ) { } }

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 22

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 09:
Grammars 1

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-21
Office Hours: By appointment

Ambiguity
A grammar is said to be ambiguous if there is at
least one string with two or more parse trees.
Note that ambiguity is a property of grammars, not
languages.

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Ambiguity

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Solution
If a grammar can be made unambiguous at all, it is
usually made unambiguous through layering.
Have exactly one way to build each piece of the
string.
Have exactly one way of combining pieces back
together.
Recursive constructions

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Layering
Root
Start symbol
Rule 1
Rule 2
Rule 3

Layers

Rule 4
Rule 5
.....
Leaf (Terminals, i.e., Tokens)
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Layering
inputs:

1+2+3<4*5
1*2+3+4<5
1<2+3+4*5
1+2<3*4+5
1+2*3<4+5

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Exercise
Provide a Grammar that is Not ambiguous
for arithmetic expressions

10 * 20 + 15
Precedence of operators and Associativity

(10 * 20) + 15
10 * (20 + 15)
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Exercise | Precedence
New Grammar:

Original Grammar:

<E> <__>

<__>

E E OP E

< > <__>

<__>

E integer

< > <__>

<__>

OP + | - | * | /

< > <__>

<__>

< > - <__>

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Exercise | Hand written notes

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Exercise | Hand written notes

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Exercise | Precedence

<E> <A> | <A> {'+' <A>} | <A> {'-' <A>}


<A> <B> | <B> {'*' <B>} | <B> {'/' <B>}
<B> '-'<C> | <C>
<C> integer

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Exercise | Precedence

<E> <A> {(+'|'-) <A>}


<A> <B> {('*'|'/') <B>}
<B> ['-'] <C>
<C> integer

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Syntax Diagrams
B

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Exercise 1
Include rules for handling parenthesis into the previous
grammar. The grammar should accept as correct the
following expressions:

10 * 20 + 15
(10 * 20) + 15
10 * (20 + 15)
(10) * (20) + (15)
(10 * 20 + 15)
10 * (20) + 15

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Exercise 1 | Hand written notes

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Syntax Diagrams
B

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Exercise 2 | Hand written notes

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Exercise 2 | Hand written notes

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

Exercise 2
Include rules to accept variables names (identifiers) in
expressions. The grammar should accept as correct the
following expressions:

A * 20 + time
(x * y) + 15
10 * (ASU + cse340)
(10) * (20) + (15)
(hello * world + Arizona)
10 * (counter) + 15

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 19

Syntax Diagrams
B

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 20

Exercise 3
Provide a Grammar that is
Not ambiguous
with Precedence of operators and Associativity for this:

10 + 20 > 15 & -10 != 1 | 20 / ( 10 + 1) < 5

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 21

Exercise 3 | Note
Precedence of operators
|
&
!
< > == != <= >=
+*/
()

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 22

Exercise 3 | Hand written notes

...

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 23

Exercise 3 | Hand written notes


{
{

}
}

{
{

}
}

}...

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 24

Exercise 3

EXPRESSION

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 25

Exercise 3
B

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 26

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 10:
Grammars II

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-21
Office Hours by appointment

Exam 1 | Review
Draw a DFA equivalent to the regular expression
email2 = character+ \. character+

@ character+ \. domain

Note:
a*

a+ = a a*
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Exam 1 | Review
Which are the tokens that are been defined?

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Exam 1 | Review
Create a grammar to validate well-written references:

Byron Lahey, Audrey Girouard, Winslow Burleson, and Roel


Vertegaal. 2011. Understanding the use of bend gestures in
mobile devices with flexible electronic paper displays. In
Proceedings of the SIGCHI Conference on Human Factors in
Computing Systems (CHI '11). ACM, New York, NY, USA,
1303-1312.

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Exam 1 | Review
Create a grammar to validate well-written references:
Byron Lahey, Audrey Girouard, Winslow Burleson, and Roel Vertegaal.
2011.
Understanding the use of bend gestures in mobile devices with flexible electronic
paper displays.

In Proceedings of the SIGCHI Conference on Human Factors in


Computing Systems (CHI '11).

ACM,
New York, NY, USA,
1303-1312.
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Exam 1 | Review
Terminals

Non-terminals

string
number
.
,
-

<AUTORS>
<YEAR>
<TITLE>
<CONFERENCE>
<PUBLISHER>
<ADDRRESS>
<PAGES>

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Exam 1 | Review
Rules
<REFERENCE>
<AUTORS>
<YEAR>
<TITLE>
<CONFERENCE>
<PUBLISHER>
<ADDRRESS>
<PAGES>

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Review | Hand written notes

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Review | Hand written notes

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Review | Hand written notes

Syntax diagram

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Review | Hand written notes


Parse tree
(it is incomplete)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Exam 1 | Review

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Exam 1 | Review

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Review | Hand written notes

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 11:
Parser Implementation I

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-21
Office Hours: By appointment

Parser
A

Grammar
AA

BNF
EBNF

Parser

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Parser | Grammar 1
<EXPRESSION>
<X>
<Y>
<R>
<E>
<A>
<B>
<C>

<X> {'|' <X>}


<Y> {'&' <Y>}
['!'] <R>
<E> {('>'|'<'|'=='|'!=') <E>}
<A> {(+'|'-) <A>}
<B> {('*'|'/') <B>}
['-'] <C>
integer|
identifier|'(' <EXPRESSION> ')'

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Parser | Grammar 2
<PROGRAM>

'{' <BODY> '}'

<BODY>

{<EXPRESSION>';'}

<EXPRESSION>
<X>
<Y>
<R>

<E>
<A>
<B>

<A> {(+'|'-) <A>}


<B> {('*'|'/') <B>}
['-'] <C>

<C>

integer|
identifier|'(' <EXPRESSION> ')'

<X> {'|' <X>}


<Y> {'&' <Y>}
['!'] <R>
<E> {('>'|'<'|'=='|'!=') <E>}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Parser | Input and Output


{
0;
1 + 2;
3 * (4 + hello);
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Parser | Step by Step


For each rule in the grammar {
Step 1. left-hand side (new method)
Step 2. right-hand side (loops, ifs, call methods)
Step 3. identify errors (terminals)
Step 4. synchronize errors (first and follow sets)
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Parser
public class Parser {
private static Vector<Token> tokens;
private static int currentToken;
public static void RULE_PROGRAM () {}
public static void RULE_BODY () {}
public static void RULE_EXPRESSION () {}
public static void RULE_X () {}
public static void RULE_Y () {}
public static void RULE_R () {}
public static void RULE_E () {}
public static void RULE_A () {}
public static void RULE_B () {}
public static void RULE_C () {}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Parser
PROGRAM

public static void RULE_PROGRAM() {


if (tokens.get(currentToken).getWord().equals({)) {
currentToken++;
else
error(1);
RULE_BODY();
if (tokens.get(currentToken).getWord().equals(}))
currentToken++;
else
error(2);
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Parser
BODY

public static void RULE_BODY() {


while (!tokens.get(currentToken).getWord().equals(})) {
RULE_EXPRESSION();
if (tokens.get(currentToken).getWord().equals(;))
currentToken++;
else
error(3);
}
}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Parser
EXPRESSION

public static void RULE_EXPRESSION() {


RULE_X();
while (tokens.get(currentToken).getWord().equals(|)) {
currentToken++;
RULE_X();
}
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Parser
X

public static void RULE_X() {


RULE_Y();
while (tokens.get(currentToken).getWord().equals(&)) {
currentToken++;
RULE_Y();
}
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Parser
Y

public static void RULE_Y() {


if (tokens.get(currentToken).getWord().equals(!)) {
currentToken++;
}
RULE_R();
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Parser
R

public static void RULE_R() {


RULE_E();
while ( tokens.get(currentToken).getWord().equals(<)

|tokens.get(currentToken).getWord().equals(>)
|tokens.get(currentToken).getWord().equals(==)
|tokens.get(currentToken).getWord().equals(!=)
) {
currentToken++;
RULE_E();
}
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Parser
E

public static void RULE_E() {


RULE_A();
while (tokens.get(currentToken).getWord().equals(-)
| tokens.get(currentToken).getWord().equals(+)
) {
currentToken++;
RULE_A();
}
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Parser
A

public static void RULE_A() {


RULE_B();
while (tokens.get(currentToken).getWord().equals(/)
| tokens.get(currentToken).getWord().equals(*)
) {
currentToken++;
RULE_B();
}
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Parser
B

public static void RULE_B() {


if (tokens.get(currentToken).getWord().equals(-)) {
currentToken++;
}
RULE_C();
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Parser
C

public static void RULE_C() {


if (tokens.get(currentToken).getToken().equals(integer)) {
currentToken++;
} else if (tokens.get(currentToken).getToken().equals(identifier)) {
currentToken++;
} else if (tokens.get(currentToken).getWord().equals(()) {
currentToken++;
RULE_EXPRESSION();
if (tokens.get(currentToken).getWord().equals())) {
currentToken++;
} else error(4);
}
} else { error (5); }
}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Homework

Programming Assignment 2
Level 1

Review and Understand the Source Code


posted in Blackboard. Specially, particularly the use of
DefaultMutableTreeNode)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

Homework

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 19

Homework

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 20

Homework

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 21

Homework

Programming Assignment 2
Level 2

Modify the Source Code


to include the rules PROGRAM and BODY, EXPRESSION, X, Y, R
(from Grammar 2)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 22

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 12:
Parser Implementation II

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-21
Office Hours: By appointment

Review

Programming Assignment 2
Level 1

Review and Understand the Source Code


posted in Blackboard.
Particularly the use of DefaultMutableTreeNode)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Review

* Parser.java is the only file that you are allowed to modify


Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Review

Programming Assignment 2
Level 2

Modify the Source Code


to include the rules PROGRAM and BODY, EXPRESSION, X, Y, R
(from Grammar 2)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Review
<PROGRAM>

'{' <BODY> '}'

<BODY>

{<EXPRESSION>';'}

<EXPRESSION>
<X>
<Y>
<R>

<E>
<A>
<B>

<A> {(+'|'-) <A>}


<B> {('*'|'/') <B>}
['-'] <C>

<C>

integer|
identifier|'(' <EXPRESSION> ')'

<X> {'|' <X>}


<Y> {'&' <Y>}
['!'] <R>
<E> {('>'|'<'|'=='|'!=') <E>}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Programming Assignment 2
Level 3

The complete grammar for our language

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Assignment 2 | Language
ARITHMETIC OPERATORS { +,

Expressions
(operators)
Actions

LOGIC OPERATORS { &,

|, ! }

RELATIONAL OPERATORS {<,


Instructions

KEYWORD {return,

Control
Structures

KEYWORD { if,

-, *, /, =}
>, ==, !=}

print}

else, while, switch, case }

KEYWORD { void, int, char, string, float, boolean }

Language

KEYWORD { true,

false }

BINARY
Data

INTEGER
FLOAT
HEXADECIMAL
OCTAL
CHAR
STRING
IDENTIFIER

Delimiter

: ; , () {} []
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Assignment 2 | Grammar
<PROGRAM>

'{' <BODY> '}

<BODY> {<PRINT>';'|<ASSIGNMENT>';'|<VARIABLE>';|<WHILE>|<IF>|<RETURN>';'}
<ASSIGNMENT> identifier '=' <EXPRESSION>
<VARIABLE>
('int'|'float'|'boolean'|'char|'string'|'void')identifier
<WHILE>
'while' '(' <EXPRESSION> ')' <PROGRAM>
<IF>
'if' '(' <EXPRESSION> ')' <PROGRAM> ['else' <PROGRAM>]
<RETURN>
'return'
<PRINT>
print ( <EXPRESSION> )
<EXPRESSION>
<X>
<Y>
<R>
<E>
<A>
<B>
<C>

<X> {'|' <X>}


<Y> {'&' <Y>}
['!'] <R>
<E> {('>'|'<'|'=='|'!=') <E>}
<A> {(+'|'-) <A>}
<B> {('*'|'/') <B>}
['-'] <C>
integer | octal | hexadecimal | binary | true | false |
string | char | float | identifier|'(' <EXPRESSION> ')'

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Assignment 2 | Input
Are there syntactical errors?
{
float a;
x = 0;
int x;
y = 1 + 1;
x = (0b11) +(05 0xFF34);
while (2 == "hi") {
a = 2 > (4 + Y);
if (true) { if( 2 + 2 ) {} else {} }
}
print ("hello" + "world");
}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Assignment 2 | Input
Are there syntactical errors?
{
int
x =
x =
x =
x =
x =
x =
x =

x;
5;
05;
0x5ff;
5.55;
"five";
5';
false;

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Assignment 2 | Input
Are there syntactical errors?
{
int
float
string
char
void
boolean

x;
x;
x;
x;
x;
x;

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Assignment 2 | Input
Are there syntactical errors?
{
x
x
x
x
x

=
=
=
=
=

"hello" + "world" 'w' * 5 / 3.4;


y hello & 0xffff | 05;
-7;
!y;
(cse340 + cse310) / cse101 ;

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Assignment 2 | Code

------------ program(root);

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Assignment 2 | Code
PROGRAM

public static void RULE_PROGRAM() {


if (tokens.get(currentToken).getWord().equals({)) {
currentToken++;
else
error(1);
RULE_BODY();
if (tokens.get(currentToken).getWord().equals(}))
currentToken++;
else
error(2);
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Assignment 2 | Code
BODY

public static void RULE_BODY() {


while (!tokens.get(currentToken).getWord().equals(})) {
if (tokens.get(currentToken).getToken().equals(identifier)) {
RULE_ASSIGNMENT();
if (tokens.get(currentToken).getWord().equals(;)) {
currentToken++;
else error(3);
} else if (tokens.get(currentToken).getToken().equals(int) | ...) {
RULE_VARIABLE();
if (tokens.get(currentToken).getWord().equals(";")) {
currentToken++;
else error(3);
} else if (tokens.get(currentToken).getWord().equals(while)) {
RULE_WHILE();
} else if (tokens.get(currentToken).getWord().equals(if)) {
RULE_IF();
} else if (tokens.get(currentToken).getWord().equals(return)) {
RULE_RETURN();
if (tokens.get(currentToken).getWord().equals(;)) {
currentToken++;
else
error(3);
} else error(4);
}
}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Assignment 2 | Code
ASSIGNMENT

public static void RULE_ASSIGNMENT() {

}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Assignment 2 | Code
VARIABLE

public static void RULE_VARIABLE() {

}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Assignment 2 | Code
WHILE

public static void RULE_WHILE() {

}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

Assignment 2 | Code
IF

public static void RULE_IF() {

}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 19

Assignment 2 | Code
RETURN

public static void RULE_RETURN() {

}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 20

Assignment 2 | Code
PRINT

public static void RULE_PRINT () {

}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 21

Assignment 2 | Code
C

public static void RULE_C() {

}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 22

PREDICTIVE
DESCENDENT

RECURSIVE

PARSER
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 23

Concepts
{
int a;
a = 0xFF + 0b111;
while (a != 05) {
if (true) {
a = 2.5e-1 / 7;
} else {
a = 'A;
while(true) {
}
}
}
print ("hello");
}

PREDICTIVE
DESCENDENT

RECURSIVE

PARSER
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 24

Homework

Programming Assignment #2
(Complete Levels 1 to 3)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 25

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 13:
Parsing Techniques I

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-38
Office Hours: By appointment

Assignment 2
1

Understand de provided source code


(3 rules)

Program the rules PROGRAM, BODY, EXPRESSION, X, Y, R, E, C


(11 rules)

Program the full set of rules in the grammar


(16 rules)

Report syntactical errors (one error and stop)


PANIC MODE

Implement error synchronization


ERROR RECOVERY
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Programming Assignment 2
Level 4

Handling Syntactical Errors (part 1):


Error messages

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Assignment 2 | Grammar
<PROGRAM>

'{' <BODY> '}

<BODY> {<PRINT>';'|<ASSIGNMENT>';'|<VARIABLE>';|<WHILE>|<IF>|<RETURN>';'}
<ASSIGNMENT> identifier '=' <EXPRESSION>
<VARIABLE>
('int'|'float'|'boolean'|'char|'string'|'void')identifier
<WHILE>
'while' '(' <EXPRESSION> ')' <PROGRAM>
<IF>
'if' '(' <EXPRESSION> ')' <PROGRAM> ['else' <PROGRAM>]
<RETURN>
'return'
<PRINT>
print ( <EXPRESSION> )
<EXPRESSION>
<X>
<Y>
<R>
<E>
<A>
<B>
<C>

<X> {'|' <X>}


<Y> {'&' <Y>}
['!'] <R>
<E> {('>'|'<'|'=='|'!=') <E>}
<A> {(+'|'-) <A>}
<B> {('*'|'/') <B>}
['-'] <C>
integer | octal | hexadecimal | binary | true | false |
string | char | float | identifier|'(' <EXPRESSION> ')'

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Error Synchronization
PROGRAM

RETURN

PRINT

BODY

EXPRESSION

ASSIGNMENT

VARIABLE

IF

WHILE

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Assignment 2
Input:
{}

Output:
Build successful

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Assignment 2
Input:
{
hello word
}

Output:
Line 2:
Line 2:

expected =
expected ;

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Assignment 2
Input:
{
int x
int
int int x;
}

Output:
Line
Line
Line
Line

2:
3:
3:
4:

expected
expected
expected
expected

;
identifier
;
identifier

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Assignment 2
Input:
{
x = a;
x = 0x36AW;
x = ((((((((((y))))))))));
x = (5+(4-(3+(5+5/(2+(3+(1+(77+(1-(y)))))))))) + hello + q;
if (a < b) {} else {}
if (a < b) {
if (a < b) {
} else {
}
}
}

Output:
Line 3: expected value, identifier or (

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Parser | Error Points

Line N: expected {

PROGRAM

Line N: expected }

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Parser | Error Points


Line N: expected ;
BODY

Line N: expected identifier or keyword


Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Parser | Error Points

ASSIGNMENT

Line N: expected =
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Parser | Error Points

VARIABLE

Line N: expected identifier


Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Parser | Error Points


Line N: expected )

WHILE

Line N: expected (
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Parser | Error Points


Line N: expected (

Line N: expected )

IF

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Parser | Error Points

RETURN

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Parser | Error Points


Line N: expected (

Line N: expected )

PRINT

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Parser | Error Points

EXPRESSION

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

Parser | Error Points


C

Line N: expected value, identifier or (

Line N: expected )
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 19

Assignment 2
public static void error(int err) {
int n = tokens.get(currentToken).getLine();
switch (err) {
case 1: gui.writeConsole("Line + n + ": expected {); break;
case 2: gui.writeConsole("Line + n + ": expected }); break;
case 3: gui.writeConsole("Line + n + ": expected ;); break;
case 4:
gui.writeConsole("Line +n+": expected identifier or keyword);
break;
case 5:
gui.writeConsole("Line +n+": expected =); break;
case 6:
gui.writeConsole("Line +n+": expected identifier); break;
case 7:
gui.writeConsole("Line +n+": expected )); break;
case 8:
gui.writeConsole("Line +n+": expected (); break;
case 9:
gui.writeConsole("Line +n+": expected value, identifier, ();
break;
}
}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 20

Updates
1. In Gui.java: make the method writeConsole public

2. In Gui.java: add this as a second parameter for the method Parser.run()

3. In Parser.java:
add the attribute gui (line 15)
add a second parameter to the method run (line 17)
initialize gui (line 18)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 21

Programming Assignment 2
Level 5

Handling Syntactical Errors (part 2):


Error Recovery

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 22

Error Recovery
Input:
{
x = a;
x = 1 + (
x = (y;)
if (a < b + ) {} else {}
if (a < b) {
if (a < b) {
} else {

We will not allow multiline expressions; line 3


and 4 should not be
considered as follow:
x= 1 + ( x = ( y;)

}
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 23

Error Synchronization
PROGRAM

RETURN

PRINT

BODY

EXPRESSION

ASSIGNMENT

VARIABLE

IF

WHILE

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 24

Error Recovery
Output:
Line 3:
Line 3:
Line 3:
// move

expected value, identifier, (


expected )
expected ;
to the next line

Line 4: expected )
Line 4: expected identifier or keyword
// infinite loop or end
Line 5: expected value, identifier, (
// simple
Line 12: expected }
// reported by program
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 25

Error Recovery
At this point:
Errors do not increment currentToken.
currentToken increase when the token is used
(added to the tree).
Error recovery is about ignoring tokens
How to know which tokens should be ignored?

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 26

Error Recovery

to be continued...

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 27

Homework

Programming Assignment #2

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 28

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 14:
Parsing Techniques II

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-38
Office Hours: By appointment

Error Recovery
Input:
{
x = a;
x = 1 + (
x = (y;)
if (a < b + ) {} else {}
if (a < b) {
if (a < b) {
} else {

We will not allow multiline expressions; line 3


and 4 should not be
considered as follow:
x= 1 + ( x = ( y;)

}
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Error Recovery
PROGRAM

RETURN

PRINT

BODY

EXPRESSION

ASSIGNMENT

VARIABLE

IF

WHILE

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Error Recovery
Output:
Line 3:
Line 3:
Line 3:
// move

expected value, identifier, (


expected )
expected ;
to the next line

Line 4: expected )
Line 4: expected identifier or keyword
// infinite loop or end
Line 5: expected value, identifier, (
// simple
Line 12: expected }
// reported by program
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Error Recovery
At this point:
Errors do not increment currentToken.
currentToken increase when the token is used
(added to the tree).
Error recovery is about ignoring tokens
How to know which tokens should be ignored?

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Parser | Error Recovery


Line N: expected {
currentToken++;
Searching for

FIRST(BODY) or }

PROGRAM

Line N: expected }

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Parser | Error Recovery

Line N: expected ;

BODY

Line N: expected identifier or keyword

currentToken++;
Searching for

FIRST(BODY) or FOLLOW(BODY)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Parser | Error Recovery

ASSIGNMENT

Line N: expected =
currentToken++;
Searching for

FIRST(EXPRESSION) or FOLLOW(EXPRESSION)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Parser | Error Recovery

VARIABLE

Line N: expected identifier

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Parser | Error Recovery


currentToken++;
Searching for

FIRST(PROGRAM) or FOLLOW(PROGRAM)

Line N: expected )

WHILE

Line N: expected (
currentToken++;
Searching for

FIRST(EXPRESSION) or )

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Parser | Error Recovery


currentToken++;
Searching for

FIRST(EXPRESSION) or )

Line N: expected (

currentToken++;
Searching for

FIRST(PROGRAM) or
FOLLOW(PROGRAM)

Line N: expected )

IF

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Parser | Error Recovery

RETURN

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Parser | Error Recovery


currentToken++;
Searching for

FIRST(EXPRESSION) or )

Line N: expected (

Line N: expected )

PRINT

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Parser | Error Recovery

EXPRESSION

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Parser | Error Recovery


C

Line N: expected value, identifier or (

Line N: expected )
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Parser | Error Recovery


Rule

FIRST set

FOLLOW set

PROGRAM

EOF

BODY

FIRST (PRINT) U FIRST (ASIGNMENT) U FIRST(VARIABLE) U FIRST


(WHILE) U FIRST(IF) U FIRST (RETURN)

PRINT

print

ASSIGNMENT

identifier

VARIABLE

int, float, boolean, void, char, string

WHILE

while

} U FIRST(BODY)

IF

if

} U FIRST(BODY)

RETURN

return

EXPRESSION

FIRST(X)

), ;

FIRST(Y)

| U FOLLOW(EXPRESSION)

! U FIRST(R)

& U FOLLOW(X)

FIRST(E)

FOLLOW(Y)

FIRST (A)

!=, ==, >, < U FOLLOW(R)

FIRST (B)

-, + U FOLLOW(E)

- U FIRST (C)

*, /, U FOLLOW(A)

integer, octal, hexadecimal, binary, true, false, string, char, float, identifier, (

FOLLOW(B)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Parser | Error Recovery


If ( tokens.get(currentToken).getLine() <
tokens.get(currentToken+1).getLine() ) {

// go back until reaching RULE_BODY()

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Homework

Programming Assignment #2

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 15:
Parsing Techniques III

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-38
Office Hours: By appointment

Parser | Error Recovery


Rule

FIRST set

FOLLOW set

PROGRAM

EOF

BODY

FIRST (PRINT) U FIRST (ASIGNMENT) U FIRST(VARIABLE) U FIRST


(WHILE) U FIRST(IF) U FIRST (RETURN)

PRINT

print

ASSIGNMENT

identifier

VARIABLE

int, float, boolean, void, char, string

WHILE

while

} U FIRST(BODY)

IF

if

} U FIRST(BODY)

RETURN

return

EXPRESSION

FIRST(X)

), ;

FIRST(Y)

| U FOLLOW(EXPRESSION)

! U FIRST(R)

& U FOLLOW(X)

FIRST(E)

FOLLOW(Y)

FIRST (A)

!=, ==, >, < U FOLLOW(R)

FIRST (B)

-, + U FOLLOW(E)

- U FIRST (C)

*, /, U FOLLOW(A)

integer, octal, hexadecimal, binary, true, false, string, char, float, identifier, (

FOLLOW(B)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Calculating the First

Set

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

FIRST set
Definition
FIRST (a) is the set of tokens that can begin the construction a.
Example
<E>
<A>
<B>
<C>

<A> {+ <A>}
<B> {* <B>}
-<C> | <C>
integer

FIRST(E)
FIRST(A)
FIRST(B)
FIRST(C)

=
=
=
=

{-, integer}
{-, integer}
{-, integer}
{integer}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

FIRST set
Define FIRST (BODY)
FIRST(BODY) =
FIRST (PRINT) U FIRST (ASSIGNMENT) U FIRST(VARIABLE) U FIRST(WHILE) U
FIRST(IF) U FIRST(RETURN)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

FIRST set
Define FIRST (C)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

FIRST set
Define FIRST (A)

Define FIRST (B)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

FIRST set
<S>
<S>
<A>
<A>
<B>
<B>
<C>
<C>
<E>
<E>
<F>
<F>

<A><B><C>
<F>
<E><F>d
a
a<B>b

c<C>
d
e<E>
<F>
<F>f

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Calculate the FIRST set


1. FIRST(X) = {X}
if X is a terminal
2. FIRST() = {}.
note that this is not covered by the first rule because
is not a terminal.
3. If A X, add FIRST(X) {} to FIRST(A)
4. If A A1A2A3 ...AiAi+1 ... Ak and
F IRST (A1) and FIRST (A2) and . . . and FIRST (Ai),
then add FIRST (Ai+1) {} to FIRST (A).
5. If A A1A2A3 ...Ak and
FIRST(A1) and FIRST(A2) and... and FIRST(Ak),
then add to FIRST(A).
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Calculate the FIRST set

loop

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

FIRST set
S
S
A
A
B
B
C
C
E
E
F
F

ABC
F
EFd
a
aBb

cC
d
eE
F
Ff

FIRST set - evolution

rule

{a}

{a, }

{c, d}

{a, }

{a, , e, f}

{a, e}

{a, e, f, d}

{e}

{e, }

{e, , f}

{}

{, f}

{a, , e, f, d}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

FIRST set | Exercise


<X>
<A>
<B>
<C>
<D>
<E>

<A> | <A> a
<B> | <B> b
<C><D><E> | c d e | <C> c <D> d <E> e
one
two
three

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

FIRST set | Exercise


<X>
<X>
<A>
<A>
<B>
<B>
<B>
<C>
<D>
<E>

<A>
<A> a
<B>
<B> b
<C><D><E>
c d e
<C> c <D> d <E> e
one
two
three

OPTION 1:
FIRST(X)
FIRST(A)
FIRST(B)
FIRST(C)
FIRST(D)
FIRST(E)

=
=
=
=
=
=

{c, one}
{c, one}
{c, one}
{one}
{two}
{three}

OPTION 2:
FIRST(X)
FIRST(A)
FIRST(B)
FIRST(C)
FIRST(D)
FIRST(E)

=
=
=
=
=
=

{c, one, }
{b, c, one}
{c, one}
{one}
{two}
{three}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

FIRST set | Exercise


<X>
<X>
<A>
<A>
<B>
<B>
<B>
<C>
<C>
<D>
<D>
<E>

<A>
<A> a
<B>
<B> b
<C><D><E>
c d e
<C> c <D> d <E> e
one

two

three

FIRST(X)
FIRST(A)
FIRST(B)
FIRST(C)
FIRST(D)
FIRST(E)

=
=
=
=
=
=

{c, one, three, two}


{c, one, three, two}
{c, one, three, two}
{one, }
{two, }
{three}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Calculating the Follow

Set

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

FOLLOW set
Definition
FOLLOW (a) is the set of tokens that can follow the construction a.
Example
<E>
<A>
<B>
<C>

<A> {+ <A>}
<B> {* <B>}
-<C> | <C>
integer

5 + 4 + -7 * 12 + 75
5 + 4 + ((-7) * 12) + 75
What follows <C> ?
What follows <A> ?
What follows <E> ?
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

FOLLOW set
Definition
FOLLOW (a) is the set of tokens that can follow the construction a.
Example
<E>
<A>
<B>
<C>

<A> {+ <A>}
<B> {* <B>}
<C> | <C>
integer

FOLLOW(E)
FOLLOW(A)
FOLLOW(B)
FOLLOW(C)

=
=
=
=

{$}
// $ represents end of input, i.e., EOF
{+, $}
{*, +, $}
{*, +, $}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

FOLLOW set
Define FOLLOW (BODY)
FIRST(BODY) =

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

FOLLOW set
Define FOLLOW (C)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 19

FOLLOW set
<S>
<S>
<A>
<A>
<B>
<B>
<C>
<C>
<E>
<E>
<F>
<F>

<A><B><C>
<F>
<E><F>d
a
a<B>b

c<C>
d
e<E>
<F>
<F>f

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 20

Calculate the FOLLOW set


1. First put $ (the end of input marker) in Follow(S) (S is the start
symbol)
2. If there is a production A aBb,
(where a can be a whole string)
then everything in FIRST(b) except for is placed in FOLLOW(B).
(apply the rule 4 in calculate FIRST set)
3. If there is a production A aB,
then add FOLLOW(A) to FOLLOW(B)
4. If there is a production A aBb,
where FIRST(b) contains ,
then add FOLLOW(A) to FOLLOW(B)
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 21

Calculate the FOLLOW set

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 22

FOLLOW set
S
S
A
A
B
B
C
C
E
E
F
F

ABC
F
EFd
a
aBb

cC
d
eE
F
Ff

FOLLOW set - evolution

rule

{eof}

{a}

{a, c, d}

{c, d}

{c, d, b}

{eof}

{f}

{f, d}

{eof}

{eof, d}

FIRST sets:
S={a,,e,f,d}
A={a, e, f, d}
B={a, }
C= {c, d}
E={e, , f}
F={,f}

{eof, d, f}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 23

Another Example
<E> <T> {+<T>}
<T> <F> {*<F>}
<F> (<E>) | integer

FIRST (E) = {(, integer}


FIRST (T) = {(, integer}
FIRST (F) = {(, integer}

FOLLOW(E) = {$, )}
FOLLOW(T) = {$, ),+ }
FOLLOW(F) = {$, ),+, *

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 24

Prediction Rules

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 25

Prediction Rules
Rule 1.
It should always be possible to choose among several
alternatives in a grammar rule.
FIRST(R1)

FIRST(R2)

FIRST(R3)...

FIRST(Rn) =

BODY

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 26

Prediction Rules
Rule 1.1
The FIRST sets of any two choices in one rule must not
have tokens in common in order to implement a singlesymbol look ahead predictive parser.

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 27

Prediction Rules
Rule 2.
For any optional part, no token beginning the optional part
can also come after the optional part.
FIRST(RULE) != FOLLOW(RULE)

BODY

PROGRAM

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 28

Homework

Programming Assignment #2

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 29

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 16:
Grammars III

Javier Gonzalez-Sanchez
javiergs@asu.edu
BYENG M1-38
Office Hours: By appointment

Chomsky Hierarchy

*Noam Chomsky
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Chomsky Hierarchy

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Chomsky Hierarchy
Type

Name

Example Use

Recognizing
Automata

Parsing
Complexity

Recursively
Enumerated

Turing Machine

Undecidable

Context
Sensitive

Linear Bounded
Automata

NP Complete

Context Free

Pushdown
Automata

O(n3)

Finite
Automata

O(n)

Arithmetic
Expression
x = a + b * 75

Regular

Identifier
a110

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Chomsky Hierarchy
Type 0 - Recursively Enumerated
structure:
where and are any string of terminals and nonterminals

Type 1 - Context-sensitive
structure: X
where X is a non-terminal, and ,, are any string of terminals and nonterminals, (
must be non-empty).

Type 2 - Context-free
structure: X |
where X is a nonterminal and is any string of terminals and nonterminals (may be
empty). It is discouraged to use only one nonterminal as .

Type 3 Regular
structure: X Y | |
where X,Y are single nonterminals, and is a string of terminals;
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Regular Grammar
Type 3 Regular
structure: X Y | |
where X,Y are single nonterminals, and is a string of terminals;
<DIGIT> integer| | integer<DIGIT>
<Q0> a<Q1> | b<Q0>
<Q1> a<Q1> | b<Q0> |
<S> a<S> |b<A>
<A> c<A> |
<A>
<A> a
<A> <B>a
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Regular Grammar
The Regular Grammars are either left of right:

Right Regular Grammars:

Left Regular Grammars:

Rules of the forms


<A>
<A> a
<A> a<B>

Rules of the forms


<A>
<A> a
<A> <B>a

A,B: nonterminals and


a: terminal

A,B: nonterminals and


a: terminal

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Regular Grammar
Key point: The derivation process is linear
Example of Derivation process:
<S> a<S> | b<A>
<A> c<A> |


The grammar is equivalent to the regular expression a*bc*
S aS aaS
aaS aabA aabcA
aabccA
aabcc

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Context-Free Grammar
Type 2 - Context-free
structure: X |
where X is a nonterminal and is any string of terminals and
nonterminals (may be empty). It is discouraged to use only one
nonterminal as .
<S>
<A>

<A><S> |
0 <A> 1| <A>1 | 0 1

<S>
<NP>
<VP>
<V>
<N>

<NP><VP>
the <N>
<V><NP>
sings | eats
cat | song | canary
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Context-Sensitive Grammar
Type 1 - Context-sensitive
structure: X
where X is a non-terminal, and ,, are any string of terminals
and nonterminals, ( must be non-empty).
<S> a<S><B><C>
<S> a<B><C>
<C><B> <H><B>
<H><B> <H><C>
<H><C> <B><C>
a<B> a b
b<B> b b
b<C> b c
c<C> c c
*The language generated by this grammar is {anbncn|n 1} .
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Chomsky Hierarchy
Type 0 - Recursively Enumerated
structure:
where and are any string of terminals and nonterminals

Type 1 - Context-sensitive
structure: X
where X is a non-terminal, and ,, are any string of terminals and nonterminals, (
must be non-empty).

Type 2 - Context-free
structure: X |
where X is a nonterminal and is any string of terminals and nonterminals (may be
empty). It is discouraged to use only one nonterminal as .

Type 3 Regular
structure: X Y | |
where X,Y are single nonterminals, and is a string of terminals;
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Exercise
G3 {

G1 {
<s>
<s>
<A>
<A>a
<A><B>
<B>b
<B>

<A>
<A><A><B>
aa
<A><B>a
<A><B><B>
<A><B>b
b

<B><A><B>
<A><B><A>
<A><B>
a<A>
ab
<B><A>
b

<s>
<A>
<A>
<B>
<B>
<A><B>

<A><B>
a<A>
a
<B>b
b
<B><A>

G4 {

G2 {
<s>
<s>
<s>
<A>
<A>
<A>
}

<s>
<s>
<A>
<A>
<A>
<B>
<B>

b<s>
a<A>
b
a<s>
b<A>
a
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Examples

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Types of Grammars
<E>
<E>
<F>
<F>

e<E>
<F>
<F>f

Is this a regular grammar (X Y | |) ?


Is this a context-free grammar (X | ) ?
Is this a context-sensitive grammar (X ) ?

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Types of Grammars
if <condition> then
<condition>

if ( <condition> ) then

<A> = <B>

Is this a regular grammar (X Y | |) ?


Is this a context-free grammar (X | ) ?
Is this a context-sensitive grammar (X ) ?

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Types of Grammars
if ( <condition> ) then
<condition>

if <condition> then

<A> = <B>

Is this a regular grammar (X Y | |) ?


Is this a context-free grammar (X | ) ?
Is this a context-sensitive grammar (X ) ?

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Types of Grammars
<A>

Is this a regular grammar (X Y | |) ?


Is this a context-free grammar (X | ) ?
Is this a context-sensitive grammar (X ) ?

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Chomsky Hierarchy
Type 0 - Recursively Enumerated
structure:
where and are any string of terminals and nonterminals

Type 1 - Context-sensitive
structure: X
where X is a non-terminal, and ,, are any string of terminals and nonterminals, (
must be non-empty).

Type 2 - Context-free
structure: X |
where X is a nonterminal and is any string of terminals and nonterminals (may be
empty). It is discouraged to use only one nonterminal as .

Type 3 Regular
structure: X Y | |
where X,Y are single nonterminals, and is a string of terminals;
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 17:
Semantic Analysis I

Javier Gonzalez-Sanchez
BYENG M1-38
Office Hours: By appointment

Semantic Analysis

Understanding the meaning


i.e.,
Interpreting expressions in their context

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Semantic Analysis
1.

Declaration and Unicity. Review for uniqueness and that the variable
has been declared before its usage.

2.

Types. Review that the types of variables match the values assigned
to them.

3.

Arrays indexes. Review that the indexes are integers.

4.

Conditions. Review that all expressions on the conditons return a


boolean value.

5.

Return type. Review that the value returned by a method match the
type of the method.

6.

Parameters. Review that the parameters in a method match in type


and number with the declaration of the method.
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Study Cases
Case 1:
int i;
char j; int m;
void method(int n, char c) {
int n; short l;
i = j;
i = m;
}
Case 2:
int i, j;
void method() {
int i = 5;
int j = i + i;
int i = i + i;
}
Case 3:
int i, m, k; boolean j;
void main() {
if (i>5) { ++i; }
while (i + 1) { ++i; }
do {++i; } while (i);
for (i = 0; m; ++i) {
k++;
}
}

Case 4:
int a;
int b;
int c, d;
char c1, c2;
int test1(int x, int y) {
return x+y;
}
void main() {
int i; i = a++;
i = test1(a, b);
i = test1(c1, c2);
i = test1(a, c1);
} }
Case 5:
int i,
public
int
a =
}

m;
boolean j;
void main() {
m; int a[];
new int[j];

Case 6:
int i;
void main(int m) {
i++; return i;
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

1. Variable Declaration and Unicity

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Symbol Table
int i;
char j; int m;
void method(int n, char c) {
int n; short l;
i = j;
i = m;
}
void method() {
int i = 5;
int j = i + i;
}

name

type

scope

int

global

char

global

int

global

method-int-char void
n

int k;
int method(int i) {
if (i>5) { ++i; }
while (i + 1) { ++i; }
do {++i; } while (i);
for (i = 0; m; ++i) {
k++;
}
}

int

value

function
method-int-char

short method-int-char

method

void

function

int

method

int

method

int

global

method-int

int

function

Int

method-int

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Symbol Table
int i;
char j; int m;
void method(int n, char c) {
int n; short l;
i = j;
i = m;
}
void method() {
int i = 5;
int j = i + i;
}

name

type

scope

int

global

char

global

int

global

method-int-char void
n

int k;
int method(int i) {
if (i>5) { ++i; }
while (i + 1) { ++i; }
do {++i; } while (i);
for (i = 0; m; ++i) {
k++;
}
}

int

value

function
method-int-char

short method-int-char

method

void

function

int

method

int

method

int

global

method-int

int

function

Int

method-int

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Symbol Table
names

bindings
int

Method-int

method

int
global

int
method

char
global

int

int
global

method-int-char

void
function

int

method-int-char

short

method-int-char

method

void
function

int
global

method-int

int
function
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Exercise
int a, b; char c, d; float e,f;
void foo1(int a) {
// float a = 1;
float w = a;
}
void foo2(char b) {
int a = c + d;
}
int foo3() {

Create the symbol table

int i = a + b;
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Programming Assignment 3
Level 1

Reviewing Declaration and Unicity

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Symbol Table

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Symbol Table

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Grammar
<PROGRAM>

'{' <BODY> '}

<BODY> {<PRINT>';'|<ASSIGNMENT>';'|<VARIABLE>';|<WHILE>|<IF>|<RETURN>';'}
<ASSIGNMENT> identifier '=' <EXPRESSION>
<VARIABLE>
('int'|'float'|'boolean'|'char|'string'|'void')identifier
<WHILE>
'while' '(' <EXPRESSION> ')' <PROGRAM>
<IF>
'if' '(' <EXPRESSION> ')' <PROGRAM> ['else' <PROGRAM>]
<RETURN>
'return'
<PRINT>
print ( <EXPRESSION> )
<EXPRESSION>
<X>
<Y>
<R>
<E>
<A>
<B>
<C>

<X> {'|' <X>}


<Y> {'&' <Y>}
['!'] <R>
<E> {('>'|'<'|'=='|'!=') <E>}
<A> {(+'|'-) <A>}
<B> {('*'|'/') <B>}
['-'] <C>
integer | octal | hexadecimal | binary | true | false |
string | char | float | identifier|'(' <EXPRESSION> ')'

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

A3 :: Parser Update
VARIABLE

void rule_variable( ) {
. . .
if (tokens.get(currentToken1).getType().equals(identifier)) {

SemanticAnalizer.CheckVariable(
tokens.get(currentToken-1).getWord(),
tokens.get(currentToken).getWord()
);

currentToken++;
} else {
error (6);
}
. . .

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

A3 :: SemanticAnalyzer.java
public class SemanticAnalizer {
private Hashtable<String, Vector<SymbolTableItem>> symbolTable;
public static void CheckVariable(string type, string id) {
// A. search the id in the symbol table
// B. if !exist then insert:

type, scope=global, value={0, false, "", }

// C. else error: variable id is already defined


}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

A3 :: Review

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Homework

Programming Assignment 3

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 18:
Semantic Analysis II

Javier Gonzalez-Sanchez
BYENG M1-38
Office Hours: By appointment

Semantic Analysis
1.

Declaration and Unicity. Review for uniqueness and that the variable
has been declared before its usage.

2.

Type Matching. Review that the types of variables match the values
assigned to them.

3.

Arrays indexes. Review that the indexes are integers.

4.

Conditions. Review that all expressions on the conditons return a


boolean value.

5.

Return type. Review that the value returned by a method match the
type of the method.

6.

Parameters. Review that the parameters in a method match in type


and number with the declaration of the method.
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

2. Type Matching

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Type matching | Example 1


int x, y, z;
char p, q, r;
float a, b, c;
boolean foo;

void method() {

x = a * c + p;
}

+
*
a

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Type matching | Cube

fill one sheet for


each operator in
the language
cube
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Type matching | Cube

OPERATOR

int

oat

char string boolean void

int
oat
char
string
boolean
void

cube

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Type matching | Cube

int

oat

int

int

oat

string

oat

oat

oat

string

char

string

string

OPERATOR

string

char string boolean void

string string string string

boolean

string

void

cube

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Type matching | Cube

int

oat

int

oat

char

string

boolean

boolean

void

OPERATOR

&

char string boolean void

cube

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Type matching | Cube

OPERATOR

<

int

oat

char string boolean void

int

boolean boolean

oat

boolean boolean

char

string

boolean

void

cube

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Type matching | Cube

int

oat

int

OK

oat

OK

OK

char

OK

string

OK

boolean

OK

void

OK

OPERATOR

char string boolean void

cube

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Type matching | Cube

OPERATOR

int
int

oat
oat

char string boolean void


X

int

oat

int

int

oat

string

oat

oat

oat

string

char

string

string

OPERATOR

string

cube (- unary)

char string boolean void

string string string string

boolean

string

void

cube (- binary)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Type matching | Example 1


int x, y, z;
char p, q, r;
float a, b, c;
boolean foo;

void method() {

x = a * c + p;
}

+
*
a

p
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Type matching | Example 2


symbol table

int a;
int c (int b) {
return b * 3 * 2 * 1 ;
}
void main () {
a = 1;
boolean a= c(14)/2 > 1;
}

cube for
matching types

stack

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Programming Assignment 3
Level 2

Reviewing Type Matching

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Grammar
<PROGRAM>

'{' <BODY> '}

<BODY> {<PRINT>';'|<ASSIGNMENT>';'|<VARIABLE>';|<WHILE>|<IF>|<RETURN>';'}
<ASSIGNMENT> identifier '=' <EXPRESSION>
<VARIABLE>
('int'|'float'|'boolean'|'char|'string'|'void')identifier
<WHILE>
'while' '(' <EXPRESSION> ')' <PROGRAM>
<IF>
'if' '(' <EXPRESSION> ')' <PROGRAM> ['else' <PROGRAM>]
<RETURN>
'return'
<PRINT>
print ( <EXPRESSION> )
<EXPRESSION>
<X>
<Y>
<R>
<E>
<A>
<B>
<C>

<X> {'|' <X>}


<Y> {'&' <Y>}
['!'] <R>
<E> {('>'|'<'|'=='|'!=') <E>}
<A> {(+'|'-) <A>}
<B> {('*'|'/') <B>}
['-'] <C>
integer | octal | hexadecimal | binary | true | false |
string | char | float | identifier|'(' <EXPRESSION> ')'

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Assignment 2 | Code
C

1.
// except for the open parenthesis
SemanticAnalizer.pushStack(
tokens.get(currentToken).getToken()
);

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Parser
1. Store in a flag (operatorWasUSed):
Did the operator - exist?

2.
if (operatorWasUsed)
String x = SemanticAnalizer.popStack();
String result = SemanticAnalizer.checkCube (x, - );
SemanticAnalizer.pushStack(result);
}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Parser
1. Store in a flag (twiceHere):
Did we pass this point twice?

2. Store the operator that


creates the loop?

3.
if (twiceHere)
String x = SemanticAnalizer.popStack();
String y = SemanticAnalizer.popStack();
String result = SemanticAnalizer.checkCube (x, y, operator );
SemanticAnalizer.pushStack(result);
}

twiceHere = false; // reset the flag


Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

Parser
1. Store in a flag (twiceHere):
Did we pass this point twice?

2. Store the operator that


creates the loop?

3.
if (twiceHere)
String x = SemanticAnalizer.popStack();
String y = SemanticAnalizer.popStack();
String result = SemanticAnalizer.checkCube (x, y, operator );
SemanticAnalizer.pushStack(result);
}

twiceHere = false; // reset the flag


Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 19

Parser
1. Store in a flag (twiceHere):
Did we pass this point twice?

2. Store the operator that


creates the loop?

3.
if (twiceHere)
String x = SemanticAnalizer.popStack();
String y = SemanticAnalizer.popStack();
String result = SemanticAnalizer.checkCube (x, y, operator );
SemanticAnalizer.pushStack(result);
}

twiceHere = false; // reset the flag


Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 20

Parser
1. Store in a flag (operatorWasUSed):
Did the operator - exist?

2.
if (operatorWasUsed)
String x = SemanticAnalizer.popStack();
String result = SemanticAnalizer.checkCube (x, ! );
SemanticAnalizer.pushStack(result);
}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 21

Parser
1. Store in a flag (twiceHere):
Did we pass this point twice?

2.
if (twiceHere)
String x = SemanticAnalizer.popStack();
String y = SemanticAnalizer.popStack();
String result = SemanticAnalizer.checkCube (x, y, & );
SemanticAnalizer.pushStack(result);
}

twiceHere = false; // reset the flag


Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 22

Parser
1. Store in a flag (twiceHere):
Did we pass this point twice?

2.
if (twiceHere)
String x = SemanticAnalizer.popStack();
String y = SemanticAnalizer.popStack();
String result = SemanticAnalizer.checkCube (x, y, & );
SemanticAnalizer.pushStack(result);
}

twiceHere = false; // reset the flag


Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 23

Assignment 2 | Code

String x = SemanticAnalizer.popStack();
String y = SemanticAnalizer.popStack();
String result = SemanticAnalizer.checkCube (x, y, = );

if (!result.equals(OK) {
error(2);
}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 24

Homework

Programming Assignment 3

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 25

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 19:
Semantic Analysis III

Javier Gonzalez-Sanchez
BYENG M1-38
Office Hours: By appointment

Semantic Analysis
1.

Declaration and Unicity. Review for uniqueness and that the variable
has been declared before its usage.

2.

Types. Review that the types of variables match the values assigned
to them.

3.

Arrays indexes. Review that the indexes are integers.

4.

Conditions. Review that all expressions on the conditons return a


boolean value.

5.

Return type. Review that the value returned by a method match the
type of the method.

6.

Parameters. Review that the parameters in a method match in type


and number with the declaration of the method.
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

3. Conditions

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Example
symbol table

int a, b;
boolean c;
{
a = 4;
b = a + 1;
IF (a > b) {
print (a);
}
}

cube for
matching types

stack

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Programming Assignment 3
Level 3

Reviewing Conditions

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Grammar
<PROGRAM>

'{' <BODY> '}

<BODY> {<PRINT>';'|<ASSIGNMENT>';'|<VARIABLE>';|<WHILE>|<IF>|<RETURN>';'}
<ASSIGNMENT> identifier '=' <EXPRESSION>
<VARIABLE>
('int'|'float'|'boolean'|'char|'string'|'void')identifier
<WHILE>
'while' '(' <EXPRESSION> ')' <PROGRAM>
<IF>
'if' '(' <EXPRESSION> ')' <PROGRAM> ['else' <PROGRAM>]
<RETURN>
'return'
<PRINT>
print ( <EXPRESSION> )
<EXPRESSION>
<X>
<Y>
<R>
<E>
<A>
<B>
<C>

<X> {'|' <X>}


<Y> {'&' <Y>}
['!'] <R>
<E> {('>'|'<'|'=='|'!=') <E>}
<A> {(+'|'-) <A>}
<B> {('*'|'/') <B>}
['-'] <C>
integer | octal | hexadecimal | binary | true | false |
string | char | float | identifier|'(' <EXPRESSION> ')'

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Conditions
WHILE

String x = SemanticAnalizer.popStack();

if (!x.equals(boolean) {
error(3);
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Conditions
IF

String x = SemanticAnalizer.popStack();

if (!x.equals(boolean) {
error(3);
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Semantic Analysis
Declaration and Unicity. Review for uniqueness and that the variable
has been declared before its usage.
Type Matching. Review that the types of variables match the values
assigned to them.
Arrays indexes. Review that the indexes are integers.
Conditions. Review that all expressions on the conditons return a
boolean value.
Return type. Review that the value returned by a method match the
type of the method.
Parameters. Review that the parameters in a method match in type
and number with the declaration of the method.
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Workshop

SemanticAnalyzer.java

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

SemanticAnalyzer.java

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

SemanticAnalyzer.java

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Homework

Programming Assignment 3

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 20:
Intermediate Code I

Javier Gonzalez-Sanchez
BYENG M1-38
Office Hours: By appointment

A Compiler in Action

Analysis

Lexer

Words
Tokens

Parser

Sentences

Semantic
Analyzer

Compilation
Assembly

Code
Generation

Symbol table

Uniqueness
Type matching

Translation
Source Code Intermediate Code
Intermediate Code Machine or Binary Code

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Source Code
{
int a;
int b;
int c;
int d;
if (a != 5) {
b = c + d;
}
print (a);
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Intermediate (Object) Code


a,int,global 0
b,int,global 0
c,int,global 0
d,int,global 0
#E1,int,label,9
#P,int,label,1
@
lod a, 0
lit 5, 0
opr 14, 0
jmc #e1, false
lod c, 0
lod d, 0
opr 2, 0
sto b, 0
lod a, 0
opr 21, 0
opr 1, 0

Symbol table

Instructions

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Translate Source to Object


{
int
int
int
int

a;
b;
c;
d;

if (a != 5) {
b = c + d;
}
print (a);
}

a,int,global 0
b,int,global 0
c,int,global 0
d,int,global 0
#E1,int,label,9
#P,int,label,1
@
lod , 0
lit 5, 0
opr 14, 0
jmc #e1, false
lod c, 0
lod d, 0
opr 2, 0
sto b, 0
lod a, 0
opr 21, 0
opr 1, 0

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

High-Level Languages
compilation

execution

// sorce code
int x;

Lexer

int foo () {
read (x);
print (5);
}

Parser

main () {
foo ();
}

Semantic Analyzer

Virtual Machine
(interpreter)

Code Generation

X,E,G,O,O
#e1,I,I,0,7
@
OPR 19, AX
STO x, AX
LIT 5, AX
OPR 21, AX
LOD #e1,AX
CAL 1, AX
OPR 0, AX

01001010101000010
01010100101010010
10100100000011011
11010010110101111
00010010101010010
10101001010101011

Assembler
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

High-Level Languages
compilation

execution

// sorce code
int x;

Lexer

int foo () {
read (x);
print (5);
}

Parser

main () {
foo ();
}

Semantic Analyzer

Virtual Machine
(interpreter)

Code Generation

X,E,G,O,O
#e1,I,I,0,7
@
OPR 19, AX
STO x, AX
LIT 5, AX
OPR 21, AX
LOD #e1,AX
CAL 1, AX
OPR 0, AX

01001010101000010
01010100101010010
10100100000011011
11010010110101111
00010010101010010
10101001010101011

Assembler
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

A Simple Virtual Machine


Memory





ALU








Register

CPU





Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

A Simple Virtual Machine





Code

sto
sto
sto
sto
lod
lit
opr
jmc
lod

0, s
0, d
0, c
0, d
s, 0
s, 0
14, 0
#a1, false
b, 0

ALU





program






Symbol
Table


Register

Memory





CPU




Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

A Simple Virtual Machine





Code

sto
sto
sto
sto
lod
lit
opr
jmc
lod

0, s
0, d
0, c
0, d
s, 0
s, 0
14, 0
#a1, false
b, 0

ALU





program






Symbol
Table




PC






Register



Memory





CPU




Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Instructions

instruction

first
parameter


second
parameter

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Instructions
LIT <value>, <register_id>
Put a constant value in a CPU register
Examples:
LIT
LIT
LIT
LIT
LIT

5, 0
5.5, 0
'a', 0
hello, 0
true, 0

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Instructions
LOD <variable>, <register_id>
Search for <variable> in the symbol table
Read its value
Put the value of <variable> in the CPU register
Examples:
LOD
a, 0
LOD hello, 0
LOD cse340, 0

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Instructions
STO <variable>, <register_id>
Read a value from the CPU register
Search for <variable> in the symbol table
Store the value into the variable
Examples:
STO
a, 0
STO hello, 0
STO cse340, 0

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Example
Source Code:
{
int a; int b;
a = 10;
b = a;

Symbol Table:
Type

Name

Scope

Value

int
int

a
b

global
global

0
0

Intermediate (Object) Code:


LIT 10,0
STO a,0
LOD a,0
STO b,0
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Instructions
OPR <operation>, <register_id>
Read one or two values from the CPU register
Do the operation <operation>
Put the result into the CPU register
Examples:
OPR
OPR
OPR

1, 0 ; return
2, 0 ; add
3, 0 ; subtract

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Operations
Number

Action

Exit program

Return

ADD: POP A, POP B, PUSH B+A

SUBTRACT: POP A, POP B, PUSH B-A

MULTIPLY: POP A, POP B, PUSH B*A

DIVIDE: POP A, POP B, PUSH B/A

MOD: POP A, POP B, PUSH (B mod A)

POW: POP A, POP B, PUSH (A to the power B)

OR: POP A, POP B, PUSH (B or A)

AND: POP A, POP B, PUSH (B and A)

10

NOT: POP A, PUSH (not A)

11

TEST GREATER THAN: POP A, POP B, PUSH (B>A)

12

TEST LESS THAN: POP A, POP B, PUSH (B<A)

13

TEST GREATER THAN OR EQUAL: POP A, POP B, PUSH (B>=A)


Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Operations
Number

Action

14

TEST LESS THAN OR EQUAL: POP A, POP B, PUSH (B<=A)

15

TEST EQUAL: POP A, POP B, PUSH (B=A)

16

TEST NOT EQUAL: POP A, POP B, PUSH (B<>A)

17
18

clear screen

19

read a value from the standard input

20

print a value to the standard output

21

print a value to the standard output and a newline character

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

Example
{
int
int
a =
b =
a =
b =
}

a;
b;
10;
a;
a * 10;
2 + 3 + 4;

Type
int
int

Name
a
b

Scope
global
global

Value
0
0

LIT 10, 0
STO a, 0
LOD a, 0
STO b, 0
LOD a, 0
LIT 10, 0
OPR 4, 0 ; multiply
STO a, 0
LIT 2, 0
LIT 3, 0
OPR 2, 0 ; add
LIT 4, 0
OPR 2, 0 ; add
STO b, 0
OPR 1,0
OPR 0,0
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 19

to be continued...

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 20

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 21:
Intermediate Code II

Javier Gonzalez-Sanchez
BYENG M1-38
Office Hours: By appointment

Review
8
hello

{
int a;
int b;
a = 5;
b = 9;
a = a + b / 3;
print (a);
print ("hello");
}
Lexer
Parser

Semantic Analyzer

Code Generation

Virtual Machine
(interpreter)
a, int, global, 0
b, int, global, 0
@
LIT 5, 0
STO a, 0
LIT 9, 0
STO b, 0
LOD a, 0
LOD b, 0
LIT 3, 0
OPR 4, 0
OPR 2, 0
STO a, 0
LOD a, 0
OPR 21,0
LOD "hello", 0
OPR 21,0
OPR 0, 0
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 2

Review

Lexer

int a;
int b;
boolean foo;
a = 10 + 20 + 30 + 40;
print (a);

Parser

Semantic Analyzer

Code Generation

foo = 340 > 126;


print (foo);
a = a / 2;
print ("total:" + a);
return;
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Review

100
true
total: 50

Virtual Machine
(interpreter)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Assignment #4

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Assignment #4
To show
proficiency
building a
descendent
parser

assignment #1
or
Lexer.jar

Programming
workshops

Lexer
Parser

Semantic Analyzer

Code Generation

Do not required.
Bonus points include
Semantic Analysis from
assignment #3

Following
lectures

Deadline: December 4
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Assignment #4 | Grammar
'{' <BODY> '}
{<PRINT>';'|<ASSIGNMENT>';'|<VARIABLE>';|
<WHILE>|<IF>|<SWITCH>|<RETURN>';'}
<ASSIGNMENT> identifier '=' <EXPRESSION>
<VARIABLE>
('int'|'float'|'boolean'|'char|'string'|'void')identifier
<WHILE>
'while' '(' <EXPRESSION> ')' <PROGRAM>
<IF>
'if' '(' <EXPRESSION> ')' <PROGRAM> ['else' <PROGRAM>]
<RETURN>
'return'
<PRINT>
print ( <EXPRESSION> )
<EXPRESSION> <X> {'|' <X>}
<X>
<Y> {'&' <Y>}
<Y>
['!'] <R>
<R>
<E> {('>'|'<'|'=='|'!=') <E>}
<E>
<A> {(+'|'-) <A>}
<A>
<B> {('*'|'/') <B>}
<B>
['-'] <C>
<C>
integer | octal | hexadecimal | binary | true | false |
string | char | float | identifier|'(' <EXPRESSION> ')'
<PROGRAM>
<BODY>

<SWITCH>
<CASES>
<DEFAULT>

'switch' '(' id ')' '{' <CASES> [<DEFAULT>] '}'


('case' (integer|octal|hexadecimal|binary) ':' <PROGRAM>)+
'default' ':' <PROGRAM>
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Assignment #4 | Compiler

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Assignment #4 | Compiler
Bonus Points

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Assignment #4 | VM
Use it to test your compiler.
No changes required

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Assignment #4 | Code
Add this lines to your Parser.run()

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Assignment #4 | Code
The CodeGenerator.java file

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Assignment #4 | Grading
Bonus 10 %
Implement SWITCH statement: (a) parser; and (b)
code generation.

Bonus 10 %
Integration: (a) graphical user interface including
token table, parse tree, and symbol table; (b)
syntactic errors handling and recovery; and (c)
semantic analysis.

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Assignment #4 | Grading

assignment

bonus

0100%

020%

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Implementing
Intermediate Code Generation

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Code Generation
1 * 2 > 3 + 4 * 5

LIT 1,
LIT 2,
OPR 4,
LIT 3,
LIT 4,
LIT 5,
OPR 4,
OPR 2,
OPR 11,

>
+

*
1

*
2

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

0
0
0
0
0
0
0
0
0

Code

PROGRAM

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Code
BODY

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

Code

PRINT

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 19

Code

ASSIGNMENT

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 20

Code
VARIABLE

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 21

Code

WHILE

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 22

Code

IF

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 23

Code

RETURN

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 24

Code

EXPRESSION

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 25

Code

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 26

Code

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 27

Code

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 28

Code

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 29

Code

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 30

Code

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 31

Code
C

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 32

Homework

Assignment #4
(LIT, LOD, STO, OPR)

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 33

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.

CSE340 - Principles of
Programming Languages
Lecture 22:
Intermediate Code III

Javier Gonzalez-Sanchez
BYENG M1-38
Office Hours: By appointment

Review

Programming Assignment 4
Level 1

LIT, LOD, STO, OPR

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 3

Code

OPR 0, 0

PROGRAM

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 4

Code
BODY

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 5

Code

OPR 21, 0

PRINT

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 6

Code

STO identifier, 0

ASSIGNMENT

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 7

Code
VARIABLE

identifier, <type>

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 8

Code

OPR 1, 0

RETURN

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 9

Parser

EXPRESSION

if (twice) {
OPR 8, 0
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 10

Parser

if (twice) {
OPR 9, 0
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 11

Parser

if (operatorUsed) {
OPR 10, 0
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 12

Parser
R

if (twice) {
OPR <operatorNumber>, 0
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 13

Parser

if (twice) {
OPR <operatorNumber>, 0
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 14

Parser

if (twice) {
OPR <operatorNumber>, 0
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 15

Parser

LIT 0, 0

if (operatorUsed) {
OPR 3, 0
}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 16

Assignment 2 | Code
C

LIT <value>, 0

LOD identifier, 0

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 17

Review

Programming Assignment 4
Level 2

JMP, JMC

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 18

Instructions
JMP <line>, 0
Put the value <line> in the program counter;
thus, the next line to be executed will be <line>
Examples:
JMP
JMP
JMP

1, 0
14, 0
75, 0

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 19

Instructions
JMC <line>, <condition>
Read one value from the CPU register
If the value is equal to <condition>, put the value
<line> in the program counter; thus, the next line to
be executed will be <line>
Examples:
JMC
JMC
JMC

1, true
14, false
75, true

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 20

Example
{
int a; int b;
a = 10;
while (a>1){
print (a);
}
return;
}

int
int
@

a
b

LIT 10, 0
STO
a, 0
LOD
a, 0
LIT
1, 0
OPR 11, 0
JMC #e1, false
LOD
a, 0
OPR 20, 0
JMP #e2, 0
OPR
1, 0
OPR
0, 0

global
global

0
0

; >

; print

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 21

Labels
Compiler creates variables and adds them to the
symbol table to remember positions in the code.
This is useful for loops and conditions.

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 22

Labels | how while works

while (a < b) {

label#1
if (a < b) {

//code
//code
}
} else
goto label#2
}
goto label#1
label #2

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 23

Labels | how if works

if (a < b) {

//code1
} else {

// code2
}

if (a < b) {
if (a>b) goto label #1
//code1
} else
goto label#2
label #1
// code2
}
label #2

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 24

Example | while
{
int a; int b;
a = 10;
while (a>1){
print (a);
}
return;
}

int
int

a
b

global
global

0
0

int

#e1

global

int

#e2

global

10
3

LIT 10, 0
STO
a, 0
LOD
a, 0
LIT
1, 0
OPR 11, 0 ; >
JMC #e1, false
LOD
a, 0
OPR 20, 0 ;print
JMP #e2, 0
OPR
1, 0
OPR
0, 0

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 25

Example | if
{
int a; int b;
a = 10;
if (a>1) {
print (a);
} else {
print (b);
}
return;
}

int
int

a
b

global
global

0
0

int

#e1

global

int

#e2

global

10
12

LIT
10, 0
STO
a, 0
LOD
a, 0
LIT
1, 0
OPR
11, 0 ; >
JMC #e1, false
LOD
a, 0
OPR
20, 0 ;print
JMP #e2, 0
LOD
b, 0
OPR
20, 0 ;print
OPR
1, 0
OPR
0, 0
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 26

Exercise
{
int a; int b;
a = 10;
while (a>1) {
if (a != 0) {
print (a);
} else {
print (b);
}
a = a -1;
}
}
Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 27

Handwritten notes

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 28

Code

WHILE

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 29

Code

IF

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 30

What about a switch-case

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 31

Handwritten notes
Add the symbol table here

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 32

Homework
Translate this source code to intermediate code:
int a, b;
switch (a) {
case 1: {
case 2: {
case 3: {
default:{

b
b
b
b

=
=
=
=

11;
22;
33;
99;

break;}
break;}
break;}
break;}

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 33

Homework

Assignment #4

Javier Gonzalez-Sanchez | CSE340 | Fall 2014 | 34

CSE340 - Principles of Programming Languages


Javier Gonzalez-Sanchez
javiergs@asu.edu
Fall 2014
Disclaimer. These slides can only be used as study material for the class CSE340 at ASU. They cannot be distributed or used for another purpose.