System SW 4

CAP318 SYSTEM SOFTWARE
HOME WORK – IV
Submitted To– Submitted By-

Lect. Mandeep Kaur Surendra
MCA 4th SEM
D3804A15
10806601
Declaration:
I declare that this assignment is my individual work. I have not copied from any other
student’s work or from any other source except where due acknowledgment is made
explicitly in the text, nor has any part been written for me by another person.
Student’s Signature: surendra
Evaluator’s comments:
_____________________________________________________________________
Marks obtained: ___________ out of ______________________
Content of Homework should start from this page only:
1
Part A
Q1. Which kind of source program errors would be generated in lexical analysis?
ANSWER:
Lexical analysis is the process of converting a sequence of characters into a sequence of
tokens. A program or function which performs lexical analysis is called a lexical analyzer,
lexer or scanner. A lexer often exists as a single function which is called by a parser or
another function.
It breaks the text down into the smallest useful atomic units, known as tokens, while
throwing away (or at least, putting to one side) extraneous information, such as white
space and comments and parsing which operates on tokens and groups them into useful
grammatical structures.
The lexical structure is specified using regular expressions.

(1) Get rid of white spaces (e.g., \t, \n, \sp) and comments
(2) Line numbering
Errors that might occur during scanning, called lexical errors include
• Encountering characters that are not in the language's alphabet

• Too many characters in a word or line (yes, such languages do exist!)
• An "unclosed" character or string literal
• An end of file within a comment
2
Example:
Consider this expression in the C programming language:
Sum=3+2;
Tokenized in the following table:
lexeme token type
sum Identifier
= Assignment operator
3 Number
+ Addition operator
2 Number
; End of statement
Q2. Formal specification and formal grammar differs in which concern?

ANSWER:
Formal Specification:
” Formal specifications are expressed in a mathematical notation with precisely

defined vocabulary, syntax and semantics.”
The specification of a programming language is intended to provide a definition that the

language users and the implementers can use to determine whether the behavior of a program
is correct, given its source code.
 Formal specification involves investing more effort in the early phases of
software development.
 This reduces requirements errors as it forces a detailed analysis of the

requirements.
3
 Incompleteness and inconsistencies can be discovered and resolved.
 Hence, savings as made as the amount of rework due to requirements problems

is reduced.
A programming language specification can take several forms, including the following:
 An explicit definition of the syntax, static semantics, and execution semantics of the
language. While syntax is commonly specified using a formal grammar, semantic
definitions may be written in natural language (e.g., as in the C language), or a formal
semantics (e.g., as in Standard ML[30] and Scheme[31] specifications).
 A description of the behavior of a translator for the language (e.g., the C++ ). The
syntax and semantics of the language have to be inferred from this description, which
may be written in natural or a formal language.
 A reference or model implementation, sometimes written in the language being
specified (e.g., Prolog). The syntax and semantics of the language are explicit in the
behavior of the reference implementation.
Formal Grammar:-
“It is an explicit definition of the syntax, static semantics, and execution semantics of the
language. While syntax is commonly specified using a formal grammar.”
A formal grammar (sometimes simply called a grammar) is a set of rules of a specific
kind, for forming strings in a formal language. The rules describe how to form strings
from the language's alphabet that are valid according to the language's syntax. A
grammar does not describe the meaning of the strings or what can be done with them
in whatever context only their form. A formal grammar is a set of rules for rewriting
strings, along with a "start symbol" from which rewriting must start. Therefore, a
grammar is usually thought of as a language generator. However, it can also sometimes
be used as the basis for a "recognizer" a function in computing that determines whether
a given string belongs to the language or is grammatically incorrect. To describe such
recognizers, formal language theory uses separate formalisms, known as automata
theory. One of the interesting results of automata theory is that it is not possible to
design a recognizer for certain formal languages.
4
Q3. Illustrate the kind of source program errors which would be detected in code
generation?
ANSWER:
Code generation is the process by which a compiler's code generator converts some
internal representation of source code into a form (e.g., machine code) that can be
readily executed by a machine (often a computer).
The input to the code generator typically consists of a parse tree or an abstract syntax
tree. The tree is converted into a linear sequence of instructions, usually in an
intermediate language such as three address code. Further stages of compilation may or
may not be referred to as "code generation", depending on whether they involve a
significant change in the representation of the program.
Tasks which are typically part of a sophisticated compiler's "code generation" phase
include:
■Instruction selection: which instructions to use.
■Instruction scheduling: in which order to put those instructions. Scheduling is a speed
optimization that can have a critical effect on pipelined machines.
■Register allocation: the allocation of variables to processor registers.
Instruction selection is typically carried out by doing a recursive post order traversal on
the abstract syntax tree, matching particular tree configurations against templates; for
example, the tree W: = ADD(X, MUL(Y, Z)) might be transformed into a linear
sequence of instructions by recursively generating the sequences for t1:= X and t2:=
MUL(Y, Z), and then emitting the instruction ADD W, t1, t2.
In a compiler that uses an intermediate language, there may be two instruction selection
stages — one to convert the parse tree into intermediate code, and a second phase
much later to convert the intermediate code into instructions in the ISA of the target
machine. This second phase does not require a tree traversal; it can be done linearly,
and typically involves a simple replacement of intermediate-language operations with
their corresponding opcodes. However, if the compiler is actually a language translator
(for example, one that converts Eiffel to C), then the second code-generation phase
may involve building a tree from the linear intermediate code.
5
Part B
Q1. In which way the symbol table in compiler differs from symbol table of
assemblers?
ANSWER:
 INTRODUCTION: -
In assembly language, the programme is responsible for looping or recursive behaviour

using jumps. As a result, a simple assembler does not need to maintain a great deal of
internal information about stack frames or variable bindings, since variables can
normally be bound directly to addresses. In higher level language, after the scanning
phase of the compiler produces a token stream, it is usually possible to identify which
tokens will be potential variables.
 SYMBOL TABLE OF ASSEMBLER:-
• The symbol table include each entry not only the name and assembly-time value fields
but also a length field and a relative location indicator.
• This length fields indicates the length (in bytes) of the instruction to which the symbol
is attached.
Symbol Table:-
Symbol(8bytes) Value(4 bytes) Length(1 byte) Relocation
“JOHNbbbb” 0000 01 “R”
“FOURbbbb” 000C 04 “R”
“FIVEbbbb” 0010 04 “R”
“TEMPbbbb” 0014 04 “R”
 SYMBOL TABLE OF COMPLIER:-
• It is created by Lexical Analysis to represent the program as a string of tokens

rather than of individual characters.
• Spaces and Comments in the source are not represented by Symbols and are not
used by future phases.
6
• Attributes stored in a symbol table for each identifier:
• type
• size
• scope/visibility information
• base address
• addresses to location of auxiliary symbol tables (in case of records,
procedures, classes)
• address of the location containing the string which actually names the
identifier and its length in the string pool
Table Index
Q2. Draw a block diagram of the phases of a compiler and indicate the main function
of each phase.
ANSWER:
 ABOUT COMPILER: -
A compiler is a program that reads a program in one language, the source language and translates into an
equivalent program in another language, the target language.
 PHASES OF COMPLIER: -
7
1. Lexical analyzer :
It takes the source program as an input and produces a long string of tokens.
2. Syntax Analyzer:
It takes an out of lexical analyzer and produces a large tree.
8
3. Semantic analyzer:
It takes the output of syntax analyzer and produces another tree.
4. Intermediate code generator:
It takes a tree as an input produced by semantic analyzer and produces intermediate

code.
5. Parser Generators:
The specification of input based on regular expression. The organization is based on

finite automation.
6. Scanner Generator:
The specification of input based on regular expression. The organization is based on

finite automation.
7. Syntax-Directed Translation:
It walks the parse tee and as a result generates intermediate code.
8. Automatic Code Generators:
It translates intermediate rampage into machine language.
9. Data-Flow Engines:
It does code optimization using data-flow analysis.
9
Q3. Is it always worthwhile to optimize a program?
ANSWER:
Program optimization or software optimization is the process of modifying a software
system to make some aspect of it work more efficiently or use fewer resources. In
general, a computer program may be optimized so that it executes more rapidly, or is
capable of operating with less memory storage or other resources, or draw less power.
The optimized system will typically only be optimal in one application or for one
audience. One might reduce the amount of time that a program takes to perform some
task at the price of making it consume more memory. In an application where memory
space is at a premium, one might deliberately choose a slower algorithm in order to use
less memory. Often there is no “one size fits all” design which works well in all cases,
so engineers make trade-offs to optimize the attributes of greatest interest.
Additionally, the effort required to make a piece of software completely optimal—
incapable of any further improvement— is almost always more than is reasonable for
the benefits that would be accrued; so the process of optimization may be halted before
a completely optimal solution has been reached.
"Levels" of optimization:
1. Design level
2. Source code level
3. Compile level
4. Assembly level
5. Runtime
Example:
10
11

System SW 4

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

System SW 4

Hochgeladen von

Copyright:

Verfügbare Formate

CAP318 SYSTEM SOFTWARE

Submitted To– Submitted By-

The lexical structure is specified using regular expressions.

• Encountering characters that are not in the language's alphabet

lexeme token type

Q2. Formal specification and formal grammar differs in which concern?

” Formal specifications are expressed in a mathematical notation with precisely

The specification of a programming language is intended to provide a definition that the

 This reduces requirements errors as it forces a detailed analysis of the

 Hence, savings as made as the amount of rework due to requirements problems

In assembly language, the programme is responsible for looping or recursive behaviour

 SYMBOL TABLE OF ASSEMBLER:-

 SYMBOL TABLE OF COMPLIER:-

• It is created by Lexical Analysis to represent the program as a string of tokens

It takes an out of lexical analyzer and produces a large tree.

It takes the output of syntax analyzer and produces another tree.

4. Intermediate code generator:

It takes a tree as an input produced by semantic analyzer and produces intermediate

The specification of input based on regular expression. The organization is based on

The specification of input based on regular expression. The organization is based on

It walks the parse tee and as a result generates intermediate code.

8. Automatic Code Generators:

It translates intermediate rampage into machine language.

It does code optimization using data-flow analysis.

Das könnte Ihnen auch gefallen