Sie sind auf Seite 1von 23

February 2010

Master of Computer Application (MCA) – Semester 3


MC0073 – System Programming
Assignment Set – 1

1. Describe the following with respect to Language


Specification:
A) Programming Language Grammars
B) Classification of Grammars
C) Binding and Binding Times
Ans –

A) Programming Language Grammars

The lexical and syntactic features of a programming language are


specified by its grammar. This section discusses key concepts and
notions from formal language grammars. A language L can be
considered to be a collection of valid sentences. Each sentence can be
looked upon as a sequence of words and each word as a sequence of
letters or graphic symbols acceptable in L. A language specified in this
manner is known as & formal language. A formal language grammar is a
set of rules which precisely specify the sentences of L. It is clear that
natural languages are not formal languages due to their rich vocabulary.
However, PLs are formal languages.

Terminal symbols, alphabet and strings

The alphabet of L, denoted by the Greek symbol , is the


collection of symbols in its character set. We will use lower case letters a,
b, c, etc. to denote symbols in . A symbol in the alphabet is known as a
terminal symbol (T) of L. The alphabet can be represented using the
mathematical notation of a set, e.g. = {a, b,… z, 0, l,… 9}

Here the symbols {, ‘,’ and} are part of the notation. We call them
metasymbols to differentiate them from terminal symbols. Throughout
this discussion we assume that metasymbols are distinct from the
terminal symbols. If this is not the case, i.e. if a terminal symbol and a
meta symbol are identical, we enclose the terminal symbol in quotes to
differentiate it from the meta symbol. For example, the set of
punctuation symbols of English can be defined as where ‘,’ denotes the
terminal symbol ‘comma’.

A string is a finite sequence of symbols. We will represent strings


by Greek symbols a, (α, ß, γ etc. Thus α = axy is a string over ∑. The
length of a string is the number of symbols in it. Note that the absence of
any symbol is also a string, the null string ε. The concatenation operation
combines two strings into a single string. It is used to build larger strings
from existing strings. Thus, given two strings α and ß, concatenation of α
with ß yields a string which is formed by putting the sequence of symbols
forming α before the sequence of symbols forming ß. For example, if α =
ab, ß = axy, then concatenation of α and ß, represented as α.ß or simply
αß, gives the string abaxy. The null string can also participate in a
concatenation, thus a.ε =ε.a = a.

Nonterminal symbols

A nonterminal symbol (NT) is the name of a syntax category of a


language, e.g. noun, verb, etc. An NT is written as a single capital letter,
or as a name enclosed between <…>, e.g. A or < Noun >. During
grammatical analysis, a nonterminal symbol represents an instance of
the category. Thus, < Noun > represents a noun.

Productions

A production, also called a rewriting rule, is a rule of the grammar.


A production has the form A nonterminal symbol :: = String of Ts and
NTs and defines the fact that the NT on the LHS of the production can be
rewritten as the string of Ts and NTs appearing on the RHS. When an NT
can be written as one of many different strings, the symbol ‘|’ (standing
for ‘or’) is used to separate the strings on the RHS, e.g.

< Article > ::- a | an | the

The string on the RHS of a production can be a concatenation of


component strings, e.g. the production < Noun Phrase > ::= < Article
>< Noun >

expresses the fact that the noun phrase consists of an article followed by
a noun.

Each grammar G defines a language lg. G contains an NT called the


distinguished symbol or the start NT of G. Unless otherwise specified, we
use the symbol S as the distinguished symbol of G. A valid string α of l g is
obtained by using the following procedure

1. Let α= ‘S’.

2. While α is not a string of terminal symbols


(a) Select an NT appearing in α, say X.

(b) Replace X by a string appearing on the RHS of a production of X.

Example

Grammar (1.1) defines a language consisting of noun phrases in English

< Noun Phrase > :: = < Article > < Noun >

< Article > ::= a | an | the

<Noun> ::= boy | apple

< Noun Phrase > is the distinguished symbol of the grammar, the boy
and an apple are some valid strings in the language.

Definition (Grammar)

A grammar G of a language lg is a quadruple where

is the alphabet of Lg, i.e. the set of Ts,

SNT is the set of NTs,

S is the distinguished symbol, and

P is the set of productions.

Derivation, reduction and parse trees

A grammar G is used for two purposes, to generate valid strings of


lg and to ‘recognize’ valid strings of lg. The derivation operation helps to
generate valid strings while the reduction operation helps to recognize
valid strings. A parse tree is used to depict the syntactic structure of a
valid string as it emerges during a sequence of derivations or reductions.

Derivation

Let production pi of grammar G be of the form

*
*
*
A
and let be a string such that ß = γAθ, then replacement of A by α in
string constitutes a derivation according to production p1 . We use the
notation N Þη to denote direct derivation of η from N and N Þ η to denote
transitive derivation of η (i.e. derivation in zero or more steps) from N,
respectively. Thus, A =>α only if A : = α is a production of G and A Þ δ if
A Þ … Þ δ. We can use this notation to define a valid string according to a
grammar G as follows: δ is a valid string according to G only if S Þ δ,
where S is the distinguished symbol of G.

Example: Derivation of the string the boy according to grammar can be


depicted as

< Noun Phrase > => < Article > < Noun >

=> the < Noun >

=> the boy

A string α such that S => α is a sentential form of lg. The string α is a


sentence of lg if it consists of only Ts.

Example: Consider the grammar G

< Sentence >::= < Noun Phrase > < Verb Phrase >

< Noun Phrase >::= < Article >< Noun >

< Verb Phrase >::= <verb> <Noun Phrase>

<Article> ::= = a | an | the

< Noun >::= boy | apple

<verb> ::= ate

The following strings are sentential forms of Lg

< Noun Phrase > < Verb Phrase >

the boy < Verb Phrase >

< Noun Phrase > ate < Noun Phrase >

the boy ate < Noun Phrase >

the boy ate an apple

However, only the boy ate an apple is a sentence.


Reduction: To determine the validity of the string

Example

The boy ate an apple

according to grammar we perform the following reductions Step String

The boy ate an apple

1 < Article > boy ate an apple

2 < Article > < Noun > ate an apple

3 < Article > < Noun > < Verb > an apple

4 < Article > < Noun > < Verb > < Article > apple

5 < Article > < Noun > < Verb > < Article > < Noun >

6 < Noun Phrase > < Verb > < Article > < Noun >

7 < Noun Phrase > < Verb > < Noun Phrase >

8 < Noun Phrase > < Verb Phrase >

9 < Sentence >

The string is a sentence of lg since we are able to construct the reduction


sequence the boy ate an apple —> < Sentence >.

Parse trees

A sequence of derivations or reductions reveals the syntactic structure of


a string with respect to G. We depict the syntactic structure in the form
of a parse tree. Derivation according to the production A :: = α gives rise
to the following elemental parse tree.

B) Classification of Grammars

Grammars are classified on the basis of the nature of productions used in


them (Chomsky, 1963). Each grammar class has its own characteristics
and limitations.

Type – 0 Grammars

These grammars, known as phrase structure grammars, contain


productions of the form
where both α and ß can be strings of Ts and NTs. Such productions
permit arbitrary substitution of strings during derivation or reduction,
hence they are not relevant to specification of programming languages.

Type – 1 grammars

These grammars are known as context sensitive grammars because their


productions specify that derivation or reduction of strings can take place
only in specific contexts. A Type-1 production has the form

Thus, a string in a sentential form can be replaced by ‘A’ (or


vice versa) only when it is enclosed by the strings . These
grammars are also not particularly relevant for PL specification since
recognition of PL constructs is not context sensitive in nature.

Type – 2 grammars

These grammars impose no context requirements on derivations or


reductions. A typical Type-2 production is of the form

which can be applied independent of its context. These grammars


are therefore known as context free grammars (CFG). CFGs are ideally
suited for programming language specification.

Type – 3 grammars

Type-3 grammars are characterized by productions of the form

A::= tB | t or A ::= Bt | t

Note that these productions also satisfy the requirements of Type-


2 grammars. The specific form of the RHS alternatives—namely a single
T or a string containing a single T and a single NT—gives some practical
advantages in scanning.

Type-3 grammars are also known as linear grammars or regular


grammars. These are further categorized into left-linear and right-linear
grammars depending on whether the NT in the RHS alternative appears
at the extreme left or extreme right.

Operator grammars

Definition (Operator grammar (OG)) An operator grammar is a


grammar none of whose productions contain two or more consecutive
NTs in any RHS alternative.
Thus, nonterminals occurring in an RHS string are separated by
one or more terminal symbols. All terminal symbols occurring in the RHS
strings are called operators of the grammar.

C) Binding and Binding Times

Definition: Binding: A binding is the association of an attribute of a


program entity with a value.

Binding time is the time at which a binding is performed. Thus the


type attribute of variable var is bound to type, when its declaration is
processed. The size attribute of type is bound to a value sometime prior
to this binding. We are interested in the following binding times:

1. Language definition time of L

2. Language implementation time of L

3. Compilation time of P

4. Execution init time of proc

5. Execution time of proc.

Where L is a programming language, P is a program written in L


and proc is a procedure in P. Note that language implementation time is
the time when a language translator is designed. The preceding list of
binding times is not exhaustive; other binding times can be defined, viz.
binding at the linking time of P. The language definition of L specifies
binding times for the attributes of various entities of programs written in
L.

Binding of the keywords of Pascal to their meanings is performed


at language definition time. This is how keywords like program,
procedure, begin and end get their meanings. These bindings apply to
all programs written in Pascal. At language implementation time, the
compiler designer performs certain bindings. For example, the size of
type ‘integer’ is bound to n bytes where n is a number determined by the
architecture of the target machine. Binding of type attributes of variables
is performed at compilation time of program bindings. The memory
addresses of local variables info and p of procedure proc are bound at
every execution init time of procedure proc. The value attributes of
variables are bound (possibly more than once) during an execution of
proc. The memory address of P↑ is bound when the procedure call new
(p) is executed.

Static and dynamic bindings


Definition (Static binding) A static binding is a binding performed
before the execution of a program begins.

Definition (Dynamic binding) A dynamic binding is a binding


performed after the execution of a program has begun.

2. Define the following:


A) Systems Software B) Application Software
C) System Programming D) Von Neumann
Architecture
Ans –
A) System Software

System software is computer software designed to operate the


computer hardware and to provide and maintain a platform for running
application software.
The most important types of system software are:
The computer BIOS and device firmware, which provide basic
functionality to operate and control the hardware connected to or built
into the computer.
The operating system (prominent examples being Microsoft Windows,
Mac OS X and Linux), which allows the parts of a computer to work
together by performing tasks like transferring data between memory and
disks or rendering output onto a display device. It also provides a
platform to run high-level system software and application software.
Utility software, which helps to analyze, configure, optimize and maintain
the computer.

In some publications, the term system software is also used to


designate software development tools (like a compiler, linker or
debugger).

System software is usually not what a user would buy a computer for -
instead, it can be seen as the basics of a computer which come built-in
or pre-installed. In contrast to system software, software that allows
users to do things like create text documents, play games, listen to
music, or surf the web is called application software.

B) Application Software

Application software, also known as applications or apps, is computer


software designed to help the user to perform singular or multiple
related specific tasks. Examples include Enterprise software, Accounting
software, Office suites, Graphics software and media players.
Application software is contrasted with system software and
middleware, which manage and integrate a computer's capabilities, but
typically do not directly apply them in the performance of tasks that
benefit the user. A simple, if imperfect analogy in the world of hardware
would be the relationship of an electric light bulb (an application) to an
electric power generation plant (a system). The power plant merely
generates electricity, not itself of any real use until harnessed to an
application like the electric light that performs

In computer science, an application is a computer program designed


to help people perform a certain type of work. An application thus differs
from an operating system (which runs a computer), a utility (which
performs maintenance or general-purpose chores), and a programming
language (with which computer programs are created). Depending on
the work for which it was designed, an application can manipulate text,
numbers, graphics, or a combination of these elements. Some
application packages offer considerable computing power by focusing on
a single task, such as word processing; others, called integrated
software, offer somewhat less power but include several applications.
User-written software tailors systems to meet the user's specific needs.
User-written software includes spreadsheet templates, word processor
macros, scientific simulations, and graphics and animation scripts. Even
email filters are a kind of user software. Users create this software
themselves and often overlook how important it is. The delineation
between system software such as operating systems and application
software is not exact, however, and is occasionally the object of
controversy.

C) System Programming

System programming (or systems programming) is the activity of


programming system software. The primary distinguishing characteristic
of systems programming when compared to application programming is
that application programming aims to produce software which provides
services to the user (e.g. word processor), whereas systems
programming aims to produce software which provides services to the
computer hardware (e.g. disk defragmenter). It requires a greater degree
of hardware awareness.

In system programming more specifically:

the programmer will make assumptions about the hardware and


other properties of the system that the program runs on, and will often
exploit those properties (for example by using an algorithm that is known
to be efficient when used with specific hardware)

usually a low-level programming language or programming language


dialect is used that:
• can operate in resource-constrained environments
• is very efficient and has little runtime overhead
• has a small runtime library, or none at all
• allows for direct and "raw" control over memory access and control
flow
• lets the programmer write parts of the program directly in
assembly language

debugging can be difficult if it is not possible to run the program in


a debugger due to resource constraints. Running the program in a
simulated environment can be used to reduce this problem.

Systems programming is sufficiently different from application


programming that programmers tend to specialize in one or the other.

In system programming, often limited programming facilities are


available. The use of automatic garbage collection is not common and
debugging is sometimes hard to do. The runtime library, if available at
all, is usually far less powerful, and does less error checking. Because of
those limitations, monitoring and logging are often used; operating
systems may have extremely elaborate logging subsystems.

Implementing certain parts in operating system and networking


requires systems programming (for example implementing Paging
(Virtual Memory) or a device driver for an operating system).

D) The Von Neumann architecture


The von Neumann architecture is a design model for a stored-program
digital computer that uses a central processing unit (CPU) and a single
separate storage structure ("memory") to hold both instructions and
data. It is named after the mathematician and early computer scientist
John von Neumann. Such computers implement a universal Turing
machine and have a sequential architecture.

A stored-program digital computer is one that keeps its programmed


instructions, as well as its data, in read-write, random-access memory
(RAM). Stored-program computers were an advancement over the
program-controlled computers of the 1940s, such as the Colossus and
the ENIAC, which were programmed by setting switches and inserting
patch leads to route data and to control signals between various
functional units. In the vast majority of modern computers, the same
memory is used for both data and program instructions. The mechanisms
for transferring the data and instructions between the CPU and memory
are, however, considerably more complex than the original von
Neumann architecture.

3. Explain the following with respect to the design


specifications of an Assembler:
A) Data Structures B) pass1 & pass2 Assembler
flow chart
Ans –

A) Data Structure

The second step in our design procedure is to establish the databases


that we have to work with.

Pass 1 Data Structures

1. Input source program

2. A Location Counter (LC), used to keep track of each instruction’s


location.

3. A table, the Machine-operation Table (MOT) that indicates the


symbolic mnemonic, for each instruction and its length (two, four, or six
bytes)

4. A table, the Pseudo-Operation Table (POT) that indicates the symbolic


mnemonic and action to be taken for each pseudo-op in pass 1.

5. A table, the Symbol Table (ST) that is used to store each label and its
corresponding value.

6. A table, the literal table (LT) that is used to store each literal
encountered and its corresponding assignment location.

7. A copy of the input to be used by pass 2.

Pass 2 Data Structures

1. Copy of source program input to pass1.

2. Location Counter (LC)

3. A table, the Machine-operation Table (MOT), that indicates for each


instruction, symbolic mnemonic, length (two, four, or six bytes), binary
machine opcode and format of instruction.

4. A table, the Pseudo-Operation Table (POT), that indicates the symbolic


mnemonic and action to be taken for each pseudo-op in pass 2.

5. A table, the Symbol Table (ST), prepared by pass1, containing each


label and corresponding value.
6. A Table, the base table (BT), that indicates which registers are
currently specified as base registers by USING pseudo-ops and what the
specified contents of these registers are.

7. A work space INST that is used to hold each instruction as its various
parts are being assembled together.

8. A work space, PRINT LINE, used to produce a printed listing.

9. A work space, PUNCH CARD, used prior to actual outputting for


converting the assembled instructions into the format needed by the
loader.

10. An output deck of assembled instructions in the format needed by


the loader.

Format of Data Structures

The third step in our design procedure is to specify the format and
content of each of the data structures. Pass 2 requires a machine
operation table (MOT) containing the name, length, binary code and
format; pass 1 requires only name and length. Instead of using two
different tables, we construct single (MOT). The Machine operation table
(MOT) and pseudo-operation table are example of fixed tables. The
contents of these tables are not filled in or altered during the assembly
process.

The following figure depicts the format of the machine-op table (MOT)

—————————————– 6 bytes per entry ———————————–

Mnemonic Binary OpcodeInstruction Instruction Not


Opcode (1byte) length format used
(4bytes) (hexadecimal) here
characters (2 bits)(3bits)
(binary) (binary) (3 bits)
“Abbb” 5A 10 001
“Ahbb” 4A 10 001
“ALbb” 5E 10 001
“ALRB” 1E 01 000
……. ……. ……. …….

‘b’ represents “blank”

B) pass1 & pass2 Assembler flow chart

Pass Structure of Assemblers


Here we discuss two pass and single pass assembly schemes in this
section:

Two pass translation

Two pass translation of an assembly language program can handle


forward references easily. LC processing is performed in the first pass
and symbols defined in the program are entered into the symbol table.
The second pass synthesizes the target form using the address
information found in the symbol table. In effect, the first pass performs
analysis of the source program while the second pass performs synthesis
of the target program. The first pass constructs an intermediate
representation (IR) of the source program for use by the second pass.
This representation consists of two main components–data structures,
e.g. the symbol table, and a processed form of the source program. The
latter component is called intermediate code (IC).

Single pass translation

LC processing and construction of the symbol table proceed as in two


pass translation. The problem of forward references is tackled using a
process called backpatch-ing. The operand field of an instruction
containing a forward reference is left blank initially. The address of the
forward referenced symbol is put into this field when its definition is
encountered.

Look at the following instructions:

START 101
READ N 101) + 09 0 113
MOVER BREG, ONE 102) + 04 2 115
MOVEM BREG, TERM 103) + 05 2 116
AGAIN MULT BREG, TERM 104) + 03 2 116
MOVER CREG, TERM 105) + 04 3 116
ADD CREG, ONE 106) + 01 3 115
MOVEM CREG, TERM 107) + 05 3 116
COMP CREG, N 108) + 06 3 113
BC LE, AGAIN 109) + 07 2 104
MOVEM BREG, 110) + 05 2 114
RESULT
PRINT RESULT 111) + 10 0 114
STOP 112) + 00 0 000
N DS 1 113)
RESULT DS 1 114)
ONE DC ‘1’ 115) + 00 0 001
TERM PS 1 116)
END

In the above program, the instruction corresponding to the statement

MOVER BREG, ONE

can be only partially synthesized since ONE is a forward reference.


Hence the instruction opcode and address of BREG will be assembled to
reside in location 101. The need for inserting the second operand’s
address at a later stage can be indicated by adding an entry to the Table
of Incomplete Instructions (TII). This entry is a pair (instruction address>,
<symbol>), e.g. (101, ONE) in this case.

By the time the END statement is processed, the symbol table


would contain the addresses of all symbols defined in the source
program and TII would contain information describing all forward
references. The assembler can now process each entry in TII to complete
the concerned instruction. For example, the entry (101, ONE) would be
processed by obtaining the address of ONE from symbol table and
inserting it in the operand address field of the instruction with assembled
address 101. Alternatively, entries in TII can be processed in an
incremental manner. Thus, when definition of some symbol symb is
encountered, all forward references to symb can be processed.

Design of A Two Pass Assembler

Tasks performed by the passes of a two pass assembler are as follows:

Pass I:

1. Separate the symbol, mnemonic opcode and operand fields.

2. Build the symbol table.

3. Perform LC processing.

4. Construct intermediate representation.

Pass II: Synthesize the target program.

Pass I performs analysis of the source program and synthesis of


the intermediate representation while Pass II processes the intermediate
representation to synthesize the target program. The design details of
assembler passes are discussed after introducing advanced assembler
directives and their influence on LC processing.

4. Explain the following with respect to Macros and


Macro Processors:
A) Macro Definition and Expansion
B) Conditional Macro Expansion
C) Macro Parameters
Ans –

A) Macro definition and Expansion

Definition : macro

A macro name is an abbreviation, which stands for some related


lines of code. Macros are useful for the following purposes:

· To simplify and reduce the amount of repetitive coding

· To reduce errors caused by repetitive coding

· To make an assembly program more readable.

A macro consists of name, set of formal parameters and body of


code. The use of macro name with set of actual parameters is replaced
by some code generated by its body. This is called macro expansion.

Macros allow a programmer to define pseudo operations, typically


operations that are generally desirable, are not implemented as part of
the processor instruction, and can be implemented as a sequence of
instructions. Each use of a macro generates new program instructions,
the macro has the effect of automating writing of the program.

Macros can be defined used in many programming languages, like


C, C++ etc. Example macro in C programming.Macros are commonly
used in C to define small snippets of code. If the macro has parameters,
they are substituted into the macro body during expansion; thus, a C
macro can mimic a C function. The usual reason for doing this is to avoid
the overhead of a function call in simple cases, where the code is
lightweight enough that function call overhead has a significant impact
on performance.

For instance,

#define max (a, b) a>b? A: b

Defines the macro max, taking two arguments a and b. This macro
may be called like any C function, using identical syntax. Therefore, after
preprocessing
z = max(x, y);

Becomes z = x>y? X:y;

While this use of macros is very important for C, for instance to


define type-safe generic data-types or debugging tools, it is also slow,
rather inefficient, and may lead to a number of pitfalls.

C macros are capable of mimicking functions, creating new syntax


within some limitations, as well as expanding into arbitrary text
(although the C compiler will require that text to be valid C source code,
or else comments), but they have some limitations as a programming
construct. Macros which mimic functions, for instance, can be called like
real functions, but a macro cannot be passed to another function using a
function pointer, since the macro itself has no address.

In programming languages, such as C or assembly language, a


name that defines a set of commands that are substituted for the macro
name wherever the name appears in a program (a process called macro
expansion) when the program is compiled or assembled. Macros are
similar to functions in that they can take arguments and in that they are
calls to lengthier sets of instructions. Unlike functions, macros are
replaced by the actual commands they represent when the program is
prepared for execution. function instructions are copied into a program
only once.

Macro Expansion.

A macro call leads to macro expansion. During macro expansion,


the macro statement is replaced by sequence of assembly statements.

Macro expansion on a source program.

Example
In the above program a macro call is shown in the middle of the
figure. i.e. INITZ. Which is called during program execution. Every macro
begins with MACRO keyword at the beginning and ends with the ENDM
(end macro).when ever a macro is called the entire is code is substituted
in the program where it is called. So the resultant of the macro code is
shown on the right most side of the figure. Macro calling in high level
programming languages

(C programming)

#define max(a,b) a>b?a:b

Main () {

int x , y;

x=4; y=6;

z = max(x, y); }

The above program was written using C programing statements. Defines


the macro max, taking two arguments a and b. This macro may be called
like any C function, using identical syntax. Therefore, after preprocessing

Becomes z = x>y ? x: y;

After macro expansion, the whole code would appear like this.

#define max(a,b) a>b?a:b

main()

{ int x , y;

x=4; y=6;z = x>y?x:y; }

B) Conditional Macro Expansion

Means that some sections of the program may be optional, either


included or not in the final program, dependent upon specified
conditions. A reasonable use of conditional assembly would be to
combine two versions of a program, one that prints debugging
information during test executions for the developer, another version for
production operation that displays only results of interest for the average
user. A program fragment that assembles the instructions to print the Ax
register only if Debug is true is given below. Note that true is any non-
zero value.
Here is a conditional statements in C programming, the following
statements tests the expression `BUFSIZE == 1020′, where `BUFSIZE’
must be a macro.

#if BUFSIZE == 1020

printf ("Large buffers!n");

#endif /* BUFSIZE is large */

Note : In the C programming macros are defined above the main() .

C) Macros Parameters

Macros may have any number of parameters, as long as they fit on


one line. Parameter names are local symbols, which are known within
the macro only. Outside the macro they have no meaning!

Syntax:

<macro name> MACRO <parameter 1>…….<parameter n>


<body line 1>
<body line 2>
.
.
<body line m>
ENDM

Valid macro arguments are

1. arbitrary sequences of printable characters, not containing blanks,


tabs, commas, or semicolons

2. quoted strings (in single or double quotes)

3. Single printable characters, preceded by ‘!’ as an escape character

4. Character sequences, enclosed in literal brackets < … >, which may


be arbitrary sequences of valid macro blanks, commas and semicolons

5. Arbitrary sequences of valid macro arguments

6. Expressions preceded by a ‘%’ character

During macro expansion, these actual arguments replace the


symbols of the corresponding formal parameters, wherever they are
recognized in the macro body. The first argument replaces the symbol of
the first parameter, the second argument replaces the symbol of the
second parameter, and so forth. This is called substitution.
Example 3

MY_SECOND MACRO CONSTANT, REGISTER

MOV A,#CONSTANT

ADD A,REGISTER

ENDM

MY_SECOND 42, R5

After calling the macro MY_SECOND, the body lines

MOV A,#42

ADD A,R5

are inserted into the program, and assembled. The parameter names
CONSTANT and REGISTER have been replaced by the macro arguments
"42" and "R5". The number of arguments, passed to a macro, can be less
(but not greater) than the number of its formal parameters. If an
argument is omitted, the corresponding formal parameter is replaced by
an empty string. If other arguments than the last ones are to be omitted,
they can be represented by commas.

Macro parameters support code reuse, allowing one macro definition to


implement multiple algorithms. In the following, the .DIV macro has a
single parameter N. When the macro is used in the program, the actual
parameter used is substituted for the formal parameter defined in the
macro prototype during the macro expansion. Now the same macro,
when expanded, can produce code to divide by any unsigned integer.

Fig. 3.0

Example 4

The macro OPTIONAL has eight formal parameters:


OPTIONAL MACRO P1,P2,P3,P4,P5,P6,P7,P8
.
.
<macro body>
.
.
ENDM

If it is called as follows,

OPTIONAL 1,2,,,5,6

the formal parameters P1, P2, P5 and P6 are replaced by the arguments
1, 2, 5 and 6 during substitution. The parameters P3, P4, P7 and P8 are
replaced by a zero length string.

5. Describe the process of Bootstrapping in the context


of Linkers
Ans –

In computing, bootstrapping refers to a process where a simple


system activates another more complicated system that serves the same
purpose. It is a solution to the Chicken-and-egg problem of starting a
certain system without the system already functioning. The term is most
often applied to the process of starting up a computer, in which a
mechanism is needed to execute the software program that is
responsible for executing software programs (the operating system).

Bootstrap loading

The discussions of loading up to this point have all presumed that


there’s already an operating system or at least a program loader
resident in the computer to load the program of interest. The chain of
programs being loaded by other programs has to start somewhere, so
the obvious question is how is the first program loaded into the
computer?

In modern computers, the first program the computer runs after a


hardware reset invariably is stored in a ROM known as bootstrap ROM. as
in "pulling one’s self up by the bootstraps." When the CPU is powered on
or reset, it sets its registers to a known state. On x86 systems, for
example, the reset sequence jumps to the address 16 bytes below the
top of the system’s address space. The bootstrap ROM occupies the top
64K of the address space and ROM code then starts up the computer. On
IBM-compatible x86 systems, the boot ROM code reads the first block of
the floppy disk into memory, or if that fails the first block of the first hard
disk, into memory location zero and jumps to location zero. The program
in block zero in turn loads a slightly larger operating system boot
program from a known place on the disk into memory, and jumps to that
program which in turn loads in the operating system and starts it. (There
can be even more steps, e.g., a boot manager that decides from which
disk partition to read the operating system boot program, but the
sequence of increasingly capable loaders remains.)
Why not just load the operating system directly? Because you can’t
fit an operating system loader into 512 bytes. The first level loader
typically is only able to load a single-segment program from a file with a
fixed name in the top-level directory of the boot disk. The operating
system loader contains more sophisticated code that can read and
interpret a configuration file, uncompress a compressed operating
system executable, address large amounts of memory (on an x86 the
loader usually runs in real mode which means that it’s tricky to address
more than 1MB of memory.) The full operating system can turn on the
virtual memory system, loads the drivers it needs, and then proceed to
run user-level programs.

Many Unix systems use a similar bootstrap process to get user-


mode programs running. The kernel creates a process, then stuffs a tiny
little program, only a few dozen bytes long, into that process. The tiny
program executes a system call that runs /etc/init, the user mode
initialization program that in turn runs configuration files and starts the
daemons and login programs that a running system needs.

None of this matters much to the application level programmer,


but it becomes more interesting if you want to write programs that run
on the bare hardware of the machine, since then you need to arrange to
intercept the bootstrap sequence somewhere and run your program
rather than the usual operating system. Some systems make this quite
easy (just stick the name of your program in AUTOEXEC.BAT and reboot
Windows 95, for example), others make it nearly impossible. It also
presents opportunities for customized systems. For example, a single-
application system could be built over a Unix kernel by naming the
application /etc/init.

Software Bootstraping & Compiler Bootstraping

Bootstrapping can also refer to the development of successively


more complex, faster programming environments. The simplest
environment will be, perhaps, a very basic text editor (e.g. ed) and an
assembler program. Using these tools, one can write a more complex
text editor, and a simple compiler for a higher-level language and so on,
until one can have a graphical IDE and an extremely high-level
programming language.

Compiler Bootstraping

In compiler design, a bootstrap or bootstrapping compiler is a


compiler that is written in the target language, or a subset of the
language, that it compiles. Examples include gcc, GHC, OCaml, BASIC,
PL/I and more recently the Mono C# compiler.

6. Describe the procedure for design of a Linker.


Ans –
Design of a linker

Relocation and linking requirements in segmented addressing

The relocation requirements of a program are influenced by the


addressing structure of the computer system on which it is to execute.
Use of the segmented addressing structure reduces the relocation
requirements of program.

Implementation Examples: A Linker for MS-DOS

Example: Consider the program of written in the assembly language of


intel 8088. The ASSUME statement declares the segment registers CS
and DS to the available for memory addressing. Hence all memory
addressing is performed by using suitable displacements from their
contents. Translation time address o A is 0196. In statement 16, a
reference to A is assembled as a displacement of 196 from the contents
of the CS register. This avoids the use of an absolute address, hence the
instruction is not address sensitive. Now no relocation is needed if
segment SAMPLE is to be loaded with address 2000 by a calling program
(or by the OS). The effective operand address would be calculated as
<CS>+0196, which is the correct address 2196. A similar situation exists
with the reference to B in statement 17. The reference to B is assembled
as a displacement of 0002 from the contents of the DS register. Since
the DS register would be loaded with the execution time address of
DATA_HERE, the reference to B would be automatically relocated to the
correct address.

Though use of segment register reduces the relocation


requirements, it does not completely eliminate the need for relocation.
Consider statement 14 .

MOV AX, DATA_HERE

Which loads the segment base of DATA_HERE into the AX register


preparatory to its transfer into the DS register . Since the assembler
knows DATA_HERE to be a segment, it makes provision to load the
higher order 16 bits of the address of DATA_HERE into the AX register.
However it does not know the link time address of DATA_HERE, hence it
assembles the MOV instruction in the immediate operand format and
puts zeroes in the operand field. It also makes an entry for this
instruction in RELOCTAB so that the linker would put the appropriate
address in the operand field. Inter-segment calls and jumps are handled
in a similar way.
Relocation is somewhat more involved in the case of intra-segment
jumps assembled in the FAR format. For example, consider the following
program :

FAR_LAB EQU THIS FAR ; FAR_LAB is a FAR label

JMP FAR_LAB ; A FAR jump

Here the displacement and the segment base of FAR_LAB are to be


put in the JMP instruction itself. The assembler puts the displacement of
FAR_LAB in the first two operand bytes of the instruction , and makes a
RELOCTAB entry for the third and fourth operand bytes which are to hold
the segment base address. A segment like

ADDR_A DW OFFSET A

(which is an ‘address constant’) does not need any relocation since


the assemble can itself put the required offset in the bytes. In summary,
the only RELOCATAB entries that must exist for a program using
segmented memory addressing are for the bytes that contain a segment
base address.

For linking, however both segment base address and offset of the
external symbol must be computed by the linker. Hence there is no
reduction in the linking requirements.

Das könnte Ihnen auch gefallen