Sie sind auf Seite 1von 20

C

C
ompiler
onstruction

Abdul Wahab
Lecturer
IIT, UST Bannu
abdul_bu@yahoo.com.

1
Cousins of Compiler
 Preprocessors
 Assemblers
 Loader and Link-Editor

Abdul Wahab, Lecturer in IT 2


Preprocessors
 They produce input to compiler. They may perform
the following functions:
1. Macro Processing
2. File Inclusion
3. Rational preprocessors (they augment older languages with
modern flow of control)
4. Language extensions
(Equel is a database query language embedded in C)

Abdul Wahab, Lecturer in IT 3


Assembler
 Some compiler produces assembly code that is passed to an
assembler for processing.

 Other compilers produces relocatable machine code that can be


passed directly to the loader/link-editor.

Assembly code is a mnemonic version of machine code, in which


names are used instead of binary codes for operations. e.g.
MOV a, R1
ADD #2, R1
MOV R1, b

Abdul Wahab, Lecturer in IT 4


Two-Pass Assembly
 The assembler makes two passes over the input.

 A pass consist of reading the input file once.

 In the first pass, all the identifiers that denote storage


locations are found and stored in a symbol table
(separate from that of compiler)

 Identifiers are assigned storage locations as they are


encountered for the first time

Abdul Wahab, Lecturer in IT 5


For Example
 The symbol table might contains the entries as shown
below: (Assuming that a word consists of 4 bytes, and the addresses
are assigned starting from bytes)

IDENTIFIER ADDRESS
a 0
b 4

Abdul Wahab, Lecturer in IT 6


Two-Passes Assembly
 In the second pass, the assembler scans the input
again

 This time it translate each operation code into the


sequence of bits representing that operation in
machine language, it also translate each identifier
representing a location into the address given for that
identifier in the symbol table.

Abdul Wahab, Lecturer in IT 7


Two-Passes Assembly

 The output of the second pass is usually relocatable


machine code, meaning that it can be loaded starting
at any location L in memory;

Abdul Wahab, Lecturer in IT 8


Example
 The following is a hypothetical machine code for
previous example:

000101 00 00000000 *
0011 01 10 00000010
001001 00 00000000 *

Instruction code Register Address mode Memory address

Abdul Wahab, Lecturer in IT 9


 If L = 00001111 i.e. 15 then a and b would be at
location 15 and 19, respectively, and the instructions
would now appear as absolute code:

000101 00 00001111
0011 01 10 00000010
001001 00 00010011

Abdul Wahab, Lecturer in IT 10


Loader and Link-Editor
 It perform two functions of loading and link-editing
 Loading process consist of taking relocatable machine
code, altering the relocatable addresses and placing
the altered instruction and data in memory at proper
location.
 Link-Editor make a single program from several files of
relocatable machine code.

Abdul Wahab, Lecturer in IT 11


The Grouping of Phases
 Activities from more than one phases can be grouped
together.

 Front and Back Ends


 Passes

Abdul Wahab, Lecturer in IT 12


Front and Back Ends
 Front End consist of those phases, or part of phases that
depends primarily on the source language and are independent
of target machine. It normally includes:

1. Lexical analysis
2. Syntactic analysis
3. The creation of Symbol Table
4. Semantic analysis
5. The generation of Intermediate Code
6. Certain amount of Code optimization
7. Error Handling

Abdul Wahab, Lecturer in IT 13


Front and Back Ends
 The Back End includes those portions of the
compiler that depend on the target machine, and are
independent of source language. It includes:

1. Code Optimization
2. Code Generation
3. Necessary Error handling and Symbol-Table
operations.

Abdul Wahab, Lecturer in IT 14


Advantages of Front End and Back End
 If we want to write new compiler for the same machine
then only the Front End of the compiler change while
the Back End remain the same.

 Similarly if we want to write the compiler for new


machine then the front end of the compiler remain the
same and back end of the compiler is changed.

 It is also good idea to compile several different


languages into same intermediate language and use a
common back end for different front ends.
Abdul Wahab, Lecturer in IT 15
Example:
 A C Language compiler may have more then one back
ends each for a different machines i.e.

C-Language Compiler
Front End

Back End Back End


IBM Apple

 It means that one back end will work for IBM Machine
and 2nd for Apple with a single front end
Abdul Wahab, Lecturer in IT 16
Example:
 Suppose we have a C- Compiler with Front End and
Back End and now for FORTRON compiler we have
only to change its front end for FORTRAN and its Back
End will remain the same.
C-Compiler FORTRAN-Compiler
Front End Front End

Target Language
Back End

Abdul Wahab, Lecturer in IT 17


Passes
 Several phases of compilation are usually implemented
in a single pass consisting of reading an input file and
writing an output file.

 The grouped phases must be interleaved during the


pass. For Example:

lexical analysis, syntax analysis, semantic analysis and


intermediate code generation might be grouped
together into one pass
Abdul Wahab, Lecturer in IT 18
Passes
 It is desirable to have few passes, since it takes time to
read and write intermediate files.

 On the other hand, if we group several phases into one


pass, we may be forced to keep the entire program in
memory, because one phase may need information in
a different order than a previous phase produce it.

Abdul Wahab, Lecturer in IT 19


Abdul Wahab, Lecturer in IT 20

Das könnte Ihnen auch gefallen