Beruflich Dokumente
Kultur Dokumente
Source Object
Assembler Linker
Program Code
Executable
Code
Loader
2 Ravi Bhushan Th
Introduction to Assemblers
Fundamental functions
translating
mnemonic operation codes to their
machine language equivalents
assigning machine addresses to symbolic
labels
Machine dependency
different machine instruction formats and codes
3 Ravi Bhushan Th
Assembler Directives
Pseudo-Instructions
Not translated into machine instructions
Providing information to the assembler, to organize
the program better.
May vary from assembler to assembler
4 Ravi Bhushan Th
Assembler’s functions
Convert mnemonic operation codes to
their machine language equivalents
Convert symbolic operands to their
equivalent machine addresses
Build the machine instructions in the
proper format
Convert the data constants to internal
machine representations
Write the object program and the
assembly listing
5 Ravi Bhushan Th
Data Structures
6 Ravi Bhushan Th
OPTAB (Operation Code Table)
Used to look up mnemonic operation codes and
translate them into machine language equivalents
Contains the mnemonic operation code and its machine
language equivalent
In more complex assemblers, contains information like
instruction format and length
Content
mnemonic, machine code (instruction format, length)
etc.
Characteristic
static table
Implementation
array or hash table, easy for search
7 Ravi Bhushan Th
OPTAB (Operation Code Table)
MOT Structure
8 Ravi Bhushan Th
SYMTAB (Symbol Table)
Used to store values (addresses) assigned to labels
Includes the name and value for each label
Flags to indicate error conditions, e.g. duplicate definition of
labels
May contain other info like type or length about the data area
or instruction labeled
Content
label name, value, flag, (type, length), variables, constants,
procedures etc.
Characteristic
dynamic table (insert, delete, search)
Implementation
hash table, non-random keys, hashing function
9 Ravi Bhushan Th
SYMTAB (Symbol Table)
As SYMTAB is an essential data structure used by the
assembler to remember information about identifiers
appearing in the program to be assembled.
Symbols used may vary from one assembler to other
assembler for the same assembly language.
Fundamental operations:
a. Insertion of new symbol
b. Lookup or search
c. Modify information regarding a symbol stored earlier in the
table
Name Type Location
X variable Offset of x
L1 Label Offset of L1
pqr procedure Offset of pqr
10 Ravi Bhushan Th
POT (Pseudo Opcode Table)
Used to store pseudo opcodes supported by the assembler.
These are used to reserve memory space and possibly
initialize it.
e.g.
DB: Define byte
DW: Define Word
DD: Define Double Word
DQ: Define Double Precision Float
DT: Define Double Precision Float
RESB: Reserve Byte
RESW: Reserve Word
RESD: Reserve Double Word
11 Ravi Bhushan Th
POT (Pseudo Opcode Table)
Structure
12 Ravi Bhushan Th
LOCCTR (Location Counter)
LOCCTR
Used to help in the assignment of addresses
Initialized to the beginning address specified in the
START statement
After each source statement is processed, the
length of the assembled instruction or data area to
be generated is added
Gives the address of a label
13 Ravi Bhushan Th
A Simple Machine
Assume a simple hypothetical accumulator based
processor. Considerations:
• All memory load and store operations are through the
accumulator register.
• Arithmetic & Logical operations used mostly A as
source and destination register.
• There are two more 32 bit GPRs B, C and one Index
register I ( arithmetic is permitted on I also)
• Addressing modes are immediate an indirect through
I.
• All memory addresses are 32 bit wide.
14 Ravi Bhushan Th
Manual Assembler
Machine Opcode table for a simple processor
15 Ravi Bhushan Th
Contd..
Program to add 10 nos.
16 Ravi Bhushan Th
Contd..
Symbol Table
17 Ravi Bhushan Th
Contd..
Code generated
18 Ravi Bhushan Th
Contd..
19 Ravi Bhushan Th
Two Pass Assembler
The two pass assembly process scans the
I/P assembly language program twice.
These scans are known as Pass-1 and
Pass-2
20 Ravi Bhushan Th
Two Pass Assembler
Read from input line
LABEL, OPCODE, OPERAND, LITERALS
Source
program
Intermediate Object
Pass 1 Pass 2
file codes
21 Ravi Bhushan Th
Pass-1 of Two Pass Assembler
23 Ravi Bhushan Th
Pass-1 of Two Pass Assembler
26 Ravi Bhushan Th
A Simple Program
27 Ravi Bhushan Th
Pass-1 of Two Pass Assembler
(Section & Symbol Tables)
28 Ravi Bhushan Th
Pass-2 of Two Pass Assembler
This phase is responsible for generating the code.
It uses the tables created in pass 1 and writes the
generated code in object file.
At first object file offset and lc are initialized with 0.
The source program lines are now read one by one
and corresponding object code is generated.
The source lines are parsed into its components and
mnemonics and symbols are getting separated.
For machine instructions the address of the operands
are found via symbol table.
For pseudo opcodes like db, dw the corresponding
initialization values are also written in object file but for
RESB, RESW etc only object file offset is updated
29
reserving the space in the object file Ravi Bhushan Th
Pass-2 of Two Pass Assembler
Flowchart
30 Ravi Bhushan Th
Machine code after Pass-2
31 Ravi Bhushan Th
Explanation of Two Pass Assembler
with an example
32 Ravi Bhushan Th
In this example, there are three types of
statements.
1. Imperative Statements: Actions to be
performed. e.g. Machine opcodes
2. Declarative Statements: like pseudo
opcodes. e.g. DS, DC
3. Assembler Directives: START, ORIGIN
etc.
33 Ravi Bhushan Th
Let us assume that in this program we have some assembler
directives as:
START: Directive to place first word of the target program(for
pass 1) e.g. START 100 setting lc to 100
END: end of source program.
ORIGIN: setting lc at some address value. e.g. ORIGIN 200,
ORIGIN label+20 etc.
EQU: Assign address of one symbol to another label, e.g. LABEL
EQU LOOP;
LTORG: Specifies where literals be placed after LTORG and
END statements(assembler allocates memory to literals from
literal pool and then clears them).
There is no intermediate code for ORIGIN and EQU.
34 Ravi Bhushan Th
MOT Table
Index Mnemonics Type OP-Code Length
1 MOVER IS 01 1
2 MOVEM IS 02 1
3 ADD IS 03 1
4 SUB IS 04 1
5 MULT IS 05 1
6 DIV IS 06 1
7 BC IS 07 1
8 COMP IS 08 1
9 PRINT IS 09 1
10 READ IS 10 1
35 Ravi Bhushan Th
ASSEMBLER DIRECTIVE Table
36 Ravi Bhushan Th
POT Table
37 Ravi Bhushan Th
Register Table
Apart from above three tables there is one
more table which contains the available
registers of the system. With these tables as
prerequisite we can start pass-1 of two pass
assembler.
Register No Name
01 AREG
02 BREG
03 CREG
04 DREG
38 Ravi Bhushan Th
Source Program
START 100
MOVER AREG, A
LOOP: PRINT B
ADD BREG, =‘9’
SUB BREG, D
COMP CREG, =‘23’
LTORG
A DS 3
LABEL EQU LOOP
ORIGIN 500
L1: MULT CREG , =‘7’
SUB BREG, =‘9’
LTORG
B DC 10
MOVEM CREG , =‘7’
PRINT =‘7’
D DC 8
END
39 Ravi Bhushan Th
Generating intermediate code
SOURCE PROGRAM INTERMEDIATE CODE SYMBOL TABLE
START 100 (AD, 01) (C, 100)
MOVER AREG, A 100 (IS, 01) 01 (S, 01) SYM_NO SYMBOL ADDRESS
LOOP: PRINT B 101 (IS, 09) -- (S, 03) 1 A 107
ADD BREG, ='9' 102 (IS, 03) 02 (L, 1) 2 LOOP 101
SUB BREG, D 103 (IS, 04) 02 (S, 04) 3 B 504
COMP CREG, ='23' 104 (IS, 08) 03 (L, 02) 4 D 507
LTORG 105 (AD, 05) 009 5 LABEL 101
106 (AD, 05) 023 6 L1 500
A DS 3 107 (DL, 01) -- 03
LABEL EQU LOOP NO INTERMEDIATE CODE
ORIGIN 500 LITERAL TABLE
L1: MULT CREG, ='7' 500 (IS, 05) 03 (L, 03)
SUB BREG, ='93' 501 (IS, 04) 02 (L, 04) LIT_NO LITERAL ADDRESS
LTORG 502 (AD, 05) -- 007 1 ='9' 105
503 (AD, 05) -- 093 2 ='23' 106
B DC 10 504 (DL, 02) -- 010 3 ='7' 502
MOVEM CREG, ='7' 505 (IS, 02) 03 (L, 05) 4 ='93' 503
PRINT ='7' 506 (IS, 09) -- (L, 05) 5 ='7' 508
D DC 8 507 (DL, 02) -- 008
END 508 (AD, 02) 007
40 Ravi Bhushan Th
Pass -2 of Two Pass Assembler
After Pass-1 of Two pass assembler now we are ready to move
on to the second pass of the assembling process.
With the help of these data structures the final target code or
object code will be generated during pass-2.
41 Ravi Bhushan Th
Object Code Generation
INTERMEDIATE CODE OBJECT CODE
(AD, 01) (C, 100) 01-100
100 (IS, 01) 01 (S, 01) 100 01 01 107
101 (IS, 09) -- (S, 03) 101 09 -- 504
102 (IS, 03) 02 (L, 1) 102 03 02 105
103 (IS, 04) 02 (S, 04) 103 04 02 507
104 (IS, 08) 03 (L, 02) 104 08 03 106
105 (AD, 05) 009 105 -- -- 009
106 (AD, 05) 023 106 -- -- 023
107 (DL, 01) -- 03 107 -- -- --
NO INTERMEDIATE CODE
44 Ravi Bhushan Th
Source Program
START 100
MOVER AREG, A
PRINT B
ADD BREG, =‘9’
SUB BREG, D
COMP CREG, =‘23’
LTORG
A DS 3
LABEL EQU A
ORIGIN 500
L1: MULT CREG , =‘7’
B DC 10
MOVEM CREG , =‘7’
D DC 8
END
45 Ravi Bhushan Th
One-Pass Assemblers
SOURCE PROGRAM TARGET PROGRAM SYMBOL TABLE TII TABLE
START 100 01- 100
MOVER AREG, A 100 01 01 107 SYM_NO SYMBOL ADDRESS LC_NO INCOMPLETE INS.
PRINT B 101 09 -- 501 1 A 107 100 A
ADD BREG, ='9' 102 03 02 105 2 B 501 101 B
SUB BREG, D 103 04 02 503 3 D 503 102 ='9'
COMP CREG, ='23' 104 08 03 106 4 LABEL 107 103 D
LTORG 105 -- -- 9 5 L1 500 104 ='23'
106 -- -- 23 500 ='7'
A DS 3 107 -- -- LITERAL TABLE 502 ='7'
LABEL EQU A
NO CODE
ORIGIN 500 LIT_NO LITERAL ADDRESS
L1: MULT CREG, ='7' 500 05 03 504 1 ='9' 105
B DC 10 501 -- -- 10 2 ='23' 106
MOVEM CREG, ='7' 502 02 03 504 3 ='7' 504
D DC 8 503 -- -- 8
END 504 -- -- 7
46 Ravi Bhushan Th
One-Pass Assemblers
Main problem
forward references
data items
labels on instructions
Solution
data items: require all such areas be defined
before they are referenced
labels on instructions: no good solution
47 Ravi Bhushan Th
One-Pass Assemblers
Main Problem
forward reference
data items
labels on instructions
49 Ravi Bhushan Th