Sie sind auf Seite 1von 28

NITTE MEENAKSHI INSTITUTE OF TECHNOLOGY (An Autonomous Institution under VTU, Belgum) Govindapura, Gollahalli, Yelahanka, Bangalore 560064

A REPORT ON ASSEMBLERS Submitted in the partial fulfilment of the requirements for the V semester in computer science and Engineering

SYSTEM SOFTWARE PROJECT BY ASHWIN M (1NT09CS110) DIXIT HEBBAR (1NT09CS109)

Department of CSE,NMIT

NITTE MEENAKSHI INSTITUTE OF TECHNOLOGY (An Autonomous Institution under VTU, Belgum) Govindapura, Gollahalli, Yelahanka, Bangalore 560 064 Department of Computer Science & Engineering

CERTIFICATE
This is to certify that the mini-project report titled ASSEMBLERS is carried out by ASHWIN.M (1NT09CS110) & DIXIT HEBBAR (1NT09CS109) , bonafide students of Nitte Meenakshi Institute of Technology, Bangalore in partial fulfilment of SYSTEM SOFTWARE during the academic year 2010-2011. It is certified that all corrections/suggestions indicated for Internal Assessment have been incorporated in the report. All the necessary requirements for the submission of the report have been satisfied.

Dr Nalini.N HOD & Prof. of CSE

Mr Nagraj Lecturer

Name of the Examiners

Signature with Date

1. ________________________

____________________

2. ________________________

____________________

Department of CSE,NMIT

ABSTRACT

In this project we are designing and also implementing a 1-Pass assembler. They are certain functions that any assembler has to perform such as translating mnemonics operations codes to their machine equivalents and assigning machine address to the various symbolic labels that are used by the prog rammer. The basic importance to the given here in an assembler is that it heavily depend on the source code it translates and machine language it produces. As we will later, there are also many subtler ways that an assembler depends upon the machine architecture. On the contrary, there are some features of an assembler that have no direct relation to the architecture, in the sense, decision made by the designers of the language. Thus the assembler gives us a starting point from which we can understand the study of advanced assembler features. This frame work can also be used to begin a design of an assembler for a completely unfamiliar machine. This assembler also helps us generating object codes for a predefined set of machine instructions and supports two types of addressing modes.

Department of CSE,NMIT

ACKNOWLEDGEMENT

We have come to the completion of the project ASSEMBLER. This project has given us enormous pleasure and satisfaction. We would be failing in our duty if we do not acknowledge the help and support provided to us during the course of the project. In first instance I would like to acknowled ge my gratitude to our H.O.D Dr Nalini for having provided us the invaluable suggestions for the completion of the project. I place on reco rd the help that rendered by Mr.Nagraj who provided guidance at all stages. My gratitude is also due to for this encouragement and guidance throughout the work.

ASHWIN M DIXIT HEBBAR

Department of CSE,NMIT

SYNOPSIS

Assembler is system software which is used to convert an assembly language program to its equivalent object code. The input to the assembler is a source code written in assembly language (using mnemonics) and the output is the object code. The design of an assembler depends upon the machine architecture as the language used is mnemonic language.

Department of CSE,NMIT

One-Pass Assemblers:
Main problem Forward references. Solution Data items: require all such areas be defined before they are referenced. Labels on instructions: no good solution. Data items Labels of instructions.

Main problems Forward references Data items. Labels on instructions.

Two types of one-pass assembler Load-and-go Produces object code directly in memory for immediate execution.

The other Produces usual kind of object code for later execution

Department of CSE,NMIT

CONTENTS:

1. Introduction to Assembler Basic Assembler functions Process translation, loading and execution.

2. Outline of Pass1 and Pass2 3. Different types of Assemblers with their features Two-pass Assembler with overlay structure One-pass Assemblers

Load-and-go Assemblers

4. Examples of multi pass Assemblers with features Microsoft Sun AIX Spare MASM Assembler

Assembler for power PC:

Assembler

5. Study case SIC with 6. Design details

architecture

Function prototypes Pass2 Assembler algorithm for Pass1, Pass2

Department of CSE,NMIT

INTRODUCTION TO ASSEMBLER
When we think of an assembler we first try to analyse what is an assembler? How does it help for user to run the programs written high level languages? Here is the answer that an assembler is system software that converts the executable statements in high level languages to executable machine codes. All assemblers may not work for all systems. The choose of assembler for a particular system depends on the machine architecture on which the application programs going to run .At present we have lot m any assemblers. Some of them are listed below. 1. Sparc assemblers for SunOS systems 2. Masm by Microsoft for Intel x 86 families 3. IBM AIX Assembler etc.

BASIC FUNCTION OF ASSEMBLER: The translation of source program to object codes require us to accomplish the following functions: Convert the mnemonics codes to their machine language equivalents. Convert the symbolic operand to their equivalent machine addresses. Build the machine instructions in proper format. Convert the data structures to inter nal machine representations. Write the object program and the assembly listing.

And also there are so many options. Some of them are listed below. One-pass assemblers Multi-pass assemblers Two-pass assembler with overlay structure

Department of CSE,NMIT

Communication between Modules:


The diagram below indicates the communication pathways between the modules of an assembler. For each arrow in the diagram, the module at the tail of the arrow plays the role of a client and the module at the head of the arrow plays the role of a server. This means that the client calls functions provided by the server.

Two-Pass Assembler with overlay structure :


For small memory Pass 1 and Pass 2 are never required at the same time. Three segments root: driver program and shared tables and subroutines. Pass 1. Pass 2.

Tree structure. Overlay program.

Department of CSE,NMIT

One-Pass Assemblers: Main problem Forward references. Data items Labels of instructions. Solution Data items: require all such areas be defined before they are referenced. Labels on instructions: no good solution. Main problems Forward references Data items. Labels on instructions. Two types of one-pass assembler Load-and-go Produces object code directly in memory for immediate execution. The other Produces usual kind of object code for later execution

Load and-go assembler:


Department of CSE,NMIT

Characteristics Useful for program development and testing Avoids the overhead of writing the object program out and reading it back Both one-pass and two-pass assemblers can be designed as loadand-go. However one-pass also avoids the over head of an additional pass over head of an additional pass over the source program For a load-and-go assembler, the actual address must be known at assembly time, we can use an absolute program. At the end of the program. Any SYMTAB entries that are still marked with *indicate undefined symbols Search SYMTAB for the symbol named in the END statement and jump to this location to begin execution. The actual starting address must be specified at assembly time

Forward reference in one-pass assembler: For any symbol that has not yet been defined. Omit the address translation. Insert the symbol into SYMTAB, and mark this symbol undefined. The address that refers to the undefined symbol is added to list of forward references associated with the symbol table entry. When the definition for a symbol is encountered, the proper address for the symbol is then inserted into any instructions previous generated according to the forward reference list.

Department of CSE,NMIT

Producing object code: When external working-storage devices are not available or too slow (for the intermediate file between the two passes). Solution: When definition of symbol is encountered, the assembler must generate another text record with the correct operand address. The loader is used to complete forward references that could not be handled by the assembler

The object program records must be kept in their


original order when they are present ed to the loader.

Multi-pass assembler: Restriction on EQU and ORG No forward reference, since symbols value cant be defined during the first pass Example Use link list to keep track of whose value depend on an undefined symbol

Implementation examples: Microsoft MASM assembler Sun sparc assembler IBM AIX assembler

Department of CSE,NMIT

Microsoft MASM assembler:


SEGMENT A collection segments ,each segment is belonging to a class,CODE,DATA,CONST,STACK defined as particular

ASSUME

Registers:CS(code),SS(stack),DS(data),ES,FS,GS Similar to program blocks in SIC

E.g. ASSUME ES: DATASEG2 E.g. MOVE AX,DATASEG2 MOVE ES, AX

Similar to BASE in SIC

JUMP with forward reference Near jump:2 or 3 bytes Far jump:5 bytes e.g. JMP TARGET Warning: JMP FAR PTR TARGET Warning: JMP SHORT TARGET Pass1:reserves 3bytes for jump instruction Phase error

PUBLIC,EXTRN

Similar to EXTDEF,EXTREF in SIC

Department of CSE,NMIT

SUN SPARC ASSEMBLER :


Sections Symbols Global vs. weak Similar to the combination of EXTDEF and EXTREF in SIC TEXT,.DATA,.RODATA,.BSS

Delayed branches Delayed slots Annulled branch instruction

AIX assembler for PowerPC:


Similar to system/370 Basic relative addressing Save instruction space, no absolute address Base register table: General purpose registers can be used as base register Easy for program relocation Only data whose are to be actual address needs to be modified e.g. USING LENGTH, 1 USING BUFFER, 4 Similar to BASE in SIC Drop

Department of CSE,NMIT

Alignment
Instruction (2) Data: halfword operand(2), fullword operand (4) Slack bytes

CSECT Control sections: RO(read-only data), RW(read-write PR(executable instructions), BS(uninitialized read/write data) Dummy section data),

The SIMPLIFIED INSTRUCTIONAL COMPUTER (SIC)


SIC is Computer that has been carefully designed to include the hardware features often found on real machines, while avoiding unusual or irrelevant complexities. Like many other products, SIC comes in two versions: the standa rd model and an XE version (XE Extra Equipment). The two versions have been designed to be compatible. That is a program for standard SIC machine will also execute properly on a SIC/XE system

SIC MACHINE ARCHITECTURE

MEMORY: Memory consists of 8-bits byte. Any three bytes form a


word (24 bits). All addresses on SIC are bytes address. There are total of 32,768 (215) bytes in the memory.

REGISTERS : Registers are very fast storage locations in the CPU are
temporarily used to store instructions, data or a ddress. SIC machine has five registers, all of which have special uses. Each register is 24 bits length.

Department of CSE,NMIT

Mnemonic
A X L

Number
0 1 2

Special uses
Accumulator; used for arithmetic operations. Index register; used for addressing. Linkage register; the jump to subroutine (JSUB) instruction stores the return address in this register. Program counters; contains the address of the next instruction to be fetched for execution. Status word; contains a variety of information, including Conditional code(CC)

PC

SW

DATA FORMATS :
Integers are stored as 24-bits binary numbers. 2s compliment representation is used for negative values. Characters are stored using their 8 -bit ASCII codes. There is no floating-point hardware on the standard version of SIC .

INSTRUCTION FORMATS:
All machine instructions on the standard versions of SIC have the following 24-bit format: 8 1 15

Opcode

Address

The flag bit X is used to indicate indexed -addressing mode.

Department of CSE,NMIT

ADDRESSING MODES:

There are two addressing modes available, indicated by setting of the X bit in the instruction. The following table describes how the target address is calculated from the address given in the instruction. Parentheses are used to indicate the contents of a register or a memory location. For example, (X) represents the contents of register X.

Indication Mode
Direct Indexed X=0 X=1

Target address calculation


TA=Address TA=Address+(X)

INSTRUCTION SET:

SIC provides a basic set of instructions that are sufficient for most simple tasks. These include instructions that load and store registers (LDA, LDX, STA, STX, etc), as well as integer arithmetic operations (ADD, SUB, MUL, and DIV). All arithmetic opera tions involve registers A and a word memory, with the result behind left in the register.

There is an instruction (COMP) that compares the value in register. A with a memory; this instruction sets a condition code CC to indicate the result (<, =, or >). Conditional jump instruction (JLT, JEQ, and JGT) can test the setting of CC, and jump accordingly. Two instructions are provide for subroutine linkage.

JSUB jumps to the subroutine, placing the return in the register L; RSUB jumps to the address contained in register L.

ONE- PASS ASSEMBLER ALOGRITHM


Department of CSE,NMIT

begin read first input line if OPCODE=START then begin save #[OPERAND] as starting address initialize LOCCTR as starting address read next input line end {if START} else initialize LOCCTR to 0 while OPCODE!=END do begin if there is not a comment line then begin if there is a symbol in the LABEL field then begin search SYMTAB for LABEL if found then begin if symbol value as null set symbol value as LOCCTR and search the linked list with the corresponding operand PTR addresses and generate operand Addresses as corresponding symbol Values Set symbol value as LOCCTR in symbol
Department of CSE,NMIT

Table and delete the linked list End Else Insert (LABEL, LOCCTR) into SYMTAB End Search OPTAB FOR OPCODE If found then Begin Search SYMTAB for OPERAND address If found then If symbol value not equal to null then Store symbol value as OPERAND address Else Insert at the end of the linked list With a node with address as LOCCTR Else Insert (symbol name,null) Add 3 to LOCCTR End Else if OPCODE=WORD then Add 3 to LOCCTR and convert comment to Object code Else if OPCODE=RESW then ADD 3 #[OPERAND] to LOCCTR Else if OPCODE=RESB then ADD #[OPERAND] to LOCCTR

Department of CSE,NMIT

Else if OPCODE=BYTE then Begin Find length of constant in bytes Add length to LOCCTR Convert constant to object code End If object code will not fit into current Text record then Begin Write text record to object program Initialize new text record End Add object code to text record End Write listing line Read next input line End Write last text record to object program Write end record to object program Write last listing line End {pass1}

PROGRAM CODE
Department of CSE,NMIT

#include<stdio.h> #include<stdlib.h> #include<string.h> #include<conio.h> int count[20];/*global declaration*/ void main() { FILE *f1,*f2,*f3,*f4; int linenum,locctr,f; char lbl[10],mne[10],opd[10],ch,menu1[10],sval[10]; char sadr[10],slbl[10],op1[10],lable[10]; void wordcount(); clrscr(); printf("Word count for input program:"); /*Counts the no of words in a given line*/ wordcount(); printf("\nOutput\n"); printf("\nSourcecode \t Objectcode\n\n"); f1=fopen("INPUT.TXT","r");/*Reading the input from a file*/ f2=fopen("SYMTAB.TXT","w+");/*Creating a file called SYMTAB that is referenced later on*/ f4=fopen("OPTAB.TXT","w");/*Creating a file called OPTAB refe renced later*/ /*scan the first line from input file*/ fscanf(f1,"%s %s %x \n",lbl,mne,&locctr); linenum=2; while(!feof(f1)) {
Department of CSE,NMIT

if(count[linenum]==1) { fscanf(f1,"%s\n",mne); fprintf(f4,"%x\t%s",locctr,mne);/*printing into OPTAB file*/ } if(count[linenum]==2) { fscanf(f1,"%s%s\n",mne,opd); fprintf(f4,"%x \t %s \t %s \n",locctr,mne,opd); /*printing the operand value into the OPTAB */ printf("%s\t%s\t",mne,opd); f3=fopen("OPCODE.TXT","r"); while(!feof(f3)) { fscanf(f3,"%s %s \n",menu1,op1); if(strcmp(mne,menu1)==0) /*comparing mne of the INPUT file with the menu1 of the OPTAB file*/ printf("%s\t",op1); } fclose(f3); f=0; rewind(f2); while(!feof(f2)) { fscanf(f2,"%s %s %s \n",sadr,slbl,sval); if(strcmp(opd,slbl)==0) { printf("%s\n\n",sadr);

Department of CSE,NMIT

/*printing the SYMTAB address into the output window*/ f=1; } } if(f==0) printf("0000\n"); /*printing default address into the output window*/ } if(count[linenum]==3) { fscanf(f1,"%s %s %s \n",lbl,mne,opd); fprintf(f4,"%x\t%s\t%s\t%s\n",locctr,lbl,mne,opd); /*updating OPTAB */ fprintf(f2,"%x\t%s\t%s\n",locctr,lbl,opd); /*updating SYMTAB*/ if((strcmp(mne,"RESW")==0)||(strcmp(mne,"RESB")==0)) printf("%s\t%s\n\n",lbl,mne); /*if output of the strcmp is 0 then dont load the address*/ else printf("%s\t%s\t00\t000%s\n\n",lbl,mne,opd); } linenum+=1; if(strcmp(mne,"WORD")==0) locctr+=3; /*assining 3 to locctr*/

else if(strcmp(mne,"BYTE")==0) locctr+=strlen(opd); /*assigning len of the optab into thelocctr*/


Department of CSE,NMIT

else if(strcmp(mne,"RESW")==0) locctr+=3 * atoi(opd); else if(strcmp(mne,"RESB")==0) locctr+=atoi(opd); else locctr+=3; } fclose(f1);/*closing input file*/ fclose(f2);/*closing symtab file*/ fclose(f4);/*closing optab file*/ getch(); } void wordcount() { FILE *f3; int word=0,i=1; char c; printf("\nWordcount"); f3=fopen("INPUT.TXT","r"); /*opening a input file in read mode*/ c=fgetc(f3); while(c!=EOF) { if(c==' ') word+=1;/*increamenting word*/ if(c=='\n') {
Department of CSE,NMIT

word+=1; count[i]=word;/*no of words in the line no i*/ printf("\nNo of words in line no %d:%d",i,word); i+=1; word=0; } c=fgetc(f3); } fclose(f3); }/*end of wordcount function*/

Department of CSE,NMIT

INPUT
WC START 1000 FIRST WORD 5 SECOND WORD 6 THIRD RESW 1 LDA FIRST ADD SECOND STA THIRD END Word count for input program: Wordcount No of words in line no 1:3 No of words in line no 2:3 No of words in line no 3:3 No of words in line no 4:3 No of words in line no 5:2 No of words in line no 6:2 No of words in line no 7:2 Output Sourcecode FIRST SECOND THIRD LDA ADD STA WORD WORD RESW FIRST SECOND THIRD 00 18 0C 1000 1003 1006 Objectcode 00 00 0005 0006

Department of CSE,NMIT

CONCLUSION

The design of 1-Pass assembler is successfully completed. It has been designed keeping in mind only the basic functions of the assembler. Working on this project has given us sufficient knowledge and also it has given us the confidence in handling other such project in future.

Department of CSE,NMIT

BIBLIOGRAPY

System software, An introduction to the programming, 3 rd Edition-Leland L.Beck

C Projects, -Yashvanth Kanetkar

Let Us C, 9 th Edition Yashvanth

Department of CSE,NMIT

Das könnte Ihnen auch gefallen