You are on page 1of 21

UNIT 2 INTRODUCTION TO ASSEMBLY LANGUAGE

Structure
Introduction Objectives Introduction to Assembly Language
2.2.1 2.2.2 2.2.3 2.2.4 2.3.1 2.3.2 2.3.3 2.3.4 2.35 2.3.6 2.4.1 2.4.2 2.5.1 2.5.2 25.3 25.4 2.5.5 2.6.1 2.6.2 Why learn Assembly Language? Assembly Language Applications Machine Instructions Assembly lnstructions A Sample Program SEGMEM and ENDS Directive Data Definition Directives The ASSUME Directive Initializing Segment Registers END Directive Interrupts DOS Function Calls Editor Assembler Linker Loader Debugger COM Programs EXE Programs

Assembly Language Fundamentals

InputlOutput Services Assembly Language Program Development Tools

A Final look at he Assembly Language Programs A Complete Example Summary Model Answers

2.0

INTRODUCTION

In the previous unit, we have discussed about the 8086 microprocessor. We have discussed about the register set, instruction set and addressing modes for this microprocessor. In this and later-two units we will discuss about the assembly language for 808618088 microprocessor. The unit 1 is the basic building block which will help in better understanding of the assembly language. In this unit, we will discuss the importance of assembly language, basic components of an assembly program followed by the discussions on he program developmental tools available. We will then discuss about what are COM programs and EXE programs. Finally we will present a complete example. For all our discussions, we have used Microsoft Assembler (MASM). However, for different assemblers the assembly language directives may change. Therefore, before running an assembly program you must consult the reference manuals of the assembler you are using.

2.1 OBJECTIVES
At the end of the unit you should be able to: define the need and importance of an assembly program define the various directives used in assembly program write a very simple assembly program with simple input- output services. define COM and EXE programs differentiate between COM and EXE programs

2.2

INTRODUCTION TO ASSEMBLY LANGUAGE

lntroducllon lo

Assembly Lenguage

Assembly language unlocks the secret of your computer's hardware and software. It teaches you about the way the computer's hardware and operating system work together and how, the application programs communicate with the operating system. Assembly language, unlike high level languages, is machine dependent. Each microprocessor has its own set of instructions, that it can support. Here we will discuss, only the IBM-PC assembly language. It consists of the Intel 808618088 instruction set. The instructions for the Intel 8088 may be used without modification on all its enhancements - 80186,80286,80386,80486 and pentium.

2.2.1 Why learn Assembly Language?


You must learn assembly language for various reasons:
1. It helps you understand the computer architecture and operating system.

2. Certain programs, requiring close interaction with computer hardware, are sometimes difficult or impossible to do in high level languages. Example: a telecommunication program for the IBM-PC.

3. High level languages, out of necessity, impose rules about what is allowed in a program. For example, Pascal does not allow, a character value to be assigned to an integer variable. Assembly language, in contrast, has very few restrictions or rules; nearly every thing is left to the discretion of the programmer. The price for such freedom is the need to handle many details that would otherwise be taken care by the programming language itself.
4. One of the most important advantages of assembly language, is that the

programs written in assembly language are at least 30% dense than the same program written in high level language. The reason for this is, that as of today the compilers are still not so intelligent to take advantage of some of the complex instructions of the assembly language. Example: if you write a high level program to compare two strings, it will translate the code, using simple instruction like MOV, CMP, JMP etc. While the same thing can be written in assembly, by using REPE and CMPSB. Obviously the code is much smaller.

2.2.2 Assembly Language Applications


The assembly language programs presented in this unit, are all trivial. The language requires a great deal of attention to detail. Most programmer's don't write large application programs in assembly language; instead they write short, specific routines. Often we write short subroutines in assembly, and call them from high level language. You can take advantage of the strengths of high level languages by using them to write applications. Then you can write assembly language subroutines to handle operations that are not available in high level language. For example, suppose you are writing a word processor in Pascal but find that this language performs badly when updating the screen display. You can write the routines to handle the screen in assembly. Similarly, you can write other critical parts of the program in assembly to speed up the performance of the program.

2.23 Machine Instructions


A machine instruction is a binary code that has a special meaning for a computer's CPU it tells the CPU to perforni a task. The task could be to move a number from one memory location to another, or to add two specific numbers. Each machine instruction is precisely defined when the CPU is constructed, and it is specific to that type of CPU. Following are some of the examples for the IBM-PC:

Mlcroproccssor and Assembly Language Programming

0000 0100

Add a number to the AL register Add a number to a variable move the AX register to another register

loo0 oo01 1010 0011

The instruction set is the entire body of the machine instructions available for the single CPU. A typical Zbyte IBM-PC machine instruction might be as follows: BO 05. The first byte is called as the operation code, which identifies it as a MOV instruction. The second byte (05) is called the operand. The complete instruction thus becomes 'move the number 05 to a register AL'. Obviously, to remember the numbers for instructions is very difficult, thus the assembly language.

2.2.4 Assembly Instructions


Although it is possible to program directly in machine language using numbers, the assembly language makes the task easier. The assembly language instruction to move the number 05 to AL, would be: MOV AL,05 Assembly language is called the low-level language, because it is,close to the machine language in structure and function. The various parts of an assembly language instruction, viz. label, mnemonic, operand and the comment have already been discussed in the previous unit.

23

ASSEMBLY LANGUAGE FUNDAMENTALS

The best way to learn to write assembly language program, is to first study a simple assembly written program. We shall in this section do just the same.

23.1 A Sample Program


;ABSTRACT , , This program adds 2 8-bit words in the memory locations called NUMl and NUM2 The result is stored in the memory location called RESULT. If there was a carry from the addition it will be stored as 0000 0001 in the location CARR?

get NUMl add NUM2 put sum into memory at SUM position carry in LSB of byte registers mask off upper seven bits store the result in the carry location.

;PORTS, ;PROCEDURES ;REGISTERS

: None used : None used : Uses CS, DS, AX

DATA SEGMENT NUMl NUM2 DB DB


1%
;First number stored here
;Second number stored here

lntroductlun lo Assembly Language

20h
? ?

RESULT DB CARRY DB' DATA ENDS CODE SEGMENT

;Put sum here


;Put any carry here

ASSUME CS:CODE, DS:DATA START: MOV AX, DATA MOV DS, AX MOV AL, NUMl ADD AL, NUM2 MOV RESULT, AL RCL AL, 01 AND AL, O O O OB OOOl MOV CARRY, AL MOV AX,4COOh INT 21h CODE ENDS END START The program contains, certain additional mnemonics, in addition to the instructions you have studied so far. These are called as assembler ,$rectives or pseudo operations. These are the directions for the assembler. Their meaning is valid only till the assembly time. There is no code generated for them. We shall now study the complete program step by step.
;Initialize data segment ;register ;Get the first number ;Add it to 2nd number
;Store the result

;Rotate carry into LSB ;Mask out all but LSB ;Store the carry result

23.2 SEGMENT and ENDS Directive


The SEGMENT and ENDS directives are used to identify a group of data items or a group of instructions, called the segment. These directives are used in the same way as parentheses are used in algebra, to group the like items together. A group of data statements or the instructions, that are put in between the SEGMENT and ENDS directives are said to constitute a logical segment. This segment is given a name. In our example CODE and DATA are the names given to code and data segments ~ c t i v e l y . The segments should have a unique name, there can be no blanks within the segment name, the length of the segment name can be upto 31 characters. Name of the mnemonics or any other reserved words is not allowed as the segment name or label.
0

2 3 3 Data Definition Directives


In assembly language, we define storage for variables using data definition directives. Data definition directives create storage at assembly time, and can even

M i c ~ p a ~d -*lY r ~ LSnoorpc P r o o ~ l n g

initialize a variable string to a starting value. The directives are summarized in the following table:

Directive
DB DW DD DQ DT

Description
Define byte Define word Define doubleword

Number of bytes
1

Attribute
Byte word double word quad word ten bytes

2 4
8

Define quadword Define 10 bytes

10

As we see from the following table, the variable being defined is given an attribute. The attribute refers to the basic unit of storage used when the variable was defined. These variables can be given a name as follows: Example: CHAR-VAR WORD-VAR LIST NUM DEN DB DW DB DW DB 'A' 01234h 1,2,3,4 4200 20
;CHAR-VAR = 41h ;Hex number should begin ;with zero
;list of 4 bytes initial-

;ized by numbers 1,2,3,4

DUP directive is used to duplicate the basic data definition 'n' number of times.
Example: ARRAY DB 10 DUP (0)

Define an array ARRAY of 10 data bytes, each byte initialized to 0. The initial value can be anything acceptable to the basic data type. EQU directive is used to define a name to a constant. Example: CONS EQU 20

will define a constant with value 20. Now in your program, where ever you want to use 20, you can use the name instead. The advantage of this is that: lets say, you want to change the value of CONS to, say 10, at some instance of time. Now, instead of making changes every where in the program, you just have to change the EQU definition, and assemble the program again. The change will be done automatically at all places.

Types of numbers used in data statements can be octal, binary, hexadecimal, decimal and ASCII. Following are the examples of each type:
TEMP-MAX OLD-VAL DECIMAL HEX-VAL ASCII-VAL DB DW DB DW DB OllOllOOB 7341Q 49 03B2Ah 'EXAMPLE'
;Binary
;Octal ;Decimal ;Hex

;ASCII

tI
I

2.3.4

The ASSUME Directive

introduction to Assembly language

i
!

8086 has four type of segments, discussed in the previous unit. In the program there can be more than one code segments, data segments, or extra segments defined. However, only one of each type can be aclivc at a time. ASSUME directive is used to tell the assembler, which segment is to be used as an active segment at any instant, and with respect to which it has to calculate the offsets of the variables or instructions. It is usually placed immediately alter the SEGMENT directive, in the code segment, but you can have as many additional ASSUMES as you like. Each time an ASSUME is encountered, the assembler starts to calculate the offset with respect to that segment. In the example above CODE and DATA are the two segments derined, one each lor code and data.

23.5 Initializing Segment Registers


ASSUME is only a directive, which is used to calculate the offset of variables, instructions or stack element, with respect to a specific segment of its type. It does not initialize the segment registers. Initialization of the segment registers has to be done explicitly using MOV instructions as follows: MOV MOV AX,DATA DS,AX

The above statements are used to initialize the data segment register. The segment r~gisters cannot be directly loaded with memory variable, therefore, the segment name is first moved into some general purpose register, which then is moved into the segmept register. All segment registcrs can be initialized in the same manner. Code segment register, is initialized automatically by the loader.

2.3.6 END Directive


The END directive tells the assembler to stop reading and assembling the program from there on. Any statement after the END will be ignored by the assembler. There can be only one END in the program, which is the last statement of the program.
G

Check Your Progress 1


1.

Why should you learn assembly language?

2.

What is a segment? What should be the characteristics of segment names?

3.
(a)

State true or false: The Duective DT defines a quadword in the memory. True False

Hicroprocessor and Assembly Languege programming

(b) DUP directive is used to indicate if a same memory location is used by two different variable names. True False

(c) EQU directive assigns a name to a constaqt value. True

n
I

False

(d) ASC-VAL DB 'TEST 1' is a valid statement True False

(e) The maximum number of active segments at a time in 8086 can be four. True
(f)

False

ASSUME directive specifies the physical address for the data values or instructions. True

n
I

False

(g) A data segment register needs to initialised by the program, however, a code segment register is initialised automatically by a loader. True (h)

I
1

False

I
I I

A statement after the END directive is ignored by the assembler. True

False

2.4

INPUTIOUTPUT SERVICES

On the machines based on Intel 80xx series (8086,8088 etc.), running on DOS, the inputloutput is carried out in the form of serviccs provided by the hardware and the operating systems. They are called as ROM-BIOS, and DOS services respectively, and are in the form of interrupts.

2.4.1

Interrupts

An interrupt occurs when any currently executing prOgram is interrupted. Interrupts are generated for a variety of reasons, usually related to the services related to the external devices connected to the machine, example: keyboard, printer, monitor, etc. There can be other reasons for the interrupts, like an error condition, trap etc. We shall confine ourse!ves only to the former kind. 8086 recognizes two kinds of interrupts, hardware interrupts and sonware interrupts. Hardware interrupts are generated when the peripheral, connected to the CPU requests for some service. A software interrupt is a callto a subroutine located in the operating system, usually the input-output routine. As a programmer, you will be more interested in the software interrupts. INT (interrupt) instruction is used within application programs to request the services of DOS, or ROM-BIOS. The INT instruction calls an operating system subroutine, identified by a number, in the range 0 - FFh. The syntax is

I NT

number

The CPU processes an interrupt instruction, using the interrupt vector table (IVT). It is situated in the first 1 K bytes of memory, and has a total of 256 entries, each of 4 bytes. The entry in the interrupt vector table, is identified by the number, given in the interrupt instruction, and in turn points to an operating system subroutine. The actual address in this table varies from machine to machine. Consider the following figure to see how the interrupts are processed.

Calling Program mov .... Int 10h add .....

F000 : F065

ROM BIOS
sti cld

0
1Oh

IR ET -

R e t k to calling program

F
,065

'

m
..

IVT

Flgurc 1 Proeesslng o : f

M Interrupt

The following steps, explain the above figure.


1.The number following the INT mnemonic tells the CPU which entry to locate in the interrupt vector table. The address of the entry can be found by multiplying the the number by 4. Example the subroutine corresponding to INT 10h will be placed at address a h , while for INT 3h will be placed at OCh.

2. The CPU jumps to the address stored in the interrupt vector table. The actual routine for servicing the INT 10h, is placed at this address.

3. The CPU loads the CS register and the IP register, with this new address in the IVT, and transfers the control to that address, just like a far CALL, (discussed in the unit 4).
4. IRET (interrupt return) causes the program to resume execution at the next instruction in the calling program.

Though it seems an extraordinarily complex way of handling the interrupt, but it works well for us, because CPU does most of the work. We shall in the following section, study some of the more commonly used DOS service routines. BIOS routines, will be taken up later, on need basis.

2.4.2

DOS Function Calls

INT 2lb is called a DOS function call. There are some 100 different functions supported by this interrupt, identified by a function number placed in the AH register, prior to calling INT. We shall do only the ones, which will be used more frequently. We shall explain the DOS routines in the following format.
Function number Description Call with
I

Returns Example Olh Reads the character from the keyboard, and echoes the character on the monitor. If no character is ready, waits until one is available.

Calls with: Returns:

AH AL

= 01
=

8 bit data input

Example: Read one character from the keyboard into register AL, with . echo, and store it in the variable VARl

Mlcroprucessor and Assembly Lsnguagc Programming

MOV INT MOV

VARl

DB 0

02h Outputs the character'on the monitor Calls with: AH DL


= =

02

8 bit-data (usually ASCII, if you want it printed on the screen)

Returns:

nothing

Example: transmit the character '*' on the screen MOV MOV INT AH, 02 DL, '*' 21h

08h Reads a character from the input device, without echoing it on the output device. Rest all other things are same as that o function 01. f 09h Output a string on the standard output device. A string must be terminated by a '$' character, which is not transmitted. Any other code, including the control codes, like, newline, tab etc., can be embedded in the string Calls with: AH DS:DX Returns: Example: CR EQU ODh L F EQU OAh
; code for carriage return ; code for line feed
=
=

09 srgment:ofl'set of string

nothing

-DATA SEGMENT -STR


-DATA

DB ENDS

'Hello world!',CR,LF,'$'

-CODE SEGMENT

MOV MOV MOV MOV INT

AH,09 DX, -DATA DS,DX DX, OFFSET -STR 21h

OAh Reads a character string of upto 255 characters from the console and stores it in a buffer. Backspace key may be used to erase the characters, ENTER is used to terminate the string. All the characters are echoed on the console. Before the function is called, a buffer is initialbed of length, equal to the maximum length of the string required plus two. The first byte of the buffer specifies the maximum number of bytes it can hold, including the carriage return (initialized by the user), the second byte is supplied by MS-DOS to signal the number of characters actually read, excluding the carriage return.

Introduction to s s e n ~ h l yInnpurcpe

' Call with:

AH DS:DX

=
=

OAh segment:offset of buffer

Returns: nothing Example:


-DATA SEGMENT

BUFF DB

81

;max length of string,

; including CR, 81 characters

DB

; actual length of string used

-DATA ENDS

-CODE SEGMENT

MOV MOV MOV

DX, -DATA DS,DX


/

__ I -

DX,OFFSET B

/--

-CODE ENDS

4Ch Terminate with a return code. Performs a final exit to DOS, passing back a return code. MS-DOS closes all the previously opened files, and updates the directory. Calls with: AH AL Returns: nothing Example: return to DOS normally MOV MOV AH,4Ch
= 4Ch = return code

AL, 00
21h

INT

Microprocessor and Assembly Language Programming

MOV INI'

AX, 4C00h
21h

For a more detailed discussion about the DOS routines, you can refer to any DOS programming book.

2.5

ASSEMBLY. LANGUAGE PROGRAM DEVELOPMENT TOOLS

Now that you have some idea, about how to go about writing assembly language programs, you might want to write your own programs, and try them out on the machine. To do that, there are some developmental tools required. Let us study them now. The discussion is from the point of view of the end user, and not the system programmer.

2.5.1

Editor

An editor is a program which, when run on a system, lets you type in text, and store in a file. This text could also be your assembly language program. There are a number of editors available on PC. Some of the more popular ones are: EDLIN, WORDSTAR, TURBO etc. The editor helps you type the program in required format. The correct format is as shown in the example in the previous section. This form of the program is called as the source program. The physical position of each field is not important, but the relative position of each field must not be changed. The number of blanks separating the fields is not fixed, there can be any number of blanks from 1to 20. The editor gives you all the flexibility, to insert lines, delete lines, insert words, characters, delete words, characters etc. In short all the features that you can think of while writing text, and more. After the program is typed, it can be stored in some secondary storage, like hard disk, floppy diskette etc, for permanent storage. More details of the editor can be had from the manual of the editor, available on your system.

2.5.2

Assembler

An assembler program is used to translate assembly language mnemonics to the . binary code for each i n s t r u ~ h tAfter the complete program has been written, with the help of an editor, it is then assembled with the help of an assembler. An assembler works in two phases, i.e., it reads your source code two times. In the first pass, the assembler, collects all the symbols defined in the program, along with their offsets, in symbol table. On a second pass through the source program, it produces a binary code for each instruction of the program, and give all the symbols an offset with respect to the segment, from the symbol table. The assembler generates two files: the object file and the list file. The object file contains the binary code for each instruction in the program. It is created only when your program has been successfully assembled, with no errors. The errors that are detected by the assembler, are called the syntax errors. These are like: MOVE MOV~ AX,BX AX,BL
; undeclared identifier MOVE.
; illegal operands

These are just two of the syntax errors that you can get when your program contains such kind of mistakes. (Exact description of the errors depcnd from assembler to assembler). In the first statement, it reads the word MOVE, it tries to match with its mnemonics set, as there is no mnemonic wilh this spelling, it assumes it to be an identificr, and looks for its entry in the symbol table. It does not evcn find it there,

therefore, gives an error 'undeclared identifier'. In the second error, the two operands are of different kind. 8086 expects, boih the identifier to be of the same kind, byte or word. But in the above case, one is a byte variable, while the other is a word variable. An assembler does not detect logical errors in your programs, that is your responsibility. List file is optional, and contains, the source code, the binary equivalent of cach instruction, and the offsets of the symbols in the program. This file is for purclv documentation purposes. Some of the assemblers available on PC are, MASM (Microsoft Assembler). TURBO etc.

Introduclion to .\s.sernbly Language

2 . 5 3 Linker
For modularity of your program, it is better to break your programs, into scvcral subroutines. It is even better, to put the common routine, like reading a hexadecimal number, writing a hexadecimal number etc., which could he used by a lot of your other programs also, into a separate file. These files are assembled separately. After each,has been successfully assembled, they can be linked together to form a large file, which constitutes your complete program. The file containing the common routines, can be linked to your other programs also. The program that links your programs is called the linker. The linker produces a link file which contains the binary codes for all compound modules. The linker also produces a link map which contains the address information about the linked files. The linker, however, does not assign absolute addresses to your program. It only assigns continuous relative addresses to all the modules linked, starting from zero. This form of program is said to be relocatable, because it can be put anywhere in memory to be run. This form of code can be even be carried to other machines, of the same kind, or compatible to the present machine, to be run successfully. The linker available on your PC is LINK. TURBO has a built in linker.

2.5.4 Loader
Loader is a program, which assigns absolute addresses to the program. These addresses are generated, by adding to all the offsets, the address from where the program is loaded into the memory. Loader comes into action, when you execute your program. This program is brought from the secondary memory, like disk, or floppy diskette, into the main memory at a specific address. Let us assume the program was loaded at address 1000h, then 1000h is added to all the offsets to get the absolute address. Once the program has been loaded, it is now ready to run.

2.5.5 Debugger
If your program requires no external hardware or requires hardware directly accessible from your system, then you can use a debugger to debug your program. Debugger allows you to load your program into just like a loader, and, troubleshoot your program. While debugging, you can run your program in single step, set breakpoints, view the contents of registers or memory locations. You can even change the contents of the register or memory location, and run your program with new value. This helps you to isolate the problems in your programs. The problems can be corrected with the help of an editor, and the whole procedure of assembling, linking and executing your program can be repeated. Debugger helps you detect the logical errors, that could not be detected by the assembler.
t

Check Your Progress 2


State'true or false

1.

For input-output on intel8086/8088 machines running on DOS require special routines to be written by the assembly programmer. True False

Microprocessor and Assenlbly Language Programming

2.

Intel 8(lSh processor rc<c~gni\c\ on14 thc 5oftuarc inlcrrupt\. True

. 11
T I

False

n
n

3.

INT instruction i n effect calls a subroutine which is identifies by a nurnhcr True False

4.

Interrupt Vector Table (IVT) stores the interrupt handling programs. True

I ] False
1
1
False

5.

INT 21h is a DOS function call. True

INT 21h will output a character on the monitor if AH register contains 02. True False

7.

String input and output can be achieved using INT 21h with function ~iunibcr O h and OAh respcctivelv. Y True False

I_)

8.

To perform final exit to DOS we must use furlction 4Ch with thc INT 21h True

1
I

False

1 1
I
}

9.

Wordstar is an editor packagc. True

False

10.

Linking is required to link scvernl segments of a single assembly program. True

I n ] False
1-1
False

21.

Rclocatdblc addresses arc actual physical addrcs\cs whcre program is expucted lo be loadcd in the main mcmory. True

1-1

12.

Debugger helps in removing the syntax errors of a program. Truc Falsr

2.6

A FINAL LOOK AT THE ASSEMBLY LANGUAGE

Before you actually start writing down the asscmbly language programs, a final thing. The assen~hly language programs can be written in two ways: one in which all code and data is written as part of one scgmcnl, called COIL1 programs, and the other where you have more than one scgmnl, called the EXE programs. We shall study each of them in brief, looking at their advanlages and disadvantages.

2.6.1 COM Programs


A COM (Command) program is simply a binary image of a machine language

Introduction to Assembly language

program. It is loaded in the memory at the lowest available segment address. The program code begins at offset 100h, the first 1K being occupied by the interrupt vector table, discussed in the earlier section. All segment registers are set to the base segment address of the program.
A COM program keeps, its code, data, and stack within the same segment. Thus, its total size should not exceed 64K bytes. A COM program sample is shown. The

program's only segment (CSEG) must be declared explicitly using segment directives. ;TITLE ADD TWO NUMBERS AND STORE THE CARRY IN A THIRD
; VARIABLE

CSEG SEGMENT ASSUME CS:CSEG, DS:CSEG, SS:CSEG ORG 100h START:MOV AX, CSEG MOV DS, AX MOV AL, NUMl ADD AL, NUM2 MOV RESULT, A L RCL AL, 01 AND AL, OOOOOOOlB MOV CARRY, AL MOV AX,4COOh INT 21h NUMl DB NUM2 DB RESULT DB CARRY DB CSEG ENDS END START The ORG directive sets the location counter at offset lOOh before generating any instruction. A COM program takes up less space on disk, as compared to the EXE program. Inspite of this it allocates all available RAM when loaded. COM programs require at least one full segment, because they automatically place their stack at the end of the segment. 15h 20h
?
; First number stored here
; Second number stored here
,

; Initialize data segment

;register
;Get the first number ;Add it to 2nd number ;Store the result

;Rotate carry into LSB


;Mask out all but LSB

; Store the carry result

; Put sum here


;Put any carry here

7 ' .

2.6.2 EXE Programs


An EXE program is stored on disk with extension EXE. EXE programs are longer than the COM programs, because with.each EXE program is associated an EXE header followed by a load module containing the program itself. The EXE header, is of fmed 256 bytes, and contains information, which is used by DOS to correctly

Mlcroproccssor and Assembly Language Programming

calculate the address of segments and other components. We will not go into the details of these. The load module consists of separate segments, which may be thought of as reserved area for instructions, variables and stack. The EXE program may contain upto 64K segments, although at the most only four segments may be active at any time. The segments may be of variable size, with maximum being 6 4 K bytes. We shall in our discussion, confine ourselves to only EXE programs for the following reasons:
1.
2.

EXE programs are better suited to debugging. EXE-format assembler programs are more easily converted into subroutines for high-level languages. The third reason has to do with memory management. EXE programs are more easily relocatable, because, there is no ORG statement, forcing the program to be loaded from a specific address. Also to fully use multitasking operating system, programs must be able to share computer memory andreiourcesr~n EXE program is easily able to do this.

3.

A COMPLETE EXAMPLE
Now that we have seen all the details of assembly language programming, we shall take up a complete example. Let us assume we want to multiply two 16 bit numbers and store the result in a 32 bit memory operand. Preparation for writing the program
1.

Write an algorithm for you program. get N U M l add NUM2 put sum into memory at SUM position carry in LSB of byte mask off upper seven bits store the result in the CARRY location.

2.

Specify the input and output required. Input required


- two 8 bit numbers

Output required

- an 8 bit sum and a carry in another 8 bit memory variable.

3.

Study the instruction set. Study the instruction set carefully, to specify the instructions available, along with their format. We need to initialize the segment registers. We have already discussed, that segment registers cannot be directly initialized by a memory variable. Instead we have to first move the offset for segment into a register, and then move the contents of register to the segment register. To exit out to DOS, we need interrupt routine 21 h, with function 4Ch, placed in AH register. It is a nice practice to first code your program on paper, and use comments liberally. This makes programming easier, and also helps you understand your program later also. Please note that t hc number of comments do not effect the size of the program.

;ABSTRACT

: This program adds 2 8-bit numbers in the memory locations

Introduction to Asembly Language

: called NUMl and NUM2. The result is stored in the


: memory location called RESULT. If there was a carry

: from the addition it will be stored as 0000 0001 in

,
;PORTS ;PROCEDURES ;REGISTERS
9

: the location CARRY


: None used : None used : Uses CS, DS, AX

DATA SEGMENT NUMi DB NUM2 DB RESULT CARRY DATA ENDS CODE SEGMENT ASSUME CS:CODE, DS:DATA START:MOV AX, DATA MOV DS, AX MOV AL, NUMl ADD AL, NUM2 MOV RESULT, AL RCL AL, 01 AND AL, 00000WlB MOV CARRY, AL MOV AX,4COOh INT 21h CODE ENDS END START Program Development After hand coding your program, enter it in your machine, using any editor available to you. Thcn assemble your program. Let us work with MASM assembler, as that is the most common assembler available on PCs. Assemble your program using: MASM FILEl;
; Initialize data segment

15h 20h DB DB

; First number stored here ; Second number stored herc

?
?

; Put sum here ; Put any carry hcrc

; register
; Get the first number

.
..I

; Add it to 2nd number ; Store the result ; Rotate carry into LSB ; Mask out all but LSB
; Storc the carry result

If you want a listing file, with tables suppressed, you can use the options: MASM /L/N FILE1;

Microprocessor and Assembly Language Programming

MASM display a copyright message and begins to read the source program. At the end of the assembly, MASM display statistics on the amount of available free space, and number of errors and warnings. MASM /L/NFILE1; Microsoft (R) Macro Assembler Version 5.10 Copyright (C) Microsoft corp. 1981-1985, 1987.All rights reserved 50592

+ 271152 bytes symbol space free

0 Warning Errors 0 Severe Errors


In this example, the program was assembled successfully and no error messages were displayed. If you look at you directory, with FILEl.*, you will find two more files, FILE1.OBJ and FILE1.LST have been created.

View the .LST file. It looks something like this: Microsoft (R) Macro Assembler Version 5.10 1/16/94 10:23:18
1

Page

1-1

page, 132
; A04-05B.ASM: 8086 program
;ABSTRACT : This program adds 2 8-bit numbers in the

memory Localions
: called NUMl and NUM2. The result is stored in the
: memory location called RESULT. If there was a carry : from the addition it will be stored as

0000 0001 in
: the location CARRY

;ALGORITHM:
9

get NUMl add NUM2 put sum into memory at SUM position carry in LSB of byte mask off upper seven bits store the result in the carry location.

, ,
,
7

,
7

;PORTS
; PROCEDURES

: None used
: None used

;REGISTERS

: Uses CS, DS, AX

70
21

0000 000015 0001 20 0002 00 0003 00 0004 0000

DATA SEGMENT NUML DB NUM2 DB R ES ULT ; Put sun1 here 15h ;First number stored here 20h ;Second number slorcd hcrc DB
? ?

Introduction to

Assembly LRnguoge

22 23 24 25 26

CARRY DB ; Put any carry here DATA ENDS CODE SEGMENT

ASSUME CS:CODE, DS:DATA 28 29


30

0000 B8 ---- R 0003 YE D8 0005 A0 0000 R

START:MOV AX, DATA ; lnitislizc data segmcnl MOV DS, AX ; register MOV AL, NUMl :Get the first number
10

31 32 33
34

0 CY 02 06 0001 R ADD AL,, N U M 2 :Add it O) O O A2 0002 R OC O O DO DO OF 0011 2401 0013 A2 0003 R 0016

2nd nurnbcr

MOV RESULT, AL ;Storc the result RCL AL, 01 ;Rot&e carry into LSB AND BL, 00000001B ; Mask out all but LSB MOV CARRY, AL ; Store the carry rcsull CODE ENDS END START

35
36

Microsoft (R) Macro Assemblcr Version 5.10 1/16/94 10:23:18 Symbols-I Segments and Groups: Name Length Align PARA PARA Value 0001
O ( W )O

Combine NONE NONE Attr DATA DATA DATA DATA CODE 0101 h

CODE . . . . . . . . . . . . . . . . . . . . . 0016 DATA . . . . . . . . . . . . . . . . . . . . . 0(?OJ Name Typc

CARRY . . . . . . . . . . . . . . . . . . . LHYTE NUMI . . . . . . . . . . . . . . . . . . . . L BYTE NUM2 . . . . . . . . . . . . . . . . . . . . L BYTE RESULT.. . . . . . . . . . . . . . . . . START . . . . . . . . . . . . . . . . . .


'(I

0001 0002 0000

L BYTE

L NEAR
TEXT

C'PLi. . . . . . . . . . . . . . . .

Microprocessor and Assembly Language Programming

@ FILENAME. @ VERSION.

.. ....

TEXT TEXT

. . . .. ..

37 Source Lines 37 Total Lines 12 Symbols 47318

+ 3,53189 Bytes symbol space free

0 Warning Errors 0 Severe Errors The list file shows the various codes generated for the program file along with the length of the various segments defined and the length, type and attributes of the variables defined. This is useful for a complete understanding of the asscmt)ly language programs. After the program has been assembled successfully, we need to LINK the program. In the LINK step, the linker program (L1NK.EXE) reads the object file (.OBJ), creates an executable file, and optionally creates a map (.MAP) file. LINK /M FlLE1; Microsoft (R) Overlay linker version 3.60 Copyright (C) Microsoft Corp 1983-1987. All rights reserved. The map file list the names of all the segments in the program; this becomes important only when writing larger assembly programs. The files created after linking are executable file (.EXE) and optionally, a map file (.MAP).

Sample for the listing of the .map file


LINK : warning L4021: no stack segment Start Stop Length Name 00004h DATA 00016h CODE Class

O O O 00003h OOh OOOlOh 00025h

Address Address

Publics by Name Publics by Value

Program entry point at 0001:0000 The last step is to run the program. The program may be run by typing its name Optionally a debugger may also be used.

Check Your Progress 3


State true or false:
1.

COM program is loaded at the 0th location in thc mcmory. True

2.

The size of COM program should not cxcccd 04K. True Falsc

3.

A COM program is longer than an ENE program. True Falsc

n n n

False

n
1

4.

STACK of a COM program is kept at the end of the occupied segment by the program. True

Inlroducllon lo Assembly Language

False

5.

EXE program contains a header module which is used by DOS for calculating segment addresses. True

I)
I

False

6.

EXE programs cannot be easily debugged in comparison to COM programs.


True False

7.

EXE programs are more easily relocatable than COM programs.


True

False)

Activity Assemble and Run the program given in this section.

2.8

SUMMARY

We can summarize the complete discussion in the following flow chart.

Figure 2: Assembly Language prmgram development and execution

Microprocessor and Assembly ~,nnguagc Programming

2.9

MODEL ANSWERS

Check Your Progress 1

l.(a) It helps in better understanding of computer architccture and operating system. c ' (b) The programs which have close interaction with computer hardware can b written efficiently in it.
(c) Flexibility of use as very few restrictions exists.

(d) Smaller machine level code, thus result in efficient execution of programs.
2.

A segment identifies a group of instructions or data values. A segment name should have the following characteristics.

* It should be unique * No blanks are allowed


* Maximum length can be upto 31 characters

* Reserved words should not be used as segment names or labels


3.

(a) False (b) False (c) True (d) True (e) True (f) False (g) True (h) True
1. False 2. False 3. True 4. False 5. True 6. True 7. True 8. True 9. True 10. False 11, False 12. False

Check Your Progress 2

Check Your Progress 3


1. False 2. True 3. False 4. True 5. True 6. False 7. True