Sie sind auf Seite 1von 11

ASSEMBLY LANGUAGE FOR X86 PROCESSORS

Introduction
Assembly Language for x86 Processors focuses on programming microprocessors compatible
with Intel (integrated electronic company) and AMD (Advanced Micro Devices Inc.) processors
running under 32-bit and 64-bit versions of Microsoft Windows.

MASM (Microsoft Macro Assembler), an industrial-strength assembler, used by practicing


professionals is the best to use when learning the assembly language. When using Linux-based
systems, you can use NASM whose syntax is most similar to that of MASM.
Assembly language
Assembly language is the oldest programming language, and of all languages, bears the closest
resemblance to native machine language. It provides direct access to computer hardware,
requiring you to understand much about your computer’s architecture and operating system.
Assembler utility program is used to converts source code program from assembly language
into machine language, then a linker utility program is used to combine individual files created
by an assembler into a single executable program.

Assembly language consists of statements written with short mnemonics such as ADD, MOV,
SUB, and CALL. Assembly language has a one-to-one relationship with machine language:
Each assembly language instruction corresponds to a single machine-language instruction.
Machine language is a numeric language specifically understood by a computer’s processor
(the CPU). All x86 processors understand a common machine language.
High-level languages such as C++, and Java have a one-to-many relationship with assembly
language and machine language. A single statement in C++, for example, expands into multiple
assembly language or machine instructions. For example, the following C++ code carries out
two arithmetic operations and assigns the result to a variable.
Assume X and Y are integers:
int X;
int Y = (X + 4) * 3;

1
Following is the equivalent translation to assembly language. The translation requires multiple
statements because each assembly language statement corresponds to a single machine
instruction:
MOV EAX, X; move X to the EAX register
Add EAX, 4; add 4 to the EAX register
MOV EBX, 3; move 3 to the EBX register
IMUL EBX; multiply EAX by EBX
MOV Y, EAX; move EAX to X

Assembly language is not portable, because it is designed for a specific processor family. There
are a number of different assembly languages widely used today, each based on a processor
family.
The C and C++ languages have the unique quality of offering a compromise between high-level
structure and low-level details. Direct hardware access is possible but completely non-portable.
Most C and C++ compilers allow you to embed assembly language statements in their code,
providing access to hardware details.
The table below shows the comparison between high-level and assembly language.

2
INSTRUCTIONS OF 8086
To program the 8086 based system one requires to know the instruction set. There are 117
basic instructions in the instruction set of 8086. Which can be further classified as
follows:
 Data transfer
 Arithmetic
 Bit Manipulation/Logical
 Program Transfer
 Process Control
 String Instructions

The instructions are meant for various operations in corresponding groups.

Instructions can have zero, one, two, or three operands (Label and comment fields can also be
included). The structures of the instructions are:
 Mnemonic e.g. STC ; Set carry
CLD ; Clear direction flag
 mnemonic [destination]e.g. INC AX ;AX← AX + 1
POP BX ; pop BX from the stack
 mnemonic [destination],[source]e.g. MOV AX, 100 ; AX← 100
ADD AX, BX ; AX← AX+BX
 mnemonic [destination],[source-1],[source-2]e.g. IMUL BX, CX, *10 ; AX← CX*10
There are three basic types of operands:
 Immediate—uses a numeric literal expression
 Register—uses a named register in the CPU
 Memory—references a memory location
Table 1 describes the standard operand types. It uses a simple notation for operands (in 32-bit
mode).

3
Table 1: Instruction Operand Notation, 32-Bit Mode

Data transfer instructions


The instructions perform data transfer of the following types:
 General purpose byte or word
 Special Address
 Flag instructions
 Simple input/output

MOV instruction
The MOV instruction copies data from a source operand to a destination operand. Its basic format shows
that the first operand is the destination and the second operand is the source:
MOV destination, source
The destination operand’s contents change, but the source operand is unchanged.
In nearly all assembly language instructions, the left-hand operand is the destination and the right-hand
operand is the source. MOV is very flexible in its use of operands, as long as the following rules are
observed:
 Both operands must be the same size.
 Both operands cannot be memory operands.
 The instruction pointer register cannot be a destination operand.
It does not affect flags.
Here is a list of the standard MOV instruction formats:
 MOV reg, reg e.g. MOV DS, CX ;copy word from CX register into data segment register

4
 MOV mem, reg e.g. MOV [40H], AX ; copy word from AX register into memory
locations 64 ; and 65
 MOV reg, mem e.g. MOV AL, TOTAL ; copy byte stored at memory location labeled
; TOTAL into the AL register
 MOV mem, imm e.g. MOV [F4H], 0F00H ; copy 3840 into memory locations 244 and
; 245
 MOV reg, imm e.g. MOV AL, 32 ; copy 32 into the AL register
NB. A single MOV instruction cannot be used to move data directly from one memory location to
another. Instead, the source operand’s value must be moved to a register before assigning its value to a
memory operand.

The LAHF and the SAHF Instructions


The LAHF (load status flags into AH) instruction copies the lower byte of the FLAGS register into AH.
The following flags are copied: Sign, Zero, Auxiliary Carry, Parity, and Carry. Using this instruction, one
can easily save a copy of the flags in a variable for safekeeping:
LAHF ; load flags into AH
MOV saveflags, AH ; save them in a variable

The SAHF (store AH into status flags) instruction copies AH into the low byte of the
FLAGS register. For example, you can retrieve the values of flags saved earlier in a variable:
MOV AH, saveflags ; load saved flags into AH
SAHF ; copy into Flags register
The PUSHF and the POPF Instructions
PUSHF pushes flag register on the stack. This instruction decrements the stack pointer by two
and copies the word in the flag register to the memory locations pointed by the stack pointer.
POPF pop word from the top of the stack to flag register. This instruction copies a word from the
two memory locations at the top of the stack to the flag register and increments stack pointer by
two.

The XCHG Instruction


The XCHG (exchange data) instruction exchanges the contents of two operands. There are three variants:
XCHG reg, reg e.g. XCHG BX, CX ; Exchange word in BX with word in CX
XCHG reg, mem e.g. XCHG AL, SUM [BX] ; Exchange byte in AL with byte
; in memory at SUM + [BX]

5
XCHG mem, reg e.g. XCHG [40H], BX ; Exchange word in BX with bytes
; in memory locations 40H and 41H
The rules for operands in the XCHG instruction are the same as those for the MOV instruction, except
that XCHG does not accept immediate operands. In array sorting applications, XCHG provides a simple
way to exchange two array elements.
To exchange two memory operands, use a register as a temporary container and combine MOV with
XCHG:
MOV AX, VAL 1 ; copy contents of variable 1 into
; register AX
XCHG AX, VAL 2 ; exchange contents of variable 2 with
; those of AX register
MOV VAL 1, AX ; save contents of AX register into
; variable 2
Simple Input and Output Port Transfer instructions
The IN instruction will copy data from a port to the accumulator. If an 8- bit port is read, the data will go
to AL and if a 16- bit port is read the data will go to AX. The transfer is either direct (8- bit) or indirect
(16- bit) mode. In direct mode the port address value is part of the instruction but in indirect mode, the
port address is referred in the DX register.
Direct is used when the ports range is limited to 256 and for the indirect the range is 65536.
Example
Direct: IN AL, OF8H
IN AX, 9FH
Indirect: MOV DX 30F8H
IN AL, DX

The OUT instruction will copy data from the accumulator to a specified port. If a byte it will be read from
the AL register and if a 16- bits word from the AX register. The transfer is either direct (8- bit) or indirect
(16- bit) mode.
Example
Direct: OUT 0F8H, AL
OUT 9FH, AX
Indirect: MOV DX 30F8H
IN DX, AL

Special Address Transfer Instructions

6
There are three instructions in this category. These are: LEA, LDS and LES.
LEA instruction determines the offset of the variable or memory location named as the source
and then loads this address in the specified 16- bit register. The flags are not affected.
Example:
LEA CX, TOTAL ; load CX with offset of TOTAL in DS
LEA AX, [BX] [DI] ; load AX with EA = [BX] + [DI]
LDS Instruction copies a word from two memory locations into the register specified in the instruction,
then it copies a word from the next two memory locations into the DS register.

Example:
LDS CX, [391AH] ; copy contents of memory displacement 391AH and 391BH to
; CX. Then copy contents at displacement 391CH and 391DH into DS
LES Instruction copies a word from two memory locations into the register specified in the instruction,
then it copies a word from the next two memory locations into the ES register.

LES CX, [3483H] ; copy contents of memory displacement 3483H and 3484H to
; CX. Then copy contents at displacement 3485H and 3486H into ES
Similar instructions can be written for other segment registers.

Arithmetic Instructions
INC and DEC instructions
The INC (increment) and DEC (decrement) instructions, respectively, add 1 and subtract 1 from a register
or memory operand. The syntax is
INC reg/mem
DEC reg/mem

Following are some examples:

.DATA
MYWORD WORD 1000H
.CODE
INC MYWORD ; MYWORD = 1001H
MOV BX, MYWORD
DEC BX ; BX = 1000H
The Overflow, Sign, Zero, Auxiliary Carry, and Parity flags are changed according to the value of the
destination operand. The INC and DEC instructions do not affect the Carry flag.

7
The ADD Instruction
The ADD/ADC instruction adds a source operand to a destination operand of the same size. The syntax
is:
ADD dest, source
ADC dest, source
In ADC, the contents of the source is added to the content of the destination and the carry flag.
Source is unchanged by the operation, and the sum is stored in the destination operand. The set of
possible operands is the same as for the MOV instruction. Here is a short code example that adds two 32-
bit integers:
.DATA
VAR1 DWORD 10000H
VAR2 DWORD 20000H
.CODE
MOV EAX, VAR1 ; EAX = 10000H
ADD EAX, VAR2 ; EAX = 30000H
The Carry, Zero, Sign, Overflow, Auxiliary Carry, and Parity flags are changed according to the value
that is placed in the destination operand.

The SUB instruction


The SUB instruction subtracts a source operand from a destination operand. The set of possible operands
is the same as for the ADD and MOV instructions.
The syntax is:
SUB dest, source
Here is a short code example that subtracts two 32-bit integers:
.DATA
VAR1 DWORD 30000H
VAR2 DWORD 10000H
.CODE
MOV EAX, VAR1 ; EAX = 30000H
SUB EAX, VAR2 ; EAX = 20000H
The Carry, Zero, Sign, Overflow, Auxiliary Carry, and Parity flags are changed according to the value
that is placed in the destination operand.
The NEG (negate) instruction
The NEG (negate) instruction reverses the sign of a number by converting the number to its two’s
complement. The following operands are permitted:
NEG reg
NEG mem

8
The Carry, Zero, Sign, Overflow, Auxiliary Carry, and Parity flags are changed according to the value
that is placed in the destination operand.
Flags affected by Addition and Subtraction
When executing arithmetic instructions, we often want to know something about the result. Is it negative,
positive, or zero? Is it too large or too small to fit into the destination operand? Answers to such questions
can help us detect calculation errors that might otherwise cause erratic program behavior. We use the
values of CPU status flags to check the outcome of arithmetic operations.
We also use status flag values to activate conditional branching instructions, the basic tools of program
logic. Here’s a quick overview of the status flags.
 The Carry flag indicates unsigned integer overflow. For example, if an instruction
has an 8-bit destination operand but the instruction generates a result larger than
11111111 binary, the Carry flag is set.
 The Overflow flag indicates signed integer overflow. For example, if an instruction
has a 16- bit destination operand but it generates a negative result smaller than -
32,768 decimal, the Overflow flag is set.
 The Zero flag indicates that an operation produced zero. For example, if an operand is
subtracted from another of equal value, the Zero flag is set.
 The Sign flag indicates that an operation produced a negative result. If the most
significant bit (MSB) of the destination operand is set, the Sign flag is set.
 The Parity flag indicates whether or not an even number of 1 bits occurs in the least
significant byte of the destination operand, immediately after an arithmetic or
Boolean instruction has executed.
 The Auxiliary Carry flag is set when a 1 bit carries out of position 3 in the least
significant byte of the destination operand.

The Zero flag is set when the result of an arithmetic operation equals zero. The following examples show
the state of the destination register and Zero flag after executing the SUB, INC, and DEC instructions:
MOV ECX, 1
SUB ECX, 1 ; ECX = 0, ZF = 1
MOV EAX, 0FFFFFFFFH
INC EAX ; EAX = 0, ZF = 1
INC EAX ; EAX = 1, ZF = 0
DEC EAX ; EAX = 0, ZF = 1

When adding two unsigned integers, the Carry flag is a copy of the carry out of the most significant bit of
the destination operand.

A subtract operation sets the Carry flag when a larger unsigned integer is subtracted from a smaller one.
For such operation, the result is negative.

9
The Sign flag is set when the result of a signed arithmetic operation is negative.

Transfer of Control Instructions


JMP (Unconditional Branching)
This instruction is unconditional branch. It changes the sequence of program execution, irrespective of
any condition to a new location indicated by the operand. The operand may be direct branch location or
indirect branch location.
The two basic types of unconditional jumps are Intrasegment jump and Intersegment jump. The
Intrasegment jump is a jump for which the addresses must lie within the current code segment. It is
achieved by only modifying the value of IP. The Intersegment jump is a jump from one code segment to
another. For this jump to be effective, both CS and IP values are to be modified.
Jump instructions carried out with a Short-label, Near-label, Memptr 16 or Regptr16 type of operands
represent Intrasegment jumps, while Farlabel and Memptr 32represent Intersegment jumps. Short-label
specifies a new value of IP with an 8-bit immediate operand. It can be used only in the range of –128 to +
127 bytes from the location of the jump instruction. Near-label specifies a new value of IP with a 16-bit
immediate operand. It can be used to cover the complete range of current code segment.
Both Memptr 16 and Regptr 16 types permit a jump to any location (address) in the current code segment.
Again the contents of a memory or register indirectly specify the address of the jump, as the case may be.
JCC (Conditional Branching)
This instruction is conditional branch. It changes the sequence of program execution, if a specified
condition is fulfilled to a new location indicated by the operand. The operand may be direct branch
location or indirect branch location.

10
The LOOP instruction
The LOOP instruction repeats a block of statements a specific number of times. ECX register is
automatically used as a counter and is decremented each time the loop repeats. Its syntax is:
LOOP destination

The loop destination must be within -128 to +127 bytes of the current location counter. The execution of
the LOOP instruction involves two steps: First, it subtracts 1 from ECX register, then it compares the
contents of ECX register to zero. If register ECX does not contain zero, a jump is taken to the label
identified by destination. Otherwise, if register ECX contains zero, no jump takes place, and control
passes to the instruction following the loop.
In the following example, we add 1 to AX each time the loop repeats. When the loop ends,
AX = 5 and ECX = 0:
MOV AX, 0
MOV ECX, 5
L1:
INC AX
LOOP L1

A common programming error is to inadvertently initialize ECX to zero before beginning a loop. If this
happens, the LOOP instruction decrements ECX to FFFFFFFFh, and the loop repeats 4,294,967,296
times! If register CX is the loop counter (in real-address mode), it repeats 65,536 times.

11