Sie sind auf Seite 1von 56

80X87 Architecture and

instruction sets
Intel Arithmetic co processors
• 8087, 80287, 80387SX, 80387DX
• 80487SX for 80486SX
• 80486DX contain internal FPU
• Pentium and Petium-4 contains built in Co
processors
• Do multiply, add divide subtract , Square root,
transcendental functions and logarithms
– On 16, 32,64 bit integers
– 32, 64 , 80 bit Floating point numbers
– 18 digit BCD data
FPU data types
• The 80x87 FPU supports seven different data types:
– three integer types,
– a packed decimal type, and
– three floating point types.
• Since the 80x86 CPUs already support integer data
types, there are few reasons why you would want to use
the 80x87 integer types.
• The packed decimal type provides a 17 digit signed
decimal (BCD) integer.
• The three data types are the 32 bit, 64 bit, and 80 bit
floating point data types we've looked at so far. The
80x87 data types appear in the following figures:
Integer data types / Assembler
directives DW, DD, DQ
Integer data types
• In the context of FPU operations, integers are whole numbers, i.e. numbers
which do not contain any fractional part.
• All integers used in FPU instructions are also considered as signed integers, the
most significant bit being 0 for positive values or 1 for negative values.

• Negative integer values are represented by taking the 2's complement of the
positive value and adding 1 (2's complements are obtained simply by inverting
each bit of the number).

• As a refresher, the following example would be for a decimal value of 6235 in a


16-bit WORD.

• 0001 1000 0101 1011 185Bh +6235d


• 1110 0111 1010 0100 2's complement
• +1
• -------------------
1110 0111 1010 0101 E7A5h -6235d


Integer data types
• Within the integer data types, three sizes of
integers may be used:
• the 16-bit WORD,
• the 32-bit DWORD,
• and the 64-bit QWORD, (the 8-bit byte cannot
be used with FPU instructions).
• The available range of values for each of those
sizes is as follows:
– WORD range ±(2 15 -1) or ±32767
– DWORD range ±(2 31 -1) or ±2147483647
– QWORD range ±(2 63 -1) or ±9223372036854775807
Floating point numbers
• The floating point data types are simply binary numbers represented in a
manner similar to the scientific notation used for decimal values.
• For example:
• 211 = 2.11 x 10 2 (2.11E+0002)
• (The latter is the conventional syntax for decimal values in scientific notation
when superscripts are not allowed in a text. For instance, most
assemblers/compilers would not recognize superscripts.)

• If the above is divided by a multiple of 10 such as 100000, the only thing


which would change in the scientific notation would be the exponent:
• 211 ÷ 100000 = 0.00211 = 2.11 x 10 -3 (2.11E-0003)
• In binary, the 211 value could be expressed as:
• 11010011 = 1.1010011 x 2 7
• In this case, if the above is divided by a multiple of 2 (such as 8), again the
only thing which would change in the "binary scientific notation" would be
the exponent:
• 11010011 ÷ 2 3 = 1.1010011 x 2 4
Floating point numbers
• As can be deduced, this allows for the representation of binary fractions, and of very large or very small
values.

• The formatting of this "binary scientific notation" was standardized for the original CPUs and is usually
called the IEEE (Institute of Electrical and Electronics Engineers) real number format.
• This real number format consists basically in dividing a binary numerical data into three fields:
– a sign field,
– an exponent field,
– and a number description (significand) field.

• The exponent field is biased to the middle of the available range such that negative exponents are
effectively smaller than positive exponents.
• And, as opposed to the negative integer system of 2's complements, the significand field is always that
of the positive number, negative numbers being distinguished strictly by the sign field.

• Within the floating point data types, three sizes of real numbers are available:
• the 32-bit REAL4 (also called short real or single precision),

• the 64-bit REAL8 (also called long real or double precision),



the 80-bit REAL10 (also called temporary real or extended precision).

FP data types / Real 4, Real 8,
Real 10

80x87 data types appear in the following figures:


Real 4 numbers
• For REAL4 numbers, the bias of the 8 exponent's bits is 7Fh (the last 7 bits).
– This means that if the real exponent is 0, the value of the exponent field would be 7Fh.
– When the exponent is negative (i.e. for absolute values lower than 1), the value in the exponent field would be lower
than 7Fh, and vice versa for values of 2 and higher.

• The maximum value of FFh in the exponent field is reserved for a special category of numbers designated
as NAN (Not-A-Number).

• This category includes the special value of INFINITY and will be described later in more details.

• The value of 0 in the exponent field is also reserved for a special category of numbers.

• When all bits in the significand field are also 0, the value of the REAL number would be equal to 0.

• If any of the bits in the significand field are set, the value is then called a "denormalized" REAL number.

• This will also be described later in more details.


• Because a valid number in real format must always start with a 1, that first bit is implied in the REAL4
format and the significand field only contains the fraction bits f1, f2, etc.

• A value of +1.0 would thus be represented in REAL4 format as:

• 0 01111111 00000000000000000000000b (or 3F800000h in hex notation)


Real 4 numbers
• The value of +2.0 (1.0 x 2 1) would be:
• 0 10000000 00000000000000000000000b (or 40000000h in hex notation) S 7Fh+1 fraction bits
• And the value of -2.0 would be:
• 1 10000000 00000000000000000000000b (or C0000000h in hex notation) The result of dividing
-211 by 8 would give -1.1010011 x 24 in binary scientific format and its REAL4 representation
would be:
• 1 10000011 10100110000000000000000 (or C1D30000h in hex notation) S 7Fh+4 fraction bits

• As with all other numerical data, all REAL numbers are stored in memory with the least significant
bytes first.
• The value of +1.0 in REAL4 format would thus appear in consecutive bytes of memory as:
• 00 00 80 3F

• The largest number which can be represented properly within the REAL4 format is when the
exponent field contains FEh and the significand is almost equal to 2 (or almost 280h =2128d or
approx. 3.40x1038).
• The smallest one would be when the exponent field contains 1 and the significand contains all 0s
(or 2-7Eh =2-126d or approx. 1.17x10-38).

• The 24 bits describing the number (23 bits in the significand field + 1 implied bit) is approximately
equivalent to 7 decimal digits.
Real 8 numbers
• For REAL8 numbers, the bias of the 11 exponent's bits is 3FFh (the last 10 bits).
• The maximum value of 7FFh in the exponent field is reserved for NANs, and the value of
0 in that field has the same purpose as described for the REAL4 format.

• As with the REAL4 format, the first bit of the number is implied and the significand field
only contains the fraction bits f1, f2, etc.
• A value of +1.0 would thus be represented in REAL8 format as:
• 0 01111111111 0000000000000000000000000000000000000000000000000000b (or
3FF0000000000000h in hex notation).
• The largest number which can be represented properly within the REAL8 format is when
the exponent field contains 7FEh and the significand is almost equal to 2 (or almost 2400h
=21024d or approx. 1.79x10308).

• The smallest one would be when the exponent field contains 1 and the significand
contains all 0s (or 2-3FEh =2-1022d or approx. 2.22x10-308).

• The 53 bits describing the number (52 bits in the significand field + 1 implied bit) is
approximately equivalent to 15 decimal digits.

Real 10 numbers
• For REAL10 numbers, the bias of the 15 exponent's bits is 3FFFh (the last 14 bits).

• The maximum value of 7FFFh in the exponent field is reserved for NANs, and the
value of 0 in that field has the same purpose as described for the REAL4 format.

• As opposed to the REAL4 and REAL8 formats, the first bit of the number is explicitly
included in the significand field and followed by the fraction bits f1, f2, etc.
• A value of +1.0 would thus be represented in REAL10 format as:
• 0 011111111111111 10000...........0b (or 3FFF1000000000000000h in hex notation).

• The largest number which can be represented properly within the REAL10 format is
when the exponent field contains 7FFEh and the significand is almost equal to 2 (or
almost 24000h =216384d or approx. 1.19x104932).
• The smallest one would be when the exponent field contains 1 and the significand's
fraction bits contains all 0s (or 2-3FFEh =2-16382d or approx. 3.36x10-4932).
• The 64 bits of the significand describing the number is approximately equivalent to 19
decimal digits.

NAN and Infinity
• NANs (Not-A-Number)
• Whenever all the bits are set to 1 in the exponent field of a real number format, the value is designated as a NAN.
• Two values in that category are generated by the FPU:
– INFINITY
– INDEFINITE.

INFINITY
• In addition to the exponent field bits being all set to 1, the value of INFINITY has the following special coding to
differentiate it from other NANs:
• All fraction bits of the significand field are 0 (the explicit 1 in bit 63 remains set for the REAL10 format). In
addition,
– when the sign bit is 0, that NAN is treated as +INFINITY
– when the sign bit is 1, that NAN is treated as -INFINITY

• Such values of INFINITY are generated by the FPU when


– - attempting to divide a valid number by 0 (Zero divide exception detected)
– - the result of a computation exceeds the maximum value allowable (Overflow exception detected)
– - instructed to store a value larger than the upper limit of the destination format (Overflow exception detected).

• This INFINITY value can be used as an operand in FPU instructions. Depending on the instruction, the result can
vary and exceptions may or may not be detected.

Indefinite
• INDEFINITE
• In addition to the exponent field bits being all set to 1, the value of
INDEFINITE has the following special coding to differentiate it from
other NANs:
• The 1st fraction bit of the significand field (f1) is set to 1, all
other fraction bits being 0 (the explicit 1 in bit 63 remains set for the
REAL10 format), and the sign bit is set.

• Such a value of INDEFINITE is generated by the FPU whenever a


reasonable result is impossible for the given instruction. An Invalid
exception is detected in some cases. Examples are:
• - using the value of INDEFINITE as an operand
• - using an empty register as an operand
• - subtracting two values of INFINITY
• - extracting the square root of a negative number.
Other NAN
• Apart from the INFINITY and INDEFINITE values which can be generated
by the FPU, there is a very large number of other NANs with all the possible
permutations of fraction bits and sign bit being set to 1 when all the bits in
the exponent field are set to 1.
– For example, the short REAL4 format could have over 16 million of them (2 24 -3
to be more exact).

• There are two general categories of other NANs, the QNANs (Quiet NAN)
and the SNANs (Signaling NAN).
– The difference betwen the two is that the first fraction bit is 1 for the QNAN (such
as for the special INDEFINITE NAN) and 0 for the SNAN (but with at least one
other fraction bit set to 1).

• Although NANs could be used as valid operands with some of the FPU
instructions, they are of no practical use for the average programmer.

BCD data type/ Assembler directive
DT
BCD data type/ Assembler directive
DT
• The Packed BCD (Binary Coded Decimal) data type is considered by the FPU as a
signed integer and has the following 80-bit special packed decimal format.
• where:
S = sign bit (0=positive, 1=negative)
dn = 4-bit decimal values, d0 being the least significant
(bits 72-78 are not used and ignored)

• For example, the decimal value 211 in this data type format would be:
• 00000000000000000211h in hex notationThe decimal value of -65536 (-216) in this
data type format would be: 80000000000000065536h in hex notation
• As with all other numerical data, the packed BCD format is stored in memory with the
least significant bytes first.
• The consecutive memory bytes (in hex notation) of the above number would thus be:
• 36 55 06 00 00 00 00 00 00 80
• As depicted, 18 decimal digits is the maximum which can be inserted in this format.
• The largest integer which could be represented in this format would thus be 18
consecutive 9 (or 1018-1).
Internal structure of 80X87
Control Unit (CU) Numeric Execution unit ( NEU )
Control Register
Status Register Exponent module Shifter

Instruction decoder Arithmetic Module

Data Data Buffer Operand Queue Temporary Registers

T (7)
A
G (6)
R (5)
E (4)
G
I (3)
T (2)
Status E
Bus tracking R (1)
Address Exceptions (0)
80 BIT WIDE STACK
80X87 Registers
• Add 13 registers to the 80386 and later processors
– eight floating point data registers,
– control register,
– status register,
– a tag register,
– an instruction pointer, and
– a data pointer.
• The data registers are similar to the 80x86's general purpose register
set insofar as all floating point calculations take place in these registers.
• The control register contains bits that let you decide how the 80x87
handles certain degenerate cases like rounding of inaccurate
computations, control precision, and so on.
• The status register is similar to the 80x86's flags register; it contains the
condition code bits and several other floating point flags that describe
the state of the 80x87 chip.
• The tag register contains several groups of bits that determine the state
of the value in each of the eight general purpose registers.
• The instruction and data pointer registers contain certain state
information about the last floating point instruction executed.
80X87 data Registers
• provides eight 80 bit data registers
organized as a stack.

• This is a significant departure from


the organization of the general
purpose registers on the 80x86 CPU
that comprise a standard general-
purpose register set.

• Intel refers to these registers as


ST(0), ST(1), ..., ST(7). Most
assemblers will accept ST as an
abbreviation for ST(0).

The biggest difference between


the FPU register set and the
80x86 register set is the stack
organization.
80X87 data Registers
• On the 80x86 CPU, the ax register
is always the ax register, no matter
what happens.

• On the 80x87, however, the register


set is an eight element stack of 80
bit floating point values (see the
figure ).

• ST(0) refers to the item on the top of


the stack, ST(1) refers to the next
item on the stack, and so on.

• Many floating point instructions push


and pop items on the stack;
therefore, ST(1) will refer to the
previous contents of ST(0) after you
push something onto the stack.
80X87 Control Register
requirement
• When Intel designed the 80x87 (and, essentially, the IEEE floating
point standard), there were no standards in floating point hardware.
• Different (mainframe and mini) computer manufacturers all had
different and incompatible floating point formats.
• Unfortunately, much application software had been written taking into
account the idiosyncrasies of these different floating point formats. Intel
wanted to designed an FPU that could work with the majority of the
software out there (keep in mind, the IBM PC was three to four years
away when Intel began designing the 8087, they couldn't rely on that
"mountain" of software available for the PC to make their chip popular).
• Unfortunately, many of the features found in these older floating point
formats were mutually exclusive. For example, in some floating point
systems rounding would occur when there was insufficient precision; in
others, truncation would occur. Some applications would work with one
floating point system but not with the other.
• Intel wanted as many applications as possible to work with as few
changes as possible on their 80x87 FPUs, so they added a special
register, the FPU control register, that lets the user choose one of
several possible operating modes for the 80x87
80X87 Control Register
• Bit 12 of the control register is only present on the 8087
and 80287 chips.
• It controls how the 80x87 responds to infinity.
• The 80387 and later chips always use a form of infinitly
known and affine closure because this is the only form
supported by the IEEE 754/854 standards.
• As such, we will ignore any further use of this bit and
assume that it is always programmed with a one.


Bits 10 and 11 provide rounding control according to
the following values:

– Bits 10 & 11 Function

• 00 To nearest or even
• 01 round Down
• 10 round up
• 11 truncate

• The "00" setting is the default.


• The 80x87 rounds values above one-half of the least
significant bit up. It rounds values below one-half of the
least significant bit down
• . If the value below the least significant bit is exactly
one-half the least significant bit, the 80x87 rounds the
value towards the value whose least significant bit is
zero. For long strings of computations, this provides a
reasonable, automatic, way to maintain maximum
precision.
80X87 Control Register
• The round up and round down options
are present for those computations where
it is important to keep track of the
accuracy during a computation.
• By setting the rounding control to round
down and performing the operation, the
repeating the operation with the rounding
control set to round up, you can
determine the minimum and maximum
ranges between which the true result will
fall.

• The truncate option forces all


computations to truncate any excess bits
during the computation.
• You will rarely use this option if accuracy
is important to you.
• However, if you are porting older
software to the 80x87, you might use this
option to help when porting the software.
80X87 Control Register (contd 1)
• Bits eight and nine of the control register
control the precision during computation.
• This capability is provided mainly to allow
compatibility with older software as
required by the IEEE 754 standard.
• The precision control bits use the
following values:
Mantissa Precision Control Bits

– Bits 8 & 9 Precision Control


• 00 24 bits
• 01 Reserved
• 10 53 bits
• 11 64 bits

• For modern applications, the precision


control bits should always be set to "11"
to obtain 64 bits of precision.
• This will produce the most accurate
results during numerical computation.
80X87 Control Register (contd 2)
• Bits zero through five are the exception
masks.
• These are similar to the interrupt enable
bit in the 80x86's flags register.
• If these bits contain a one, the
corresponding condition is ignored by
the 80x87 FPU.
• However, if any bit contains zero, and the
corresponding condition occurs, then the
FPU immediately generates an interrupt
so the program can handle the
degenerate condition.

• Bit zero corresponds to an invalid


operation error.
• This generally occurs as the result of a
programming error.
• Problem which raise the invalid
operation exception include pushing
more than eight items onto the stack or
attempting to pop an item off an empty
stack, taking the square root of a negative
number, or loading a non-empty register.

80X87 Control Register (contd 3 )
Bit one masks the denormalized interrupt
which occurs whenever you try to
manipulate denormalized values.

• Denormalized values generally occur


when you load arbitrary extended
precision values into the FPU or work
with very small numbers just beyond the
range of the FPU's capabilities.
• Normally, you would probably not enable
this exception.


Bit two masks the zero divide exception.

• If this bit contains zero, the FPU will


generate an interrupt if you attempt to
divide a nonzero value by zero.

• If you do not enable the zero division


exception, the FPU will produce NaN (not
a number) whenever you perform a zero
division.

80X87 Control Register (contd 4 )
Bit three masks the overflow exception.

• The FPU will raise the overflow exception


– if a calculation overflows or

– if you attempt to store a value which is too


large to fit into a destination operand (e.g.,
storing a large extended precision value
into a single precision variable).


Bit four, if set, masks the underflow
exception.

• Underflow occurs when the result is too


small to fit in the desintation operand.

• Like overflow, this exception can occur

– whenever you store a small extended


precision value into a smaller variable
(single or double precision) or
– when the result of a computation is too
small for extended precision.
80X87 Control Register (contd 5 )
• Bit five controls whether the
precision exception can occur.
• A precision exception occurs
whenever the FPU produces an
imprecise result, generally the
result of an internal rounding
operation.
• Although many operations will
produce an exact result, many
more will not.
• For example, dividing one by ten
will produce an inexact result.
Therefore, this bit is usually one
since inexact results are very
common.
80X87 Control Register (contd 6 )
• Bits six and thirteen through fifteen in the
control register are currently undefined
and reserved for future use.
• Bit seven is the interrupt enable mask,
but it is only active on the 8087 FPU;
• a zero in this bit enables 8087 interrupts
and a one disables FPU interrupts.
The 80x87 provides two instructions,
FLDCW (load control word) and FSTCW
(store control word), that let you load and
store the contents of the control register.

• The single operand to these instructions


must be a 16 bit memory location.
• The FLDCW instruction loads the control
register from the specified memory
location, FSTCW stores the control
register into the specified memory
location.

80X87 Status register
The FPU status register provides the
status of the coprocessor at the instant
you read it.
• The FSTSW instruction stores the16 bit
floating point status register into the
mod/reg/rm operand.

• The status register s a 16 bit register, its


layoutis

• Bits zero through five are the exception


flags.
• These bits are appear in the same order
as the exception masks in the control
register.
• If the corresponding condition exists,
then the bit is set.
• These bits are independent of the
exception masks in the control register.
• The 80x87 sets and clears these bits
regardless of the corresponding mask
setting.

80X87 Status register
Bit six (active only on 80386 and later
processors) indicates a stack fault.
• A stack fault occurs whenever there is a
stack overflow or underflow.

• When this bit is set, the C1 condition


code bit determines whether there was a
stack overflow (C1=1) or stack underflow
(C1=0) condition.
Bit seven of the status register is set if
any error condition bit is set.

• It is the logical OR of bits zero through


five.
• A program can test this bit to quickly
determine if an error condition exists.
Bits eight, nine, ten, and fourteen are the
coprocessor condition code bits.
• Various instructions set the condition
code bits as shown in the table that
follows
80X87 Status register (contd)
Bits 11-13 of the FPU status register
provide the register number of the top of
stack. During computations, the 80x87 adds
(modulo eight) the logical register numbers
supplied by the programmer to these three
bits to determine the physical register
number at run time.

Bit 15 of the status register is the busy bit. It


is set whenever the FPU is busy. Most
programs will have little reason to ac
Condition Code Bits
Instruction Condition
C3 C2 C1 C0

ST > source
00X0
ST < source
00X1
fcom, fcomp, fcompp, ficom,
ficomp ST = source
10X0
FPU Condition Code Bits ST or source
11X1
undefined

00X0 ST is positive

00X1 ST is negative
ftst
10X0 ST is zero (+ or -)

11X1 ST is uncomparable
0000 + Unnormalized

0010 -Unnormalized

0100 +Normalized

0110 -Normalized

1000 +0

1010 -0

fxam 1100 +Denormalized


FPU Condition Code Bits
1110 -Denormalized

0001 +NaN

0011 -NaN

0101 +Infinity

0111 -Infinity

1XX1 Empty register


00X0 ST > source

00X1 ST < source


fucom, fucomp, fucompp
10X0 ST = source

11X1 Unorder
Condition Code Interpretation
Insruction(s) C0 C3 C2 C1

Result of
Result of
Result of comparison (see table
fcom, fcomp, fcmpp, ftst, compariso
Operand above) or stack
fucom, fucomp, fucompp, compariso n.
is not overflow/underflow (if stack
ficom, ficomp n.
comparabl exception bit is set ).
See table
e.
See table above.
above.

See See See Sign of result, or stack


fxam previous previous previous overflow/underflow (if stack
table. table. table. exception bit is set ).
0-
reduction
done.
Bit 1 of remainder or stack
Bit 2 of Bit 0 of
fprem, fprem1 overflow/underflow (if stack
remainder remainder 1-
exception bit is set ).
reduction
incomplet
e.
Condition Code Interpretation
Round up occurred or stack
fist, fbstp, frndint, fst, fstp, fadd, overflow/underflow (if stack
fmul, fdiv, fdivr, fsub, fsubr, fscale, exception bit is set ).
fsqrt, fpatan, f2xm1, fyl2x, fyl2xp1
Undefined Undefined
Undefined

0- reduction
done. Round up occurred or stack
fptan, fsin, fcos, fsincos overflow/underflow (if stack
Undefined Undefined 1- reduction exception bit is set ).
incomplete.

fchs, fabs, fxch, fincstp, fdecstp,


constant loads, fxtract, fld, fild, fbld, Zero result or stack
fstp (80 bit) Undefined Undefined Undefined overflow/underflow (if stack
exception bit is set ).

Restored Restored Restored


from from from
fldenv, fstor Restored from memory operand.
memory memory memory
operand. operand. operand.

fldcw, fstenv, fstcw, fstsw, fclex


Undefined Undefined Undefined Undefined
Cleared to Cleared to Cleared to
finit, fsave Cleared to zero.
zero. zero. zero.
Programming the FPU
• The 80-bit registers are
generally designated in most
literature as a stack of eight
registers. To better understand
how these 80-bit registers
function, instead of imagining
them as a stack, it will be
easier to imagine them as a
revolver barrel with 8
compartments numbered
clock-wise from 0 to 7. When
the FPU is initialized, all the
compartments are empty and
Barrel Compartment #0 (BC0)
would be at the 12 o'clock
position (at the TOP), as
depicted in Fig
• When the FPU would be instructed to LOAD a value, it would turn the
barrel clockwise by one notch and load the specified value in the top
compartment.
• The first value loaded immediately after the FPU is initialized would thus
go into BC7 according to the FPU's internal numbering system.

• If the FPU would be instructed to load another value while the first one is
still in BC7, it would again turn the barrel clockwise by one notch and load
the specified value again in the top compartment, which would now be
BC6.

• Values can be loaded only into the TOP compartment of the FPU

• This could continue until all the compartments contain a value.


• If, however, an attempt is made to load a value when all the compartments
have a value in them, the barrel would still turn by one notch but the
attempted loading would fail (just like trying to insert a bullet into a
compartment which already contains one).
• And, in addition, whatever valid value would have been in that
compartment now at the TOP is also destroyed, leaving unusable trash in
that register at the TOP.
• Rule #1: An FPU 80-bit register compartment MUST be free (empty) in order
to load a value into it.

• Quite fortunately, these registers can be emptied with various FPU instructions.
The most common way is generally referred to as "popping a register".
• The "pop" mnemonic used for the CPU is not available for the FPU. Instead, it
can be included as a part of numerous FPU instructions; such instruction would be
carried out normally and then immediately followed by popping the register at the
TOP.

• When the FPU is instructed to POP a value, it would first remove it from
whichever compartment would currently be at the TOP and then turn the barrel
counter-clockwise by one notch.
• For example, if BC6 would be at the TOP and popped, BC7 would then become
the register compartment at the TOP.

• Values can be popped only from the TOP compartment of the FPU

• Those BC numbers are never used directly by the programmer.


• The FPU takes care of remembering where all the 80-bit values are located in its
internals and which of its compartments is at the TOP.
• However, the programmer must remain aware of this internal numbering system.
• For the programmer, while still using the revolving barrel image, the 80-bit
registers are ALWAYS numbered clockwise from 0 to 7 starting from the TOP.
• The numbers shown in Fig.1 above would therefore never change for referring to
register numbers in FPU instructions.
• The register at the TOP would always have the number 0.

• The designation ST with its number in parenthesis (such as ST(0), ST(1), etc.) is
used when reference to a given 80-bit register is required in an FPU instruction.
• (MASM also interprets ST without any explicit number as if ST(0) had been
specified.)

• Any value loaded to the FPU must initially be referred to as ST(0) because it can
only be loaded to the TOP compartment.
• If the FPU would be instructed to load another value while the first one is still
there, that second value would now be referred to as ST(0) because it has now
become the one at the TOP.
• As a consequence, the first value would now have to be referred to as ST(1). If
another value is loaded, the first value would then have to be referred to as ST(2).
• After popping the last loaded value, that same first value would revert back to
being referred to as ST(1).

• That is probably the most complex concept to understand by someone starting to


learn how to use the FPU. When compared to the CPU where a value in EAX
would always be referred to as EAX regardless of operations on the other
registers, a value in an FPU register must be referred to according to its position
relative to the register at the TOP.
• Rule #2: The programmer must constantly keep track of the
relative location of the existing register values while other
values may be loaded to or popped from the TOP register.

• A good programming practice is to insert a comment after each FPU


instruction which can affect the location of register values, indicating
the new ST number of each value.

• When a register is popped from the FPU, its current value can no
longer be used in any operation.
• If that value would need to be used later, it should be stored in
memory before popping it and reloaded when required. (Some
debuggers may still show the old value in the popped register but
that should only be considered as residual "gun powder".)
Programming the control word
• The Control Word 16-bit register is used by the programmer to
select between the various modes of computation available from the
FPU, and to define which exceptions should be handled by the FPU
or by an exception handler written by the programmer.
• The Control Word is divided into several bit fields as depicted in the
following Fig.1.2.

• The IC field (bit 12) or Infinity Control allows for two types of infinity
arithmetic:
• 0 = Both -infinity and +infinity are treated as unsigned infinity
(initialized state)
1 = Respects both -infinity and +infinity
• This field has been retained for compatibility with the 287 and earlier
co-processors. In the more modern FPUs, this bit is disregarded
and both -infinity and +infinity are respected.
Programming the control word (1)
• The RC field (bits 11 and 10) or Rounding Control determines how the FPU
will round results in one of four ways:
• 00 = Round to nearest, or to even if equidistant (this is the initialized state)
01 = Round down (toward -infinity)
10 = Round up (toward +infinity)
11 = Truncate (toward 0)
• The PC field (bits 9 and 8) or Precision Control determines to what precision
the FPU rounds results after each arithmetic instruction in one of three
ways:
• 00 = 24 bits (REAL4)
01 = Not used
10 = 53 bits (REAL8)
11 = 64 bits (REAL10) (this is the initialized state)
• The IEM field (bit 7) or Interrupt Enable Mask determines whether any of the
interrupt masks will be enabled (bit = 0) or all those masks will be disabled
(bit = 1). This bit field is set to 1 in the initialized state. (This field is also for
compatibility with early co-processors and not used anymore.)
Programming the control word (2)
• Bits 5-0 are the interrupt masks. In the initialized state, they are all set to 1
which lets the FPU handle all exceptions. When any one of them is set to 0,
it instruct the FPU to generate an interrupt whenever that particular
exception is detected so that the program will take whatever action may be
deemed necessary before returning control to the FPU.
• The various interrupt masks available are:
• PM (bit 5) or Precision Mask
UM (bit 4) or Underflow Mask
OM (bit 3) or Overflow Mask
ZM (bit 2) or Zero divide Mask
DM (bit 1) or Denormalized operand Mask
IM (bit 0) or Invalid operation Mask
• (A more detailed description of the various exceptions and how the FPU
would normally handle them is given in the following section. This document
will not describe how interrupts are generated and transmitted nor how to
respond to such interrupts.)
• Bits 15-13 and 6 are reserved or unused.
Status word programming
• The Status Word 16-bit register indicates the general
condition of the FPU. Its content may change after each
instruction is completed.
• Part of it cannot be changed directly by the programmer.
It can, however, be accessed indirectly at any time to
inspect its content.
• The Status Word is divided into several bit fields as
depicted in the following Fig.
• When the FPU is initialized, all the bits are reset to 0.
• The B field (bit 15) indicates if the FPU is busy (B=1)
while executing an instruction, or is idle (B=0).
Status word programming
• The C3 (bit 14) and C2 - C0 (bits 10-8) fields contain the
condition codes following the execution of some
instructions such as comparisons. These codes will be
explained in detail for each instruction affecting those
fields.
• The TOP field (bits 13-11) is where the FPU keeps track
of which of its 80-bit registers is at the TOP. The BC
numbers described previously for the FPU's internal
numbering system of the 80-bit registers would be
displayed in that field. When the programmer specifies
one of the FPU 80-bit registers ST(x) in an instruction,
the FPU adds (modulo 8) the ST number supplied to the
value in this TOP field to determine in which of its
registers the required data is located.
Status word programming ( 1 )
• The IR field (bit 7) or Interrupt Request gets set to 1 by the FPU
while an exception is being handled and gets reset to 0 when the
exception handling is completed.
• When the interrupt is masked in the Control Word for the FPU to
handle the exception, this bit may never be seen to be set while
stepping through the instructions with a debugger.
• However, if the programmer handles the interrupt, that bit should
remain set until the interrupt handling routine is completed.
• Bits 6-0 are flags raised by the FPU whenever it detects an
exception.
• Those exception flags are cumulative in the sense that, once set
(bit=1), they are not reset (bit=0) by the result of a subsequent
instruction which, by itself, would not have raised that flag.
• Those flags can only be reset by either initializing the FPU (FINIT
instruction) or by explicitly clearing those flags (FCLEX instruction).
Status word programming (2)
• The SF field (bit6) or Stack Fault exception is set whenever an
attempt is made to either load a value into a register which is not
free (the C1 bit would also get set to 1) or pop a value from a
register which is free (and the C1 bit would get reset to 0). (Such
stack fault is also treated as an invalid operation and the I field flag
bit0 would thus also be set by this exception; see below.)
• The P field (bit5) or Precision exception is set whenever some
precision is lost by instructions which do exact arithmetic.
• For example, dividing 1 by 10 does not yield an exact value in
binary arithmetic and would set the P exception flag. Another
example which sets the P exception flag would be the conversion of
a REAL10 to a REAL4 when some of the least significant bits would
be lost.
If the FPU handles this exception (when the PM bit is set in the
Control Word), it rounds the result according to the rounding mode
specified in the RC field of the Control Word.
Status word programming (3)
• The U field (bit4) or Underflow exception flag gets set
whenever a value is too small (without being equal to 0)
to be represented properly.
• Each of the floating point formats has a different limit on
the smallest number which can be represented. The U
flag gets set if the result of an operation exceeds that
limit. For example, dividing a valid very small number by
a large number could exceed the limit. A valid REAL10
small number may be much smaller than acceptable for
the REAL4 or REAL8 formats; in such cases, conversion
from the former to the latter would also set the U flag.
If the FPU handles this exception (when the UM bit is set
in the Control Word), it would denormalize the value until
the exponent is in range or ultimately return a 0.
Status word programming (4)
• The O field (bit3) or Overflow exception flag gets set whenever a value is
too large in magnitude to be represented properly.
• Again, each of the floating point formats has a different limit on the largest
number which can be represented. The O flag gets set if the result of an
operation exceeds that limit.
• For example, multiplying a valid very large number by another large number
could exceed the limit. A valid REAL10 large number may be much larger
than acceptable for the REAL4 or REAL8 formats; conversion from the
former to the latter would also set the O flag.
If the FPU handles this exception (when the OM bit is set in the Control
Word), it would generate a properly signed INFINITY according to the IC
flag of the Control Word.
• The Z field (bit2) or Zero divide exception flag gets set whenever the division
of a finite non-zero value by 0 is attempted.

• If the FPU handles this exception (when the ZM bit is set in the Control
Word), it would generate a properly signed INFINITY according to the XOR
of the operand signs and then according to the IC flag of the Control Word.
Status word programming (5)
• The D field (bit1) or Denormalized exception flag gets set whenever
an instruction attempts to operate on a denormalized number or the
result of the operation is a denormalized number.

If the FPU handles this exception (when the DM bit is set in the
Control Word), it would simply continue with normal processing and
then check for other possible exceptions.

• The I field (bit0) or Invalid operation exception flag gets set


whenever an operation is considered invalid by the FPU. Examples
of such operations are:
- Stack overflow or underflow
- Indeterminate arithmetic such as 0 divided by 0, or subtracting
infinity from infinity
- Using a Not-A-Number (NAN) as an operand with some
instructions
- Trying to extract the square root of a negative number
TAG Word
• The Tag Word 16-bit register is managed by the FPU to maintain
some information on the content of each of its 80-bit registers.
• The Tag Word is divided into 8 fields of 2 bits each as depicted in
the following Fig.1.4.

• The above Tag numbers correspond to the FPU's internal
numbering system for the 80-bit registers (the BC numbers). The
meaning of each pair of bits is as follows:
• 00 = The register contains a valid non-zero value
01 = The register contains a value equal to 0
10 = The register contains a special value (NAN, infinity, or
denormal)
11 = The register is empty
TAG Word (1)
• When the FPU is initialized, all the 80-bit registers are empty and the Tag
Word would thus have an overall value of
1111111111111111b (FFFFh).
• If a valid non-zero value is then loaded, the Tag Word would then be change
to
0011111111111111b (3FFFh). (Remember that the very first value loaded
goes into BC7.)
• If a second value equal to 0 was then loaded, the Tag Word would become
0001111111111111b (1FFFh). (And the second value loaded goes into
BC6.)
• Although this Tag Word may contain information which could also be useful
to the programmer, it cannot be accessed directly nor by itself. The only way
to gain access to it is to store the FPU's environment data in memory (see
the FSTENV instruction) and examine it there. However, the information
available in the Tag Word could also be obtained otherwise (such as with
the FXAM instruction for individual registers).

Internal Flag Register
• The FPU also has an internal exception flag register
which is not accessible to the programmer. All these
flags are cleared before each instruction and are set as
each exception is encountered. Those are the flags that
trigger a response from the FPU or an interrupt for the
programmer's exception handler. They are also OR'ed
with the exception flags of the Status Word to provide a
cumulative record for the programmer.
• It is possible that several of the flags could be set with a
single instruction. For example, using a denormal
number as an operand would set the Denormal flag. The
result of the operation with it could then set the
Underflow flag and the Precision flag.

Das könnte Ihnen auch gefallen