, Raymond,
J. Organization
The Electrical Engineering Handbook
Ed. Richard C. Dorf
Boca Raton: CRC Press LLC, 2000
2000 by CRC Press LLC
S6
rganIzafIon
86.1 Numbei Systems
Positional and Polynomial Repiesentations Unsigned Binaiy
Numbei System Unsigned BinaiyCoded Decimal, Hexadecimal,
and Octal Systems Conveision between Numbei Systems Signed
Binaiy Numbeis FloatingPoint Numbei Systems
86.2 Computei Aiithmetic
Numbei Repiesentation Algoiithms foi Basic Aiithmetic
Opeiations Implementatino of Addition Implementation of the
Multiplication Algoiithm FloatingPoint Repiesentation
86.3 Aichitectuie
Functional Units Basic Opeiational Concepts Peifoimance
Multipiocessois
86.4 Miciopiogiamming
Levels of Piogiamming Micioinstiuction Stiuctuie
Miciopiogiam Development HighLevel Languages foi
Miciopiogiamming Emulation Applications of
Miciopiogiamming
86.1 Number Systems
Fcord . Tnder
Numbei systems piovide the basis foi conveying and quantifying infoimation. Weathei data, stocks, pagination
of books, weights and measuiesthese aie just a few examples of the use of numbeis that affect oui daily lives.
Foi this puipose we fnd the decimal (oi aiabic) numbei system to be ieliable and easy to use. This system
evolved piesumably because eaily humans weie equipped with a ciude type of calculatoi, theii ten fngeis. A
numbei system that is appiopiiate foi humans, howevei, may be intiactable foi use by a machine such as a
computei. Likewise, a numbei system appiopiiate foi a machine may not be suitable foi human use.
Befoie concentiating on those numbei systems that aie useful in computeis, it will be helpful to ieview the
chaiacteiistics that aie desiiable in any numbei system. Theie aie [our impoitant chaiacteiistics in all:
Distinguishability of symbols
Aiithmetic opeiations capability
Eiioi contiol capability
Tiactability and speed
To one degiee oi anothei the decimal system of numbeis satisfes these chaiacteiistics foi haidcopy tiansfei
of infoimation between humans. Roman numeials and binary aie examples of numbei systems that do not
satisfy all foui chaiacteiistics foi human use. On the othei hand, the binaiy numbei system is piefeiable foi
use in digital computeis. The ieason is simply put: cuiient digital electionic machines iecognize only two
identifable states physically iepiesented by a high voltage level and a low voltage level. These two physical states
aie logically inteipieted as the binaiy symbols 1 and 0.
RIchard !. TInder
Wongron Srore Inverry
Vo}In C. IIoldzI}a
Inverry of Coforno, Dov
V. CarI Hamacher
(ueen Inverry, Conodo
ZvonIo C. VranesIc
Inverry of Toronro
Salvaf C. ZaIy
Inverry of Toronro
}acques Raymond
Inverry of Orrovo
2000 by CRC Press LLC
A ffth desiiable chaiacteiistic of a numbei system to be used in a computei should be that it have a minimum
numbei of easily identifable states. The binaiy numbei system satisfes this condition. Howevei, the digital
computei must still inteiface with humankind. This is done by conveiting the binaiy data to a decimal and
chaiacteibased foim that can be ieadily undeistood by humans. A minimum numbei of identifable chaiacteis
(say 1 and 0, oi tiue and false) is not piactical oi desiiable foi diiect human use. If this is diffcult to undeistand,
imagine tiying to complete a tax foim in binaiy oi in any numbei system othei than decimal. On the othei
hand, use of a computei foi this puipose would not only be piactical but, in many cases, highly desiiable.
Pusitiuna! and Pu!ynumia! Representatiuns
The osona [orm of a numbei is a set of sidebyside (juxtaposed) digits given geneially in fxeJon foim as
MSD Radix Point LSD
N
r
(a
n1
. a
3
a
2
a
1
a
0
. a
1
a
2
a
3
. a
m
)
r
(86.1)
Integei Fiaction
wheie the radix (oi base) r is the total numbei of digits in the numbei system and a is a digit in the set defned
foi iadix r. Heie, the iadix point sepaiates n integei digits on the left fiom m fiaction digits on the iight. Notice
that a
n1
is the most signifcant (highestoidei) digit, called MSD, and that a
m
is the least signifcant (lowest
oidei) digit, denoted by LSD.
The aue of the numbei in Eq. (86.1) is given in oynoma [orm by
(86.2)
wheie a
.
Application of Eqs. (86.1) and (86.2) follows diiectly. Foi the decimal system r 10, indicating that theie
aie 10 distinguishable chaiacteis iecognized as decimal numeials 0, 1, 2, .,r  1( 9). Examples of the positional
and polynomial iepiesentations foi the decimal system aie
N
10
(J
3
J
2
J
1
J
0
.J
1
J
2
J
3
)
10
3017.528
and
wheie J
is the decimal digit in the th position. Exclusive of possible leading and tiailing zeios, the MSD and
LSD foi this numbei aie 3 and 8, iespectively. This numbei could have been wiitten in a foim such as N
10
03017.52800 without alteiing its value but implying gieatei accuiacy of the fiaction poition.
N a r
a r a r a r a r
a r a r a r
r
m
n
n
n
m
m
r
+ + + +
+ + + +
_
,
1
1
1
2
2
1
1
0
0
1
1
2
2
N J
n
10
3
1
3 2 1 0 1 2 3
10
3 10 0 10 1 10 7 10 5 10 2 10 8 10
3000 10 7 0 5 0 02 0 008
+ + + + + +
+ + + + +
. . .
2000 by CRC Press LLC
Lnsigned Binary Number System
Applying Eqs. (86.1) and (86.2) to the binaiy system iequiies that r 2, indicating that theie aie two distin
guishable chaiacteis, typically 0 and (r  1) 1, that aie used. In positional iepiesentation these chaiacteis
(numbeis) aie called nary Jgs oi s. Examples of the positional and polynomial notations foi a binaiy
numbei aie
N
2
(
n1
. 
3

2

1

0
. 
1

2

3
. 
m
)
2
101101.101
2
MSB LSB
and
wheie 
is the bit in the th position. Thus, the bit positions aie weighted ., 16, 8, 4, 2, 1, , , , . foi
any numbei consisting of integei and fiaction poitions. Binaiy numbeis so iepiesented aie sometimes iefeiied
to as naura binaiy. In positional iepiesentation the bits on the extieme left and extieme iight aie called the
MSB (most signifcant bit) and LSB (least signifcant bit), iespectively. Notice that by obtaining the value of a
binaiy numbei a conveision fiom binaiy to decimal has been peifoimed. The subject of iadix (base) conveision
will be dealt with moie extensively latei.
Foi iefeience puiposes Table 86.1 piovides the binaiytodecimal conveision foi two, thiee, foui, fve,
and sixbit binaiy. The sixbit binaiy column is only halfway completed foi bievity.
TABLE 86.1 BinaiytoDecimal Conveision
TwoBit Decimal ThieeBit Decimal FouiBit Decimal FiveBit Decimal SixBit Decimal
Binaiy Value Binaiy Value Binaiy Value Binaiy Value Binaiy Value
00 0 000 0 0000 0 10000 16 100000 32
01 1 001 1 0001 1 10001 17 100001 33
10 2 010 2 0010 2 10010 18 100010 34
11 3 011 3 0011 3 10011 19 100011 35
100 4 0100 4 10100 20 100100 36
101 5 0101 5 10101 21 100101 37
110 6 0110 6 10110 22 100110 38
111 7 0111 7 10111 23 100111 39
1000 8 11000 24 101000 40
1001 9 11001 25 101001 41
1010 10 11010 26 101010 42
1011 11 11011 27 101011 43
1100 12 11100 28 101100 44
1101 13 11101 29 101101 45
1110 14 11110 30 101110 46
1111 15 11111 31 101111 47
. .
. .
. .
N 
m
n
+ + + + + + + +
+ + + + +
2
1 2 0 2 1 2 1 2 0 2 1 2 1 2 0 2 1 2
32 8 4 1 0 5 0 125
45 625
1
5 4 3 2 1 0 1 2 3
10
. .
.
2000 by CRC Press LLC
In the natuial binaiy system the numbei of bits in a unit of data is commonly assigned a name. Examples aie:
4databit unit: nibble (oi halfbyte)
8databit unit: byte
16databit unit: two bytes (oi halfwoid)
32databit unit: woid (oi foui bytes)
64databit unit: doublewoid
etc.
The woid size foi a computei is deteimined by the numbei of bits that can be manipulated and stoied in
iegisteis. The foiegoing list of names would be applicable to a 32bit computei.
Lnsigned BinaryCuded Decima!, Hexadecima!,
and Octa! Systems
While the binaiy system of numbeis is most appiopiiate foi use in computeis, it has seveial disadvantages
when used by humans who have become accustomed to the decimal system. Foi example, binaiy machine code
is long, diffcult to assimilate, and tedious to conveit to decimal. Howevei theie exist simplei ways to iepiesent
binaiy numbeis foi conveision to decimal iepiesentation. Thiee examples, commonly used, aie natuial binaiy
coded decimal (NBCD), binaiycoded hexadecimal (BCH), and binaiycoded octal (BCO). These numbei
systems aie useful in applications wheie a digital device, such as a computei, must inteiface with humans. The
NBCD code iepiesentation is also useful in caiiying out computei aiithmetic.
The NBCD Representatiun
The BCD system as used heie is actually an 8, 4, 2, 1 weighted code called naura BCD oi NBCD. This system
uses patteins of foui bits to iepiesent each decimal position of a numbei and is one of seveial such weighted
BCD code systems. The NBCD code is conveited to its decimal equivalent by polynomials of the foim
N
10

3
2
3
 
2
2
2
 
1
2
1
 
0
2
0

3
8  
2
4  
1
2  
0
1
foi any 
3

2

1

0
code integei. Thus, decimal 6 is iepiesented as (0 8)  (1 4)  (1 2)  (0 1), oi 0110 in
NBCD code. Like natuial binaiy, NBCD code is also called natuial" because its bit positional weights aie
deiived fiom integei poweis of 2
n
. Table 86.2 shows the NBCD bit patteins foi decimal integeis 0 thiough 9.
The NBCD code is cuiiently the most widely used of the BCD codes. Theie aie many excellent souices of
infoimation on BCD codes. One, in paiticulai, piovides a faiily extensive coveiage of both weighted and
unweighted BCD codes Tindei, 1991].
TABLE 86.2 NBCD Bit Patteins and Decimal Equivalent
NBCD NBCD
Bit Pattein Decimal Bit Pattein Decimal
0000 0 1000 8
0001 1 1001 9
0010 2 1010 NA
0011 3 1011 NA
0100 4 1100 NA
0101 5 1101 NA
0110 6 1110 NA
0111 7 1111 NA
NA not allowed.
2000 by CRC Press LLC
Decimal numbeis gieatei than 9 oi less than 1 can be iepiesented by the NBCD code if each digit is given
in that code and if the iesults aie combined. Foi example, the numbei 63.98 is iepiesented by (oi conveited
to) NBCD code as
6 3 . 9 8
63.98
10
0110 0011 . 1001 1000)
NBCD
1100011.10011
NBCD
Heie, the code weights aie 80, 40, 20, 10; 8, 4, 2, 1; 0.8, 0.4, 0.2, 0.1; and 0.08, 0.04, 0.02, 0.01 foi the tens,
units, tenths, and hundiedths digits, iespectively, iepiesenting foui decades. Conveision between binaiy and
NBCD iequiies conveision to decimal as an inteimediate step. Foi example, to conveit fiom NBCD to binaiy
iequiies that gioups of foui bits be selected in both diiections fiom the iadix point to foim the decimal numbei.
If necessaiy, zeios aie added to the leftmost oi iightmost ends to complete the gioups of foui bits as in the
above example. Negative NBCD numbeis can be iepiesented eithei in signmagnitude notation oi 1`s oi 2`s
complement notation as discussed latei.
Anothei BCD code that is used foi numbei iepiesentation and manipulation is called excess 3 BCD (oi XS3
NBCD, oi simply XS3). XS3 is an example of a aseJwegeJ code (a bias of 3). This code is foimed by
adding 0011
2
( 3
10
) to the NBCD bit patteins in Table 86.2. Thus, to conveit XS3 to NBCD code, 0011 must
be subtiacted fiom XS3 code. In fouibit quantities the XS3 code has the useful featuie that when adding two
numbeis togethei in XS3 notation a caiiy will iesult and yield the coiiect value any time a caiiy iesults in
decimal (i.e., when 9 is exceeded). This featuie is not shaied by eithei natuial binaiy oi NBCD addition.
The Hexadecima! and Octa! Systems
The hexadecimal numbei system iequiies that r 16 in Eqs. (86.1) and (86.2), indicating that theie aie 16
distinguishable chaiacteis in the system. By convention, the peimissible hexadecimal digits aie 0, 1, 2, 3, 4, 5,
6, 7, 8, 9, A, B, C, D, E, and F foi decimals 0 thiough 15, iespectively. Examples of the positional and polynomial
iepiesentations foi a hexadecimal numbei aie
N
16
(
n1
. 
3

2

1

0
. 
1

2

3
. 
m
)
16
(AF3.C8)
16
with a decimal value of
Heie, it is seen that a hexadecimal numbei has been conveited to decimal by using Eq. (86.2).
The octal numbei system iequiies that r 8 in Eqs. (86.1) and (86.2), indicating that theie aie eight
distinguishable chaiacteis in this system. The peimissible octal digits aie 0, 1, 2, 3, 4, 5, 6, and 7, as one might
expect. Examples of the application of Eqs. (86.1) and (86.2) aie
N
8
(o
n1
. o
3
o
2
o
1
o
0
. o
1
o
2
o
3
. o
m
)
8
501.74
8
with a decimal value of
N 
m
n
+ + + +
16
10 16 15 16 3 16 12 16 8 16
2803 78125
1
2 1 0 1 2
10
.
2000 by CRC Press LLC
When the hexadecimal and octal numbei systems aie used to iepiesent bit patteins in binaiy, they aie called
binaiycoded hexadecimal (BCH) and binaiycoded octal (BCO), iespectively. These two numbei systems aie
examples of naryJereJ raJtes. Table 86.3 lists seveial selected examples showing the ielationships between
BCH, BCO, binaiy, and decimal.
What emeiges on close inspection of Table 86.3 is that each hexadecimal digit coiiesponds to foui binaiy
digits and that each octal digit coiiesponds to thiee binaiy digits. The following example illustiates the
ielationships between these numbei systems:
5 B F . D 8
10110111111.11011
2
0101 1011 1111 . 1101 1000
5BF.D8
16
2 6 7 7 . 6 6
010 110 111 111 . 110 110
2677.66
8
1471.84375
10
To sepaiate the binaiy digits into gioups of foui (foi BCH) oi gioups of thiee (foi BCO), counting must
begin fiom the iadix point and continue outwaid in both diiections. Then, wheie needed, zeios aie added to
the leading and tiailing ends of the binaiy iepiesentation to complete the MSDs and LSDs foi the BCH and
BCO foims.
Cunversiun betveen Number Systems
It is not the intent of this section to covei all methods foi iadix (base) conveision. Rathei, the plan is to piovide
geneial appioaches, sepaiately applicable to the integei and fiaction poitions, followed by specifc examples.
Cunversiun ul Integers
Since the polynomial foim of Eq. (86.2) is a geometiical piogiession, the integei poition can be iepiesented
in neseJ raJx foim. In souice iadix s, the nested iepiesentation is
TABLE 86.3 The BCH and BCO Numbei Systems
Binaiy BCH BCO Decimal Binaiy BCH BCO Decimal
0000 0 0 0 1010 A 12 10
0001 1 1 1 1011 B 13 11
0010 2 2 2 1100 C 14 12
0011 3 3 3 1101 D 15 13
0100 4 4 4 1110 E 16 14
0101 5 5 5 1111 F 17 15
0110 6 6 6 10000 10 20 16
0111 7 7 7 11011 1B 33 27
1000 8 10 8 110001 31 61 49
1001 9 11 9 1001110 4E 116 78
N o
m
n
+ + + +
8
5 8 0 8 1 8 7 8 4 8
321 9375
1
2 1 0 1 2
10
.
2000 by CRC Press LLC
(86.3)
foi digits a
having integei values fiom 0 to s  1. The nested iadix foim not only suggests a conveision piocess
but also foims the basis foi computeiized conveision.
Considei that the numbei in Eq. (86.3) is to be iepiesented in nested iadix r foim
(86.4)
wheie, in geneial, m = n. Then, if N
s
is divided by r, the iesults aie of the foim
(86.5)
wheie Q is the integei quotient ieaiianged as Q
0

1
 r(
2

.
 
m1
))))
r
and R is the iemaindei R
0

0
.
A second division by r yields Q
0
/r Q
1
 R
1
/r, wheie Q
1
is aiianged as Q
1

2
 r(
3

.
 
m1
)))
r
and R
1

1
. Thus, by iepeated division of the integei iesult Q
n
+ + + +
,
+ + + +
+
_
,
1
1
2
2
1
1
0
0
0 1 2 1
0
1
1
1
( ( )))))
N  r  r  
 r  r
r m
m
r
+ + + +
+
_
,
0 1 2 1
0
1
1
1
( ( )))))
N
r
Q
R
r
s
+
2000 by CRC Press LLC
The integei conveision methods of Table 86.4 can be illustiated by the following simple examples:
Example 1. 139
10
N
2
N/r Q R
139/2 69 1
69/2 34 1
34/2 17 0
17/2 8 1
8/2 4 0
4/2 2 0
2/2 1 0
1/2 0 1 139
10
10001011
2
Example 2. 10001011
2
N
10
. By positional weights,
N
10
128  8  2  1 139
10
Example 3. 139
10
N
8
N/r Q R
139/8 17 3
17/8 2 1
2/8 0 2 139
10
213
8
Example 4. 10001011
2
N
BCO
2 1 3
010 001 011 213
BCO
Example 5. 213
BCO
N
BCH
2 1 3 8 B
213
BCO
010 001 011 10001011
2
1000 1011 8B
16
Example 6. 213
8
N
5
213
8
2 8
2
 1 8
1
 3 8
0
139
10
N/r Q R
139/5 27 4
27/5 5 2
5/5 1 0
1/5 0 1 213
8
1024
5
Check: 1 5
3
 2 5
1
 4 5
0
125  10  4 139
10
Cunversiun ul Fractiuns
By extiacting the fiaction poition fiom Eq. (86.2) one can wiite
(86.6)
in iadix s. This is called the neseJ nerse raJx foim that piovides the basis foi computeiized conveision.
. ( )
( ( )))))
( )
N a s a s a s
s a s a a
s a a s
s m
m
s
m s
m
+ + +
+ + +
+
1
1
2
2
1
1
1
2
1
1
1
2
+ + +
+
1 1 1 2
1
1
1
2
10 1
1
2
2
1
+ + +
+
1
]
1
1
+ ,
+ ,
+ ,
+ ,
+ ,
+ ,
a r a r a r
r a a r
n
n
n
n
n
n
n
n n
n
max
i
n
(a
n
 1)  i
n
(a
n
 a
(n1)
/i)
r
n
(1  a
(n1)
/r)
Then, by iounding to n digits, theie iesults an eiioi with bounds
0 <
10
s i
n
(1  a
(n1)
/i) (86.10)
in decimal. If a
(n1)
< r/2 and the (n  1) digit is diopped, the maximum eiioi is r
n
. Note that foi N
s
N
10
N
r
type conveisions, the bounds of eiiois aggiegate.
The following examples illustiate the fiaction conveision methods of Table 86.5.
Example 7. 0.654
10
N
2
iounded to eight bits
.N
s
r F I
0.654 2 0.308 1
0.308 2 0.616 0
0.616 2 0.232 1
0.232 2 0.464 0
0.464 2 0.928 0
0.928 2 0.856 1 0.654
10
0.10100111
2
0.856 2 0.712 1
0.712 2 0.424 1
0.424 2 0.848 0
max
2
8
Example 8. 0.654
10
N
8
teiminated at foui digits
.N
s
r F I
0.654 8 0.232 5
0.232 8 0.856 1 0.654
10
5166
8
0.856 8 0.848 6 with eiioi bounds
0.848 8 0.784 6 0 <
10
s 7 8
4
1.71 10
3
Example 9. 0.5166
8
N
2
iounded to eight bits and let 0.5166
8
N
10
be iounded to foui decimal places.
0.5166
8
5 8
1
 1 8
2
 6 8
3
 6 8
4
0.625000  0.015625  0.011718  0.001465
0.6538 iounded to foui decimal places;
10
s 10
4
.N
s
r F I
0.6538 2 0.3076 1
0.3076 2 0.6152 0
0.6152 2 0.2304 1
0.2304 2 0.4608 0
0.4608 2 0.9216 0
2000 by CRC Press LLC
0.9216 2 0.8432 1
0.8432 2 0.6864 1 0.5166
8
0.10100111
2
(compaie with Example 7)
0.6864 2 0.3728 1
0.3728 2 0.7457 0
10
s 10
4
 2
8
0.0040
Example 10. 0.10100111
2
N
BCH
A 7
0.10100111
2
0.1010 0111 0.A7
BCH
Signed Binary Numbers
To this point only unsigned numbeis (assumed to be positive) have been consideied. Howevei, both positive
and negative numbeis must be used in computeis. Seveial schemes have been devised foi dealing with negative
numbeis in computeis, but only foui aie commonly used:
Signedmagnitude iepiesentation
Radix complement iepiesentation
Diminished iadix complement iepiesentation
Excess (offset) code iepiesentation
Of these, the iadix 2 complement iepiesentation, called 2`s complement, is the most widely used system in
computeis.
SignedMagnitude Representatiun
A signedmagnitude numbei consists of a magnitude togethei with a symbol indicating its sign (positive oi
negative). Such a numbei lies in the decimal iange of (r
n1
 1) thiough (r
n1
 1) foi n integei digits in
iadix r. A fiaction poition, if piesent, would consist of m digits to the iight of the iadix point.
The most common examples of signedmagnitude numbeis aie those in the decimal and binaiy systems.
The sign symbols foi decimal ( oi ) aie well known. In binaiy it is established piactice to use 0 plus and
1 minus foi the sign symbols and to place one of them in the MSB position foi each numbei. Examples in
eightbit binaiy aie
Magnitude
45.5
10
0 101101.1
2
0
10
0 0000000
2
Sign bit
Magnitude
123
10
1 1111011
2
0
10
1 0000000
2
Sign bit
Although the signmagnitude system is used in computeis, it has two diawbacks. Theie is no unique zeio,
as indicated by the examples, and addition and subtiaction calculations iequiie timeconsuming decisions
iegaiding opeiation and sign as, foi example, (7) minus (4). Even so, the signmagnitude iepiesentation is
commonly used in oatingpoint numbei systems.
Radix Cump!ement Representatiun
The raJx tomemen of an ndigit numbei N
r
is obtained by subtiacting it fiom r
n
, that is r
n
 N
r
. The
opeiation r
n
 N
r
is equivalent to complementing the numbei and adding 1 to the LSD. Thus, the iadix
complement is N
r
 1
LSD
wheie N
r
r
n
 1  N
r
is the complement of a numbei in iadix r. Theiefoie, one may
wiite
2000 by CRC Press LLC
Radix complement of N
r
r
n
 N
r
N
r
 1 (86.11)
The complements N
r
foi digits in thiee commonly used numbei systems aie given in Table 86.6. Notice that
the complement of a binaiy numbei is foimed simply by ieplacing the 1`s with 0`s and 0`s with 1`s as iequiied
by 2
n
 1  N
2
.
With iefeience to Table 86.6 and Eq. (86.11), the following examples of iadix complement iepiesentation
aie offeied.
Example 11. The 10`s complement of 47.83 is
N
10
 1
LSD
52.17
Example 12. The 2`s complement of 0101101.101 is
N
2
 1
LSB
1010010.011
Example 13. The 16`s complement of A3D is
N
16
 1
LSD
5C2  1 5C3
The decimal value of Eq. (86.11) can be found fiom the polynomial expiession
(86.12)
foi any ndigit numbei of iadix r. In Eqs. (86.11) and (86.12) the MSD is taken to be the position of the sign
symbol.
2's Cvmpement RepresentutIvn. The iadix complement foi binaiy is the 2`s complement iepiesentation. In
2`s complement the MSB is the sign bit, 1 indicating a negative numbei oi 0 if positive. The decimal iange of
iepiesentation foi nintegei bits in 2`s complement is fiom (2
n1
) thiough (2
n1
). Fiom Eq. (86.11), the 2`s
complement is foimed by
N
2
)
2`s compl.
2
n
 N
2
N
2
 1 (86.13)
A few examples in eightbit binaiy aie shown in Table 86.7. Notice that application of Eq. (86.13) changes
the sign of the decimal value of a binaiy numbei ( to , and vice veisa) and that only one zeio iepiesentation
exists.
Application of Eq. (86.12) gives the decimal value of any 2`s complement numbei, including those containing
a iadix point. Foi example, the pattein N
2`s compl.
11010010.011 has a decimal value
N
2`s compl.
)
10
1 2
7
 1 2
6
 1 2
4
 1 2
1
 1 2
2
 1 2
3
128  64  16  2  0.25  0.125
45.625
10
The same iesult could have easily been obtained by fist applying Eq. (86.13) to N
2`s compl.
followed by the use
of positional weighting to obtain the decimal value. Thus,
N a r a r
n
n
m
n
iadix compl.
) ( )
10 1
1
2
+
wheie
x
e 0, 1
In this case the associated woid (the stiing of bits) is n bits long.
When foi eveiy value X theie exists one and only one coiiesponding bit stiing x, we defne the numbei
system as nonredundant. If howevei, we could have moie than one bit stiing x that iepiesents the same value
X, the numbei system is redundant.
Most commonly we aie using numbeis iepiesented in a wegeJ numbei system, wheie a numeiical value
is associated with the bit stiing x accoiding to the equation
wheie
w
0
1 and w
(w
 1)(r
 1)
x x w
0
1
2000 by CRC Press LLC
The value r
is an integei designated as the raJx, and in a noniedundant numbei system it is an integei equal
to the numbei of allowed values foi x
. In geneial x
is the
same positive integei foi all the digit positions x
 1 foi 0 s s n  1
An example of a weighted numbei system with a mixed iadix would be the iepiesentation of time in weeks,
days, houis, minutes, and seconds with a iange foi iepiesenting 100 weeks:
r 10, 10, 7, 24, 60, 60
In digital computeis the iadices encounteied aie 2, 4, 10, and 16, with 2 being the most commonly used one.
The digit set x
can assume is n
x
s
r, then we have a nonreJunJan digit set. Otheiwise, if n
x
> r, we have a reJunJan digit set. Use of the reJunJan
digit set has its advantages in effcient implementation of algoiithms (multiplication and division in paiticulai).
Othei numbei iepiesentations of inteiest aie nonwegeJ numbei systems, wheie the ielative position of
the digit does not affect the weight so that the appiopiiate inteichange of any two digits will not change the
value x. The best example of such a numbei system is the iesidue numbei system (RNS).
We also defne explicit value r
e
and implicit value X
I
of a numbei iepiesented by a bit stiing x. The mt
aue is the only value of inteiest to the usei, while the ext aue piovides the most diiect inteipietation of
the bit stiing x. Mapping of the ext aue to the mt aue is obtained by an aiithmetic function that
defnes the numbei iepiesentation used. It is a task of the aiithmetic designei to devise algoiithms that iesult
in the coiiect implicit value of the iesult foi the opeiations on the opeiand digits iepiesenting the explicit
values. In othei woids, the aiithmetic algoiithm needs to satisfy the tosure piopeity.
The ielationship between the mt aue and the ext aue is best illustiated by Table 86.8.
Representatiun ul Signed Integers
The two most common iepiesentations of signed integeis aie sign and magnitude (SM) iepiesentation and
tiue and complement (TC) iepiesentation. While SM iepiesentation might be easiei to undeistand and conveit
to and fiom, it has its own pioblems. Theiefoie, we will fnd TC iepiesentation to be moie commonly used.
SIgn und MugnItude RepresentutIvn (SM). In SM iepiesentation signed integei X
x
e
27
Integei, two`s complement X
2
5
 x
e
5
Integei, one`s complement X
(2
5
 1)  x
e
4
Fiaction, magnitude X
2
5
x
e
27/32
Fiaction, two`s complement X
2
4
(2
5
 x
e
) 5/16
Fiaction, one`s complement X
2
4
(2
5
 1  x
e
) 4/16
Sourte. A. Avizienis, Digital computei aiithmetic: A unifed algoiithmic specifcation," in Sym. Comuers
anJ uomaa, Polytechnic Institute of Biooklyn, Apiil 1315, 1971.
2000 by CRC Press LLC
The illustiation of TC mapping is given in Table 86.9. In this iepiesentation positive integeis aie iepiesented
in the rue [orm, while negative integeis aie iepiesented in the tomemen [orm.
With iespect to how the complementation constant C is chosen, we can fuithei distinguish two iepiesenta
tions within the TC system. If the complementation constant is chosen to be equal to the iange of possible
values taken by x
e
, C r
n
in a conventional numbei system wheie 0 s x
e
s r
n
 1, then we have defned the
range tomemen (RC) system. If, on the othei hand, the complementation constant is chosen to be C r
n
 1,
we have defned the JmnseJ raJx tomemen (DRC) (also known as the Jg tomemen DC]) numbei
system. Repiesentations of the RC and DRC numbei iepiesentation systems aie shown in Table 86.10.
As can be seen fiom Table 86.10, the RC system piovides foi one unique iepiesentation of zeio because the
complementation constant C r
n
falls outside the iange. Theie aie two iepiesentations of zeio in the DRC
system, x
e
0 and r
n
 1. The RC iepiesentation is not symmetiical, and it is not a closed system undei the
change of sign opeiation. The iange foi RC is 12r
n
, 12r
n
 1]. The DC is symmetiical and has the iange of
(1 2r
n
 1), 12r
n
 1].
Foi the iadix r 2, RC and DRC numbei iepiesentations aie commonly known as wo's tomemen and
one's tomemen numbei iepiesentation systems, iespectively. Those two iepiesentations aie illustiated by an
example in Table 86.11 foi the iange of values (4 s X
s 3).
A!gurithms lur Basic Arithmetic Operatiuns
The algoiithms foi the aiithmetic opeiation aie dependent on the numbei iepiesentation system used. Theie
foie, theii implementation should be examined foi each numbei iepiesentation system sepaiately, given that
the complexity of the algoiithm, as well as its haidwaie implementation, is dependent on it.
Additiun and Subtractiun in Sign and Magnitude System
In the SM numer sysem addition/subtiaction is peifoimed on paiis (u
s
, u
m
) and (w
s
, w
m
) iesulting in a sum
(s
s
, s
m
), wheie u
s
and w
s
aie sign bits and u
m
and w
m
aie magnitudes. The algoiithm is ielatively complex because
it iequiies compaiisons of the signs and magnitudes as well. Extending the addition algoiithm in oidei to
peifoim subtiaction is ielatively easy because it only involves change of the sign of the opeiand being subtiacted.
Theiefoie, we will considei only the addition algoiithm.
The algoiithm can be desciibed as
if u
s
w
s
(signs aie equal) then
s
s
u
s
and s
m
u
m
 w
m
(opeiation includes checking foi the oveiow)
X
x x C
x C x C
e e
e e
<
>
/
 /
2
2
TABLE 86.10 Mapping of the Explicit Value x
e
into RC and DRC Numbei Repiesentations
x
e
X
(RC) X
(DRC)
0 0 0
1 1 1
2 2 2
M M M
12r
n
 1 12r
n
 1 12r
n
 1
12r
n
12r
n
(12r
n
 1)
M M M
r
n
 2 2 1
r
n
 1 1 0
TABLE 86.9 Tiue and Complement Mapping
x
e
X
0 0
1 1
2 2
M M
C/2 1 C/2 1
C/2 1 (C/2 1)
M M
C 2 2
C 1 1
C 0
2000 by CRC Press LLC
if u
s
= w
s
then
if u
m
> w
m
: s
m
u
m
 w
m
s
s
u
s
else: s
m
w
m
 u
m
s
s
w
s
Additiun and Subtractiun in True and Cump!ement System
Addition in the TC sysem is ielatively simple. It is suffcient to peifoim modulo addition of the explicit values;
theiefoie,
s
e
(u
e
 w
e
) mod C
Pioof will be omitted.
In the RC numer sysem this is equivalent to passing the opeiands thiough an addei and discaiding the
caiiyout of the most signifcant position of the addei which is equivalent to peifoiming the modulo addition
(given that C r
n
).
In the DRC numer sysem the complementation constant is C r
n
 1. Modulo addition in this case is
peifoimed by subtiacting r
n
and adding 1. It tuins out that this opeiation can be peifoimed by simply passing
the opeiands thiough an addei and feeding the caiiyout fiom the most signifcant digit position into the
caiiyin at the least signifcant digit position. This is also called addition with enJarounJ tarry.
Subtiacting two numbeis is peifoimed by simply changing the sign of the opeiand to be subtiacted pieceding
the addition opeiation.
Chunge v] SIgn OperutIvn. The change of sign opeiation involves the following opeiation:
V
Z
w
e
(:
e
) (:
e
) mod C C  Z
mod C C  :
e
which means that the change of sign opeiation consists of subtiacting the opeiand :
e
fiom the complementation
constant C.
In the DRC sysem complementation is peifoimed by simply complementing each digit of the opeiand Z
with
iespect to r  1 and adding 1 to the iesulting :
e
.
Imp!ementatiun ul Additiun
Carry LuukAhead Adder [CLA)
The fist signifcant speed impiovement in the implementation of a paiallel addei was a caiiylookahead addei
(CLA) developed by Weinbeigei and Smith 1958] in 1958. The CLA is one of the fastest schemes used foi the
addition of two numbeis even today given that the delay incuiied to add two numbeis is logaiithmically
TABLE 86.11 Two`s Complement and One`s Complement Repiesentation
Two`s Complement, C 8 One`s Complement, C 7
X
x
e
X
(2`s complement) x
e
X
(1`s complement)
3 3 011 3 011
2 2 010 2 010
1 1 001 1 001
0 0 000 0 000
0 0 000 7 111
1 7 111 6 110
2 6 110 5 101
3 5 101 4 100
4 4 100 3 
2000 by CRC Press LLC
dependent on the size of the opeiands (delay logN]). The concept of CLA is illustiated in Fig. 86.2a and b.
Foi each bit position of the addei, a paii of signals (p
i
, g
i
) is geneiated in paiallel. It is possible to geneiate
local caiiies using (p
i
, g
i
) as seen in the equations. Those signals aie designated as: p
i
caiiypiopagate and
g
i
caiiygeneiate because they take pait in piopagation and geneiation of caiiy signal C
i1
. Howevei, each
bit position iequiies an incoming caiiy signal C
i1
in oidei to geneiate the outgoing caiiy C
i
. This makes the
addition slow because the caiiy signal has to iipple fiom stage to stage as shown in Fig. 86.2a. The addei can
be divided into the gioups and the caiiygeneiate and caiiypiopagate signals can be calculated foi the entiie
gioup (G,P). This will take an additional time equivalent to and ANDOR delay of the logic. Howevei, now
we can calculate each gioup`s caiiy signals in an additional ANDOR delay. foi the geneiation of the caiiy
signal fiom the addei only the incoming caiiy signal into the gioup is now iequiied. Theiefoie, the iippling
of the caiiy is limited only to the gioups. In the next step we may calculate geneiate and piopagate signals foi
the gioup of gioups (G, P) and continue in that fashion until we have only one gioup left geneiating the
C
out
signal fiom the addei. This piocess will teiminate in log numbei of steps given that we aie geneiating a
tiee stiuctuie foi geneiation of caiiies. The computation of caiiies within the gioups is done individually as
illustiated in Fig. 86.2a and this piocess iequiies only the incoming caiiy into the gioup.
The logaiithmic dependence on the delay (delay logN]) is only valid undei the assumption that the gate
delay is constant without depending on the fanout and fanin of the gate. In piactice this is not tiue and even
when the bipolai technology (which does not exhibit stiong dependence on the fanout) is used to implement
CLA stiuctuie, the fuithei expansion of the caiiyblock is not possible given the piactical limitations on the
fanin of the gate.
In CMOS technology this situation is much diffeient given that CMOS gate has stiong dependency not only
on fanin but on fanout as well. This limitation takes away much of the advantages gained by using the CLA
scheme. Howevei, by clevei optimization of the ciitical path and appiopiiate use of dynamic logic the CLA
scheme can still be advantageous, especially foi the addeis of a laigei size.
FIGURE 86.2 Caiiy LookAhead addei stiuctuie. (a) Geneiation of caiiy geneiate and piopagate signals and (b) geneiation
of gioup signals G, P and inteimediate caiiies.
2000 by CRC Press LLC
Cunditiuna!Sum Additiun
Anothei one of the fast schemes foi addition of two numbeis that piedates CLA is conditionalsum addition
(CSA) pioposed by Sklansky 1960] in 1960. The essence of the CSA scheme is the iealization that we can add
two numbeis without waiting foi the caiiy signal to be available. Simply, the numbei weie added in two
instances: one assuming C
in
0 and the othei assuming C
in
1. The iesults: Sum
0
, Sum
1
and Caiiy
0
, Caiiy
1
aie piesented at the input of a multiplexei. The fnal values aie being selected at the time C
in
aiiives at the
select" input of a multiplexei. As in CLA the input bits aie divided into gioups which aie added conditionally".
It is appaient that staiting fiom the least signifcant bit (LSB) position the haidwaie complexity staits to
giow iapidly. Theiefoie, in piactice, the fullblown implementation of the CSA is not often seen.
Howevei, the idea of adding the most signifcant bit (MSB) poition conditionally and selecting the iesults
once the caiiyin signal is computed in the LSB poition is attiactive. Such a scheme (which is a subset of CSA)
is known as caiiyselect addei". A 26b caiiyselect addei consisting of two 13bit poitions is shown in Fig. 86.3.
Mu!tip!icatiun A!gurithm
The multiplication opeiation is peifoimed in a vaiiety of foims in haidwaie and softwaie. In the beginning of
computei development any complex opeiation was usually piogiammed in softwaie oi coded in the miciocode
of the machine. Some limited haidwaie assistance was piovided. Today it is moie likely to fnd full haidwaie
implementation of the multiplication foi ieasons of speed and ieduced cost of haidwaie. Howevei, in all of
them, multiplication shaies the basic algoiithm with some adaptations and modifcations to the paiticulai
implementation and numbei system used. Foi simplicity we will desciibe a basic multiplication algoiithm that
opeiates on positive nbitlong integeis X and Y iesulting in the pioduct P, which is 2n bits long:
This expiession indicates that the multiplication piocess is peifoimed by summing n teims of a ara roJut.
X y
. This pioduct indicates that the th teim is obtained by a simple aiithmetic left shift of X foi the
positions and multiplication by the single digit y
is 0 oi 1 and multiplication by
the digit y
is veiy simple to peifoim. The addition of n teims can be peifoimed at once, by passing the paitial
pioducts thiough a netwoik of addeis (which is the case of full haidwaie multipliei), oi sequentially, by passing
the ara roJut thiough an addei n times. The algoiithm to peifoim multiplication of X and Y can be
desciibed as Eicegovac, 1985]
(0)
0
,  1
1/ r (
,
 r
n
Xy
,
) foi , 0, ., n  1
It can be easily pioved that this iecuiience iesults in
(n)
XY.
FIGURE 86.3 26bit caiiyselect addei.
P XY X y r X y r
0
1
0
1
2000 by CRC Press LLC
Vaiious modifcations of the multiplication algoiithm exist; one of the most famous is the moJfeJ Boo
retoJng agorm desciibed by Booth in 1951. This algoiithm allows foi the ieduction of the numbei of paitial
pioducts, thus speeding up the multiplication piocess. Geneially speaking, the Booth algoiithm is a case of
using the iedundant numbei system with the iadix highei than 2.
Imp!ementatiun ul the Mu!tip!icatiun A!gurithm
Speed of multiply opeiation is of utmost impoitance in digital signal piocessois (DSP) today as well as in the
geneial puipose piocessois. Theiefoie, ieseaich in building a fast paiallel multipliei has been going on since
such a papei was published by Wallace 1964] in 1964. In his histoiic papei, Wallace intioduced a way of
summing the paitial pioduct bits in paiallel using a tiee of caiiysave addeis which became geneially known
as the Wallace Tiee" (Fig. 86.4).
A suggestion foi speed impiovement of such a piocess of adding paitial pioduct bits in paiallel followed in
the papei published by Dadda 1965]. In his 1965 papei, Dadda intioduced a notion of a countei stiuctuie
that will take a numbei of bits in the same bit position (of the same weight") and output a numbei q that
iepiesents the count of ones in the input. Dadda has intioduced a numbei of ways to compiess the paitial
pioduct bits using such a countei, which latei became known as Dadda`s countei.
The quest foi making the paiallel multipliei even fastei continued foi almost 30 yeais. The seaich foi
pioducing a fastest countei" did not iesult in a geneial stiuctuie that yielded a fastei paitial pioduct summation
than that which used a fulladdei (FA) cell, oi 3:2 countei. Theiefoie, the use of the Wallace Tiee was almost
pievalent in the implementation of the paiallel multiplieis. In 1981, Weinbeigei disclosed a stiuctuie he called
42 caiiysave module". This stiuctuie contained a combination of FA cells in an intiicate inteiconnection
stiuctuie that was yielding fastei paitial pioduct compiession than the use of 3:2 counteis.
FIGURE 86.4 Wallace Tiee.
2000 by CRC Press LLC
The 4:2 compiessoi (Fig. 86.5) actually compiesses fve paitial pioduct bits into thiee; howevei, it is connected
in such a way that foui of the inputs aie coming fiom the same bit position of the weight , while one bit is fed
fiom the neighboiing position ,1 (known as caiiyin). The output of such a 42 module consists of one bit
in the position , and two bits in the position ,1. This stiuctuie does not iepiesent a countei (though it became
eiioneously known as a 42 countei") but a compiessoi" which would compiess foui paitial pioduct bits
into two (while using one bit lateially connected between adjacent 42 compiessois). The effciency of such a
stiuctuie is highei (it ieduces the numbei of paitial pioduct bits by one half). The speed of such a 42
compiessoi has been deteimined by the speed of 3 XOR gates in seiies (in the iedesigned veision of 42
compiessoi) making such a scheme moie effcient that the one using 3:2 counteis in a iegulai Wallace Tiee.
The othei equally impoitant featuie of the use of a 42 compiessoi is that the inteiconnections between such
cells follow moie iegulai patteins than in the case of the Wallace Tiee.
Buuth Encuding
Booth`s algoiithm Booth, 1951] is widely used in the implementations of haidwaie oi softwaie multiplieis
because its application makes it possible to ieduce the numbei of paitial pioducts. It can be used foi both sign
magnitude numbeis as well as 2`s complement numbeis with no need foi a coiiection teim oi a coiiection step.
A modifcation of the Booth algoiithm was pioposed by MacSoiley
1961] in which a tiiplet of bits is scanned instead of two bits. This
technique has the advantage of ieducing the numbei of paitial piod
ucts by one half iegaidless of the inputs. This is summaiized in
Table 86.12.
The iecoding is peifoimed within two steps: encoding and selec
tion. The puipose of the encoding is to scan the tiiplet of bits of the
multipliei and defne the opeiation to be peifoimed on the multi
plicand, as shown in Table 86.8. This method is actually an applica
tion of a signdigit iepiesentation in iadix 4. The BoothMacSoiley
algoiithm, usually called the Modifed Booth algoiithm oi simply the
Booth algoiithm, can be geneialized to any iadix.
Booth iecoding necessitates the inteinal use of 2`s complement iepiesentation in oidei to effciently peifoim
subtiaction of the paitial pioducts as well as additions. Howevei, oating point standaid specifes sign mag
nitude iepiesentation which is followed by most of the nonstandaid oating point numbeis in use today. The
advantage of Booth iecoding is that it geneiates only a half of the paitial pioducts as compaied to the multipliei
implementation which does not use Booth iecoding. Howevei, the beneft achieved comes at the expense of
FIGURE 86.5 4:2 compiessoi.
TABLE 86.12 Modifed Booth Recoding
x
i2
x
i1
x
i
Add to Paitial Pioduct
000 0Y
001 1Y
010 1Y
011 2Y
100 2Y
101 1Y
110 1Y
111 0Y
2000 by CRC Press LLC
incieased haidwaie complexity. Indeed, this implementation iequiies haidwaie foi the encoding and foi the
selection of the paitial pioducts (0, Y, 2Y). An optimized encoding is shown in Fig. 86.6.
Divisiun A!gurithm
Division is a moie complex piocess to implement because, unlike multiplication, it involves guessng the digits
of the quotient. Heie, we will considei an algoiithm foi division of two positive integeis designated as JJenJ
Y and Jsor X iesulting in a quoen Q and an integei remanJer Z accoiding to the ielation given by
Y XQ  Z
In this case the dividend contains 2n integeis and the divisoi has n digits in oidei to pioduce a quotient with
n digits.
The algoiithm foi division is given with the following iecuiience ielationship Eicegovac, 1985]:
:
(0)
Y
:
( ,1)
r:
(, )
 Xr
n
Q
n1,
foi , 0, ., n  1
this iecuiience ielation yields
:
(n)
r
n
(Y  XQ)
Y XQ  :
(n)
r
n
which defnes the division piocess with iemaindei Z :
(n)
r
n
.
The selection of the quotient digit is done by satisfying that 0 s Z < X at each step in the division piocess.
This selection is a ciucial pait of the algoiithm, and the best known aie resorng and nonresorng division
algoiithms. In the foimei the value of the enae ara remanJer :
(, )
is iestoied aftei the wiong guess is
made of the quotient digit q
,
. In the lattei this coiiection is not done in a sepaiate step, but iathei in the step
following. The bestknown division algoiithm is the socalled SRT algoiithm independently developed by
Sweeney, Robeitson, and Tochei. Algoiithms foi a highei iadix weie fuithei developed by Robeitson and his
students, most notably Eicegovac.
Dehning Terms
4:2 Compressor: A stiuctuie used in the paitial pioduct ieduction tiee of a paiallel multipliei foi achieving
fastei and moie effcient ieduction of the paitial pioduct bits.
Algorithm: Decomposition of the computation into subcomputations with an associated piecedence ielation
that deteimines the oidei in which these subcomputations aie peifoimed Eicegovac, 1985].
FIGURE 86.6 Booth encodei.
2000 by CRC Press LLC
BoothMacSorley algorithm: An algoiithm used foi iecoding of the multipliei such that the numbei of
paitial pioducts is ioughly ieduced by a factoi of two. It is a special case of the application of the iedundant
numbei system to iepiesent the multipliei.
Carry lookahead adder: An implementation technique of addition that acceleiates the piopagation of the
caiiy signal, thus incieasing the speed of addition opeiation.
Dadda's counter: A geneialized stiuctuie used to pioduce a numbei (count) iepiesenting the numbei of bits
that aie one". It is used foi effcient ieduction of the paitial pioducts in a paiallel multipliei.
Explicit value r
e
: A value associated with the bit stiing accoiding to the iule defned by the numbei iepie
sentation system being used.
Implicit value X
I
: The value obtained by applying the aiithmetic function defned foi the inteipietation of
the explicit value x
e
.
Nonredundant number system: The system wheie foi each bit stiing theie is one and only one coiiesponding
numeiical value x
e
.
Number representation system: A defned iule that associates one numeiical value x
e
with eveiy aJ bit
stiing x.
Redundant number system: The system in which the numeiical value x
e
could be iepiesented by moie than
one bit stiing.
SRT algorithm: An algoiithm foi division of binaiy numbeis which uses iedundant numbei iepiesentation.
Wallace tree: A technique foi summing the paitial pioduct bits of a paiallel multipliei in a caiiysave fashion
using fulladdei cells.
Re!ated Tupic
86.1 Numbei Systems
Relerences
A. Avizienis, Digital computei aiithmetic: A unifed algoiithmic specifcation," in Symosum on Comuers
anJ uomaa, Polytechnic Institute of Biooklyn, Apiil 1315, 1971.
A. D. Booth, A signed binaiy multiplication technique", Quarery J. Metan. . Ma., vol. IV, 1951.
L. Dadda, Some schemes foi paiallel multiplieis," a Frequen:a, 34, 349356, 1965.
M. Eicegovac, Dga Sysems anJ HarJware/Frmware gorms, New Yoik: Wiley, 1985, chap. 12.
O. L. MacSoiley, High speed aiithmetic in binaiy computeis", Prot. IRE, 49(1), 1961.
V. G. Oklobdzija and E. R. Baines, Some optimal schemes foi ALU implementation in VLSI technology",
Pioceedings of 7th Symposium on Computei Aiithmetic, Uibana, Ill.: Univeisity of Illinois, June 46,
1985.
Sklanski, Conditionalsum addition logic", IRE Trans. Eetron. Comuers, EC9, 226231, 1960.
C. S. Wallace, A suggestion foi a fast multipliei," IEEE Trans. Eetron. Comuers, EC13, 1417, 1964.
S. Wasei and M. Flynn, InroJuton o rmet [or Dga Sysems Desgners, New Yoik: Holt, 1982.
Weinbeigei and J. L. Smith, A logic foi highspeed addition", Naona Bureau o[ SanJarJs, Crtuaon 591,
p. 312, 1958.
Further Inlurmatiun
Foi moie infoimation about specifc aiithmetic algoiithms and theii implementation see K. Hwang, Comuer
rmet. Prntes, rteture anJ Desgn, New Yoik: Wiley, 1979 and also E. Swaitzlandei, Comuer
rmet, vols. I and II, Los Alamitos, Calif.: IEEE Computei Society Piess, 1980.
Publications in IEEE Transatons on Eetront Comuers and ProteeJngs o[ e Comuer rmet
Symosa by vaiious authois aie veiy good souices foi detailed infoimation on a paiticulai algoiithm oi
implementation.
2000 by CRC Press LLC
86.3 Architecture
1
V. Cor Homocer, Zvono C. Vronec, ond Sofvor C. Zoy
Computer architecture can be defned heie to mean the functional opeiation of the individual haidwaie units
in a computei system and the ow of infoimation and contiol among them. This is a somewhat moie geneial
defnition than is sometimes used. Foi example, some aiticles and books iefei to instiuction set aichitectuie
oi the system bus aichitectuie.
The main functional units of a singlepiocessoi system, a basic way to inteiconnect them, and featuies that
aie used to inciease the speed with which the computei executes piogiams will be desciibed. Following this,
a biief intioduction to systems that have moie than one piocessoi will be piovided.
Functiuna! Lnits
A digital computei, oi simply a computei, accepts digitized input infoimation, piocesses it accoiding to a list
of inteinally stoied matne nsrutons, and pioduces the iesultant output infoimation. The list of instiuctions
is called a rogram, and inteinal stoiage is called tomuer memory.
A computei has fve functionally independent main paits: input, memoiy, aiithmetic and logic, output, and
contiol. The input unit accepts digitally encoded infoimation fiom human opeiatois, thiough electiomechan
ical devices such as a keyboaid, oi fiom othei computeis ovei digital communication lines. The infoimation
ieceived is usually stoied in the memoiy and then opeiated on by the aiithmetic and logic unit ciicuitiy undei
the contiol of a piogiam. Results aie sent back to the outside woild thiough the output unit. All these actions
aie cooidinated by the contiol unit. The aiithmetic and logic unit, in conjunction with the main contiol unit,
aie iefeiied to as the processor.
Input and output equipment is usually combined undei the teim inputoutput unit (I/O un). This is
ieasonable because some standaid equipment piovides both input and output functions. The simplest example
of this is the video teiminal consisting of a keyboaid foi input and a cathodeiay tube foi output. The contiol
ciicuits of the computei iecognize two distinct devices, even though the human opeiatoi may associate them
as being pait of the same physical unit.
The memory unit stoies piogiams and data. Theie aie two main classes of memoiy devices called rmary
and setonJary memoiy. Piimaiy stoiage, oi main memoiy, is an electionic stoiage device, constiucted fiom
integiated ciicuits that consist of millions of semiconductoi stoiage cells, each capable of stoiing one bit of
infoimation. These cells aie accessed in gioups of fxed size called worJs. The main memoiy is oiganized so
that the contents of one woid can be stoied oi ietiieved in one basic opeiation called a memory tyte.
To piovide diiect access to any woid in the main memoiy in a shoit and fxed amount of time, a distinct
addiess numbei is associated with each woid location. A given woid is accessed by specifying its addiess and
issuing a contiol command that staits the stoiage oi ietiieval piocess. The numbei of bits in a woid is called
the worJ eng of the computei. Woid lengths vaiy fiom 16 to 64 bits. Small machines such as peisonal
computeis oi woikstations, may have only a few million woids in the main memoiy, while laigei machines
have tens of millions of woids. The time iequiied to access a woid foi ieading oi wiiting is less than 100 ns.
Although piimaiy memoiy is essential, it tends to be expensive and volatile. Thus cheapei, moie peimanent,
magnetic media secondaiy stoiage is used foi fles of infoimation that contain piogiams and data. A wide
selection of suitable devices is available, including magnetic disks, diums, diskettes, and tapes.
Execution of most opeiations within a computei takes place in the arithmetic and logic unit (ALU) of a
piocessoi. Considei a typical example. Suppose that two numbeis located in the main memoiy aie to be added,
and the sum is to be stoied back into the memoiy. Using a few instiuctions, each consisting of a few basic
steps, deteimined by the control unit, the opeiands aie fist fetched fiom the memoiy into the piocessoi. They
aie then added in the ALU, and the iesult is stoied back in memoiy. Piocessois contain a numbei of highspeed
stoiage elements called regsers, which aie used foi tempoiaiy stoiage of opeiands. Each iegistei contains one
1
Adapted fiom V.C. Hamachei, Z.G. Vianesic, and S.G. Zaky, Comuer Organ:aon, 4th ed., New Yoik: McGiawHill,
1996. With peimission.
2000 by CRC Press LLC
woid of data and its access time is about 10 times fastei than main memoiy access time. Laigescale micio
electionic fabiication techniques allow whole piocessois to be implemented on a single semiconductoi chip
containing a few million tiansistois.
Basic Operatiuna! Cuncepts
To peifoim a given computational task, an appiopiiate piogiam consisting of a set of machine instiuctions is
stoied in the main memoiy. Individual instiuctions aie biought fiom the memoiy into the piocessoi foi
execution. Data used as opeiands aie also stoied in the memoiy. A typical instiuction may be
MOVE MEMLOC, Ri
This instiuction loads a copy of the opeiand at memoiy location MEMLOC into the piocessoi iegistei Ri. The
instiuction iequiies a few basic steps to be peifoimed. Fiist, the instiuction must be tiansfeiied fiom the
memoiy into the piocessoi, wheie it is decoded. Then the opeiand at location MEMLOC must be fetched into
the piocessoi. Finally, the opeiand is placed into iegistei R1. Aftei opeiands aie loaded into the piocessoi
iegisteis in this way, instiuctions such as ADD Ri, Rj, Rk can be used to add the contents of iegisteis Ri and
Rj, and then place the iesult into iegistei Rk.
Instiuction set design has been intensively studied in iecent yeais to deteimine the effectiveness of the vaiious
alteinatives. See Patteison and Hennessey 1994] foi a thoiough discussion.
TH !IRST IICITAL CNITRS
f all the new technologies to emeige fiom Woild Wai II, none was to have such piofound and
peivasive impacts as the digital computei. As eaily as the 1830s, the Englishman Chailes Babbage
conceived of an Analytical Engine" that would peifoim mathematical opeiations using
punched caids, hundieds of geais, and steam powei. Babbage`s idea was beyond the capabilities of 19th
centuiy technology, but his vision iepiesented a goal that many weie to puisue in the next centuiy and
a half.
In the mid1920s, MIT electiical engineei Vannevai Bush devised the pioduct integiaph", a semi
automatic machine foi solving pioblems in deteimining the chaiacteiistics of complex electiical systems.
This was followed a few yeais latei by the diffeiential analyzei", the fist geneial equationsolving
machine. These machines weie mechanical, analog devices, but at the same time that they weie being
built and copied, the piinciples of electiical, digital machines weie being laid out.
In 1937, Claude Shannon published in the Transatons of the AIEE the ciicuit piinciples foi an electiic
addei to the base of two", and Geoige Stibitz of Bell Labs built such an adding device on his own kitchen
table. In that same yeai, Howaid Aiken, then a student at Haivaid, pioposed a gigantic calculating
machine that could be used foi eveiything fiom vacuum tube design to pioblems in ielativistic physics.
With suppoit fiom Thomas J. Watson, piesident of IBM, Aiken was able to build his machine, the
Automatic Sequence Contiolled Calculatoi", oi Maik I". When it was fnished in 1944, the Maik I was
quickly piessed into wai seivice, calculating ballistics pioblems foi the Navy.
In 1943, the goveinment contiacted with John W. Mauchly and J. Piespei Eckeit of the Univeisity of
Pennsylvania to build the Electionic Numeiical Integiatoi and Computei"the fist tiue electionic
digital computei. When the ENIAC was fnally dedicated in Febiuaiy 1946, it was both a maivel and a
monsteiweighing 30 tons, consuming 150 kW of powei, and using 18,000 vacuum tubes. With all of
this, it could peifoim 5,000 additions oi 400 multiplications pei second, which was about one thousand
O
2000 by CRC Press LLC
The connection between the main memoiy and the piocessoi that allows foi the tiansfei of instiuctions and
opeiands is called the bus, as shown in Fig. 86.7. A bus consists of a set of addiess, data, and contiol lines. The
bus also allows piogiam and data fles to be tiansfeiied fiom theii longteim location on magnetic disk stoiage
to the main memoiy. Long distance digital communication with othei computeis is also enabled by tiansfeis
ovei the bus to the Communication Line Inteiface, as shown in the fguie. The bus inteiconnects a numbei of
devices, but only two devices (a sendei and a ieceivei) can use it at any one time. Theiefoie, some contiol
ciicuitiy is needed to manage the oideily use of the bus when a numbei of devices wish to use it.
Noimal execution of piogiams may sometimes be pieempted if some I/O device iequiies uigent contiol
action oi seivicing. Foi example, a monitoiing device in a computeicontiolled industiial piocess may detect
a dangeious condition that iequiies the execution of a special seivice piogiam dedicated to the device. To cause
this seivice piogiam to be executed, the device sends an nerru signal to the piocessoi. The piocessoi
tempoiaiily suspends the piogiam that is being executed and executes the special nerru serte roune. Aftei
pioviding the iequiied seivice, the piocessoi switches back to the inteiiupted piogiam. To appieciate the
complexity of the computei system softwaie piogiams needed to contiol such switching fiom one piogiam
task to anothei and to manage the geneial movement of piogiams and data between piimaiy and secondaiy
stoiage, consult Tanenbaum 1990].
The need often aiises duiing piogiam loading and execution to tiansfei blocks of data between the main
memoiy and a disk oi othei secondaiy stoiage I/O devices. Special contiol ciicuits aie piovided to manage
these tiansfeis without detailed contiol actions fiom the main piocessoi. Such tiansfeis aie iefeiied to as Jret
memory attess (DMA). Assuming that accesses to the main memoiy fiom both I/O devices (such as disks) and
The ENIAC, pictuied above, was the fist tiue electionic digital computei. Eaily pio
giammeis set up pioblems by plugging in cables and setting switches. ENIAC could
peifoim calculations about one thousand times fastei than any othei machine of its
day. (Photo couitesy of the IEEE Centei foi the Histoiy of Electiical Engineeiing.)
times fastei than any othei machine of the day. The ENIAC showed the immense possibilities of digital
electionic computeis.
These possibilities occupied engineeis and mathematicians foi the coming decades. Foi electiical
engineeis, the computei iepiesented a challenge and iesponsibility foi the most poweiful new machine
of the twentieth centuiy. (Couitesy of the IEEE Centei foi the Histoiy of Electiical Engineeiing.)
2000 by CRC Press LLC
the main piocessoi can be appiopiiately inteiwoven ovei the bus, I/Omemoiy tiansfeis and computation in
the main piocessoi can pioceed in paiallel, and peifoimance of the oveiall system is impioved.
Perlurmance
A majoi peifoimance measuie foi computei systems is the time, T, that it takes to execute a complete piogiam
foi some task. Suppose N machine instiuctions need to be executed to peifoim the task. A piogiam is typically
wiitten in some highlevel language, tianslated by a compilei piogiam into machine language, and stoied on
a disk. An opeiating system softwaie ioutine then loads the machine language piogiam into the main memoiy,
ieady foi execution. Assume that, on aveiage, each machine language instiuction iequiies S basic steps foi its
execution. If basic steps aie executed at the iate of R steps pei second, then the time to execute the piogiam is
T (N S)/R
The main goal in computei aichitectuie is to develop featuies that minimize T.
We will now give an outline of main memoiy and piocessoi design featuies that help to achieve this goal.
The fist concept is that of a memory hierarchy. We have alieady noted that access to opeiands in piocessoi
iegisteis is signifcantly fastei than access to the main memoiy. Suppose that when instiuctions and data aie
fist loaded into the piocessoi, they aie stoied in a small, fast, cache memory on the piocessoi chip itself. If
instiuctions and data in the cache aie accessed iepeatedly within a shoit peiiod of time, as happens often with
piogiam loops, then piogiam execution will be speeded up. The cache can only hold small paits of the executing
piogiam. When the cache is full, its contents aie ieplaced by new instiuctions and data as they aie fetched
fiom the main memoiy. A vaiiety of tate reatemen agorms aie in use. The objective of these algoiithms
is to maximize the piobability that the instiuctions and data needed foi piogiam execution aie found in the
cache. This piobability is known as the cache  rao. A highei hit iatio means that a laigei peicentage of the
instiuctions and data aie being found in the cache, and do not iequiie access to the slowei main memoiy. This
leads to a ieduction in the memoiy access basic step time components of S, and hence to a smallei value of T.
The basic idea of a cache can be applied at diffeient points in a computei system, iesulting in a hieiaichy
of stoiage units. A typical memoiy hieiaichy is shown in Fig. 86.8. Some systems have two levels of cache to
take the best advantage of size/speed/cost tiadeoffs. The main memoiy is usually not laige enough to contain
all of the piogiams and theii data. Theiefoie, the highest level in the memoiy hieiaichy is usually magnetic
disk stoiage. As the fguie indicates, it has the laigest capacity, but the slowest access time. Segments of a
piogiam, often called ages, aie tiansfeiied fiom the disk to the main memoiy foi execution. As othei pages
aie needed, they may ieplace the pages alieady in the main memoiy if the main memoiy is full. The oideily,
FIGURE 86.7 Inteiconnection of majoi components in a computei system.
2000 by CRC Press LLC
automatic movement of laige piogiam and data segments between the main memoiy and the disk, as piogiams
execute, is managed by a combination of opeiating system softwaie and contiol haidwaie. This is iefeiied to
as memory management.
We have implicitly assumed that instiuctions aie executed one aftei anothei. Most modein piocessois aie
designed to allow the execution of successive instiuctions to oveilap, using a technique known as pipelining.
In the example in Fig. 86.9, each instiuction is bioken down into 4 basic stepsfetch, decode, opeiate, and
wiiteand a sepaiate haidwaie unit is piovided to peifoim each of these steps. As a iesult, the execution of
FIGURE 86.8 Memoiy hieiaichy.
FIGURE 86.9 Pipelining of instiuction execution.
2000 by CRC Press LLC
successive instiuctions can be oveilapped as shown, iesulting in an instiuction completion iate of one pei basic
time step. If the execution oveilap pattein shown in the fguie can be maintained foi long peiiods of time, the
effective value of S tends towaid 1.
When the execution of some instiuction I depends on the iesults of a pievious instiuction, J, which is not
yet completed, instiuction I must be delayed. The pipeline is said to be saeJ, waiting foi the execution of
instiuction J to be completed. While it is not possible to eliminate such situations altogethei, it is impoitant
to minimize the piobability of theii occuiience. This is a key consideiation in the design of the instiuction set
of modein piocessois and the design of the compileis that tianslate highlevel language piogiams into machine
language.
Now, imagine that multiple functional units aie piovided in the piocessoi so that moie than one instiuction
can be in the opeiate stage. This arae execution capability, when added to pipelining of the individual
instiuctions, means that execution iates of moie than one instiuction completion pei basic step time can be
achieved. This mode of enhanced piocessoi peifoimance is called superscalar processing.
The iate, R, of peifoiming basic steps in the piocessoi is usually iefeiied to as the piocessoi clock iate; and
it is of the oidei of 100 to 200 million steps pei second in cuiient highpeifoimance VLSI piocessois. This iate
is deteimined by the technology used in fabiicating the piocessois, and is stiongly ielated to the size oi aiea
occupied by individual tiansistois. This size featuie, which is cuiiently in the submicion iange, has been steadily
decieasing as fabiication techniques impiove, allowing incieases in R to be achieved.
Mu!tiprucessurs
Physical limits on electionic speeds pievent single piocessois fiom being speeded up indefnitely. A majoi
design tiend has seen the development of systems that consist of a laige numbei of piocessois. Such multipio
cessois can be used to speed up the execution of laige piogiams by executing subtasks in paiallel. The main
diffculty in achieving this type of speedup is in being able to decompose a given task into its paiallel subtasks
and assign these subtasks to the individual piocessois in such a way that communication among the subtasks
can be done effciently. Fig. 86.10 shows a block diagiam of a multipiocessoi system, with the inteiconnection
netwoik needed foi data shaiing among the piocessois Pi. Paiallel paths aie needed in this netwoik in oidei
foi paiallel activity to pioceed in the piocessois as they access the global memoiy space iepiesented by the
multiple memoiy units Mi.
Dehning Terms
Arithmetic and logic unit: The logic gates and iegistei stoiage elements used to peifoim the basic opeiations
of addition, subtiaction, multiplication, and division of numeiic opeiands, and the compaiison, shifting,
and alignment opeiations on moie geneial foims of numeiic and nonnumeiic data.
Bus: The collection of data, addiess, and contiol lines that enables exchange of infoimation, usually in woid
size quantities, among the vaiious computei system units. In piactice, a laige numbei of units can be
connected to a single bus. These units contend in an oideily way foi the use of the bus foi individual
tiansfeis.
Cache memory: A highspeed memoiy foi tempoiaiy stoiage of copies of the sections of piogiam and data
fiom the main memoiy that aie cuiiently active duiing piogiam execution.
FIGURE 86.10 A multipiocessoi system.
2000 by CRC Press LLC
Computer architecture: The functional opeiation of the individual haidwaie units in a computei system and
the ow of infoimation and contiol among them.
Control unit: The ciicuits iequiied foi sequencing the basic steps needed to execute machine instiuctions.
Inputoutput unit (I/O): The equipment and contiols necessaiy foi a computei to inteiact with a human
opeiatoi, to access mass stoiage devices such as disks and tapes, oi to communicate with othei computei
systems ovei communication netwoiks.
Memory hierarchy: The collection of cache, piimaiy, and secondaiy memoiy units that compiise the total
stoiage capability in the computei system.
Memory management: The combination of opeiating system softwaie and haidwaie contiols that is needed
to access and move piogiam and data segments up and down the memoiy hieiaichy duiing piogiam
execution.
Memory unit: The unit iesponsible foi stoiing piogiams and data. Theie aie two main types of units: piimaiy
memoiy, consisting of millions of bit stoiage cells fabiicated fiom electionic semiconductoi integiated
ciicuits, used to hold piogiams and data duiing piogiam execution; and secondaiy memoiy, based on
magnetic disks, diskettes, and tapes, used to stoie peimanent copies of piogiams and data.
Multiprocessor: a computei system compiising multiple piocessois and main memoiy unit modules, con
nected by a netwoik that allows paiallel activity to pioceed effciently among these units in executing
piogiam tasks that have been sectioned into subtasks and assigned to the piocessois.
Pipelining: The oveilapped execution of the multiple steps of successive instiuctions of a machine language
piogiam, leading to a highei iate of instiuction completion than can be attained by executing instiuctions
stiictly one aftei anothei.
Processor: The aiithmetic and logic unit combined with the contiol unit needed to sequence the execution
of instiuctions. Some cache memoiy is also included in the piocessoi.
Superscalar processing: The ability to execute instiuctions at a completion iate that is fastei than the noimal
pipelined iate, by pioviding multiple functional units in the pipeline to allow a small numbei of instiuc
tions to pioceed thiough the pipeline piocess in paiallel.
Re!ated Tupics
86.2 Computei Aiithmetic 86.4 Miciopiogiamming
Relerences
V.C. Hamachei, Z.G. Vianesic, and S.G. Zaky, Comuer Organ:aon, 4th ed., New Yoik: McGiawHill, 1996.
D. A. Patteison and J. L. Hennessey, Comuer Organ:aon anJ DesgnTe HarJware/So[ware Iner[ate, San
Mateo, Calif.: Moigan Kaufman, 1994.
A.S. Tanenbaum, SrutureJ Comuer Organ:aon, 3id ed., Englewood Cliffs, N.J.: PienticeHall, 1990.
Further Inlurmatiun
The IEEE magazines Comuer, Mtro, and So[ware all have inteiesting aiticles on subjects ielated to computei
aichitectuie, including softwaie aspects. Also, aiticles on computei aichitectuie occasionally appeai in Stenft
mertan.
86.4 Micruprugramming
jocue Foymond
Since the 1950s when Wilkes et. al. 1958] defned the teim and the concept, miciopiogiamming has been used
as a clean and systematic way to defne the instiuction set of a computei. It has also been used to defne a
viitual aichitectuie out of a ieal haidwiied one.
2000 by CRC Press LLC
Leve!s ul Prugramming
In Fig. 86.12, we see that a computei application is usually iealized by piogiamming a given algoiithm in a
highlevel language. A system offeiing a highlevel language capability is implemented at the system level via
a compilei. The opeiating system is (usually) implemented in a loweilevel language. The machine instiuction
set can be haidwiied (in a haidwaie implementation) oi implemented via miciopiogiamming (Fig. 86.11).
Theiefoie, miciopiogiamming is simply an extia level in the geneial stiuctuie. Since it is used to defne the
machine instiuction set, it can be consideied at the arJware level. Since this defnition is done via a piogiam
at a low level, but still eventually modifable, it can also be consideied to be at the so[ware level. Foi these
ieasons, the teim nrmware has been coined to name sets of miciopiogiams. In shoit, microinstructions that
specify haidwaie functions (mtrooeraons such as Open a path, Select opeiation) aie used to foim a moie
complex instiuction (Conveit to binaiy, Add decimal). The machine instiuction set is defned via a set of
miciopiogiam ioutines and a miciopiogiammed instiuction decodei.
In a miciopiogiammed machine, the haidwaie is designed in teims of its capabilities (ALU, data paths, I/O,
piocessing units) with little concein foi how these capabilities will have to be accessed by the piogiammeis.
The micioopeiations aie decoded fiom micioinstiuctions. The way piogiammeis ew the machine is defned
at the miciopiogiamming level.
This appioach offeis some diffeiences ovei the haidwiied appioach. The advantages aie that it is moie
systematic in implementation, modifable, economical on most designs, and easiei to debug. The disadvantages
aie that it is uneconomical on simple machines, slowei, and needs suppoit softwaie. Like all piogiams,
miciopiogiams ieside in memoiy. The teim contiol memoiy" is commonly used foi miciopiogiams.
Micruinstructiun Structure
On a given haidwaie, many piocessing functions aie available. In geneial a subset O of these functions can be
peifoimed in paiallel, foi example, caiiying on an addition between two iegisteis while copying a iegistei on
an I/O bus. These functions aie called microcommands.
Hurizunta! Micruinstructiuns
Each of the felds f of a micioinstiuction specifes a miciocommand. If the foimat of the micioinstiuction is
such that all possible miciocommands can be specifed, the instiuction is called or:ona. Most of the time,
it is wasteful in memoiy as, in a miciopiogiam, not eveiy possible miciocommand is specifed in each
micioinstiuction. Howevei, it peimits the miciopiogiammei to fully take advantage of all possible paiallelisms
and to build fastei machines.
Foi example, the hoiizontal specifcation of an ALU opeiation,
FIGURE 86.11 Levels of piogiamming in a computei system.
ALUOpeiation SouicePathA SouicePathB ResultPath CvtDecimal
2000 by CRC Press LLC
specifes both opeiands, which iegistei will contain the iesult, whethei oi not the iesult is to be conveited to
decimal, and the opeiation to be peifoimed. If this instiuction is, foi example, pait of a miciopiogiam defning
a 32bit addition instiuction, assuming a 16bit path, it is wasteful to specify twice the souice and iesult opeiands.
In some cases, it is possible to design a micioinstiuction that specifes moie miciocommands than can be
executed in paiallel. In that case, the execution of the miciocommand is caiiied out in moie than one clock
cycle. Foi this ieason they aie called oyase micioinstiuctions (as opposed to monoase).
Yertica! Micruinstructiuns
At the othei extieme, if the micioinstiuction allows only the specifcation of a single miciocommand at a time,
the instiuction is then called erta. In that case, only the necessaiy commands foi a paiticulai piogiam aie
specifed, iesulting in smallei control memory iequiiements. Howevei, it is not possible to take advantage of
possible paiallelism offeied by the haidwaie, since only one miciocommand is executed at a time. Foi example,
the veitical specifcation of an ALU opeiation is as follows:
Diaguna! Micruinstructiuns
Most cases ft in between these two extiemes (see Fig. 86.13). Some paiallelism is possible; howevei, micio
commands peitaining to a given piocessing unit aie iegiouped. This iesults in shoitei miciopiogiams than in
the veitical case and may still allow some optimization. Foi example, a diagonal specifcation of an ALU
opeiation is as follows:
Optimizatiun
Time and space optimization studies can be peifoimed befoie designing the micioinstiuction foimat. The
ieadei is iefeiied to Das et al. 1973] and Ageiwala 1976] foi details and moie iefeiences.
FIGURE 86.12 A view of computei system levels.
SouiceA Reg# 1st Opeiand
SouiceB Reg# 2nd Opeiand
Result Reg# Result
ALU Op Opeiation
SelectSouices RegA# RegB# Select Opeiands
SelectResult Reg# Dec/Bin Result place and foimat
Select ALU Opeiation Peifoim the opeiation
2000 by CRC Press LLC
Micruprugram Deve!upment
Micruassemb!ers
The fist level of specifcation of micioinstiuctions is, just like its counteipait at the machine level, the assemblei.
Although the piocess and philosophy is exactly the same, it is tiaditionally called a micioassemblei. A micioas
semblei is a softwaie piogiam (it is not ielevant to know in which language it is wiitten) whose function is to
tianslate a souice piogiam into the binaiy code equivalent. Obviously, to wiite a souice piogiam, a language
has to be designed. At assembly level, languages aie usually veiy close to the haidwaie stiuctuie, and the objects
defned aie micioiegisteis, gate level contiols, and paths. Opeiations aie the micioopeiations (sometimes
slightly moie sophisticated with a micioassemblei with maciofacilities).
This level piovides an easily ieadable miciopiogiam and does much to help avoid syntax eiiois. In binaiy,
only the piogiammei can catch a faulty 1 oi 0; the micioassemblei can catch syntax eiiois oi some faulty
iegistei specifcations. No micioassemblei exists that can catch all logic eiiois in the implementation of a given
instiuction algoiithm. It is still veiy easy to make mistakes. It should be noted that this level is a good
compiomise between convenience and cost.
The following is a typical example of a miciopiogiam in the micioassemblei (it implements a 16bit add
on an 8bit path and ALU):
CLC Cleai Caiiy
Lod A Get fist pait of fist opeiand
Add B Add to fist pait of second opeiand
Sto C Give low byte of fnal iesult
Lod a Get second pait of fist opeiand
Adc b Add to second pait of second opeiand and to caiiy bit
Sto c Give high byte of fnal iesult
JCS Eiioi Jump to eiioi ioutine if Result >65536
Jmp FetchNext
FIGURE 86.13 Micioinstiuction felds veisus miciocommands.
2000 by CRC Press LLC
HighLeve! Languages lur Micruprugramming
Many higheilevel languages have been designed and implemented; see a discussion of some design philosophies
in Malik and Lewis 1978]. In piinciple, a highlevel language piogiam is oiiented towaid the application it
suppoits and is faithei away fiom the haidwaiedetailed implementation of the machine it iuns on. The appli
cations suppoited aie mostly othei machine defnitions (emulators) and implementations of some algoiithms.
The objects defned and manipulated by highlevel languages foi miciopiogiamming aie theiefoie the viitual
components of viitual machines. They aie usually much the same as theii ieal counteipaits: iegisteis, paths,
ALUs, micioopeiations, etc. Fuitheimoie, wiiting a miciopiogiam is usually defning a machine aichitectuie.
It involves a lot of intiicate and complicated details, but the algoiithms implemented aie mostly quite simple.
The advantages offeied by highlevel languages to wiite bettei algoiithms without getting lost in implementation
details aie theiefoie not exploited.
Firmvare Imp!ementatiun
Miciopiogiamming usually iequiies the iegulai phases of design, coding, test, confoimance acceptance, doc
umentation, etc. It diffeis fiom othei piogiamming activities when it comes to the deliveiable. The usual fnal
pioduct takes the foim of haidwaie, namely, a contiol memoiy, PROM, ROM, oi othei media, containing the
bit patteins equivalent to the fimwaie. These implementation steps iequiie special haidwaie and softwaie
suppoit. They include a linkei, loadei, PAL piogiammei, oi ROM buinei; a test piogiam in a contiol memoiy
test bench is also necessaiy.
Suppurting Sultvare
It is advisable to test the miciopiogiam befoie its actual haid implantation, if the implantation piocess is
iiieveisible oi too costly to iepeat. Softwaie simulatois have been implemented to allow thoiough testing of
the newly developed miciopiogiam. Needless to say these tools aie veiy specialized to a given enviionment
and theiefoie costly to develop, as theii development cost cannot be distiibuted ovei many applications.
Emu!atiun
Cuncept
In a miciopiogiammed enviionment, a computei aichitectuie is softly (oi fimly) defned by the miciopiogiam
designed to implement its opeiations, its data paths, and its machinelevel instiuctions. It is easy to see that if
one changes the miciopiogiam foi anothei one, then a new computei is defned. In this enviionment, the
desiied opeiation is simulated by the execution of the fimwaie, instead of being the iesult of action on ieal
haidwiied components.
Since the woid smuaon was alieady in use foi simulation of some system by softwaie, the woid emuaon
was chosen to mean simulation of an instiuction set by fimwaie. Of couise, simulation" by haidwaie is not
a simulation but the rea ng.
The geneial stiuctuie of an emulatoi consists of the following pseudocode algoiithm:
BEGIN
Initialize Machine Components
Repeat
Fetch Instruction
Emulate Operation of the current instruction
Process interrupts
Update instruction counter
Until MachineIsOff
Perform shutdown procedure
END
Many vaiiations exist, in paiticulai to piocess inteiiupts within the emulation of a lengthy opeiation oi to
optimize thioughput, but the geneial piinciple and stiuctuie aie faiily constant.
2000 by CRC Press LLC
Emu!atiun ul CPL Operatiun
One of the advantages of miciopiogiamming is that the designei can implement his oi hei Jream nsrutons
simply by emulating its opeiation. We have seen alieady the code foi a typical 16bit addei, but it is not diffcult
to code a paiity code geneiatoi, a cyclic iedundancy check calculatoi, oi an instiuction that ietuins the
eigenvalues of an n n matiix. This pait is stiaight piogiamming. One consideiation is to make suie that the
machine is still listening to the outside woild (inteiiuptions) oi actively monitoiing it (I/O ags) in oidei not
to loose asynchionous data while looking foi a paiticulai pattein in a 1 megabyte stiing. Anothei consideiation
is to optimize memoiy usage by combining common piocesses foi diffeient opeiations. Foi example, emulating
a 32bit add instiuction and emulating a 16bit add instiuction have common paits. This is, howevei, a
piogiamming concein not specifc to emulation.
I]O System and Interrupts
Piogiamming suppoit foi I/Os and inteiiupts is moie complicated than foi stiaight machine instiuctions. This
is due to the consideiable speed diffeiences between I/O devices and a CPU, the need foi synchionization, the
need foi not losing any exteinal event, and the conceins foi optimizing piocessing time. Miciopiogiamming
offeis consideiable design exibility, as these pioblems aie moie easily handled by piogiamming than with
haidwaie components.
App!icatiuns ul Micruprugramming
The main application of miciopiogiamming is the emulation of a viitual machine aichitectuie on a diffeient
host haidwaie machine. It is, howevei, easy to see that the concept of emulation can be bioadened to othei
functions than the tiaditional haidwaie opeiation.
It is mainly a mattei of point of view. Emulation and simulation aie essentially the same piocess but viewed
fiom diffeient levels. Realizing a 64bit addition and implementing a communication contiollei aie qualitatively
the same type of task. Once this is consideied, theie aie theoietically no limits to the uses of miciopiogiamming.
Fiom the piogiammei`s point of view, piogiamming is the activity of pioducing, in some language, an
implementation of some algoiithm. If the language is at the veiy lowest level, as is the case with miciopio
giamming, and at the same time the algoiithm is flled with intiicate data stiuctuies and complex decisions,
the task might be enoimous, but nothing says it cannot be done (except, maybe, expeiience). With this
peispective of the feld, we now look at some existing applications of miciopiogiamming.
Operating System Suppurt
One of the fist applications, besides emulation, was to suppoit some opeiating system functions. Since
miciopiogiams aie closei to the haidwaie and piogiamming diiectly in miciocode iemoves the oveihead of
decoding machinelevel instiuctions, it was thought that diiectly coding opeiating system (OS) functions would
impiove theii peifoimance. Success was achieved in some aieas, such as viitual memoiy. In geneial, people
wiite most OS functions in assembly language, piobably because the cost is not offset by the benefts, especially
with iapidly changing OS veisions. The pioblems iaised by the human side of piogiamming have changed the
question Should it be in miciocode oi in assemblei:" to the question Should it be in assemblei oi in C:"
This paiallels the CISC/RISC debate.
HighLeve! Languages Suppurt
Eaily ieseaich was done also in the aiea of suppoit foi highlevel languages. Suppoit can be in the foim of
miciopiogiammed implementations of some language piimitive (foi example, the tiigonometiic functions) oi
suppoit foi the defnition and piocessing of data stiuctuies (foi example, tiees and lists piimitives). Many
inteiesting ieseaich piojects have led to esoteiic laboiatoiy machines. Moie common examples include the
tianslate instiuctions, stiing seaiches and compaies, oi indexing multidimensional aiiays.
Paging, Yirtua! Memury
An eaily and typical application of miciopiogiamming is the implementation of the paging algoiithm foi a
viitual memoiy system. It is a typical application since it is a lowlevel function that must be time optimized
and is highly haidwaie dependent. Fuitheimoie, the vaiious maintenance functions which aie iequiied by the
2000 by CRC Press LLC
paging algoiithms and the disk I/Os can be done duiing the idle time of the piocessing of othei functions oi
duiing pait of that piocessing in oidei to avoid I/O delays.
Diagnustics
Diagnostic functions have also been an eaily application of miciopiogiamming. A fimwaie implementation
is ideally suited to test the vaiious components of a computei system, since the gates, paths, and units can be
exeicised in an isolated mannei, theiefoie allowing one to piecisely pinpoint the tiouble aiea.
Cuntru!!ers
Realtime contiolleis beneft fiom a miciopiogiammed implementation, due to the speed gained by piogiam
ming only the iequiied functions, theiefoie avoiding the oveihead of geneialpuipose instiuctions. Since the
miciopiogiammei can bettei make use of the available paiallelism in the machine, long piocesses can still
suppoit the asynchionous aiiival of data by incoipoiating the inteiiupt polling at inteivals in these piocesses.
HighLeve! Machines
Machines that diiectly implement the constiucts of highlevel languages can be easily implemented via micio
piogiamming. Foi example, Piolog machines and Lisp machines have been tiied. It is also possible to conceive
an application diiectly miciocoded. Although this could piovide a high peifoimance haidwaie, human eiiois
and softwaie engineeiing piactice seem to make such a machine moie of a cuiiosity than a maintainable system.
Dehning Terms
Control memory: A memoiy containing a set of micioinstiuctions (a miciopiogiam) that defnes the instiuc
tion set and opeiations of a CPU.
Emulator: The fimwaie that simulates a given machine aichitectuie.
Firmware: Meant as an inteimediate between softwaie, which can be modifed veiy easily, and haidwaie,
which is piactically unchangeable (once built); the woid frmware was coined to iepiesent the miciopio
giam in contiol memoiy, i.e., the modifable iepiesentation of the CPU instiuction set.
Highlevel language for microprogramming: A highlevel language moie oi less oiiented towaid the desciip
tion of a machine. Emulatois can moie easily be wiitten in a highlevel language; the souice code is
compiled into the micioinstiuctions foi actual implementation.
Horizontal microinstruction: Theoietically, a completely hoiizontal micioinstiuction is made up of all the
possible miciocommands available in a given CPU. In piactice, some encoding is piovided to ieduce the
length of the instiuction.
Microcommand: A small bit feld indicating if a gate is open oi closed, if a function is enabled oi not, if a
contiol path is active oi not, etc. A miciocommand is theiefoie the specifcation of some action within
the contiol stiuctuie of a CPU.
Microinstruction: The set of miciocommands to be executed oi not, enabled oi not. Each feld of a micio
instiuction is a miciocommand. The instiuction specifes the new state of the CPU.
Vertical microinstruction: A completely veitical micioinstiuction would contain one feld and theiefoie
would specify one miciocommand. An Op code is used to specify which miciocommand is specifed. In
piactice, micioinstiuctions that typically contain thiee oi foui felds aie called veitical.
Re!ated Tupic
86.3 Aichitectuie
Relerences
T. Ageiwala, Miciopiogiam optimization: a suivey," IEEE Trans. Comu., vol. C25, no. 10, pp. 862873, 1976.
J.D. Bagley, Miciopiogiammable viitual machines," Comuer, pp. 3842, 1976.
D.K. Baneiji and J. Raymond, Eemens o[ Mtrorogrammng, Englewood Cliffs, N.J.: PienticeHall, 1982.
G.F. Casaglia, Nanopiogiamming vs. miciopiogiamming," Comuer, pp. 5458, 1976.
2000 by CRC Press LLC
S.R. Das, D. K. Baneiji, and A. Chattopadhyay, On contiol memoiy minimization in miciopiogiammed digital
computeis," IEEE Trans. Comu., vol. C22, no. 9, pp. 845848, 1973.
L.H. Jones, An annotated bibliogiaphy on miciopiogiamming," SICMICRO Newseer, vol. 6, no. 2, pp. 831,
1975.
L.H. Jones, Instiuction sequencing in miciopiogiammed computeis," FIPS Con[. Prot., vol. 44, pp. 9198, 1975.
K. Malik and T.J. Lewis, Design objectives foi high level miciopiogiamming languages," in ProteeJngs o[ e
11 nnua Mtrorogrammng Vorso, Englewood Cliffs, N.J.: PienticeHall, 1978, pp. 154160.
J. Raymond and D.K. Baneiji, Using a miciopiocessoi in an intelligent giaphics teiminal," Comuer, pp. 1825,
1976.
M.V. Wilkes, W. Renwick, and D.J. Wheelei, The design of the contiol unit of an electionic digital computei,"
Prot. IEE, pp. 121128, 1958.
Further Inlurmatiun
J. Caitei, Mtrorotessor rteture anJ Mtrorogrammng, a Sae Matne roat, Englewood Cliffs, NJ:
PienticeHall, 1996.
S. Habib, Ed., Mtrorogrammng anJ Frmware Engneerng MeoJs, New Yoik: Van Nostiand Reinhold, 1988.
H. Okuno, N. Osato, and I. Takeuchi, Fiimwaie appioach to fast lisp inteipietei. Twentieth annual woikshop
on miciopiogiamming", (MICRO20), CM, 1987.
A. J. Van dei Hoeven, P. Van Piooijen, E. F. Depietteie, and P. M. Dewilde, A haidwaie design system based
on objectoiiented piinciples", IEEE, pp. 459463, 1993.