Beruflich Dokumente
Kultur Dokumente
Dr Shankar Balachandran
Indian Institute of Technology Madras
shankar@cse.iitm.ernet.in
14 October 2006
Two Major Blocks in a CPU
Datapath
Adders, multipliers, dividers
Shifters, Registers
Anything that changes or stores data
Control Unit
Controlsthe data
How data is stored?
Where is it stored?
When should data be available?
Control Unit
Correct sequencing of control signals
Much like human brain controlling various
parts of body
Sequence and timing is the key
Any aberration will result in wrong operation
A Simplified Control Unit
Fetch
Fetch Unit
Decode
Decode Unit
Write Back
Write Back Unit
A Possible Implementation
Mod-3
Counter 2 to 4
Decoder
CLK
Timing Diagram
CLK
Fetch
Decode
Execute
Write Back
Let’s Sample The Signals
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
Another Way to Generate Signals
1000
0100
0010
0001
Hardwired vs Microprogrammed
Hardwired
Use gates to generate signals
Squeeze out the juice for performance
Different logic styles possible
Microprogrammed
Store the control signals in the sequence
Just read from the memory every clock cycle
A Model Computer
(Richard Eckert, SIGCSE Bulletin, Vol. 20, No. 3, September 1988)
IP 8 12 LA
PC Accumulator
LP EA
EP 12
12
S
12
LM MAR A
ALU
8 EU
R 12
RAM 12
W LB
Register B
12
LD 12 LI
MDR IR
ED EI
4
Bus Control
More Details
L = Load IP
PC ACC
LA
EA
LP
E = Copy to bus EP
S
LM MAR ALU A
A,S = Add and Subtract EU
R RAM
Sign bit to control unit W
B
LB
LD LI
IP = Increment PC MDR IR
EI
ED
Bus Control
Active
Mnemonic Opcode Action Register Transfers Controls
1. MAR ←IR EI,LM
LDA 1 A←(Mem) 2. MDR ←M(MAR) R
Load
Accumulator 3. A ←MDR ED,LA
LDA
STA
ADD
Decoder
SUB Control
MBA Matrix
JMP
JN
Halt NF
Control Signals
Table with Sequencing
IP LP EP LM R W LD ED LI EI LA EA A S EU LB
Fetch T2 T0 T0 T1 T2 T2
LDA T3 T4 T5 T3 T5
STA T3 T5 T4 T3 T4
MBA T3 T3
ADD T3 T3 T3
SUB T3 T3 T3
JMP T3 T3
JN T3 T3
*F *F
IP = T2; R=T1+T4*LDA; LI=T2;
LP = T3*JMP+T3*JN*NF; W=T5* STA; A = T3*ADD;
EP = T0; LD = T4*STA; S = T3*SUB;
LM = T0+T3*LDA+T3*STA ED=T2+T5*LDA; …..
Control Matrix
Implement using discrete gates
Usually done using PLAs
Large control matrices are implemented
hierarchically
For speed
A well known process and design flows
are widespread
An Alternate Implementation
MAP
4-bit
opcode CD
Starting &
1* +
IR Address NF
Generator 01
00
CLK
uPC
Map CD Meaning +1
32 x 24
1 * From IR
Control
Unconditional
Control ROM Store
0 0 Branch within
Microprogram
Jump Address
NF=0 =>
Increment Microinstruction
0 1
NF=1 => Register
Conditional Branch
HLT Control
Control Store
uInstruction
Instruction Op-Code Address Control Signals CD MAP HLT Addr. Of Next
00 0011000000000000 0 0 0 01
Fetch 0 01 0000100000000000 0 0 0 02
02 1000000110000000 0 1 0 XX
LDA 1 03 0001000001000000 0 0 0 04
04 0000100000000000 0 0 0 05
05 0000000100100000 0 0 0 00
STA 2 06 0001000001000000 0 0 0 07
07 0000001000010000 0 0 0 08
08 0000010000000000 0 0 0 00
ADD 3 09 0000000000101010 0 0 0 00
SUB 4 0A 0000000000100110 0 0 0 00
MBA 5 0B 0000000000010001 0 0 0 00
JMP 6 0C 0100000001000000 0 0 0 00
JN 7 0D 0000000000000000 1 0 0 0F
0E 0000000000000000 0 0 0 00
0F 0100000001000000 0 0 0 00
Expansion 8-E 10-1E
HLT F 1F 0000000000000000 0 0 1 XX
Control Word
Example 1 – MBA followed by I
P
L
P
E
P
L
M
R W L
D
E
D
L
I
E
I
L
A
E
A
A S E
U
L
B
ADD
00 0011000000000000 0 0 0 01
Fetch 0 01 0000100000000000 0 0 0 02
02 1000000110000000 0 1 0 XX 0B
09
LDA 1 03 0001000001000000 0 0 0 04
04 0000100000000000 0 0 0 05
05 0000000100100000 0 0 0 00
STA 2 06 0001000001000000 0 0 0 07
07 0000001000010000 0 0 0 08
08 0000010000000000 0 0 0 00
ADD 3 09 0000000000101010 0 0 0 00
SUB 4 0A 0000000000100110 0 0 0 00
MBA 5 0B 0000000000010001 0 0 0 00
JMP 6 0C 0100000001000000 0 0 0 00
JN 7 0D 0000000000000000 1 0 0 0F
0E 0000000000000000 0 0 0 00
0F 0100000001000000 0 0 0 00
Expansion 8-E 10-1E
HLT F 1F 0000000000000000 0 0 1 XX
Sequence for MBA,ADD
MOV B,A 1. MAR ←PC 0011000000000000
uPC
Caution :
Microinstruction can get very wide
Solution :
There is no free lunch.
Can we pipeline microfetch?
A neat idea :
Why wait till the current micro-op is over?
Branch field gives next operation
Get the next op
Caveat :
External inputs and status flags may change the order
What about interrupts?
They are going to follow you everywhere
Should have a mechanism that can invalidate microcode
prefetch
Similar to pipeline flush for instructions
Commonly used
Historical Perspectives
Hardwired Logic
Popular before 60’s
Only way people did it
Popular now
Speed Benefits
Microprogram
Popular in 70’s
Memory was slower than CPU
No on-chip cache
Best way is to store the microcode
Now – Depends on who you ask?
Shades of gray :
Extremes of spectrum are harder to find nowadays
Tools for Design
Hardwired
Any state machine optimizer
Assigning states, minimizing tranisitions, races,
hazards,……..
Microcoding
Small ones can be in binary
Large ones – Use microassembler
Very useful debug tool
Can use microassembler simultaneously with actual
hardware development
Hardwired vs Microcoding
Hardwired units are faster and smaller
Emulation is easy with microcoding
Hardwired design is complex if large
Bugs in hardwired design cannot be fixed
in field
Hardwired control is not suited for loops
Looping with microcode can be made as fast
Hardwired vs Microcode vs RISC
RISC
Simplerinstruction set
Hardwired Implementation
Microcoding in P4
Two pointers control flow independently
Both processors share the ROM entries
Access is alternated between the CPUs
Thank You