Beruflich Dokumente
Kultur Dokumente
$2 = 10 before sub;
$2 = -20 after sub
CC 1
Value of
register $2: 10
CC 2
CC 3
CC 4
CC 5
CC 6
CC 7
CC 8
CC 9
10
10
10
10/ 20
20
20
20
20
DM
Reg
Program
execution
order
(in instructions)
sub $2, $1, $3
sub
and
or
add
sw
$2,
$12,
$13,
$14,
$15,
$1, $3
$2, $5
$6, $2
$2, $2
100($2)
or $13, $6, $2
sw $15, 100($2)
IM
Reg
IM
DM
Reg
IM
DM
Reg
IM
Reg
DM
Reg
IM
Reg
Reg
Reg
DM
Reg
Software Solution
sub
$2,
$1, $3
lw
slt
and
or
add
sw
$10, 40($3)
$5, $6, $7
$12, $2, $5
$13, $6, $2
$14, $2, $2
$15, 100($2)
or
sub
$2,
nop
nop
and
or
add
sw
$12,
$13,
$14,
$15,
$1, $3
$2, $5
$6, $2
$2, $2
100($2)
Hardware Solution:
Forwarding
Idea: use intermediate data, do not wait for result to be
finally written to the destination register. Two steps:
1.
2.
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/ME M
WB
EX
ME M/WB
WB
Add
Add
Instruction
memory
Branch
Shift
left 2
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
W rite
data 2
register
W rite
data
MemtoReg
Address
Add
result
MemWrite
PC
Instruction
RegW rite
Zero
ALU
0
M
u
x
1
ALU
result
Read
data
Address
Data
memory
Write
data
Instruction
16
[15 0]
Control signals
emanate from
the control
portions of the
pipeline registers
Instruction
[20 16]
Instruction
[15 11]
Sign
extend
32
ALU
control
0
M
u
x
1
ALUOp
RegDst
M emRead
1
M
u
x
0
Hazard Detection
Hazard conditions:
Eg., in the earlier example, first hazard between sub $2, $1, $3 and
and $12, $2, $5 is detected when the and is in EX stage and the
sub is in MEM stage because
Data Forwarding
Plan:
allow inputs to the ALU not just from ID/EX, but also later
pipeline registers, and
use multiplexors and control signals to choose appropriate
inputs to ALU
Time (in clock cycles)
CC 1
Value of register $2 : 10
Value of EX/MEM : X
Value of MEM/WB : X
CC 2
CC 3
CC 4
CC 5
CC 6
CC 7
CC 8
CC 9
10
X
X
10
X
X
10
20
X
10/20
X
20
20
X
X
20
X
X
20
X
X
20
X
X
DM
Reg
Program
execution order
(in instructions)
sub $2, $1, $3
sub
and
or
add
sw
$2,
$12,
$13,
$14,
$15,
$1, $3
$2, $5
$6, $2
$2, $2
100($2)
or $13, $6, $2
sw $15, 100($2)
IM
Reg
IM
Reg
IM
DM
Reg
IM
Reg
DM
Reg
IM
Reg
DM
Reg
Reg
DM
Reg
ID/EX
R e g is te rs
EX/MEM
MEM/WB
ALU
D a ta
m e m o ry
a . N o fo rw a rd in g
M
u
x
E X /M E M
M E M /W B
M
u
x
R e g is te rs
F o r w a rd A
ALU
M
u
x
Rs
Rt
Rt
Rd
D a ta
m e m o ry
F o rw a rd B
M
u
x
E X /M E M .R e g iste rR d
F o rw a rd in g
u n it
b . W ith fo rw a rd in g
M E M /W B .R e g is te rR d
M
u
x
Forwarding Hardware:
Multiplexor Control
Mux control
Source
ForwardA = 00
ForwardA = 10
ForwardA = 01
ID/EX
EX/MEM
MEM/WB
ForwardB = 00
ForwardB = 10
ForwardB = 01
ID/EX
EX/MEM
MEM/WB
Explanation
The first ALU operand comes from the register file
The first ALU operand is forwarded from prior ALU result
The first ALU operand is forwarded from data memory
or an earlier ALU result
The second ALU operand comes from the register file
The second ALU operand is forwarded from prior ALU result
The second ALU operand is forwarded from data memory
or an earlier ALU result
1.
EX/MEM.RegWrite
// if there is a write
and ( EX/MEM.RegisterRd 0 )
// to a non-$0 register
and ( EX/MEM.RegisterRd = ID/EX.RegisterRt ) ) // which matches, then
ForwardB = 10
MEM hazard
if (
MEM/WB.RegWrite
and ( MEM/WB.RegisterRd 0 )
and ( EX/MEM.RegisterRd ID/EX.RegisterRs )
// if there is a write
// to a non-$0 register
// and not already a register match
// with earlier pipeline register
and ( MEM/WB.RegisterRd = ID/EX.RegisterRs ) ) // but match with later pipeline
register, then
ForwardA = 01
if (
MEM/WB.RegWrite
and ( MEM/WB.RegisterRd 0 )
and ( EX/MEM.RegisterRd ID/EX.RegisterRt )
// if there is a write
// to a non-$0 register
// and not already a register match
// with earlier pipeline register
and ( MEM/WB.RegisterRd = ID/EX.RegisterRt ) ) // but match with later pipeline
register, then
ForwardB = 01
This check is necessary, e.g., for sequences such as add $1, $1, $2; add $1, $1, $3; add $1, $1, $4;
(array summing?), where an earlier pipeline (EX/MEM) register has more recent data
PC
Instruction
memory
Instruction
IF/ID
EX/MEM
WB
EX
MEM/WB
WB
M
u
x
Registers
ALU
Data
memory
M
u
x
IF/ID.RegisterRs
Rs
IF/ID.RegisterRt
Rt
IF/ID.RegisterRt
Rt
IF/ID.RegisterRd
Rd
M
u
x
EX/MEM.RegisterRd
Forwarding
unit
MEM/WB.RegisterRd
M
u
x
Forwarding
Execution
example:
sub
and
or
add
$2,
$4,
$4,
$9,
$1,
$2,
$4,
$4,
$3
$5
$2
$2
or $4, $4, $2
and $4 , $2, $5
sub $ 2 , $ 1 , $ 3
b e fo re < 1 >
b e fo re < 2 >
ID/EX
$2,
$4,
$4,
$9,
$1,
$2,
$4,
$4,
$3
$5
$2
$2
10
Control
IF/ID
PC
Instruction
memory
10
WB
In stru ctio n
sub
and
or
add
$2
EX/MEM
WB
EX
WB
$1
M
u
x
5
Registers
ALU
$5
$3
M
u
x
M
u
x
Forwarding
unit
C lo c k 3
MEM/WB
Data
memory
M
u
x
a dd $9 , $ 4, $2
sub
and
or
add
$2,
$4,
$4,
$9,
$1,
$2,
$4,
$4,
or $ 4 , $ 4, $ 2
$3
$5
$2
$2
a nd $4 , $ 2, $5
10
memory
10
WB
EX/MEM
10
Control
I nstr uction
PC
$4
WB
EX
MEM/WB
WB
$2
M
u
x
6
Registers
Data
ALU
$2
memory
$5
M
u
x
M
u
x
Forwarding
unit
C lo c k 4
b e fo r e < 1 >
ID/EX
IF/ID
Instruction
sub $ 2, . . .
M
u
x
sub
and
or
add
$2,
$4,
$4,
$9,
$1,
$2,
$4,
$4,
or $4, $4, $2
$3
$5
$2
$2
a nd $ 4, . . .
sub $ 2 , . . .
ID/EX
10
10
WB
EX/MEM
10
Control
WB
EX
MEM/WB
1
IF/ID
PC
Instruction
memory
In s tru ctio n
$4
WB
$4
M
u
x
Registers
Data
ALU
$2
memory
$2
M
u
x
M
u
x
M
u
x
Forwarding
unit
C lo c k 5
a f te r < 1 >
a f te r < 2 >
sub
and
or
add
$2,
$4,
$4,
$9,
$1,
$2,
$4,
$4,
a dd $9 , $ 4, $2
$3
$5
$2
$2
or $4, . . .
and $4, . . .
ID/EX
10
WB
EX/MEM
10
Control
WB
EX
MEM/WB
1
IF/ID
WB
PC
Instruction
memory
In stru ctio n
$4
M
u
x
Registers
Data
ALU
memory
$2
M
u
x
M
u
x
4
2
M
u
x
Forwarding
unit
C lo c k 6
lw
and
or
add
Slt
$2,
$4,
$8,
$9,
$1,
20($1)
$2, $5
$2, $6
$4, $2
$6, $7
As even a pipeline
dependency goes
backward in time
forwarding will not
solve the hazard
Program
CC 1
execution
order
(in instructions)
lw $2, 20($1)
or $8, $2, $6
IM
CC 2
CC 3
R eg
IM
CC 4
CC 5
DM
Reg
Reg
IM
DM
Reg
IM
CC 6
CC 8
CC 9
R eg
DM
Reg
IM
CC 7
Reg
DM
Reg
Reg
DM
Reg
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/MEM
WB
EX
MEM/WB
WB
Add
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Instruction
memory
Branch
Shift
left 2
MemWrite
Address
Instruction
PC
Add
Add result
RegWrite
Address
Data
memory
Read
data
Write
data
Instruction 16
[150]
Control signals
emanate from
the control
portions of the
pipeline registers
Instruction
[2016]
Instruction
[1511]
Sign
extend
32
ALU
control
0
M
u
x
1
ALUOp
RegDst
MemRead
1
M
u
x
0
if ( ID/EX.MemRead
// if the instruction in the EX stage is a load
and ( ( ID/EX.RegisterRt = IF/ID.RegisterRs )
// and the destination register
or ( ID/EX.RegisterRt = IF/ID.RegisterRt ) ) ) // matches either source register
// of the instruction in the ID stage, then
stall the pipeline
Mechanics of Stalling
does not let the IF/ID register change (disable write!) this will
cause the instruction in the ID stage to repeat, i.e., stall
therefore, the instruction, just behind, in the IF stage must be
stalled as well so hardware does not let the PC change
(disable write!) this will cause the instruction in the IF stage to
repeat, i.e., stall
changes all the EX, MEM and WB control fields in the ID/EX
pipeline register to 0, so effectively the instruction just behind
the load becomes a nop a bubble is said to have been inserted
into the pipeline
note that we cannot turn that instruction into an nop by 0ing all the
bits in the instruction itself recall nop = 000 (32 bits) because
it has already been decoded and control signals generated
IF/IDWrite
Hazard
detection
unit
ID/EX
WB
Control
0
M
u
x
PC
Instruction
memory
Instruction
PCWrite
IF/ID
EX/MEM
WB
EX
MEM/WB
WB
M
u
x
Registers
ALU
Data
memory
M
u
x
IF/ID.RegisterRs
IF/ID.RegisterRt
IF/ID.RegisterRt
Rt
IF/ID.RegisterRd
Rd
ID/EX.RegisterRt
Rs
Rt
M
u
x
EX/MEM.RegisterRd
Forwarding
unit
MEM/WB.RegisterRd
M
u
x
lw
and
or
add
Slt
$2,
$4,
$8,
$9,
$1,
20($1)
$2, $5
$2, $6
$4, $2
$6, $7
and$4, $2, $5
or $8, $2, $6
IM
CC 3
Reg
IM
CC 4
CC 5
DM
Reg
Reg
Reg
IM
IM
CC 6
CC 7
DM
Reg
Reg
DM
CC 8
CC 9
CC 10
Reg
bubble
add$9, $4, $2
IM
DM
Reg
IM
Reg
Reg
DM
Reg
Stall inserted
here
Or, more
accurately
Stalling
Execution
example:
lw
and
or
add
$2,
$4,
$4,
$9,
20($1)
$2, $5
$4, $2
$4, $2
a nd $ 4 , $ 2 , $ 5
lw $ 2 , 2 0 ( $ 1 )
b e fo re < 1 >
Hazard
b e fo re < 3 >
ID/EX.MemRead
detection
unit
b e fo re < 2 >
ID/EX
X
11
IF /ID W rite
WB
M
u
x
Control
EX/MEM
WB
EX
MEM/WB
0
IF/ID
Instruction
memory
PC
Instructio n
P C W rite
$1
M
u
x
X
Registers
ALU
$X
M
u
x
lw
and
or
add
$2,
$4,
$4,
$9,
20($1)
$2, $5
$4, $2
$4, $2
C lo c k 2
WB
X
2
ID/EX.RegisterRt
M
u
x
Forwarding
unit
Data
memory
M
u
x
or $4, $4, $2
lw $ 2 , 2 0 ( $ 1 )
a nd $4 , $ 2, $5
b e fo re < 1 >
b e fo r e < 2 >
Hazard
ID/EX.MemRead
detection
2
unit
ID/EX
5
11
00
I F /ID W rite
WB
M
u
x
Control
EX/MEM
WB
EX
MEM/WB
P C W rite
IF/ID
$2
PC
Instruction
I n structio n
$1
M
u
x
5
Registers
ALU
memory
$5
$X
M
u
x
lw
and
or
add
$2,
$4,
$4,
$9,
C lo c k 3
20($1)
$2, $5
$4, $2
$4, $2
WB
X
2
4
ID/EX.RegisterRt
M
u
x
Forwarding
unit
Data
memory
M
u
x
or $4 , $ 4, $2
a nd $ 4, $2 , $ 5
b u b b le
Hazard
b e fo r e < 1 >
ID/EX.MemRead
detection
lw $ 2 , . . .
unit
ID/EX
5
10
00
IF /ID W r ite
WB
M
u
x
Control
EX/MEM
WB
EX
11
MEM/WB
0
IF/ID
PC
Instruction
I ns tru ctio n
P C W r ite
$2
WB
$2
M
u
x
5
Registers
Data
ALU
memory
$5
memory
$5
M
u
x
lw
and
or
add
$2,
$4,
$4,
$9,
C lo c k 4
20($1)
$2, $5
$4, $2
$4, $2
ID/EX.RegisterRt
M
u
x
Forwarding
unit
M
u
x
or $4, $4, $2
Hazard
lw $ 2 , . . .
ID/EX.MemRead
detection
unit
b u b b le
ID/EX
2
I F / ID W rite
10
10
WB
M
u
x
Control
EX/MEM
0
WB
EX
MEM/WB
0
11
P C W rit e
IF/ID
PC
Instruction
Ins truction
$4
WB
$2
M
u
x
Registers
ALU
memory
$2
$5
Data
memory
M
u
x
M
u
x
lw
and
or
add
$2,
$4,
$4,
$9,
C lo c k 5
20($1)
$2, $5
$4, $2
$4, $2
ID/EX.RegisterRt
M
u
x
Forwarding
unit
a f te r < 1 >
Hazard
unit
ID/EX
10
10
WB
IF / ID W rite
b u b b le
ID/EX.MemRead
detection
4
2
a nd $4 , . . .
or $4, $4, $2
M
u
x
Control
EX/MEM
10
M
WB
EX
MEM/WB
P CW rit e
IF/ID
Instruction
PC
I ns truc t io n
$4
WB
$4
M
u
x
2
Registers
Data
ALU
memory
$2
memory
$2
M
u
x
lw
and
or
add
$2,
$4,
$4,
$9,
C lo c k 6
20($1)
$2, $5
$4, $2
$4, $2
ID/EX.RegisterRt
M
u
x
Forwarding
unit
M
u
x
a f te r < 2 >
a ft e r < 1 >
add $9 , $4, $2
Hazard
or $4, . . .
a nd $ 4 , . . .
ID/EX.MemRead
detection
unit
ID/EX
10
10
I F/ I DW r ite
WB
M
u
x
Control
EX/MEM
10
WB
EX
MEM/WB
0
1
IF/ID
WB
PC
Instruction
Ins tr uct io n
PC W rite
$4
M
u
x
4
Registers
Data
ALU
memory
memory
$2
M
u
x
M
u
x
4
2
lw
and
or
add
$2,
$4,
$4,
$9,
C lo c k 7
20($1)
$2, $5
$4, $2
$4, $2
9
ID/EX.RegisterRt
M
u
x
Forwarding
unit
sequential instructions
Predicting Branch-not-taken:
Misprediction delay
Time (in clock cycles)
Program
execution
CC 1
CC 2
order
(in instructions)
40 beq $1, $3, 7
48 or $13, $6, $2
72 lw $4, 50($7)
IM
CC 3
Reg
IM
CC 4
CC 5
DM
Reg
Reg
IM
DM
Reg
IM
CC 6
CC 8
CC 9
Reg
DM
Reg
IM
CC 7
Reg
DM
Reg
Reg
DM
Reg
Move the branch decision from the MEM stage (as in our
current pipeline) earlier to the ID stage
with the more efficient equality test we can put it in the ID stage
without significantly lengthening this stage remember an objective
of pipeline design is to keep pipeline stages balanced
Flushing on Misprediction
M
u
x
WB
Control
M
u
x
IF/ID
EX/MEM
WB
EX
MEM/WB
WB
Shift
left 2
Registers
PC
M
u
x
Instruction
memory
ALU
M
u
x
Data
memory
M
u
x
Sign
extend
M
u
x
Forwarding
unit
Branch decision is moved from the MEM stage to the ID stage simplified drawing
not showing enhancements to the forwarding and hazard detection units
sub
beq
and
or
add
slt
...
lw
$10,
$1,
$12,
$13,
$14,
$15,
$4,
$3,
$2,
$2,
$4,
$6,
$8
7
$5
$6
$2
$7
$4, 50($7)
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
lw
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
IF
ID
ID
EX
MEM
WB
lw
IF
beq stalled
beq stalled
beq $1, $0, target
ID
EX
IF
ID
MEM
WB
ID
ID
EX
MEM
WB