Hazards PDF

Revisiting Hazards
So far our datapath and control have ignored hazards

We shall revisit data hazards and control hazards and
enhance our datapath and control to handle them in
hardware
Data Hazards and Forwarding

Problem with starting an instruction before previous are finished:
data dependencies that go backward in time called data hazards

Time (in clock cycles)
$2 = 10 before sub;
$2 = -20 after sub
CC 1
Value of
register $2: 10
CC 2
CC 3
CC 4
CC 5
CC 6
CC 7
CC 8
CC 9
10
10
10
10/ 20
20
20
20
20
DM
Reg
Program
execution
order
(in instructions)
sub $2, $1, $3
sub
and
or
add
sw
$2,
$12,
$13,
$14,
$15,
$1, $3
$2, $5
$6, $2
$2, $2
100($2)
and $12, $2, $5
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)
IM
Reg
IM
DM
Reg
IM
DM
Reg
IM
Reg
DM
Reg
IM
Reg
Reg
Reg
DM
Reg
Software Solution
Have compiler guarantee never any data hazards!
by rearranging instructions to insert independent instructions

between instructions that would otherwise have a data hazard
between them,
or, if such rearrangement is not possible, insert nops
sub
$2,
$1, $3
lw
slt
and
or
add
sw
$10, 40($3)
$5, $6, $7
$12, $2, $5
$13, $6, $2
$14, $2, $2
$15, 100($2)
or
sub
$2,
nop
nop
and
or
add
sw
$12,
$13,
$14,
$15,
$1, $3
$2, $5
$6, $2
$2, $2
100($2)
Such compiler solutions may not always be possible, and nops

slow the machine down
MIPS: nop = no operation = 000 (32bits) = sll $0, $0, 0
Hardware Solution:
Forwarding
Idea: use intermediate data, do not wait for result to be
finally written to the destination register. Two steps:
1.
2.
Detect data hazard

Forward intermediate data to resolve hazard
Pipelined Datapath with Control II (as before)

PCSrc
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/ME M
WB
EX
ME M/WB
WB
Add
Add
Instruction
memory
Branch
Shift
left 2
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
W rite
data 2
register
W rite
data
MemtoReg
Address
Add
result
MemWrite
PC
Instruction
RegW rite
Zero
ALU
0
M
u
x
1
ALU
result
Read
data
Address
Data
memory
Write
data
Instruction
16
[15 0]
Control signals
emanate from
the control
portions of the
pipeline registers
Instruction
[20 16]
Instruction
[15 11]
Sign
extend
32
ALU
control
0
M
u
x
1
ALUOp
RegDst
M emRead
1
M
u
x
0
Hazard Detection
Hazard conditions:
1a. EX/MEM.RegisterRd = ID/EX.RegisterRs

1b. EX/MEM.RegisterRd = ID/EX.RegisterRt
2a. MEM/WB.RegisterRd = ID/EX.RegisterRs
2b. MEM/WB.RegisterRd = ID/EX.RegisterRt
Eg., in the earlier example, first hazard between sub $2, $1, $3 and
and $12, $2, $5 is detected when the and is in EX stage and the
sub is in MEM stage because
EX/MEM.RegisterRd = ID/EX.RegisterRs = $2 (1a)
Whether to forward also depends on:
if the later instruction is going to write a register if not, no need to
forward, even if there is register number match as in conditions above

if the destination register of the later instruction is $0 in which case
there is no need to forward value ($0 is always 0 and never overwritten)
Data Forwarding
Plan:
allow inputs to the ALU not just from ID/EX, but also later
pipeline registers, and
use multiplexors and control signals to choose appropriate
inputs to ALU
CC 1
Value of register $2 : 10
Value of EX/MEM : X
Value of MEM/WB : X
CC 2
CC 3
CC 4
CC 5
CC 6
CC 7
CC 8
CC 9
10
X
X
10
X
X
10
20
X
10/20
X
20
20
X
X
20
X
X
20
X
X
20
X
X
DM
Reg
Program
execution order
(in instructions)
sub $2, $1, $3
sub
and
or
add
sw
$2,
$12,
$13,
$14,
$15,
$1, $3
$2, $5
$6, $2
$2, $2
100($2)
and $12, $2, $5
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)
IM
Reg
IM
Reg
IM
DM
Reg
IM
Reg
DM
Reg
IM
Reg
DM
Reg
Reg
DM
Dependencies between pipelines move forward in time
Reg
ID/EX
R e g is te rs
EX/MEM
MEM/WB
ALU
D a ta
m e m o ry
a . N o fo rw a rd in g
M
u
x
Datapath before adding forwarding hardware

ID / E X
E X /M E M
M E M /W B
M
u
x
R e g is te rs
F o r w a rd A
ALU
M
u
x
Rs
Rt
Rt
Rd
D a ta
m e m o ry
F o rw a rd B
M
u
x
E X /M E M .R e g iste rR d
F o rw a rd in g
u n it
b . W ith fo rw a rd in g
M E M /W B .R e g is te rR d
Datapath after adding forwarding hardware
M
u
x
Forwarding Hardware:
Multiplexor Control
Mux control
Source
ForwardA = 00
ForwardA = 10
ForwardA = 01
ID/EX
EX/MEM
MEM/WB
ForwardB = 00
ForwardB = 10
ForwardB = 01
ID/EX
EX/MEM
MEM/WB
Explanation
The first ALU operand comes from the register file
The first ALU operand is forwarded from prior ALU result
The first ALU operand is forwarded from data memory
or an earlier ALU result
The second ALU operand comes from the register file
The second ALU operand is forwarded from prior ALU result
The second ALU operand is forwarded from data memory
or an earlier ALU result
Depending on the selection in the rightmost multiplexor

(see datapath with control diagram)
Data Hazard: Detection and

Forwarding
1.
Forwarding unit determines multiplexor control according to the

following rules:
EX hazard
if (
EX/MEM.RegWrite
// if there is a write
and ( EX/MEM.RegisterRd 0 )
// to a non-$0 register
and ( EX/MEM.RegisterRd = ID/EX.RegisterRs ) ) // which matches, then
ForwardA = 10
if (
EX/MEM.RegWrite
and ( EX/MEM.RegisterRd 0 )
and ( EX/MEM.RegisterRd = ID/EX.RegisterRt ) ) // which matches, then
ForwardB = 10
Data Hazard: Detection and

Forwarding
2.
MEM hazard
if (
MEM/WB.RegWrite
and ( MEM/WB.RegisterRd 0 )
and ( EX/MEM.RegisterRd ID/EX.RegisterRs )
// and not already a register match
// with earlier pipeline register
and ( MEM/WB.RegisterRd = ID/EX.RegisterRs ) ) // but match with later pipeline
register, then
ForwardA = 01
if (
MEM/WB.RegWrite
and ( MEM/WB.RegisterRd 0 )
and ( EX/MEM.RegisterRd ID/EX.RegisterRt )
// and not already a register match
// with earlier pipeline register
and ( MEM/WB.RegisterRd = ID/EX.RegisterRt ) ) // but match with later pipeline
register, then
ForwardB = 01
This check is necessary, e.g., for sequences such as add $1, $1, $2; add $1, $1, $3; add $1, $1, $4;
(array summing?), where an earlier pipeline (EX/MEM) register has more recent data
Forwarding Hardware with Control

Called forwarding unit, not hazard detection unit,
because once data is forwarded there is no hazard!
ID/EX
WB
Control
PC
Instruction
memory
Instruction
IF/ID
EX/MEM
WB
EX
MEM/WB
WB
M
u
x
Registers
ALU
Data
memory
M
u
x
IF/ID.RegisterRs
Rs
IF/ID.RegisterRt
Rt
IF/ID.RegisterRt
Rt
IF/ID.RegisterRd
Rd
M
u
x
EX/MEM.RegisterRd
Forwarding
unit
MEM/WB.RegisterRd
Datapath with forwarding hardware and control wires certain details,

e.g., branching hardware, are omitted to simplify the drawing
Note: so far we have only handled forwarding to R-type instructions!
M
u
x
Forwarding
Execution
example:
sub
and
or
add
$2,
$4,
$4,
$9,
$1,
$2,
$4,
$4,
$3
$5
$2
$2
or $4, $4, $2
and $4 , $2, $5
sub $ 2 , $ 1 , $ 3
b e fo re < 1 >
b e fo re < 2 >
ID/EX
$2,
$4,
$4,
$9,
$1,
$2,
$4,
$4,
$3
$5
$2
$2
10
Control
IF/ID
PC
Instruction
memory
10
WB
In stru ctio n
sub
and
or
add
$2
EX/MEM
WB
EX
WB
$1
M
u
x
5
Registers
ALU
$5
$3
M
u
x
M
u
x
Forwarding
unit
C lo c k 3
MEM/WB
Data
memory
M
u
x
a dd $9 , $ 4, $2
sub
and
or
add
$2,
$4,
$4,
$9,
$1,
$2,
$4,
$4,
or $ 4 , $ 4, $ 2
$3
$5
$2
$2
a nd $4 , $ 2, $5
10
memory
10
WB
EX/MEM
10
Control
I nstr uction
PC
$4
WB
EX
MEM/WB
WB
$2
M
u
x
6
Registers
Data
ALU
$2
memory
$5
M
u
x
M
u
x
Forwarding
unit
C lo c k 4
b e fo r e < 1 >
ID/EX
IF/ID
Instruction
sub $ 2, . . .
M
u
x
a fte r < 1 >
sub
and
or
add
add $9, $4, $2
$2,
$4,
$4,
$9,
$1,
$2,
$4,
$4,
or $4, $4, $2
$3
$5
$2
$2
a nd $ 4, . . .
sub $ 2 , . . .
ID/EX
10
10
WB
EX/MEM
10
Control
WB
EX
MEM/WB
1
IF/ID
PC
Instruction
memory
In s tru ctio n
$4
WB
$4
M
u
x
Registers
Data
ALU
$2
memory
$2
M
u
x
M
u
x
M
u
x
Forwarding
unit
C lo c k 5
a f te r < 1 >
a f te r < 2 >
sub
and
or
add
$2,
$4,
$4,
$9,
$1,
$2,
$4,
$4,
a dd $9 , $ 4, $2
$3
$5
$2
$2
or $4, . . .
and $4, . . .
ID/EX
10
WB
EX/MEM
10
Control
WB
EX
MEM/WB
1
IF/ID
WB
PC
Instruction
memory
In stru ctio n
$4
M
u
x
Registers
Data
ALU
memory
$2
M
u
x
M
u
x
4
2
M
u
x
Forwarding
unit
C lo c k 6
Data Hazards and Stalls
Load word can still cause a hazard:
an instruction tries to read a register following a load instruction that

writes to the same register
lw
and
or
add
Slt
$2,
$4,
$8,
$9,
$1,
20($1)
$2, $5
$2, $6
$4, $2
$6, $7
As even a pipeline
dependency goes
backward in time
forwarding will not
solve the hazard
Program
CC 1
execution
order
(in instructions)
lw $2, 20($1)
and $4, $2, $5
or $8, $2, $6
add $9, $4, $2
slt $1, $6, $7
IM
CC 2
CC 3
R eg
IM
CC 4
CC 5
DM
Reg
Reg
IM
DM
Reg
IM
CC 6
CC 8
CC 9
R eg
DM
Reg
IM
CC 7
Reg
DM
Reg
Reg
DM
therefore, we need a hazard detection unit to stall the pipeline after

the load instruction
Reg
Pipelined Datapath with Control II (as before)

PCSrc
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/MEM
WB
EX
MEM/WB
WB
Add
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Instruction
memory
Branch
Shift
left 2
MemWrite
Address
Instruction
PC
Add
Add result
RegWrite
Address
Data
memory
Read
data
Write
data
Instruction 16
[150]
Control signals
emanate from
the control
portions of the
pipeline registers
Instruction
[2016]
Instruction
[1511]
Sign
extend
32
ALU
control
0
M
u
x
1
ALUOp
RegDst
MemRead
1
M
u
x
0
Hazard Detection Logic to Stall
Hazard detection unit implements the following check if to stall
if ( ID/EX.MemRead
// if the instruction in the EX stage is a load
and ( ( ID/EX.RegisterRt = IF/ID.RegisterRs )
// and the destination register
or ( ID/EX.RegisterRt = IF/ID.RegisterRt ) ) ) // matches either source register
// of the instruction in the ID stage, then
stall the pipeline
Mechanics of Stalling
If the check to stall verifies, then the pipeline needs to stall

only 1 clock cycle after the load as after that the forwarding
unit can resolve the dependency
What the hardware does to stall the pipeline 1 cycle:
does not let the IF/ID register change (disable write!) this will
cause the instruction in the ID stage to repeat, i.e., stall
therefore, the instruction, just behind, in the IF stage must be
stalled as well so hardware does not let the PC change
(disable write!) this will cause the instruction in the IF stage to
repeat, i.e., stall
changes all the EX, MEM and WB control fields in the ID/EX
pipeline register to 0, so effectively the instruction just behind
the load becomes a nop a bubble is said to have been inserted
into the pipeline
note that we cannot turn that instruction into an nop by 0ing all the
bits in the instruction itself recall nop = 000 (32 bits) because
it has already been decoded and control signals generated
Hazard Detection Unit

ID/EX.MemRead
IF/IDWrite
Hazard
detection
unit
ID/EX
WB
Control
0
M
u
x
PC
Instruction
memory
Instruction
PCWrite
IF/ID
EX/MEM
WB
EX
MEM/WB
WB
M
u
x
Registers
ALU
Data
memory
M
u
x
IF/ID.RegisterRs
IF/ID.RegisterRt
IF/ID.RegisterRt
Rt
IF/ID.RegisterRd
Rd
ID/EX.RegisterRt
Rs
Rt
M
u
x
EX/MEM.RegisterRd
Forwarding
unit
MEM/WB.RegisterRd
Datapath with forwarding hardware, the hazard detection unit and

controls wires certain details, e.g., branching hardware are omitted
to simplify the drawing
M
u
x
Stalling Resolves a Hazard
Same instruction sequence as before for which forwarding by

itself could not resolve the hazard:
Program
Time (inclock cycles)
execution
CC 1
CC 2
order
(ininstructions)
lw$2, 20($1)
lw
and
or
add
Slt
$2,
$4,
$8,
$9,
$1,
20($1)
$2, $5
$2, $6
$4, $2
$6, $7
and$4, $2, $5
or $8, $2, $6
IM
CC 3
Reg
IM
CC 4
CC 5
DM
Reg
Reg
Reg
IM
IM
CC 6
CC 7
DM
Reg
Reg
DM
CC 8
CC 9
CC 10
Reg
bubble
add$9, $4, $2
slt $1, $6, $7
IM
DM
Reg
IM
Reg
Reg
DM
Reg
Hazard detection unit inserts a 1-cycle bubble in the pipeline, after

which all pipeline register dependencies go forward so then the
forwarding unit can handle them and there are no more hazards
Stall/Bubble in the Pipeline
Stall inserted
here
Stall/Bubble in the Pipeline
Or, more
accurately
Stalling
Execution
example:
lw
and
or
add
$2,
$4,
$4,
$9,
20($1)
$2, $5
$4, $2
$4, $2
a nd $ 4 , $ 2 , $ 5
lw $ 2 , 2 0 ( $ 1 )
b e fo re < 1 >
Hazard
b e fo re < 3 >
ID/EX.MemRead
detection
unit
b e fo re < 2 >
ID/EX
X
11
IF /ID W rite
WB
M
u
x
Control
EX/MEM
WB
EX
MEM/WB
0
IF/ID
Instruction
memory
PC
Instructio n
P C W rite
$1
M
u
x
X
Registers
ALU
$X
M
u
x
lw
and
or
add
$2,
$4,
$4,
$9,
20($1)
$2, $5
$4, $2
$4, $2
C lo c k 2
WB
X
2
ID/EX.RegisterRt
M
u
x
Forwarding
unit
Data
memory
M
u
x
or $4, $4, $2
lw $ 2 , 2 0 ( $ 1 )
a nd $4 , $ 2, $5
b e fo re < 1 >
b e fo r e < 2 >
Hazard
ID/EX.MemRead
detection
2
unit
ID/EX
5
11
00
I F /ID W rite
WB
M
u
x
Control
EX/MEM
WB
EX
MEM/WB
P C W rite
IF/ID
$2
PC
Instruction
I n structio n
$1
M
u
x
5
Registers
ALU
memory
$5
$X
M
u
x
lw
and
or
add
$2,
$4,
$4,
$9,
C lo c k 3
20($1)
$2, $5
$4, $2
$4, $2
WB
X
2
4
ID/EX.RegisterRt
M
u
x
Forwarding
unit
Data
memory
M
u
x
or $4 , $ 4, $2
a nd $ 4, $2 , $ 5
b u b b le
Hazard
b e fo r e < 1 >
ID/EX.MemRead
detection
lw $ 2 , . . .
unit
ID/EX
5
10
00
IF /ID W r ite
WB
M
u
x
Control
EX/MEM
WB
EX
11
MEM/WB
0
IF/ID
PC
Instruction
I ns tru ctio n
P C W r ite
$2
WB
$2
M
u
x
5
Registers
Data
ALU
memory
$5
memory
$5
M
u
x
lw
and
or
add
$2,
$4,
$4,
$9,
C lo c k 4
20($1)
$2, $5
$4, $2
$4, $2
ID/EX.RegisterRt
M
u
x
Forwarding
unit
M
u
x
add $9, $4, $2
or $4, $4, $2
and $4, $2, $5
Hazard
lw $ 2 , . . .
ID/EX.MemRead
detection
unit
b u b b le
ID/EX
2
I F / ID W rite
10
10
WB
M
u
x
Control
EX/MEM
0
WB
EX
MEM/WB
0
11
P C W rit e
IF/ID
PC
Instruction
Ins truction
$4
WB
$2
M
u
x
Registers
ALU
memory
$2
$5
Data
memory
M
u
x
M
u
x
lw
and
or
add
$2,
$4,
$4,
$9,
C lo c k 5
20($1)
$2, $5
$4, $2
$4, $2
ID/EX.RegisterRt
M
u
x
Forwarding
unit
a f te r < 1 >
add $9, $4, $2
Hazard
unit
ID/EX
10
10
WB
IF / ID W rite
b u b b le
ID/EX.MemRead
detection
4
2
a nd $4 , . . .
or $4, $4, $2
M
u
x
Control
EX/MEM
10
M
WB
EX
MEM/WB
P CW rit e
IF/ID
Instruction
PC
I ns truc t io n
$4
WB
$4
M
u
x
2
Registers
Data
ALU
memory
$2
memory
$2
M
u
x
lw
and
or
add
$2,
$4,
$4,
$9,
C lo c k 6
20($1)
$2, $5
$4, $2
$4, $2
ID/EX.RegisterRt
M
u
x
Forwarding
unit
M
u
x
a f te r < 2 >
a ft e r < 1 >
add $9 , $4, $2
Hazard
or $4, . . .
a nd $ 4 , . . .
ID/EX.MemRead
detection
unit
ID/EX
10
10
I F/ I DW r ite
WB
M
u
x
Control
EX/MEM
10
WB
EX
MEM/WB
0
1
IF/ID
WB
PC
Instruction
Ins tr uct io n
PC W rite
$4
M
u
x
4
Registers
Data
ALU
memory
memory
$2
M
u
x
M
u
x
4
2
lw
and
or
add
$2,
$4,
$4,
$9,
C lo c k 7
20($1)
$2, $5
$4, $2
$4, $2
9
ID/EX.RegisterRt
M
u
x
Forwarding
unit
Control (or Branch) Hazards
Problem with branches in the pipeline we have so far is that the

branch decision is not made till the MEM stage so what
instructions, if at all, should we insert into the pipeline following

the branch instructions?
Possible solution: stall the pipeline till branch decision is known
not efficient, slow the pipeline significantly!
Another solution: predict the branch outcome
e.g., always predict branch-not-taken continue with next
sequential instructions
if the prediction is wrong have to flush the pipeline behind the

branch discard instructions already fetched or decoded and
continue execution at the branch target
Predicting Branch-not-taken:
Misprediction delay
Program
execution
CC 1
CC 2
order
(in instructions)
40 beq $1, $3, 7
44 and $12, $2, $5
48 or $13, $6, $2
52 add $14, $2, $2
72 lw $4, 50($7)
IM
CC 3
Reg
IM
CC 4
CC 5
DM
Reg
Reg
IM
DM
Reg
IM
CC 6
CC 8
CC 9
Reg
DM
Reg
IM
CC 7
Reg
DM
Reg
Reg
DM
Reg
The outcome of branch taken (prediction wrong) is decided only when

beq is in the MEM stage, so the following three sequential instructions
already in the pipeline have to be flushed and execution resumes at lw
Optimizing the Pipeline to Reduce Branch Delay
Move the branch decision from the MEM stage (as in our
current pipeline) earlier to the ID stage
calculating the branch target address involves moving the
branch adder from the MEM stage to the ID stage inputs to

this adder, the PC value and the immediate fields are already
available in the IF/ID pipeline register
calculating the branch decision is efficiently done, e.g., for
equality test, by XORing respective bits and then ORing all the
results and inverting, rather than using the ALU to subtract and
then test for zero (when there is a carry delay)
with the more efficient equality test we can put it in the ID stage
without significantly lengthening this stage remember an objective
of pipeline design is to keep pipeline stages balanced
we must correspondingly make additions to the forwarding and

hazard detection units to forward to or stall the branch at the ID
stage in case the branch decision depends on an earlier result
Flushing on Misprediction
Same strategy as for stalling on load-use data hazard

Zero out all the control values (or the instruction itself) in
pipeline registers for the instructions following the branch that
are already in the pipeline effectively turning them into nops
so they are flushed
in the optimized pipeline, with branch decision made in the ID

stage, we have to flush only one instruction in the IF stage the
branch delay penalty is then only one clock cycle
Optimized Datapath for Branch

IF.Flush
Hazard
detection
unit
ID/EX
M
u
x
WB
Control
M
u
x
IF/ID
EX/MEM
WB
EX
MEM/WB
WB
Shift
left 2
Registers
PC
IF.Flush control zeros out the instruction in the IF/ID

pipeline register (which follows the branch)
M
u
x
Instruction
memory
ALU
M
u
x
Data
memory
M
u
x
Sign
extend
M
u
x
Forwarding
unit
Branch decision is moved from the MEM stage to the ID stage simplified drawing
not showing enhancements to the forwarding and hazard detection units
Reducing Branch Delay
Move hardware to determine outcome to ID

stage
Target address adder

Register comparator
Example: branch taken

36:
40:
44:
48:
52:
56:
72:
sub
beq
and
or
add
slt
...
lw
$10,
$1,
$12,
$13,
$14,
$15,
$4,
$3,
$2,
$2,
$4,
$6,
$8
7
$5
$6
$2
$7
$4, 50($7)
Example: Branch Taken
Example: Branch Taken
Data Hazards for Branches
If a comparison register is a destination

of 2nd or 3rd preceding ALU instruction
add $1, $2, $3
IF
add $4, $5, $6
beq $1, $4, target
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
Can resolve using forwarding
WB

of preceding ALU instruction or 2nd
preceding load instruction
lw
Need 1 stall cycle

$1, addr
IF
add $4, $5, $6

beq stalled
beq $1, $4, target
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
IF
ID
ID
EX
MEM
WB

of immediately preceding load
instruction
lw
Need 2 stall cycles

$1, addr
IF
beq stalled
beq stalled
beq $1, $0, target
ID
EX
IF
ID
MEM
WB
ID
ID
EX
MEM
WB
Simple Example: Comparing

Performance
Compare performance for single-cycle, multicycle, and pipelined

datapaths using the gcc instruction mix
assume 2 ns for memory access, 2 ns for ALU operation, 1 ns

for register read or write
assume gcc instruction mix 23% loads, 13% stores, 19%
branches, 2% jumps, 43% ALU
for pipelined execution assume
50% of the loads are followed immediately by an instruction that

uses the result of the load
25% of branches are mispredicted
branch delay on misprediction is 1 clock cycle
jumps always incur 1 clock cycle delay so their average time is 2
clock cycles
Simple Example: Comparing

Performance
Single-cycle : average instruction time 8 ns

Multicycle : average instruction time 8.04 ns
Pipelined:
loads use 1 cc (clock cycle) when no load-use dependency and 2 cc
when there is dependency given 50% of loads are followed by
dependency the average cc per load is 1.5
stores use 1 cc each
branches use 1 cc when predicted correctly and 2 cc when not

given 25% misprediction average cc per branch is 1.25
jumps use 2 cc each
ALU instructions use 1 cc each
therefore, average CPI is

1.5 23% + 1 13% + 1.25 19% + 2 2% + 1 43% = 1.18
therefore, average instruction time is 1.18 2 = 2.36 ns

Hazards PDF

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Hazards PDF

Hochgeladen von

Copyright:

Verfügbare Formate

Revisiting Hazards

So far our datapath and control have ignored hazards

Data Hazards and Forwarding

data dependencies that go backward in time called data hazards

and $12, $2, $5

add $14, $2, $2

Have compiler guarantee never any data hazards!

by rearranging instructions to insert independent instructions

Such compiler solutions may not always be possible, and nops

Detect data hazard

Pipelined Datapath with Control II (as before)

1a. EX/MEM.RegisterRd = ID/EX.RegisterRs

EX/MEM.RegisterRd = ID/EX.RegisterRs = $2 (1a)

Whether to forward also depends on:

if the later instruction is going to write a register if not, no need to

forward, even if there is register number match as in conditions above

and $12, $2, $5

add $14, $2, $2

Dependencies between pipelines move forward in time

Datapath before adding forwarding hardware

Datapath after adding forwarding hardware

Depending on the selection in the rightmost multiplexor

Data Hazard: Detection and

Forwarding unit determines multiplexor control according to the

Data Hazard: Detection and

Forwarding Hardware with Control

Datapath with forwarding hardware and control wires certain details,

a fte r < 1 >

add $9, $4, $2

Data Hazards and Stalls

Load word can still cause a hazard:

an instruction tries to read a register following a load instruction that

and $4, $2, $5

add $9, $4, $2

slt $1, $6, $7

therefore, we need a hazard detection unit to stall the pipeline after

Pipelined Datapath with Control II (as before)

Hazard Detection Logic to Stall

Hazard detection unit implements the following check if to stall

If the check to stall verifies, then the pipeline needs to stall

Hazard Detection Unit

Datapath with forwarding hardware, the hazard detection unit and

Stalling Resolves a Hazard

Same instruction sequence as before for which forwarding by

slt $1, $6, $7

Hazard detection unit inserts a 1-cycle bubble in the pipeline, after

Stall/Bubble in the Pipeline

Stall/Bubble in the Pipeline

add $9, $4, $2

and $4, $2, $5

add $9, $4, $2

Control (or Branch) Hazards

Problem with branches in the pipeline we have so far is that the

instructions, if at all, should we insert into the pipeline following

Possible solution: stall the pipeline till branch decision is known

not efficient, slow the pipeline significantly!

Another solution: predict the branch outcome

e.g., always predict branch-not-taken continue with next

if the prediction is wrong have to flush the pipeline behind the

continue execution at the branch target

44 and $12, $2, $5

52 add $14, $2, $2