Beruflich Dokumente
Kultur Dokumente
ASSIGNMENT#3
Givenon:27thMarch2015
Due:3rdApril2015
Lecturer:Dr.G.Chege(gchege@usiu.ac.ke;
AnswerallfourQuestions.
Question1
Inthisquestionweexaminehowlatenciesofindividualcomponentsofthedatapathaffecttheclockcycle
timeoftheentiredatapath,andhowthesecomponentsareutilizedbyinstructions.Forthreeproblemsin
thisexercise,assumethefollowinglatenciesforlogicblocksinthedatapath:
a) WhatistheclockcycletimeiftheonlytypesofinstructionsweneedtosupportareALU
instructions(ADD,AND,etc.)?
i)
ii)
200+90+20+90+20=420ps
750+300+50+250+50=1400ps
b) WhatistheclockcycletimeifweonlyhavetosupportLWinstructions?
i)
ii)
200+90+20+90+250+20=670ps
750+300+50+250+500+50=1900ps
c)WhatistheclockcycletimeifwemustsupportADD,BEQ,LW,andSWinstructions?
In this question the answer applies as in b above. The critical path taken by the lw instruction is the longest
as lw unlike store word will need to write the value retrieved from memory to the register. Store word on the
other hand will only need to write to memory but not returning any values to register hence it is shorter by a
Mux. The longest path for add and bne is shorter by D-Mem.
Question2
TheProblemsinQuestion2andQuestion3refertothefollowinginstructionsequences:
a) Findalldatadependencesineachofthetwoinstructionsequences(a)&(b).
a)
b)
Instruction
Sequence
RAW
WAR
WAW
I1:ADDR1,R2,R1
I2:LWR2,0(R1)
I3:LWR1,4(R1)
I4:ORR3,R1,R2
(R1)I1toI2,I3
(R2)I2toI4
(R1)I3toI4
(R2)I1toI2
(R1)I1,I2toI3
(R1)I1toI3
(R1)I1toI2
(R1)I2toI3
(R1)I1toI2
(R2)I2toI3
(R1)I3toI4
(R1)I1toI2
(R1)I2toI4
I1:LWR1,0(R1)
I2:ANDR1,R1,R2
I3:LWR2,0(R1)
I4:LWR1,0(R3)
b)Findallhazardsineachoftheinstructionsequences(a)&(b)fora5stagepipeline
withandthenwithoutforwarding.
OnlyRAWdependencescanbecomedatahazards.Withforwarding,onlyRAWdependencesfromaloadtotheverynext
instructionbecomehazards.Withoutforwarding,anyRAWdependencefromaninstructiontooneofthe
following3instructionsbecomesahazard:
a)
b)
Instruction
Sequence
With Forwarding
Without
Forwarding
I1:ADDR1,R2,R1
I2:LWR2,0(R1)
I3:LWR1,4(R1)
I4:ORR3,R1,R2
I1:LWR1,0(R1)
I2:ANDR1,R1,R2
I3:LWR2,0(R1)
I4:LWR1,0(R3)
(R1)I3toI4
(R1)I1toI2,I3
(R2)I2toI4
(R1)I3toI4
(R1)I1toI2
(R1)I1toI2
(R1)I2toI3
Question3
Assumethat,beforeanyofthetwosequencesinQuestion2aboveisexecuted,allvaluesindatamemory
arezeroesandthatregistersR0throughR3havethefollowinginitialvalues:
a) Whichvalueisthefirstonetobeforwardedandwhatisthevalueitoverrides?
Instruction Sequence
RAW
I1:ADDR1,R2,R1
I2:LWR2,0(R1)
I3:LWR1,4(R1)
I4:ORR3,R1,R2
(R1)I1toI2(30overrides1)
b) Ifweassumeforwardingwillbeimplementedwhenwedesignthehazarddetectionunit,butthenwe
forgettoactuallyimplementforwarding,whatarethefinalregistervaluesafterthisinstruction
sequence?
A register modification becomes visible to the EX stage of the following instructions only
two cycles after the instruction that produces the register value leaves the EX stage. Our
forwarding-assuming hazard detection unit only adds a one-cycle stall if the instruction
that immediately follows a load is dependent on the load. We have:
Instruction sequence
with forwarding stalls
Execution without
forwarding
I1:ADDR1,R2,R1
I2:LWR2,0(R1)
I3:LWR1,4(R1)
Stall
I4:ORR3,R1,R2
R1=30(Stallandafter)
R2=0(I4andafter)
R1=0(afterI4)
R3=30(afterI4)
R0=0
R1=0
R2=0
R3=30
Question4
Thisquestionisintendedtohelpyouunderstandtherelationshipbetweendelayslots,controlhazards,and
branchexecutioninapipelinedprocessor.Inthisexercise,weassumethatthefollowingMIPScodeis
executedonapipelinedprocessorwitha5stagepipeline,fullforwarding,andapredicttakenbranch
predictor:
a) Drawthepipelineexecutiondiagramforthiscode,assumingtherearenodelayslotsandthatbranches
executeintheEXstage.
Executed Instructions
1
Pipeline Cycles
3
4
5
10
11
12
13
14
a)
LWR2,0(R2)
BEQR2,R0,Label(T)
LWR2,0(R2)
BEQR2,R0,Label(NT)
ORR2,R2,R3
SWR2,0(R5)
IFIDEXMEMWB
IFID***EXMEMWB
IF***IDEXMEBWB
IFID***EXMEMWB
IFIDEXMEMWB
IFIDEXMEMWB
LWR2,0(R1)
BEQR2,R0,Label2(NT)
LWR3,0(R2)
BEQR3,R0,Label1(T)
BEQR2,R0,Label2(T)
SWR1,0(R2)
IFIDEXMEMWB
IFID***EXMEBWB
IFIDEXMEBWB
IFID***EXMEBWB
IF***IDEXMEBWB
IFIDEXMEBWB
b)
b) Repeat4(a)above,butassumethatdelayslotsareused.Inthegivencode,theinstructionthatfollows
thebranchisnowthedelayslotinstructionforthatbranch.
Executed Instructions
1
Pipeline Cycles
3
4
5
10
11
12
13
14
a)
LWR2,0(R2)
BEQR2,R0,Label(T)
ORR2,R2,R3
LWR2,0(R2)
BEQR2,R0,Label(NT)
ORR2,R2,R3
SWR2,0(R5)
IFIDEXMEMWB
IFID***EXMEBWB
IF***IDEXMEBWB
IFID***EXMEMWB
IF***IDEXMEMWB
IFIDEXMEMWB
IFID
EXMEMWB
LW R2,0(R1)
BEQ R2,R0,Label2
(NT)
LW R3,0(R2)
BEQ R3,R0,Label1 (T)
ADD R1,R3,R1
IFIDEXMEMWB
IFID***EXMEMWB
IF***IDEXMEBWB
IFIDEXMEMWB
IFIDEXMEMWB
IFIDEXMEMWB
IFIDEXMEBWB
b)