Beruflich Dokumente
Kultur Dokumente
RISC-V CPU:
Datapath and Control Unit
datapath control
26
§4.3 Building a Datapath
Building a Datapath
• Datapath
– Elements that process data and addresses
in the CPU
• Registers, ALUs, mux’s, memories, …
28
Recall: Memory is Byte-Addressed
• What was the smallest data type we saw in C?
Assume here addr of lowest
– A char, which was a byte (8 bits) byte in word is addr of word
– Everything in multiples of 8 bits
(e.g. 1 word = 4 bytes) … …… … …
• Memory addresses are indexed 12 13 143
word 15
by bytes, not words 8 9 102
word 11
• Word addresses are 4 bytes apart 4 5 61
word 7
– Word addr is same as left-most byte 0 1 20
word 3
– Addrs must be multiples of 4 to be “word-aligned”
• Pointer arithmetic not done for you in assembly
– Must take data size into account yourself
29
Instruction Fetch
Increment by
4 for next
32-bit instruction
register
Basic Phases of Instruction Execution
rd
PC
Reg[]
rs1
IMEM
ALU
DMEM
rs2
+ imm
4
mux
Clock
time 31
State Required by RV32I ISA
Each instruction reads and updates this state during execution:
• Registers (x0..x31)
− Register file (or regfile) Reg holds 32 registers x 32 bits/register: Reg[0].. Reg[31]
− First register read specified by rs1 field in instruction
− Second register read specified by rs2 field in instruction
− Write register (destination) specified by rd field in instruction
− x0 is always 0 (writes to Reg[0]are ignored)
• Program Counter (PC)
− Holds address of current instruction
• Memory (MEM)
− Holds both instructions & data, in one 32-bit byte-addressed memory space
− We’ll use separate memories for instructions (IMEM) and data (DMEM)
▪ Later we’ll replace these with instruction and data caches
− Instructions are read (fetched) from instruction memory (assume IMEM read-only)
− Load/store instructions access data memory
33
Agenda
• Datapath Overview
• Assembling the Datapath Part 1
• Processor Design Process
• Assembling the Datapath Part 2
35
R-Format Instructions
• Read two register operands
• Perform arithmetic/logical operation
• Write register result
Implementing the add instruction
37
Datapath Walkthroughs (1/3)
• add x3,x1,x2 # r3 = r1+r2
1) IF: fetch this instruction, increment PC
2) ID: decode as add
then read R[1] and R[2]
3) EX: add the two values retrieved in ID
4) MEM: idle (not using memory)
5) WB: write result of EX into R[3]
38
Instruction Fetch
Increment by
4 for next
32-bit instruction
register
Example: add Instruction
add x3,x1,x2
R[1] + R[2]
R[1]
registers
3
instruction
memory
PC
memory
1
Data
ALU
2 R[2]
imm
+4
MUX
40
Datapath for add
+4 Reg[]
DataD Reg[rs1]
pc inst[11:7] alu
pc+4
IMEM AddrD
inst[19:15] AddrA DataA Reg[rs2]
+
inst[24:20] AddrB DataB
inst[31:0] RegWriteEnable
(RegWEn)
Control Logic
41
Timing Diagram for add
+4 Reg[]
DataD Reg[rs1]
pc inst[11:7] alu
pc+4 IMEM AddrD
inst[19:15] AddrA DataA Reg[rs2]
+
inst[24:20] AddrB DataB
inst[31:0]
RegWEn
clock
time
Clock
PC 1000 1004
43
Datapath for add/sub
+4 Reg[]
DataD Reg[rs1]
ALU
pc IMEM
inst[11:7]
AddrD alu
pc+4 inst[19:15] AddrA DataA Reg[rs2]
inst[24:20] AddrB DataB
Control Logic
44
Implementing other R-Format instructions
45
Implementing the addi instruction
• RISC-V Assembly Instruction:
addi x15,x1,-50
46
Datapath for add/sub
+4 Reg[]
DataD Reg[rs1]
ALU
pc IMEM
inst[11:7]
AddrD alu
pc+4 inst[19:15] AddrA DataA Reg[rs2]
inst[24:20] AddrB DataB
Control Logic
47
Adding addi to datapath
+4 Reg[]
DataD
ALU
pc IMEM
inst[11:7]
AddrD Reg[rs1] alu
pc+4 inst[19:15] AddrA DataA 0
Reg[rs2]
inst[24:20] AddrB DataB 1
inst[31:20]
Imm. imm[31:0]
Gen
Control Logic
48
I-Format immediates
inst[31:0]
------inst[31]-(sign-extension)------- inst[30:20]
imm[31:0]
inst[31:20] imm[31:0]
Imm.
Gen • High 12 bits of instruction (inst[31:20]) copied to low 12 bits
of immediate (imm[11:0])
• Immediate is sign-extended by copying value of inst[31] to
ImmSel=I fill the upper 20 bits of the immediate value (imm[31:12])
49
Adding addi to datapath
+4 Reg[]
DataD
ALU
pc IMEM
inst[11:7]
AddrD Reg[rs1] alu
pc+4 inst[19:15] AddrA DataA 0
Reg[rs2]
inst[24:20] AddrB DataB 1
Control Logic
50
Load Instruction Datapath
Load Instructions are also I-Type
31 0
imm[11:0] rs1 func3 rd opcode
56
All RV32 Load Instructions
59
Store Instruction Datapath
So Far: Adding addi to datapath
+4 Reg[]
DataD
ALU
pc IMEM inst[11:7]
AddrD Reg[rs1] alu
pc+4 inst[19:15] AddrA DataA 0
Reg[rs2]
inst[24:20] AddrB DataB 1
inst[31:20]Imm.
imm[31:0]
Gen
Control Logic
61
Adding lw to datapath
ALU
+4 Reg[]
wb
DataD
ALU DMEM 1
pc IMEM inst[11:7]
AddrD Reg[rs1]
Addr wb
pc+4 inst[19:15] AddrA DataA 0
DataR 0
Reg[rs2] mem
inst[24:20] AddrB DataB 1
inst[31:20]Imm.
imm[31:0]
Gen
62
Adding lw to datapath
alu
+4 Reg[]
wb
DataD
Reg[rs1] ALU DMEM 1
pc IMEM inst[11:7]
AddrD
Reg[rs2] Addr wb
DataR 0
pc+4 inst[19:15] AddrA DataA 0
mem
inst[24:20] AddrB DataB 1
inst[31:20]Imm.
imm[31:0]
Gen
63
S-Format Used for Stores
• Store needs to read two registers, rs1 for base memory
address, and rs2 for data to be stored, as well as need
immediate offset!
• Can’t have both rs2 and immediate in same place as other
instructions!
• Note: stores don’t write a value to the register file, no rd!
• RISC-V design decision is move low 5 bits of immediate to
where rd field was in other instructions – keep rs1/rs2
fields in same place
• register names more critical than immediate bits in hardware
design
31 0
imm[11:5] rs2 rs1 func3 imm[4:0] opcode
64
S-Format Example
sw x14, 8(x2)
31 0
imm[11:5] rs2 rs1 func3 imm[4:0] opcode
65
All RV32 Store Instructions
66
Implementing Store Word
instruction
• RISC-V Assembly Instruction:
sw x14, 8(x2)
ALU
+4 Reg[]
wb
DataD
ALU DMEM 1
pc IMEM inst[11:7]
AddrD Reg[rs1]
Addr wb
pc+4 inst[19:15] AddrA DataA 0
DataR 0
Reg[rs2] mem
inst[24:20] AddrB DataB 1
inst[31:20]Imm.
imm[31:0]
Gen
68
Adding sw to datapath
ALU
+4 Reg[]
wb
DataD
Reg[rs1] ALU DMEM
1
pc IMEM inst[11:7]
AddrD
pc+4 Reg[rs2] Addr wb
DataR 0
inst[19:15] AddrA DataA 0
DataW mem
inst[24:20] AddrB DataB 1
69
Adding sw to datapath
ALU
+4 Reg[]
wb
DataD
Reg[rs1] ALU DMEM
1
pc IMEM inst[11:7]
AddrD
pc+4 Reg[rs2] Addr wb
DataR 0
inst[19:15] AddrA DataA 0
DataW mem
inst[24:20] AddrB DataB 1
*= “Don’t Care”
70
Recommended Reading
• RISC-V Edition - Computer Organization and Design_
The Hardware Software Interface - David A. Patterson,
John L. Hennessy:
− Chapter-4:
▪ 4.1 - 4.5
Acknowledgements
• The slides used in this lecture contain/adapt
materials/illustrations developed by:
− Prof. David A. Patterson and Prof. John L. Hennessy [UC
Berkely]
− Steven Ho and Nick Riasanovsky[UC Berkely]
Break!
73
I-Format immediates
inst[31:0]
------inst[31]-(sign-extension)------- inst[30:20]
imm[31:0]
inst[31:20] imm[31:0]
Imm.
Gen • High 12 bits of instruction (inst[31:20]) copied to low 12 bits
of immediate (imm[11:0])
• Immediate is sign-extended by copying value of inst[31] to
ImmSel=I fill the upper 20 bits of the immediate value (imm[31:12])
74
I & S Immediate Generator
inst[31:0]
31 25 24 20 19 15 14 12 11 7 6 0
5
5
1 6
I S
datapath control
+4 Reg[] ALU
wb
DataD
Reg[rs1] ALU DMEM
pc IMEM inst[11:7]
AddrD
1
pc+4 Reg[rs2] Addr wb
DataR 0
inst[19:15] AddrA DataA 0
DataW mem
inst[24:20] AddrB DataB 1
78
Implementing Branches
79
Implementing Branches
Just
re-routes
wires
Sign-bit wire
replicated
80
Full Datapath
Alternatively:
Adding branches to datapath
+4 Reg[] pc
ALU
wb 1
DataD
alu 1 Reg[rs1] ALU DMEM
0
0
pc IMEM inst[11:7]
AddrD
1
Reg[rs2] Addr wb
pc+4 Branch DataR 0
inst[19:15] AddrA DataA 0
Comp. DataW mem
inst[24:20] AddrB DataB 1
PCSel inst[31:0] ImmSel RegWEn BrUn BrEq BrLT BSel ASel ALUSel MemRW WBSel
82
Adding branches to datapath
+4 Reg[] pc
alu
alu
wb 1
DataD
1 Reg[rs1] ALU DMEM
0
0
pc IMEM inst[11:7]
AddrD
1
Reg[rs2] Addr wb
Branch DataR 0
pc+4 inst[19:15] AddrA DataA 0
Comp. DataW mem
inst[24:20] AddrB DataB 1
PCSel=taken/not-taken inst[31:0] ImmSel=B RegWEn=0 BrUn BrEq BrLT Bsel=1 ASel=1 MemRW=Read WBSel=*
ALUSel=Add
83
Branch Comparator
A
• BrEq = 1, if A=B
Branch
Comp. • BrLT = 1, if A < B
B
• BrUn =1 selects unsigned comparison
for BrLT, 0=signed
84
Multiply Branch Immediates by
Shift?
• 12-bit immediate encodes PC-relative offset of -4096 to +4094 bytes in multiples of 2
bytes
• Standard approach: treat immediate as in range -2048..+2047, then shift left by 1 bit to
multiply by 2 for branches
Each instruction immediate bit can appear in one of two places in output immediate value –
so need one 2-way mux per bit
85
RISC-V Branch Immediates
• 12-bit immediate encodes PC-relative offset of -4096 to +4094 bytes in multiples of 2
bytes
• RISC-V approach: keep 11 immediate bits in fixed position in output value, and rotate LSB
of S-format to be bit 12 of B-format
Only one bit changes position between S and B, so only need a single-bit 2-way mux
86
RISC-V Immediate Encoding
Instruction Encodings, inst[31:0]
88
Processor Design Process
• Five steps to design a processor:
1. Analyze instruction set →
datapath requirements Processor
Input
2. Select set of datapath Control
components & establish Memory
Now
clock methodology
Datapath
3. Assemble datapath components Output
to meet the requirements
4. Analyze implementation of each instruction to determine
setting of control points that affect the register transfer
5. Assemble the control logic
• Formulate Logic Equations
• Design Circuits
89
Step 1: Requirements of the Instruction Set
• Memory (MEM)
– Instructions & data (separate: in reality just caches)
– Load from and store to
• Registers (32 32-bit regs)
– Read rs1 and rs2
– Write rd
• PC
– Add 4 (+ maybe extended immediate)
• Add/Sub/OR unit for operation on register(s) or
extended immediate
– Compare if registers equal?
90
Storage Element: Idealized Memory
• Memory (idealized) Write Enable Address
– One input bus: Data In
Data In DataOut
– One output bus: Data Out
32 32
• Memory access: CLK
– Read: Write Enable = 0, data at Address is placed on
Data Out
– Write: Write Enable = 1, Data In written to Address
• Clock input (CLK)
– CLK input is a factor ONLY during write operation
– During read, behaves as a combinational logic block:
Address valid → Data Out valid after “access time”
91
Storage Element: Register
Write Enable
• Similar to D flip-flop except:
– N-bit input and output buses Data In Data Out
• Write Enable:
CLK
– De-asserted (0): Data Out will not change
– Asserted (1): Data In value placed onto Data Out
after CLK trigger
92
Storage Element: Register File
RW RA RB
Write Enable 5 5 5
• Register File consists of 32 registers:
busA
– Output buses busA and busB busW 32
32 x 32-bit
– Input bus busW 32 Registers busB
• Register selection Clk 32
– Place data of register RA (number) onto busA
– Place data of register RB (number) onto busB
– Store data on busW into register RW (number) when Write
Enable is 1
• Clock input (CLK)
– CLK input is a factor ONLY during write operation
– During read, behaves as a combinational logic block:
RA or RB valid → busA or busB valid after “access time”
93
Step 2: CPU Clocking
• For each instruction, how do we control the flow of
information through the datapath?
• Single Cycle CPU: All stages of an instruction
completed within one long clock cycle
– Clock cycle sufficiently long to allow each instruction to
complete all stages without interruption within one cycle
94
Step 3: Assembling the Datapath
• Assemble datapath to meet ISA requirements
– Exact requirements will change based on ISA
– Here we must examine each instruction of RISC
• The datapath is all of the hardware
components and wiring necessary to carry out
ALL of the different instructions
– Make sure all components (e.g. RegFile, ALU) have
access to all necessary signals and buses
– Control will make sure instructions are properly
executed (the decision making)
95
Break!
96
Processor Design Process
• Five steps to design a processor:
1. Analyze instruction set → Processor
datapath requirements Input
Control
2. Select set of datapath Memory
components & establish
clock methodology Datapath Output
3. Assemble datapath meeting
the requirements
Now
107
Summary (1/2)
• Five steps to design a processor:
1) Analyze instruction set → Processor
datapath requirements Input
Control
2) Select set of datapath
Memory
components & establish
clock methodology Datapath Output
3) Assemble datapath meeting
the requirements
4) Analyze implementation of each instruction to determine
setting of control points that effects the register transfer
5) Assemble the control logic
• Formulate Logic Equations
• Design Circuits
108
Summary (2/2)
• Determining control signals
– Any time a datapath element has an input that
changes behavior, it requires a control signal
(e.g. ALU operation, read/write)
– Any time you need to pass a different input based
on the instruction, add a MUX with a control
signal as the selector
(e.g. next PC, ALU input, register to write to)
• Your control signals will change based on your
exact datapath
• Your datapath will change based on your ISA 109
And in Conclusion, …
• Universal datapath
− Capable of executing all RISC-V instructions in one cycle each
− Not all units (hardware) used by all instructions
• 5 Phases of execution
− IF, ID, EX, MEM, WB
− Not all instructions are active in all phases
• Controller specifies how to execute instructions
− what new instructions can be added with just most control?
110
Recommended Reading
• RISC-V Edition - Computer Organization and Design_ The
Hardware Software Interface - David A. Patterson, John
L. Hennessy:
– Chapter-4:
4.5
4.6 -> Graphically representing pipeline
111
Acknowledgements
• The slides used in this lecture contain/adapt
materials/illustrations developed by:
– Prof. David A. Patterson and Prof. John L. Hennessy [UC
Berkely]
– Steven Ho and Nick Riasanovsky[UC Berkely]
112
THANK YOU