Beruflich Dokumente
Kultur Dokumente
Microarchitecture
http://www.syssec.ethz.ch/education/Digitaltechnik_17
1
Adapted from Digital Design and Computer Architecture, David Money Harris & Sarah L. Harris ©2007 Elsevier
Carnegie Mellon
In This Lecture
Overview of the MIPS Microprocessor
Single-Cycle Processor
Performance Analysis
2
Carnegie Mellon
Introduction Application
Software
programs
Operating
Microarchitecture: Systems
device drivers
Micro- datapaths
Processor: architecture controllers
▪ Datapath: functional blocks
adders
▪ Control: control signals Logic
memories
Analog amplifiers
Circuits filters
transistors
Devices
diodes
Physics electrons
3
Carnegie Mellon
Microarchitecture
Multiple implementations for a single architecture:
4
Carnegie Mellon
5
Carnegie Mellon
6
Carnegie Mellon
7
Carnegie Mellon
8
Carnegie Mellon
9
Carnegie Mellon
Main Memory
10
Carnegie Mellon
// […]
11
Carnegie Mellon
12
Carnegie Mellon
Other fields:
▪ op: the operation code or opcode (0 for R-type instructions)
▪ funct: the function together, the opcode and function tell the
computer what operation to perform
▪ shamt: the shift amount for shift instructions, otherwise it’s 0
R-Type
op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
13
Carnegie Mellon
R-Type Examples
Assembly Code Field Values
op rs rt rd shamt funct
Machine Code
op rs rt rd shamt funct
15
Carnegie Mellon
Register File
input [4:0] a_rs, a_rt, a_rd;
input [31:0] di_rd;
input we_rd;
output [31:0] do_rs, do_rt;
// Circuit description
assign do_rs = R_arr[a_rs]; // Read RS
16
Carnegie Mellon
Register File
input [4:0] a_rs, a_rt, a_rd;
input [31:0] di_rd;
input we_rd;
output [31:0] do_rs, do_rt;
17
Carnegie Mellon
18
Carnegie Mellon
19
Carnegie Mellon
F2
N 010 A+B
011 not used
Cout + 100 A & ~B
[N-1] S
101 A | ~B
Extend
Zero
N N N N 110 A-B
1
0
3
F1:0
2
111 SLT
N
Y
20
Carnegie Mellon
23
Carnegie Mellon
So far so good
We have most of the components for MIPS
▪ We still do not have data memory
24
Carnegie Mellon
So far so good
We have most of the components for MIPS
▪ We still do not have data memory
25
Carnegie Mellon
So far so good
We have most of the components for MIPS
▪ We still do not have data memory
26
Carnegie Mellon
Data Memory
Will be used to store the bulk of data
// Circuit description
assign do = M_arr[addr]; // Read memory
27
Carnegie Mellon
28
Carnegie Mellon
I-Type
Immediate-type: 3 operands:
▪ rs, rt: register operands
▪ imm: 16-bit two’s complement immediate
Other fields:
▪ op: the opcode
▪ Simplicity favors regularity: all instructions have opcode
▪ Operation is completely determined by the opcode
I-Type
op rs rt imm
6 bits 5 bits 5 bits 16 bits
29
Carnegie Mellon
Changes to ALU
▪ Used to calculate memory address, add function needed during
non R-type operations
▪ Result is also used as address for the data memory
▪ One input is no longer RT, but the immediate value
30
Carnegie Mellon
31
Carnegie Mellon
32
Carnegie Mellon
34
Carnegie Mellon
35
Carnegie Mellon
36
Carnegie Mellon
37
Carnegie Mellon
38
Carnegie Mellon
39
Carnegie Mellon
40
Carnegie Mellon
41
Carnegie Mellon
43
Carnegie Mellon
44
Carnegie Mellon
JTA 0000 0000 0100 0000 0000 0000 1010 0000 (0x004000A0)
26-bit addr 0000 0000 0100 0000 0000 0000 1010 0000 (0x0100028)
0 1 0 0 0 2 8
45
Carnegie Mellon
Jump: J
From Appendix B:
Opcode: 000010
PC= JTA
JTA = Jump Target Address
= {(PC+4)[31:28], addr,2’b0}
46
Carnegie Mellon
47
Carnegie Mellon
48
Carnegie Mellon
49
Carnegie Mellon
50
Carnegie Mellon
32-bit register
▪ Instruction memory:
Takes input 32-bit address A and reads the 32-bit data (i.e.,
instruction) from that address to the read data output RD.
▪ Register file:
The 32-element, 32-bit register file has 2 read ports and 1 write port
▪ Data memory:
Has a single read/write port. If the write enable, WE, is 1, it writes
data WD into address A on the rising edge of the clock. If the write
enable is 0, it reads address A onto RD.
51
Carnegie Mellon
15:0 SignImm
Sign Extend
ALU
ALUResult
A RD
Instruction
A2 RD2 SrcB Data
Memory
A3 Memory
Register
WD3 WD
File
SignImm
15:0
Sign Extend
ALU
ALUResult ReadData
A RD
Instruction
A2 RD2 SrcB Data
Memory 20:16
A3 Memory
Register
WD3 WD
File
SignImm
15:0
Sign Extend
ALU
ALUResult ReadData
A RD
Instruction
A2 RD2 SrcB Data
Memory 20:16
A3 Memory
Register
WD3 WD
File
PCPlus4
+
SignImm
4 15:0
Sign Extend
Result
Single-Cycle Datapath: sw
Write data in rt to memory
RegWrite ALUControl2:0 MemWrite
0 010 1
CLK CLK
CLK
25:21
WE3 SrcA Zero WE
PC' PC Instr A1 RD1
A RD
ALU
ALUResult ReadData
20:16 A RD
Instruction
A2 RD2 SrcB Data
Memory 20:16
A3 Memory
Register WriteData
WD3 WD
File
PCPlus4
+
SignImm
4 15:0
Sign Extend
Result
ALU
ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
Sign Extend
Result
add t, b, c # t = b + c
R-Type
op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits 59
Carnegie Mellon
ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
<<2
Sign Extend PCBranch
+
Result
CLK CLK
CLK
25:21 WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD
ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
<<2
Sign Extend PCBranch
+
Result
61
Carnegie Mellon
Control Unit
Control
Unit MemtoReg
MemWrite
Branch
Opcode5:0 Main
ALUSrc
Decoder
RegDst
RegWrite
ALUOp1:0
ALU
Funct5:0 ALUControl 2:0
Decoder
62
Carnegie Mellon
63
Carnegie Mellon
Review: ALU
A B
N N F2:0 Function
000 A&B
N
001 A|B
1
F2
N 010 A+B
011 not used
Cout + 100 A & ~B
[N-1] S
101 A | ~B
Extend
Zero
N N N N 110 A-B
1
0
3
R-type 000000 1 1 0 0 0 0 10
lw 100011 1 0 1 0 0 1 00
sw 101011 0 X 1 0 1 X 00
beq 000100 0 X 0 1 0 X 01
MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite
CLK CLK
CLK
25:21 WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD
1 ALU ALUResult
A RD
ReadData
1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
<<2
Sign Extend PCBranch
+
Result
66
Carnegie Mellon
MemtoReg
Control
MemWrite
Unit
Branch 0
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite
CLK CLK
CLK 1 0
0 001 0
25:21
WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD
ALU
1 ALUResult ReadData
0 A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
1
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0 <<2
Sign Extend PCBranch
+
Result
67
Carnegie Mellon
CLK CLK
CLK
25:21 WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD
ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
<<2
Sign Extend PCBranch
+
Result
No change to datapath
68
Carnegie Mellon
R-type 000000 1 1 0 0 0 0 10
lw 100011 1 0 1 0 0 1 00
sw 101011 0 X 1 0 1 X 00
beq 000100 0 X 0 1 0 X 01
addi 001000 1 0 1 0 0 0 00
69
Carnegie Mellon
Extended Functionality: j
Jump MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite
CLK CLK
CLK
0 PC' 25:21
WE3 SrcA Zero WE
0 PC Instr A1 RD1 0 Result
1 A RD
ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
PCJump 15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0
<<2
Sign Extend PCBranch
+
27:0 31:28
25:0
<<2
70
Carnegie Mellon
R-type 000000 1 1 0 0 0 0 10 0
lw 100011 1 0 1 0 0 1 00 0
sw 101011 0 X 1 0 1 X 00 0
beq 000100 0 X 0 1 0 X 01 0
j 000100 0 X X X 0 X XX 1
71
Carnegie Mellon
Processor Performance
How fast is my program?
▪ Every program consists of a series of instructions
▪ Each instruction needs to be executed.
72
Carnegie Mellon
Processor Performance
How fast is my program?
▪ Every program consists of a series of instructions
▪ Each instruction needs to be executed.
73
Carnegie Mellon
Processor Performance
How fast is my program?
▪ Every program consists of a series of instructions
▪ Each instruction needs to be executed.
74
Carnegie Mellon
Processor Performance
Now as a general formula
▪ Our program consists of executing N instructions.
▪ Our processor needs CPI cycles for each instruction.
▪ The maximum clock speed of the processor is f,
and the clock period is therefore T=1/f
75
Carnegie Mellon
Processor Performance
Now as a general formula
▪ Our program consists of executing N instructions.
▪ Our processor needs CPI cycles for each instruction.
▪ The maximum clock speed of the processor is f,
and the clock period is therefore T=1/f
76
Carnegie Mellon
77
Carnegie Mellon
78
Carnegie Mellon
79
Carnegie Mellon
Single-Cycle Performance
TC is limited by the critical path (lw)
MemtoReg
Control
MemWrite
Unit
Branch 0 0
ALUControl 2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite
CLK CLK
CLK 1 0
010 1
25:21
WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD
ALU
1 ALUResult ReadData
1 A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
0
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0 <<2
Sign Extend PCBranch
+
Result
81
Carnegie Mellon
Single-Cycle Performance
Single-cycle critical path:
▪ Tc = tpcq_PC + tmem + max(tRFread, tsext + tmux) + tALU + tmem + tmux + tRFsetup
CLK CLK
CLK 1 0
010 1
25:21
WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD
ALU
1 ALUResult ReadData
1 A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
0
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+
SignImm
4 15:0 <<2
Sign Extend PCBranch
+
Result
82
Carnegie Mellon
Tc =
83
Carnegie Mellon
85
Carnegie Mellon
86
Carnegie Mellon