Sie sind auf Seite 1von 23

2/9/17

AGENDA
Intro to Microarchitecture: • Review from last lecture
Single-Cycle • ISA tradeoffs

CS 3330
• Single-cycle Microarchitecture

Samira Khan
University of Virginia
Feb 9, 2017

Review: ISA vs. Microarchitecture Review: ISA


• Instructions
• ISA (Instruction Set Architecture) Problem • Opcodes, Addressing M odes, Data Types
• Agreed upon interface between software and • Instruction Types and Formats
hardware Algorithm • Registers, Condition Codes

• SW/compiler assumes, HW promises • Memory


• What the software writer needs to know to write and Program • Address space, Addressability, Alignment
debug system/user programs • Virtual m em ory m anagem ent

• Microarchitecture ISA • Call, Interrupt/Exception Handling


• Specific implementation of an ISA • Access Control, Priority/Privilege
Microarchitecture
• Not visible to the software • I/O: memory-mapped vs. instr.
• Microprocessor Circuits • Task/thread Management
• ISA, uarch, circuits • Power and Thermal Management
• “Architecture” = ISA + microarchitecture Transistors
• Multi-threading support, Multiprocessor support
3 4

1
2/9/17

Microarchitecture Property of ISA vs. Uarch?


• Implementation of the ISA under specific design constraints and goals • ADD instruction’s opcode
• Anything done in hardware without exposure to software • Number of general purpose registers
• Pipelining (will see later) • Number of ports to the register file
• Clock gating
• Caching? Levels, size, associativity, replacement policy • Number of cycles to execute the MUL instruction
• Prefetching? • Whether or not the machine employs pipelined instruction execution
• Voltage/frequency scaling?
• Error correction?

• Remember
• Microarchitecture: Implementation of the ISA under specific design constraints
and goals
5 6

Design Point Design Point


• A set of design considerations and their importance • A set of design considerations and their importance
• leads to tradeoffs in both ISA and uarch • leads to tradeoffs in both ISA and uarch
• Considerations • Considerations
• Cost • Cost
• Performance • Performance
• Maximum power consumption • Maximum power consumption
• Energy consumption (battery life) • Energy consumption (battery life)
• Availability • Availability
• Reliability and Correctness • Reliability and Correctness
• Time to Market • Time to Market

• Design point determined by the “Problem” space (application space), • Design point determined by the “Problem” space (application space),
the intended users/market the intended users/market
Look Forward & Up
7 8

2
2/9/17

ROLE OF THE (COMPUTER) ARCHITECT ROLE OF THE (COMPUTER) ARCHITECT


• Look backward (to the past)
• Understand tradeoffs and designs, upsides/downsides, past
workloads. Analyze and evaluate the past

• Look forward (to the future)


• Be the dreamer and create new designs. Listen to dreamers
• Push the state of the art. Evaluate new design choices

• Look up (towards problems in the computing stack)


• Understand important problems and their nature
• Develop architectures and ideas to solve important problems

• Look down (towards device/circuit technology)


• Understand the capabilities of the underlying technology
from Yale Patt’s lecture notes • Predict and adapt to the future of technology (you are designing for
9 N years ahead). Enable the future technology 10

Application Space
Tradeoffs: Soul of Computer Architecture
• ISA-level tradeoffs
• Dream, and they will appear…
• Microarchitecture-level tradeoffs

• System and Task-level tradeoffs


• How to divide the labor between hardware and software

• Computer architecture is the science and art of making


the appropriate trade-offs to meet a design point
• Why art?

11 12

3
2/9/17

Many Different ISAs Over Decades


• x86
• PDP-x: Programmed Data Processor (PDP-11)
• VAX
• IBM 360

ISA Principles and Tradeoffs


• CDC 6600
• SIMD ISAs: CRAY-1, Connection Machine
• VLIW ISAs: Multiflow, Cydrome, IA-64 (EPIC)
• PowerPC, POWER
• RISC ISAs: Alpha, MIPS, SPARC, ARM

• What are the fundamental differences?


• E.g., how instructions are specified and what they do
• E.g., how complex are the instructions

14

MIPS ARM

0 rs rt rd shamt funct R-type


6-bit 5-bit 5-bit 5-bit 5-bit 6-bit

opcode rs rt immediate I-type


6-bit 5-bit 5-bit 16-bit

opcode immediate J-type


6-bit 26-bit

15 16

4
2/9/17

What Are the Elements of An ISA? Data Type Tradeoffs


• Instructions • What is the benefit of having more or high-level data types in the ISA?
• Opcode
• Operand specifiers (addressing modes) • What is the disadvantage?
• How to obtain the operand? Why are there different addressing modes?

• Think compiler/programmer vs. microarchitect


• Data types
• Definition: Representation of information for which there are
instructions that operate on the representation • Concept of semantic gap
• Integer, floating point, character, binary, decimal, BCD • Data types coupled tightly to the semantic level, or complexity of instructions
• Doubly linked list, queue, string, bit vector, stack
• VAX: INSQUEUE and REMQUEUE instructions on a doubly linked list or
queue; FINDFIRST • Example: Early RISC architectures vs. Intel 432
• Digital Equipment Corp., “VAX11 780 Architecture Handbook,” 1977. • Early RISC: Only integer data type
• X86: SCAN opcode operates on character strings; PUSH/POP
• Intel 432: Object data type, capability based machine
17 18

Complex vs. Simple Instructions Complex vs. Simple Instructions


• Complex instruction: An instruction does a lot of work, • Advantages of Complex instructions
e.g. many operations + Denser encoding à smaller code size à better memory
• Insert in a doubly linked list utilization, saves off-chip bandwidth, better cache hit rate
• Compute FFT (better packing of instructions)
• String copy + Simpler compiler: no need to optimize small instructions as
much

• Simple instruction: An instruction does small amount of


work, it is a primitive using which complex operations • Disadvantages of Complex Instructions
can be built - Larger chunks of work à compiler has less opportunity to
optimize (limited in fine-grained optimizations it can do)
• Add
- More complex hardware à translation from a high level to
• XOR control signals and optimization needs to be done by
• Multiply hardware

19 20

5
2/9/17

ISA-level Tradeoffs: Semantic Gap ISA-level Tradeoffs: Semantic Gap


• Where to place the ISA? Semantic gap • Some tradeoffs (for you to think about)
• Closer to high-level language (HLL) à Small semantic gap, complex
instructions
• Closer to hardware control signals? à Large semantic gap, simple instructions • Simple compiler, complex hardware vs. complex compiler, simple
hardware
• RISC vs. CISC machines
• RISC: Reduced instruction set computer • Burden of backward compatibility
• CISC: Complex instruction set computer
• FFT, QUICKSORT, POLY, FP instructions?
• VAX INDEX instruction (array access with bounds checking) • Performance? Energy Consumption?
• Optimization opportunity: Example of VAX INDEX instruction: who (compiler
vs. hardware) puts more effort into optimization?
• Instruction size, code size
21 22

Small versus Large Semantic Gap ISA-level Tradeoffs: Instruction Length


• CISC vs. RISC • Fixed length: Length of all instructions the same
• Complex instruction set computer à complex instructions + Easier to decode single instruction in hardware
• Initially motivated by “not good enough” code generation + Easier to decode m ultiple instructions concurrently
• Reduced instruction set computer à simple instructions -- Wasted bits in instructions (W hy is this bad?)
-- Harder-to-extend ISA (how to add new instructions?)
• John Cocke, mid 1970s, IBM 801
• Goal: enable better com piler control and optim ization • Variable length: Length of instructions different (determined by
opcode and sub-opcode)
• RISC motivated by + Com pact encoding (W hy is this good?)
Intel 432: 6 to 321 bit instructions.
• Memory stalls (no work done in a complex instruction when -- M ore logic to decode a single instruction
there is a memory stall?) -- Harder to decode multiple instructions concurrently
• When is this correct?
• Simplifying the hardware à lower cost, higher frequency • Tradeoffs
• Enabling the compiler to optimize the code better • Code size (m em ory space, bandwidth, latency) vs. hardware com plexity
• Find fine-grained parallelism to reduce stalls • ISA extensibility and expressiveness vs. hardware complexity
• Perform ance? Energy? Sm aller code vs. ease of decode

23 24

6
2/9/17

ISA-level Tradeoffs: Uniform Decode ISA-level Tradeoffs: Number of Registers


• Uniform decode: Same bits in each instruction
correspond to the same meaning • Affects:
• Opcode is always in the same location • Number of bits used for encoding register address
• Ditto operand specifiers, immediate values, … • Number of values kept in fast storage (register file)
• Many “RISC” ISAs: Alpha, MIPS, SPARC • (uarch) Size, access time, power consumption of register file
+ Easier decode, simpler hardware
+ Enables parallelism: generate target address before knowing the
instruction is a branch • Large number of registers:
-- Restricts instruction format (fewer instructions?) or wastes space
+ Enables better register allocation (and optimizations) by compiler à
fewer saves/restores
• Non-uniform decode -- Larger instruction size
• E.g., opcode can be the 1st-7th byte in x86 -- Larger register file size
+ More compact and powerful instruction format
-- More complex decode logic

25 26

ISA-level Tradeoffs: Addressing Modes A Note on RISC vs. CISC


• Addressing mode specifies how to obtain an operand of an • Usually, …
instruction
• Register
• Immediate • RISC
• Memory (displacement, register indirect, indexed, absolute, • Simple instructions
memory indirect, autoincrement, autodecrement, …) • Fixed length
• Uniform decode
• More modes: • Few addressing modes
+ help better support programming constructs (arrays, pointer-
based accesses) • CISC
-- make it harder for the architect to design • Complex instructions
-- too many choices for the compiler? • Variable length
• Many ways to do the same thing complicates compiler design • Non-uniform decode
• Wulf, “Compilers and Computer Architecture,” IEEE Computer 1981 • Many addressing modes
27 28

7
2/9/17

Food for Thought for You Y86-64 Instruction Set #1


Byte 0 1 2 3 4 5 6 7 8 9

• How would you design a new ISA? halt 0 0

nop 1 0

• Where would you place it? cmovXX rA, rB 2 fn rA rB

• What design choices would you make in terms of ISA irmovq V , rB 3 0 F rB V

properties? rmmovq rA, D (rB) 4 0 rA rB D

mrmovq D (rB), rA 5 0 rA rB D

• What would be the first question you ask in this OPq rA, rB 6 fn rA rB

process? jXX D est 7 fn D est

• “What is my design point?” call D est 8 0 D est

Look Forward & Up


ret 9 0

pushq rA A 0 rA F

popq rA B 0 rA F
29 30

Now That We Have an ISA


• How do we implement it?

• i.e., how do we design a system that obeys the


hardware/software interface?
Implementing the ISA:
Microarchitecture Basics

31

8
2/9/17

How Does a Machine Process Instructions?


The “Process instruction” Step
• What does processing an instruction mean? • ISA specifies abstractly what AS’ should be, given an instruction and
AS
• Remember the von Neumann model
• It defines an abstract finite state machine where
• State = program m er-visible state
AS = Architectural (programmer visible) state before an instruction is processed • Next-state logic = instruction execution specification
• From ISA point of view, there are no “intermediate states” between AS and AS’
during instruction execution
• One state transition per instruction

Process instruction
• Microarchitecture implements how AS is transformed to AS’
• There are many choices in implementation
• We can have programmer-invisible state to optimize the speed of instruction
AS’ = Architectural (programmer visible) state after an instruction is processed execution: multiple state transitions per instruction
• Choice 1: AS à AS’ (transform AS to AS’ in a single clock cycle)
• Choice 2: AS à AS+M S1 à AS+M S2 à AS+M S3 à AS’ (take m ultiple clock cycles to
• Processing an instruction: Transforming AS to AS’ according to the ISA transform AS to AS’)
specification of the instruction
33 34

A Very Basic Instruction Processing Engine A Very Basic Instruction Processing Engine
• Each instruction takes a single clock cycle to execute • Single-cycle machine
• Only combinational logic is used to implement instruction execution
• No intermediate, programmer-invisible state updates

AS = Architectural (programmer visible) state AS’ (State) AS


Combinational
at the beginning of a clock cycle Logic

Process instruction in one clock cycle

AS’ = Architectural (programmer visible) state • What is the clock cycle time determined by?
at the end of a clock cycle • What is the critical path of the combinational logic
determined by?
35 36

9
2/9/17

Assembly/Machine Code View Single-cycle vs. Multi-cycle Machines


CPU M emory • Single-cycle machines
Addresses • Each instruction takes a single clock cycle
Registers
Data Code • All state updates made at the end of an instruction’s execution
PC
Condition Data • Big disadvantage: The slowest instruction determines cycle time à long clock
Instructions Stack
Codes cycle time

Programmer-Visible State
• PC: Program counter
• Multi-cycle machines
• Address of next instruction • Instruction processing broken into multiple cycles/stages
• Memory
• Called “RIP” (x86-64)
• Byte addressable array
• State updates can be made during an instruction’s execution
• Register file • Architectural state updates made only at the end of an instruction’s execution
• Code and user data
• Heavily used program data
• Stack to support procedures
• Advantage over single-cycle: The slowest “stage” determines cycle time
• Condition codes
• Store status inform ation about m ost
recent arithm etic or logical operation n Both single-cycle and multi-cycle machines literally follow the
• Used for conditional branching von Neumann model at the microarchitecture level
Instructions (and programs) specify how to transform
the values of programmer visible state
37 38

Instruction Processing “Stage” Instruction Processing “Cycle” vs. Machine Clock Cycle
• Instructions are processed under the direction of a “control • Single-cycle machine:
unit” step by step.
• All phases of the instruction processing cycle take a single
• Instruction stage: Sequence of steps to process an instruction machine clock cycle to complete
• Fundamentally, there are five phases:

• Fetch • Multi-cycle machine:


• All six phases of the instruction processing cycle can take
• Decode
multiple machine clock cycles to complete
• Evaluate Address/Fetch Operands • In fact, each phase can take multiple clock cycles to complete
• Execute
• Store Result

• Not all instructions require all stages

39 40

10
2/9/17

Instruction Processing Viewed Another Way Single-cycle vs. Multi-cycle: Control & Data
• Single-cycle machine:
• Instructions transform Data (AS) to Data’ (AS’) • Control signals are generated in the same clock cycle as the
• This transformation is done by functional units one during which data signals are operated on
• Units that “operate” on data
• Everything related to an instruction happens in one clock cycle
• These units need to be told what to do to the data (serialized processing)

• An instruction processing engine consists of two components


• Datapath: Consists of hardware elements that deal with and transform • Multi-cycle machine:
data signals • Control signals needed in the next cycle can be generated in
• functional units that operate on data the current cycle
• hardware structures (e.g. wires and muxes) that enable the flow of data into
the functional units and registers • Latency of control processing can be overlapped with latency
• storage units that store data (e.g., registers) of datapath operation (more parallelism)
• Control logic: Consists of hardware elements that determine control
signals, i.e., signals that specify what the datapath elements should do
to the data
41 42

Many Ways of Datapath and Control Design Flash-Forward: Performance Analysis


• There are many ways of designing the data path and control logic • Execution time of an instruction
• {CPI} x {clock cycle time}

• Execution time of a program


• Single-cycle, multi-cycle, pipelined datapath and control • Sum over all instructions [{CPI} x {clock cycle time}]
• {# of instructions} x {Average CPI} x {clock cycle time}

• Hardwired/combinational vs. microcoded/microprogrammed control


• Control signals generated by combinational logic versus • Single cycle microarchitecture performance
• CPI = 1
• Control signals stored in a memory structure
• Clock cycle time = long
• Multi-cycle microarchitecture performance
• CPI = different for each instruction
• Average CPI à hopefully small Now, we have
• Clock cycle time = short two degrees of freedom
to optim ize independently
43 44

11
2/9/17

Remember…
A Single-Cycle • Single-cycle machine

Microarchitecture
A Closer Look Combinational
AS’
(State) AS
Logic

46

Let’s Start with the State Elements For Now, We Will Assume
Reg
• Data and control inputs
valA
Write
• “Magic” memory and register file
srcA A valW 0
PC
Register M UX

valB file
W dstW
1
• Synchronous write
srcB B M UX
• the selected register is updated on the positive edge clock
Select transition when write enable is asserted
M em • Cannot affect read output in between clock edges
Write
Operation
Instr Address
Addr Read A
Instruction A
Data
Instruction Write Data L
U
Mem
Data Mem B

M em 47 48
Read

12
2/9/17

Instruction Processing Instruction Processing


• 6 (5) generic steps • 6 (5) generic steps
• Instruction fetch (IF) • Instruction fetch (IF)
• Instruction decode and register operand fetch (ID/RF) • Instruction decode and register operand fetch (ID/RF)
• Execute/Evaluate memory address (EX/AG) • Execute/Evaluate memory address (EX/AG)
• Memory operand fetch (MEM) • Memory operand fetch (MEM)
• Store/writeback result (WB) • Store/writeback result (WB)
• PC Update • PC Update
IF ID/RF EX/AG MEM WB IF ID/RF EX/AG MEM WB
rB valB rB valB
A Read A Read
new PC P Instr Address new PC Instr Address
L Data P L
Addr Instruction rA rA Data
C ValA U C Addr Instruction ValA U
Instruction DestE Write Data Instruction DestE Write Data
Register Data Register Data
Mem ValE Mem Mem Mem
file ValE file

49 50

Instruction Processing Instruction Processing


• 6 (5) generic steps • 6 (5) generic steps
• Instruction fetch (IF) • Instruction fetch (IF)
• Instruction decode and register operand fetch (ID/RF) • Instruction decode and register operand fetch (ID/RF)
• Execute/Evaluate memory address (EX/AG) • Execute/Evaluate memory address (EX/AG)
• Memory operand fetch (MEM) • Memory operand fetch (MEM)
• Store/writeback result (WB) • Store/writeback result (WB)
• PC Update • PC Update
IF ID/RF EX/AG MEM WB IF ID/RF EX/AG MEM WB
rB valB rB valB
A Read A Read
new PC P Instr Address new PC Instr Address
L Data P L
Addr Instruction rA rA Data
C ValA U C Addr Instruction ValA U
Instruction DestE Write Data Instruction DestE Write Data
Register Data Register Data
Mem ValE Mem Mem Mem
file ValE file

51 52

13
2/9/17

Executing Arith./Logical Operation


Single-Cycle Datapath for OPq rA, rB 6 fn rA rB
Arithmetic and Logical •Fetch •Memory
Instructions •Decode
• Read 2 bytes • Do nothing
•Write back
• Read operand registers • Update register
•Execute •PC Update
• Perform operation • Increment PC by 2
• Set condition codes

54

Stage Computation: Arith/Log. Ops


OPq rA, rB ALU Datapath Reg
Write ALU
icode:ifun ¬ M1 [PC] Read instruction byte
rA:rB ¬ M1 [PC+1] Read register byte OP
Fetch rA valA
A Read
Instr
valP ¬ PC+2 Compute next PC P
rB L Address
Data
Addr Instruction ValB U
Decode valA ¬ R[rA] Read operand A C
valB ¬ R[rB] Read operand B Instruction DestE Write Data
Register Data
Execute valE ¬ valB OP valA Perform ALU operation Mem ValE Mem
file
Set CC Set condition code
Memory register
Write R[rB] ¬ valE Write back result A
back 2 D
PC update PC ¬ valP Update PC D

• Formulate instruction execution as sequence of simple IF ID EX MEM WB PC


steps if MEM[PC] == OPq rA, rB
R[rB] ¬ R[rB] op R[rA]
Combinational
• Use same general form for all instructions PC ¬ PC + 2
**Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]
state update logic
55 56

14
2/9/17

ALU Datapath Reg


Write ALU

We did not cover these slides


OP
rA valA
A Read
P Instr Address

in the class
rB L Data
C Addr Instruction ValB U

Instruction DestE Write Data


Register Data
Mem ValE Mem
file
Will learn about these in the next class
A They are here for your benefit
2 D
D

IF ID EX MEM WB PC
if MEM[PC] == OPq rA, rB
R[rB] ¬ R[rB] op R[rA]
Combinational
PC ¬ PC + 2
**Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]
state update logic
57

Executing mrmovq (Load from Mem to Reg)


mrmovq D(rB) ,rA
Single-Cycle Datapath for
fn rA rB D
Data Movement Instructions 6
•Fetch •Memory
• Read 10 bytes • Read from memory
•Decode •Write back
• Read operand registers • Write to Register
•Execute •PC Update
• Compute effective • Increment PC by 10
address 60

15
2/9/17

Stage Computation: mrmovq


mrmovq D(rB), rA Ld Datapath Reg
icode:ifun ¬ M1[PC] Read instruction byte Write ALU
rA:rB ¬ M1[PC+1] Read register byte OP
Fetch rB valB
valC ¬ M8[PC+2] Read displacement D
A Read
valP ¬ PC+10 Compute next PC P Instr Address
rA L Data
C Addr Instruction ValA M U
Decode rB M U
valB ¬ R[rB] Read operand B DestE Write Data
Instruction Register X
valE ¬ valB + valC Compute effective address rA U Data
Mem
Execute Mem X ValE file
Memory valM ¬ M8[valE] Write value to memory D
Write R[rA] ¬ valM
A M From ALU
back D
10 U From M em
PC update PC ¬ valP Update PC D X
M UX
• Use ALU for address computation Select
if MEM[PC]== mrmovq Disp (rB), rA IF ID EX MEM WB PC
EA = Disp + R[rB]
R[rA] ¬ MEM[EA]
Combinational
61
PC ¬ PC + 10 state update logic 62

Ld Datapath Reg Ld Datapath Reg


Write ALU Write ALU
OP OP
rB valB rB valB
A Read A Read
P Instr Address P Instr Address
rA L Data rA L Data
C Addr Instruction ValA M U C Addr Instruction ValA M U
rB M U rB M U
Instruction DestE Write Data Instruction DestE Write Data
rA U Register X Data rA U Register X Data
Mem ValE Mem Mem Mem
X file X ValE file
D D

A M From ALU A M From ALU


D U D U
10 From M em 10 From M em
D X D X
M UX M UX
Select Select
if MEM[PC]== mrmovq Disp (rB), rA IF ID EX MEM WB PC if MEM[PC]== mrmovq Disp (rB), rA IF ID EX MEM WB PC
EA = Disp + R[rB] EA = Disp + R[rB]
R[rA] ¬ MEM[EA]
Combinational Combinational
R[rA] ¬ MEM[EA]
PC ¬ PC + 10 state update logic 63
PC ¬ PC + 10 state update logic 64

16
2/9/17

Ld Datapath Reg Ld Datapath Reg


Write ALU Write ALU
OP OP
rB valB rB valB
A Read A Read
P Instr Address P Instr Address
rA L Data rA L Data
C Addr Instruction ValA M U C Addr Instruction ValA M U
rB M U rB M U
Instruction DestE Write Data Instruction DestE Write Data
rA U Register X Data rA U Register X Data
Mem ValE Mem Mem Mem
X file X ValE file
D D

A M From ALU A M From ALU


D U D U
10 From M em 10 From M em
D X D X
M UX M UX
Select Select
if MEM[PC]== mrmovq Disp (rB), rA IF ID EX MEM WB PC if MEM[PC]== mrmovq Disp (rB), rA IF ID EX MEM WB PC
EA = Disp + R[rB] EA = Disp + R[rB]
R[rA] ¬ MEM[EA]
Combinational Combinational
R[rA] ¬ MEM[EA]
PC ¬ PC + 10 state update logic 65
PC ¬ PC + 10 state update logic 66

Stage Computation: rmmovq


Executing rmmovq (St from reg to Memory) rmmovq rA, D(rB)
icode:ifun ¬ M1[PC] Read instruction byte
rmmovq rA, D(rB) Fetch
rA:rB ¬ M1[PC+1] Read register byte
valC ¬ M8[PC+2] Read displacement D
valP ¬ PC+10 Compute next PC
4 0 rA rB D Decode
valA ¬ R[rA] Read operand A
valB ¬ R[rB] Read operand B
valE ¬ valB + valC Compute effective address
Execute

Memory M8[valE] ¬ valA Write value to memory


•Fetch •Memory Write
back
• Read 10 bytes • Write to memory PC update PC ¬ valP Update PC
•Decode •Write back
• Read operand registers • Do nothing • Use ALU for address computation

•Execute •PC Update


• Compute effective address • Increment PC by 10 67 68

17
2/9/17

St Datapath Reg M em St Datapath Reg M em


Write ALU Write Write ALU Write
OP OP
rB valB rB valB
A Read A Read
P Instr Address P Instr Address
rA L Data rA L Data
C Addr Instruction ValA M U C Addr Instruction ValA M U
rB M U rB M U
Instruction DestE Write Data Instruction DestE Write Data
rA U Register X Data rA U Register X Data
Mem ValE Mem Mem Mem
X file X ValE file
D D

A M From ALU A M From ALU


D U D U
10 From M em 10 From M em
D X D X
M UX M UX
Select Select
if MEM[PC]== rmmovq rA, Disp (rB) IF ID EX MEM WB PC if MEM[PC]== rnmovq rA, Disp (rB) IF ID EX MEM WB PC
EA = Disp + R[rB] EA = Disp + R[rB]
MEM[EA] ¬ R[rA]
Combinational Combinational
MEM[EA] ¬ R[rA]
PC ¬ PC + 10 state update logic 69
PC ¬ PC + 10 state update logic 70

St Datapath Reg M em St Datapath Reg M em


Write ALU Write Write ALU Write
OP OP
rB valB rB valB
A Read A Read
P Instr Address P Instr Address
rA L Data rA L Data
C Addr Instruction ValA M U C Addr Instruction ValA M U
rB M U rB M U
Instruction DestE Write Data Instruction DestE Write Data
rA U Register X Data rA U Register X Data
Mem ValE Mem Mem Mem
X file X ValE file
D D

A M From ALU A M From ALU


D U D U
10 From M em 10 From M em
D X D X
M UX M UX
Select Select
if MEM[PC]== rnmovq rA, Disp (rB) IF ID EX MEM WB PC if MEM[PC]== rnmovq rA, Disp (rB) IF ID EX MEM WB PC
EA = Disp + R[rB] EA = Disp + R[rB]
MEM[EA] ¬ R[rA]
Combinational Combinational
MEM[EA] ¬ R[rA]
PC ¬ PC + 10 state update logic 71
PC ¬ PC + 10 state update logic 72

18
2/9/17

Stage Computation: immovq


Executing irmovq (Move imm to Reg) irmovq V, rB
icode:ifun ¬ M1[PC] Read instruction byte
irmovq V, rB Fetch
rA:rB ¬ M1[PC+1] Read register byte
valC ¬ M8[PC+2] Read displacement D
valP ¬ PC+10 Compute next PC
3 0 F rB V Decode

valE ¬ 0 + valC Compute effective address


Execute

Memory R[rB] ¬ valA Write value to memory


•Fetch •Memory Write
back
• Read 10 bytes • Do nothing PC update PC ¬ valP Update PC
•Decode •Write back
• Read operand registers • Write V to rB • Use ALU for address computation

•Execute •PC Update


• Add 0 to V • Increment PC by 10 73 74

IRMov Datapath: Option 1 IRMov Datapath: Option 1


Reg M em Reg M em
Write
0 M ALU Write Write
0 M ALU Write
U OP U OP
rB valB X rB valB X
A Read A Read
P Instr Address P Instr Address
rA L Data rA L Data
C Addr Instruction ValA M U C Addr Instruction ValA M U
rB M U rB M U
Instruction DestE Write Data Instruction DestE Write Data
rA U Register X Data rA U Register X Data
Mem ValE Mem Mem Mem
X file X ValE file
D D

A M From ALU A From ALU


M
D D
10 U From M em 10 U From M em
D D
X X
M UX M UX
if MEM[PC]== irmovq V, rB Select Select
IF ID EX MEM WB PC if MEM[PC]== irmovq V, rB IF ID EX MEM WB PC
R[rB] ¬ V R[rB] ¬ V + 0
PC ¬ PC + 10
Combinational Combinational
PC ¬ PC + 10
state update logic 75
state update logic 76

19
2/9/17

IRMov Datapath: Option 1 IRMov Datapath: Option 1


Reg M em Reg M em
Write
0 M ALU Write Write
0 M ALU Write
U OP U OP
rB valB X rB valB X
A Read A Read
P Instr Address P Instr Address
rA L Data rA L Data
C Addr Instruction ValA M U C Addr Instruction ValA M U
rB M U rB M U
Instruction DestE Write Data Instruction DestE Write Data
rA U Register X Data rA U Register X Data
Mem ValE Mem Mem Mem
X file X ValE file
D D

A M From ALU A From ALU


M
D D
10 U From M em 10 U From M em
D D
X X
M UX M UX
if MEM[PC]== irmovq V, rB Select Select
IF ID EX MEM WB PC if MEM[PC]== irmovq V, rB IF ID EX MEM WB PC
R[rB] ¬ V + 0 R[rB] ¬ V + 0
PC ¬ PC + 10
Combinational Combinational
PC ¬ PC + 10
state update logic 77
state update logic 78

IRMov Datapath: Option 1 IRMov Datapath: Option 2


Reg M em Reg M em
Write
0 M ALU Write Write ALU Write
U OP OP
rB valB X rB valB
A Read A Read
P Instr Address P Instr Address
rA L Data rA L Data
C Addr Instruction ValA M U C Addr Instruction ValA M U
rB M U rB M U
Instruction DestE Write Data Instruction DestE Write Data
rA U Register X Data rA U Register X Data
Mem ValE Mem Mem Mem
X file X ValE file
D D

A From ALU A M
M From ALU
D D U
10 U From M em 10 From M em
D D X
X
M UX M UX
Select Select
if MEM[PC]== irmovq V, rB IF ID EX MEM WB PC if MEM[PC]== irmovq V, rB IF ID EX MEM WB PC
R[rB] ¬ V + 0 R[rB] ¬ V
PC ¬ PC + 10
Combinational Combinational
PC ¬ PC + 10
state update logic 79
state update logic 80

20
2/9/17

IRMov Datapath: Option 2


Reg M em
Write ALU Write
OP
rB valB
A Read
• Tradeoffs between option 1 and option 2?
P Instr Address
rA L Data
C Addr Instruction ValA M U
rB M U
Instruction DestE Write Data
rA U Register X Data
Mem ValE Mem
X file
D

A M
From ALU
D U
10 From M em
D X

M UX
Select
if MEM[PC]== irmovq V, rB IF ID EX MEM WB PC
R[rB] ¬ V
PC ¬ PC + 10
Combinational
state update logic 81 82

Stage Computation: rrmovq


Executing rrmovq (Move from Reg to Reg) rrmovq rA, rB
icode:ifun ¬ M1[PC] Read instruction byte
rrmovq rA, rB Fetch
rA:rB ¬ M1[PC+1] Read register byte
Read displacement D
valP ¬ PC+2 Compute next PC
2 0 rA rB Decode
ValA ¬ R[rA]
valE ¬ 0 + valA Compute effective address
Execute

Memory Write value to memory


•Fetch •Memory Write R[rB] ß valE
back
• Read 2 bytes • Do nothing PC update PC ¬ valP Update PC
•Decode •Write back
• Read operand register rA • Write val rA to rB • Use ALU for address computation

•Execute •PC Update


• Add 0 to val rA • Increment PC by 2 83 84

21
2/9/17

rrMov Datapath: Option 1 rrmov Datapath: Option 1


Reg M em Reg M em
Write
0 M ALU Write Write
0 M ALU Write
U OP U OP
rB valB X rB valB X
A Read A Read
P Instr Address P Instr Address
rA L Data rA L Data
C Addr Instruction ValA M U C Addr Instruction ValA M U
rB M U rB M U
Instruction DestE Write Data Instruction DestE Write Data
rA U Register X Data rA U Register X Data
Mem ValE Mem Mem Mem
X file X ValE file
D D

A M From ALU A From ALU


M
D D
2 U From M em 2 U From M em
D D
X X
M UX M UX
Select Select
if MEM[PC]== rrmovq rA, rB IF ID EX MEM WB PC if MEM[PC]== rrmovq rA, rB IF ID EX MEM WB PC
R[rB] ¬ R[rA] R[rB] ¬ R[rA]
PC ¬ PC + 2
Combinational Combinational
PC ¬ PC + 2
state update logic 85
state update logic 86

rrmov Datapath: Option 1 rrmov Datapath: Option 1


Reg M em Reg M em
Write
0 M ALU Write Write
0 M ALU Write
U OP U OP
rB valB X rB valB X
A Read A Read
P Instr Address P Instr Address
rA L Data rA L Data
C Addr Instruction ValA M U C Addr Instruction ValA M U
rB M U rB M U
Instruction DestE Write Data Instruction DestE Write Data
rA U Register X Data rA U Register X Data
Mem ValE Mem Mem Mem
X file X ValE file
D D

A M From ALU A From ALU


M
D D
2 U From M em 2 U From M em
D D
X X
M UX M UX
Select Select
if MEM[PC]== rrmovq rA, rB IF ID EX MEM WB PC if MEM[PC]== rrmovq rA, rB IF ID EX MEM WB PC
R[rB] ¬ R[rA] R[rB] ¬ R[rA]
PC ¬ PC + 2
Combinational Combinational
PC ¬ PC + 2
state update logic 87
state update logic 88

22
2/9/17

rrmov Datapath: Option 2


Intro to Microarchitecture:
Reg M em
Write ALU Write

rB valB
OP

A
Single-Cycle
P Instr
rA L Address
Read
Data
CS 3330
C Addr Instruction ValA M U
rB M U
Instruction DestE Write Data
rA U Register X Data
Mem Mem

?
X ValE file
D Samira Khan
University of Virginia
A M
From ALU Feb 9, 2017
D U
10 From M em
D X

M UX
Select
if MEM[PC]== rrmovq rA, rB IF ID EX MEM WB PC
R[rB] ¬ R[rA]
PC ¬ PC + 2
Combinational
state update logic 89

23

Das könnte Ihnen auch gefallen