Processor Organization & Instruction Cycle

CHAPTER 5
Processor organization & Instruction

Cycle
1
Instruction Sets Review
Q. Consider the following assembly code

Memory
read:
199 235
(a) MOV R1, 200 (R1200) 200 420
(b) MOV R2,[R1] 201 330
202 0
(c) MOV R3,[R1+1]

(d) JMP calculate 300 15
calculate: CPU
R1 0
(e) ADD R3, R2 (R3R3+R2)
R2 0
(f) MOV [300], R3 R3 0
2
Instruction Sets Review
1. What types of instructions are used in the program?

2. What addressing modes are used in the program?
3. What will be the values of R1, R2 and R3 after the
execution of the program?
4. Assume a processor has 12 registers (16-bits each) and

an instruction set with 30 instructions. Show possible
instruction formats for the following instructions (How
many bits is required for the instruction fields?)
a. MOV R2,[R1]
b. ADD R3,R2,R1
3
Processor Organization
What is a processor (CPU) required to do?

Fetch and execute instructions
PC, IR Fetch Instruction From memory
Decoding Interpret (decode)

circuit Instruction
MAR, MBR [Fetch Data] From memory, I/O
ALU [Process Data]
MAR, MBR [Write Data] To memory, I/O
4
Processor Organizationcntd
CPU contains:
Registers
Internal processor memory
ALU
performs arithmetic and logic operations (processes data)
Operates only on data in registers
ALU with its inputs and outputs is termed as a data path
Control Unit
Decodes instructions, generates control signals to control
the processor
Internal Bus
Interconnects CPU parts
5
Register Organization
Types of registers
User-visible registers
They can be directly accessed (read or written to) by
programmers (instructions)
Used to minimize memory reference
Control registers
Used by control unit to control operation of the processor
Status (flag) registers
Indicate the current state (status) of the processor
No clean separation of registers into these categories (depends

on the processor)
6
User-visible registers
General purpose registers

Can be used for a variety of functions
(hold data, used for addressing)
Data registers
Hold only data
e.g. Accumulator (working) register used to store
intermediate ALU results
Address registers
Only used for addressing
e.g. Segment registers (SS, DS, CS and ES in x86)
Index registers (SI, DI in x86)
Stack pointer
7
Control registers
Program Counter (PC): Contains address of next instruction

to be fetched
Instruction Register (IR): Temporarily holds most recently
fetched instruction
Memory Address Register (MAR): Specifies the address in
memory of the word to be written from or read into the MBR
Memory Buffer Register (MBR): Contains a word to be stored
in memory or is used to receive a word from memory
8
Status registers
e.g. Flag register (x86), CPSR(ARM)
Flags : Indicate the occurrence of an event in the CPU

Carry flag (CF), Zero flag (ZF), Sign flag (SF), Interrupt
flag (IF), Overflow flag (OF)
Used by branch (jump) instructions and interrupts

(CPU checks the appropriate flags when a conditional branch
instruction is encountered or when interrupt is enabled)
9
Instruction Cycle
e.g. MOV R1, [200] 100 MOV R1, [200]
Memory
200 10
CPU PC 100 Address Bus 100 MOV R1, [200] Memory

Fetch
Cycle
CPU IR MOV R1, [200] Data Bus 100 MOV R1, [200] Memory
CPU Decoder MAR 200 CPU
CPU MAR 200 Address Bus 200 10 Execute

Fetch Memory
Operand Cycle
CPU MBR 10 Data Bus 200 10 Memory
CPU MBR 10 R1 10 CPU
10
Instruction Cycle with Interrupt
Process Interrupt
Fetch Instruction
Store PC in
Interpret (decode)
memory (stack)
Instruction
Load address of
[Fetch Data]
ISR on PC
[Process Data] Execute Interrupt

routine (ISR)
[Write Data]
Restore PC from
Interrupt No memory (stack)
?
Yes
Process Interrupt
11
Instruction Pipelining
In this lecture:
Pipelining
Pipelining hazards
Resource hazards
Data hazards
Control hazards
12
Review

= .
CPI: Average clock cycle per instruction
e.g. Suppose a program has 10 instructions with the following

relationship between instructions and clock cycles required
to execute each instruction
No. of Clock The CPI for this program is given by:
instructions Cycles 41 + 32 + 33
4 1 10
= 1.9
3 2 (10 instructions with 19 clock cycles)
3 3
13
Review
To reduced execution time:

Reduce clock period (Increase clock frequency)
(Improve response time)
Reduce CPI (execute more instructions with the

same number of clock cycles)
(Improve throughput)
One approach to reduce CPI is to overlap execution of
instructions (pipelining)
14
Pipelining
Instruction cycle has several stages (fetch, decode,

execute)
Let instructions execute one after the other
(assume one clock cycle per stage (3 clock cycles per instruction) )
Clk
Instruction 1 Fetch Decode Execute
Instruction 3
9 clock cycles for 3 instructions, 3n clock cycles for n instructions

15
Pipeliningcntd
Let the instruction stages overlap

When instruction2 is being decoded, instruction1
is fetched and so on
Clk
5 clock cycles for 3 instructions (CPI is reduced)

16
Pipeliningcntd
Additional hardware is required for a pipelined

processor (pipeline registers between the stages)
PC FI/DI DI/EI
R R
e e
g g
Fetch i Decode i Execute
s s
(FI) t
(DI) t
(EI)
e e
r r
s s
17
More stages
In practice the three stages may take different times (clock

cycles): execution may take more time than decoding. This
would reduce the effectiveness of the pipeline
10ns 10ns 30ns
Fetch Decode Execute
Currently decoded instruction has to wait until previous

instruction is executed
Throughput is limited by the slowest stage
18
More stagescntd
If we have more stages:

The stages will be of more nearly equal duration
Program execution time is reduced more
e.g. 5-stage pipeline
10ns 10ns 10ns 10ns 10ns
Fetch Decode Fetch Execute Write

Instr. Instr. Operands Instr. Operand
(FI) (DI) (FO) (EI) (WO)
Operands can be fetched from memory or from registers

Operand can be written to memory or to registers
19
5-stage Pipeline
Assume:
All instructions require all the five stages
Equal duration for each stage
Time
I1 FI DI FO EI WO
I2 FI DI FO EI WO
I3 FI DI FO EI WO
Assuming one clock cycle per stage, 3 instructions

would require 7 clock cycles
20
Pipeline Performance
Assume an instruction goes through k stages and each stage has

a duration of
Without pipelining, execution time for n instructions (T) will be:
=
With pipelining
, = + 1
e.g. For =1, k=5, n=10

= 5 10 = 50
, = 5 + 10 1 = 14
50
Speed up factor of = 3.57
14
With pipelining the program is executed 3.57 times faster than
without pipelining
21
Pipeline Performancecntd

Speed up factor ( ) = =
, + 1
22
Pipeline Hazards
Some things could go wrong on real pipelined

executions
A pipeline hazard occurs when the pipeline, or some
portion of the pipeline, must stall (be idle) because
conditions do not permit continued execution
Pipeline hazards:
Resource (Structural) hazards
Data hazards
Control hazards
23
Resource Hazards
Occur when two or more instructions that are already in

the pipeline need the same resource
e.g. Memory access
Consider a 5-stage pipeline (each stage takes one cycle)
Time
Memory 1 2 3 4 5 6 7
Address
Instructions I1 FI DI FO EI WO
CPU I2 FI DI FO EI WO
Data Data
I3 FI DI FO EI WO
If operand is to be fetched from memory at stage 3 of the first instruction, a

resource hazard occurs while the processor tries to fetch third instruction
(both operations need to use the same bus)
24
Resource Hazardscntd
Therefore the fetch instruction stage of the pipeline must stall (be
idle) for one cycle (one more clock cycle required to execute the 3
instructions)
Time
1 2 3 4 5 6 7 8
I1 FI DI FO EI WO Assume all other
I2 FI DI FO EI WO operands are in
registers
I3 Idle FI DI FO EI WO
Another solution for resource hazards is to increase available

resources (e.g. Have separate data and instruction memory
with separate buses)
25
Data Hazards
Occur when one instruction depends on data value

produced by a preceding instruction
e.g.
R1 0
ADD R1,R2 (R1=1) R2 1
ADD R3,R1 (R3=3) R3 2
Wrong
value of R1 Time
is read 1 2 3 4 5 6 7
ADD R1,R2 FI DI FO EI WO
(R1=0) (R1=1)
(R1=0)
FI DI FO EI WO
26
Data Hazardscntd
Such hazard is termed as read after write (RAW) hazard since

current instruction must wait to read data until after a previous
instruction writes the correct data
The hazard occurs if read takes place before the write operation is
complete
Other types of data hazards:
Write after read (WAR)
Write after write (WAW)
Approaches for handling data hazards:
Avoid hazard
Detect and stall
Detect and forward
27
Data Hazardscntd
Write after Read (WAR) hazard

The hazard occurs if write takes place before a read operation is complete
Next instruction modifies (writes) operand before current instruction uses
(reads) the operand (Current instruction reads wrong value)
e.g. Add R4,R1,R3 (R4=R1+R3)
Add R3,R1,R2 (R3=R1+R2) If this happens first
WAR hazard occurs
Write after Write (WAW) hazard

Next instruction modifies (writes) operand before current instruction
modifies (writes) the operand (previous instruction reads wrong value)
Current instruction modifies operand before previous instruction uses the
operand (previous instruction reads wrong value)
These hazards occur with multiple pipelines (superscalar processors)
28
Data Hazardscntd
Avoid hazard
Make sure there are no hazards in the code
Put no operation instructions between dependent instructions
(programmer or compiler)
ADD R1,R2
NOP (no operation)
ADD R3,R1
Detect and stall (wait until the write operation is over)
Time
1 2 3 4 5 6 7
(R1=0) (R1=1)
ADD R3,R1 FI DI idle idle FO EI
(R1=1)
FI DI FO
29
Control Hazards
Arise from the need to make a decision based on the

results of one instruction while others are executing
Occur with branch instructions
PC=200
e.g. Time
100: JMP 200 FI DI FO EI WO

JMP 200
Add R1,R2 Add R1,R2 FI DI FO EI

200:
SUB R1,R2
Wrong
instruction is
fetched
30
Control Hazardscntd
Approaches for handling control hazards

Detect and stall
Delayed branch
Branch prediction
Zelalem Birhanu, AAiT 31

Processor Organization & Instruction Cycle

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Processor Organization & Instruction Cycle

Hochgeladen von

Copyright:

Verfügbare Formate

CHAPTER 5

Processor organization & Instruction

Q. Consider the following assembly code

1. What types of instructions are used in the program?

4. Assume a processor has 12 registers (16-bits each) and

What is a processor (CPU) required to do?

Decoding Interpret (decode)

MAR, MBR [Fetch Data] From memory, I/O

ALU [Process Data]

MAR, MBR [Write Data] To memory, I/O

No clean separation of registers into these categories (depends

General purpose registers

Program Counter (PC): Contains address of next instruction

e.g. Flag register (x86), CPSR(ARM)

Flags : Indicate the occurrence of an event in the CPU

Used by branch (jump) instructions and interrupts

CPU PC 100 Address Bus 100 MOV R1, [200] Memory

CPU Decoder MAR 200 CPU

CPU MAR 200 Address Bus 200 10 Execute

CPU MBR 10 R1 10 CPU

[Process Data] Execute Interrupt

e.g. Suppose a program has 10 instructions with the following

To reduced execution time:

Reduce CPI (execute more instructions with the

Instruction cycle has several stages (fetch, decode,

9 clock cycles for 3 instructions, 3n clock cycles for n instructions

Let the instruction stages overlap

5 clock cycles for 3 instructions (CPI is reduced)

Additional hardware is required for a pipelined

In practice the three stages may take different times (clock

10ns 10ns 30ns

Fetch Decode Execute

Currently decoded instruction has to wait until previous

Throughput is limited by the slowest stage

If we have more stages:

10ns 10ns 10ns 10ns 10ns

Fetch Decode Fetch Execute Write

Operands can be fetched from memory or from registers

Assuming one clock cycle per stage, 3 instructions

Assume an instruction goes through k stages and each stage has

e.g. For =1, k=5, n=10

Some things could go wrong on real pipelined

Occur when two or more instructions that are already in

If operand is to be fetched from memory at stage 3 of the first instruction, a

Another solution for resource hazards is to increase available

Occur when one instruction depends on data value

Such hazard is termed as read after write (RAW) hazard since

Write after Read (WAR) hazard

Write after Write (WAW) hazard

Arise from the need to make a decision based on the

100: JMP 200 FI DI FO EI WO

Approaches for handling control hazards

Zelalem Birhanu, AAiT 31

Das könnte Ihnen auch gefallen