You are on page 1of 35

Introduction & Performance Metrics

Afshan Jamil

Text Book:
Computer organization and design by David A. Patterson &
John L. Hennessy (4th edition)
Reference Book:
Logic and computer design fundamentals by M. Morris
Mano & Charles Kime (4th edition)

Contact:
Office # 14
afshan.jamil@uettaxila.edu.pk

Outline

Computer Architecture Concepts


Basic Operation Cycle
Register Set
ISA Design Issues
Classifying ISA
Addressing Modes
Performance Metrics

Organization vs. Architecture


Computer organization

Encompasses all physical aspects of computer systems.


E.g., circuit design, control signals, memory types.
How does a computer work?

Computer architecture

Logical aspects of system implementation as seen by the programmer.


E.g., instruction sets, instruction formats, data types, addressing
modes.
How do I design a computer?
e.g. Builiding a house

Computer Architecture Concepts

Computer

Architecture

Implementa
tion

Organizatio
n

Hardware

Instruction Set Architecture


Serves as an interface between software and hardware.
Provides a mechanism by which the software tells the hardware
what should be done
High level
language
program

ISA level

Software
Hardware

Hardware

Basic Operation Cycle


Fetch
instructi
on

Store
results

Execute

Decode
instructi
on

Locate &
Fetch
operand
s

Register Set

Programmer accessible portion of register file.

ISA Design Issues


Where

registers, memory, stack, accumulator

How

type & size of operands are supported?

byte, int, float, double, string, vector. . .

What

is the operand location specified?

register, immediate, indirect, . . .

What

many explicit operands are there?

0, 1, 2, or 3

How

are operands stored?

operations are supported?

add, sub, mul, move, compare . . .

Classifying ISA

In a stack architecture, operands are implicitly taken from


the stack.

In an accumulator architecture, one operand of a binary


operation is implicitly in the accumulator.

A stack cannot be accessed randomly.

One operand is in memory, creating lots of bus traffic.

In a general purpose register (GPR) architecture, registers


can be used instead of memory.

Faster than accumulator architecture.


Efficient implementation for compilers.
Results in longer instructions.

CONTD

Most systems today are GPR systems.


There are three types:

Memory-memory where two or three operands may be in


memory.
Register-memory where at least one operand must be in a
register.
Load-store where no operands may be in memory.

The number of operands and the number of available


registers has a direct effect on instruction length.

Instruction Formats
Instruction set architectures are measured according to:

Main memory space occupied by a program.

Instruction complexity.

Instruction length (in bits).

Total number of instructions in the instruction set

Addressing Modes

Implied Mode

needs no address field at all


operand is specified implicitly in the definition of the opcode

Examples

instruction that uses an accumulator without a second


operand is an implied-mode instruction
Complement accumulator
data-manipulation instructions in a stack computer
ADD (the operands are implied to be on top of stack)

Addressing Modes
Immediate Mode:
opcod R1 R2 60
e
0

Direct Mode:
opcod R1 R2 60
e
0

Memory

Indirect Mode:
opcod R1 R2 60
e
0

Memory

Register Direct Mode:


opcod R1 R2 60
e
0
Processor
Registers

Register Indirect Mode:


opcod R1 R2 60
e
0
Processor
Registers

Memory

Relative Addressing Mode:

opcod R1 R2 60
e
0

PC

Memory

Indexed Addressing Mode:


opcod R1 R2 60
e
0

IR

Memory

Addressing Modes (Summary)


Direct, ACC <-800
Immediate, ACC <500
Indirect, ACC <300
Relative, ACC <600
Index, ACC <200
assuming R1 as an index register

Register, ACC <400

assuming R1 register holds operand

Register Indirect, ACC700


<assuming R1 register holds the effective address

Class Task

the instruction shown, what value is loaded into the


For
accumulator for each addressing mode?

Memory

IR: 200

Mode

800 50
900 1000
1000 1400
1100 120

Immediate
Direct

1200 570

Indexed
Addressing

1300 30

Indirect

1400 60

Value
loaded in
ACC

Understanding Performance

Algorithm

Programming language, compiler,


architecture

Determine number of machine instructions


executed per operation

Processor and memory system

Determines number of operations executed

Determine how fast instructions are executed

I/O system (including OS)

Determines how fast I/O operations are executed


Chapter 1 Computer
Abstractions and

Response Time and Throughput

Response time

How long it takes to do a task

Throughput

Total work done per unit time

How are response time and throughput


affected by

e.g., tasks/transactions/ per hour

Replacing the processor with a faster version?


Adding more processors?

Well focus on response time for now


Chapter 1 Computer
Abstractions and

Relative Performance

Define Performance = 1/Execution Time


X is n time faster than Y

Performanc e X Performanc e Y
Execution time Y Execution time X n

Example: If computer A runs a program in 10


seconds and computer B runs the same program in
15 seconds, how much faster is A than B?

Chapter 1 Computer
Abstractions and

Measuring Execution Time

Elapsed time

Total response time, including all aspects

Processing, I/O, OS overhead, idle time

Determines system performance

CPU time

Time spent processing a given job

Discounts I/O time, other jobs shares

Comprises user CPU time and system CPU time


Different programs are affected differently by CPU
and system performance

Chapter 1 Computer
Abstractions and

CPU Clocking

Operation of digital hardware governed by


a constant-rate clock
Clock period

Clock (cycles)
Data transfer
and computation
Update state

Clock period: duration of a clock cycle

e.g., 250ps = 0.25ns = 2501012s

Clock frequency (rate): cycles per


second

Chapter 19
Computer
e.g., 4.0GHz = 4000MHz = 4.010
Hz
Abstractions and

CPU Time
CPU Time CPU Clock Cycles Clock Cycle Time
CPU Clock Cycles

Clock Rate

Performance improved by

Reducing number of clock cycles


Increasing clock rate
Hardware designer must often trade off clock rate
against cycle count

Chapter 1 Computer
Abstractions and

CPU Time Example

Computer A: 2GHz clock, 10s CPU time


Designing Computer B

Aim for 6s CPU time


Can do faster clock, but causes 1.2 clock cycles

How fast must Computer B clock be?


Clock Cycles B 1.2 Clock Cycles A
Clock Rate B

CPU Time B
6s

Clock Cycles A CPU Time A Clock Rate A


10s 2GHz 20 10 9
1.2 20 10 9 24 109
Clock Rate B

4GHz
6s
6s
Chapter 1 Computer
Abstractions and

Instruction Count and CPI


Clock Cycles Instructio n Count Cycles per Instructio n
CPU Time Instructio n Count CPI Clock Cycle Time
Instructio n Count CPI

Clock Rate

Instruction Count for a program

Determined by program, ISA and compiler

Average cycles per instruction

Determined by CPU hardware


If different instructions have different CPI

Average CPI affected by instruction mix


Chapter 1 Computer
Abstractions and

CPI Example
Computer A: Cycle Time = 250ps, CPI = 2.0
Computer B: Cycle Time = 500ps, CPI = 1.2
Same ISA
Which is faster, and by how much?

CPU Time
CPU Time

Instructio n Count CPI Cycle Time


A
A
I 2.0 250ps I 500ps
A is faster

Instructio n Count CPI Cycle Time


B
B
I 1.2 500ps I 600ps

CPU Time

B I 600ps 1.2
CPU Time
I 500ps
A

by this
much
Chapter 1 Computer
Abstractions and

CPI Example

Alternative compiled code sequences using


instructions in classes A, B, C
Class

CPI for class

IC in sequence 1

IC in sequence 2

hich code sequence executes most instructions? Which is fa


PI for each sequence?

Sequence 1: IC =
5

Clock Cycles
= 21 + 12 +
23
= 10

Sequence 2: IC =
6

Clock Cycles
= 41 + 12 +
13 Chapter 1 Computer
= 9 Abstractions and

Performance Summary
The BIG Picture

Instructions Clock cycles Seconds


CPU Time

Program
Instruction Clock cycle

Performance depends on

Algorithm: affects IC, possibly CPI


Programming language: affects IC, CPI
Compiler: affects IC, CPI
Instruction set architecture: affects IC, CPI, T c

Chapter 1 Computer
Abstractions and

Home Task 1

Perform exercise 1.3 and 1.4 (chapter 1 of


computer organization and design)
Perform question # 5, 8, and 9 (chapter 9 of
logic and computer design fundamentals)
Submit on the next class.