Sie sind auf Seite 1von 29

CS M151B / EE M116C

Computer Systems Architecture

Prof. Lei He
Lhe@ee.ucla.edu

1-1

What is Computer Architecture?

Computer Architecture

Instruction Set Architecture


Machine Organization
Hardware Designer
circuits, components, timing, functionality, ease of debugging
construction engineer

Computer Architect
high-level components, how they fit together, how they work
together to deliver performance.
building architect

1-2

Why Computer Architecture?

Industry is rapidly changing


new problems
new opportunities
different tradeoffs

Race for high performance, low power/area


But what will it do for me?
you want to call yourself a computer scientist
you want to build high performance software
you need to make a purchasing decision
you may decide to go into this field!
1-3

How do we classify CA?

Application
Operating
System
Compiler
Instr. Set Proc.

Firmware
I/O system

Instruction Set
Architecture

Datapath & Control


Digital Design
Circuit Design
Layout

Coordination of many levels of abstraction


Under a rapidly changing set of forces
Design, Measurement, and Evaluation
1-4

Forces on Computer Architecture

Technology

Programming
Languages

Applications
Computer
Architecture

Operating
Systems

Cleverness

History
1-5

Instruction Set Architecture

... the attributes of a [computing] system as seen by the programmer, i.e., the
conceptual structure and functional behavior, as distinct from the organization of
the data flows and controls the logic design, and the physical implementation.
Amdahl, Blaaw, and Brooks, 1964

Instruction Set Architecture (ISA):


Anything a programmer needs to know to make an
assembly-language program work correctly.

Instruction formats
What the instructions do
number and types of registers
addressing modes, exceptional conditions, ...

Interface between hardware and low-level software


Standardizes instructions, machine language bit patterns
Different implementations of the same architecture
Can prevent using new innovations

1-6

ISA Examples

Alpha

(v1, v3)

1992-97

PA-RISC

(v1.1, v2.0)

1986-96

Sparc

(v8, v9)

1987-95

MIPS

(MIPS I, II, III, IV, V)

1986-96

x86

(8086,80286,80386, 1978-00
80486,Pentium, MMX, ...)

IA64

Itanium

20021-7

MIPS R3000 ISA

Instruction Categories

Registers

Load/Store
Computational
Jump and Branch
Floating Point

R0 - R31

PC
HI

coprocessor

LO

Memory Management
Special
3 Instruction Formats: all 32 bits wide
OP

rs

rt

OP

rs

rt

OP

rd

sa

immediate
jump target
1-8

funct

Organization

Design your hardware to implement the ISA


Capabilities & performance characteristics of
principal functional blocks
(e.g., Registers, ALU, Shifters, Logic Units, ...)

Interconnections of various blocks


Control between blocks
We can have many different implementations of
a given ISA
scaling trends
new performance enhancing techniques
1-9

Example Organization - PIII

1-10

Example Organization 2 - P4

1-11

High Level View of a Computer

Processor
Input
Control
Memory
Datapath

Output

1-12

Performance

1 20 0

D E C A lp ha 2 12 6 4 /6 0 0

1 10 0
1 00 0
9 00

Performance

8 00
7 00
6 00
5 00

D E C A lp ha 5 /50 0

4 00
3 00

D E C A lp ha 5 / 3 0 0

2 00
1 00

S U N - 4/ M IP S
2 60
M /1 2 0

0
19 87

19 8 8

M IP S
M 20 00
1 9 89

I BM
R S 60 00
1 99 0

D E C A l p h a 4 /2 6 6
I B M P O W E R 10 0
D E C A X P /5 0 0
H P 9 0 0 0 /7 5 0
1 99 1

1 99 2
Ye a r

19 9 3

19 9 4

19 95

1-13

1 9 96

1 99 7

Performance

source: Intel
1-14

What is power?

Power

Energy is measured in Joules


Power is rate of energy consumption
Joules per second (Watts)
Power Density - power/area

Why do we care about this?


Californias energy crisis?
Power is dissipated as heat
Heat is hard to get rid of!
Workstation processor might use 70 Watts
Limits how densely components can be packaged

Battery power is limited!

1-15

Power Density

source: Fred Pollack - Keynote MICRO32


P4 Willamette - 75 Watts, 217 mm2 die, .18m, 1.75 V, 1.3-2.0 GHz
P4 Northwood - 62-68 Watts, 146 mm2 die, .13m, 1.5 V, 1.4-3.6 GHz
1-16

Area

"doubling of transistor density on a manufactured die every year"

source: Intel Website


1-17

Pentium III Die Photo

1st Pentium III, Katmai: 9.5 M transistors, 12.3 *


10.4 mm in 0.25-mi. with 5 layers of aluminum

source: www.tomshardware.com

EBL/BBL - Bus logic, Front, Back


MOB - Memory Order Buffer
Packed FPU - MMX Fl. Pt. (SSE)
IEU - Integer Execution Unit
FAU - Fl. Pt. Arithmetic Unit
MIU - Memory Interface Unit
DCU - Data Cache Unit
PMH - Page Miss Handler
DTLB - Data TLB
BAC - Branch Address Calculator
RAT - Register Alias Table
SIMD - Packed Fl. Pt.
RS - Reservation Station
BTB - Branch Target Buffer
IFU - Instruction Fetch Unit (+I$)
ID - Instruction Decode
ROB - Reorder Buffer
MS - Micro-instruction Sequencer
1-18

Die Photo of P4

1-19

Price/Performance Pyramid

Super

$Millions

Mainframe

$100s Ks

Server
Differences in scale,
not in substance

$10s Ks

Workstation

Personal

Embedded
Figure 3.4 Classifying computers by computational
power and price range.

Slide from Prof. B Parhami at UCSB


1-20

$1000s
$100s
$10s

Automotive Embedded Computers

Impact sensors

Brakes

Airbags

Engine

Cent ral
controller
Navigation &
entert ainment

Figure 3.5 Embedded computers are ubiquitous, yet invisible. They


are found in our automobiles, appliances, and many other places.

Slide from Prof. B Parhami at UCSB


1-21

Generations of Progress

Table 3.2 The 5 generations of digital computers, and their ancestors.


Generation
(begun)

Processor
Memory
I/O devices
technology innovations introduced

Dominant
look & fell

0 (1600s)

(Electro-)
mechanical

Wheel, card

Lever, dial,
punched card

Factory
equipment

1 (1950s)

Vacuum tube

Magnetic
drum

Paper tape,
magnetic tape

Hall-size
cabinet

2 (1960s)

Transistor

Magnetic core Drum, printer,


text terminal

3 (1970s)

SSI/MSI

RAM/ROM
chip

4 (1980s)

LSI/VLSI

SRAM/DRAM Network, CD,


mouse,sound

5 (1990s)

ULSI/GSI/
WSI, SOC

SDRAM, flash Sensor/actuator, Invisible,


point/click
embedded

Room-size
mainframe

Disk, keyboard, Desk-size


video monitor mini
Desktop/
laptop micro

Slide from Prof. B Parhami at UCSB


1-22

What you will learn

Rapidly changing field: doubling every 1.5 years:


memory capacity
processor throughput
organization)

(Due to advances in technology and

Things youll be learning:


how computers work, a basic foundation
how to analyze their performance (or how not to!)
issues affecting modern processors (caches, pipelines)

1-23

Technology Trends

Memory Gap (Wall)


Processor speed - 60% / year
Memory (DRAM) speed - 7% / year
but capacity doubles every 1.5 years!

Interconnect Scaling Bottleneck (deep submicron effect)


Interconnect not scaling with transistors
Size of future structures
Bypassing results between pipeline stages

Clock scaling
Deeper pipelines
Cost of latches and bypass logic
1-24

Memory Wall

From: A Case for Intelligent RAM: IRAM


Patterson et al, IEEE MICRO 1997
1-25

IA-32 History

source: Intel PIII Manual

1-26

IA-32 History (2)

source: Intel PIII Manual

1-27

Levels of Representation

High Level Language


Program

temp = v[k];

v[k] = v[k+1];
v[k+1] = temp;

Compiler
lw $15, 0($2)
lw $16, 4($2)
sw $16, 0($2)
sw $15, 4($2)

Assembly Language
Program
Assembler
Machine Language
Program

Note: some
compilers translate
directly to machine
language

0000 1001 1100 0110 1010 1111 0101 1000


1010 1111 0101 1000 0000 1001 1100 0110
1100 0110 1010 1111 0101 1000 0000 1001
0101 1000 0000 1001 1100 0110 1010 1111

Machine Interpretation
Control Signal
Specification

ALUOP[0:3] <= InstReg[9:11] & MASK

1-28

Key Points

All computers consist of five components


(1) datapath
Processor
(2) control
(3) Memory
(4) Input devices
(5) Output devices

ISA defines how software can use the hardware


Organization defines how the ISA is implemented
Heavily influenced by scaling trends

Need to design against constraints of performance,


power, area and cost
Challenge: What to do with future silicon real estate?
1-29