Sie sind auf Seite 1von 84

Outline

t Computer: A historical perspective


t Abstractions
t Technology
l Performance
n Definition
n CPU performance
l Power trends: multi-processing
l Measuring and evaluating performance
l Cost

Computer Abstractions and Technology-1 Computer Architecture


4
. …

Computer Abstractions and Technology-3 Computer Architecture


4 ?
t A device that computes, especially a programmable
electronic machine that performs high-speed
mathematical or logical operations or that
assembles, stores, correlates, or otherwise
processes information
-- The American Heritage Dictionary of the English
Language, 4th Edition, 2000

Computer Abstractions and Technology-4 Computer Architecture


t Special-purpose versus general-purpose
t Non-programmable versus programmable
t Scientific versus office data processing
t Mechanical, electromechanical, electronic, …

Tabulating machine
(H. Hollerith, 1889)
Difference Engine
Harvard Mark I
(C. Babbage, 1822)
(IBM, H. Aiken, 1944)
Computer Abstractions and Technology-5 Computer Architecture
0

4 ?
t ENIAC (Electronic Numerical Integrator
and Calculator)
t Work started in 1943 in Moore School of Electrical
Engineering at the University of Pennsylvania, by
John Mauchly and J. Presper Eckert
t Completed in 1946
t 25 2.5
t 20 10-digit registers, each 2 feet
t 18,000
(electronic switches, 1906 )
t 1900
t Programming manually by
plugging cables and setting
switches

Computer Abstractions and Technology-7 Computer Architecture


ENIAC

Computer Abstractions and Technology-8 Computer Architecture


t By W. Shockley, J.
Bardeen, W. Brattain of
Bell Lab. in 1947
l Much more reliable
than vacuum tubes
l Electronic switches
in “solids”

Computer Abstractions and Technology-9 Computer Architecture


UNIVAC (Remington-Rand, 1951)

IBM 701 (IBM, 1952)

Computer Abstractions and Technology-10 Computer Architecture


t Ex.: IBM 1401 (IBM, 1959)

This is how
IBM is called
“Big Blue”!
Computer Abstractions and Technology-11 Computer Architecture
5 IC
t 1958 Jack Kilby: integrated a
transistor with resistors and capacitors on a single
semiconductor chip, which is a monolithic IC

Computer Abstractions and Technology-12 Computer Architecture


IC ...
t 1971 Intel 4004
l 108 KHz, 0.06 MIPS
l 2300 transistors (10 microns)
l Bus width: 4 bits
l Memory addr.: 640 bytes
l For Busicom calculator
(original commission was
12 chips)

Computer Abstractions and Technology-13 Computer Architecture


...
t 1977 Apple II: Steve Jobs, Steve Wozniak
Motorola 6502 CPU, 48Kb RAM

Computer Abstractions and Technology-14 Computer Architecture


? PC
t 1981 IBM PC: Intel 8088, 4.77MHz, 16Kb RAM,
two 160Kb floppy disks

Computer Abstractions and Technology-15 Computer Architecture


t 1973: Researchers at
Xerox PARC developed
an experimental PC: Alto
l Mouse, Ethernet,
bit-mapped graphics, icons,
menus, WYSIWG editing
t Hosted the invention of:
l Local-area networking
l Laser printing
l All of modern client / server
distributed computing

Computer Abstractions and Technology-16 Computer Architecture


PC --
t 1979: 1st electronic spreadsheet (VisiCalc for
Apple II) by Don Bricklin and Bob Franston
l “The killer app for early PCs”
l Followed by dBASE II, ...

Computer Abstractions and Technology-17 Computer Architecture


...

Computer Abstractions and Technology-18 Computer Architecture


80 IC VLSI
t New processor architecture was introduced:
RISC (Reduced Instruction Set Computer)
l IBM: John Cocke
l UC Berkeley: David Patterson
l Stanford: John Hennessy
t Commercial RISC processors around 1985
l MIPS: MIPS
l Sun: Sparc
l IBM: Power RISC
l HP: PA-RISC
l DEC: Alpha
t They compete with CISC (complex instruction set
computer) processors, mainly Intel x86 processors,
for the next 20 years
Computer Abstractions and Technology-19 Computer Architecture

PC
(Embedded Computer)

Computer Abstractions and Technology-20 Computer Architecture


1.1 Introduction
The Computer Revolution
t Progress in computer technology
l Underpinned by Moore’s Law
t Makes novel applications feasible
l Computers in automobiles
l Cell phones
l Human genome project
l World Wide Web
l Search Engines
t Computers are pervasive

Computer Abstractions and Technology-21 Computer Architecture


Line Width/Feature Size

Computer Abstractions and Technology-22 Computer Architecture


Computer Abstractions and Technology-23 Computer Architecture
Technology Trends:
Microprocessor Capacity
2X transistors/chip
every 1.5 years
called

Computer Abstractions and Technology-24 Computer Architecture


Classes of Computers
t Desktop computers
l General purpose, variety of software
l Subject to cost/performance tradeoff
t Server computers
l Network based
l High capacity, performance, reliability
l Range from small servers to building sized
t Embedded computers
l Hidden as components of systems
l Stringent power/performance/cost constraints

Computer Abstractions and Technology-25 Computer Architecture


Computer Progress Supported/Driven
by Market and Usage
t Applications drive machine “balance”
l Numerical simulations: floating-point, memory BW
l Transaction processing: I/O, INT performance
l Media processing: low-precision ‘pixel’ arithmetic
t Applications drive machine performance
l What if my computer runs all my software very fast?
l Programs use increasing amount of memory:
n Double per 1.5-2 year, or 0.5-1 addressing bit per year
l High-level programming languages replace assembly
languages => compilers important
n Compiler and architecture work together
t Effects of compatibility and ease of use
t Effects of market demands and market share
l Can investment in R&D, production be paid off?
Computer Abstractions and Technology-26 Computer Architecture
Computer Usage: General Purpose (PC
and Server)
t Uses: commercial (int.), scientific (FP, graphics),
home (int., audio, video, graphics)
l Software compatibility is the most important factor
l Short product life; higher price and profit margin
l OS issue: OS serves another interface above arch.
n Effects of OS developments on architecture
n RISC-based Unix workstation vs x86-based PC: (1)
units sold is only 1% of PC’s, (2) emphasize more on
performance than on price
t Future:
l Use increased transistors for performance, human
interface (multimedia), bandwidth, monitoring

Computer Abstractions and Technology-27 Computer Architecture


Computer Usage: Embedded
t A computer inside another device used for running
one predetermined application
t Uses: control (traffic, printer, disk); consumer
electronics (video game, CD player, PDA); cell
phone Lego Mindstorms

Robotic command explorer:


A “Programmable Brick”,
Hitachi H8 CPU (8-bit), 32KB RAM,
LCD, batteries,
infrared transmitter/receiver,
4 control buttons, 6 connectors
Computer Abstractions and Technology-28 Computer Architecture
? 4 ?

Computer Abstractions and Technology-29 Computer Architecture


Computer Abstractions and Technology-30 Computer Architecture
Embedded Computers
t Typically w/o FP or MMU, but integrating various
peripheral functions, e.g., DSP
l Large variety in ISA, performance, on-chip
peripherals
l Compatibility is non-issue, new ISA easy to enter,
low power become important
t More architecture and survive longer:
4- or 8-bit microprocessor still in use
(8-bit for cost-sensitive, 32-bit for performance)
t Large volume sale (billions) at low price ($40-$5)
t Use of microprocessor:
l 1995 #1: x86; #2: 6800; #3: Hitachi SuperH (Sega)
l 2002 #1: ARM #2: x86; #3: Motorola 6800
t Trend: lower cost, more functionality
l system-on-chip, µP core on ASIC
Computer Abstractions and Technology-31 Computer Architecture
The Processor Market

Computer Abstractions and Technology-32 Computer Architecture


Outline
t Computer: A historical perspective
t Abstractions
t Technology
l Performance
n Definition
n CPU performance
l Power trends: multi-processing
l Measuring and evaluating performance
l Cost

Computer Abstractions and Technology-33 Computer Architecture


1.2 Below Your Program
Below Your Program
t Application software
l Written in high-level language
t System software
l Compiler: translates HLL code to
machine code
l Operating System: service code
n Handling input/output
n Managing memory and storage
n Scheduling tasks & sharing
resources
t Hardware
l Processor, memory, I/O controllers

Computer Abstractions and Technology-34 Computer Architecture


Levels of Program Code
t High-level language
l Level of abstraction closer
to problem domain
l Provides for productivity
and portability
t Assembly language
l Textual representation of
instructions
t Hardware representation
l Binary digits (bits)
l Encoded instructions and
data

Computer Abstractions and Technology-35 Computer Architecture


1.3 Under the Covers
Components of a Computer
The BIG Picture t Same components for
all kinds of computer
l Desktop, server,
embedded
t Input/output includes
l User-interface devices
n Display, keyboard, mouse
l Storage devices
n Hard disk, CD/DVD, flash
l Network adapters
n For communicating with
other computers

Computer Abstractions and Technology-36 Computer Architecture


Anatomy of a Computer

Output
device

Network
cable

Input Input
device device

Computer Abstractions and Technology-37 Computer Architecture


Anatomy of a Mouse
t Optical mouse 光學滑鼠:裡⾯面有照相機每
秒照很多次,再由影像分析
l LED illuminates 來來判斷移動
desktop
l Small low-res camera
l Basic image processor
n Looks for x, y
movement
l Buttons & wheel
t Supersedes roller-ball
mechanical mouse

Computer Abstractions and Technology-38 Computer Architecture


Through the Looking Glass
t LCD screen: picture elements (pixels)
l Mirrors content of frame buffer memory
l Bit map: a matrix of pixels
l Resolution in 2008: 640 x 480 to 2560 x 1600 pixels

Computer Abstractions and Technology-39 Computer Architecture


Opening the Box

Computer Abstractions and Technology-40 Computer Architecture


Inside the Processor (CPU)
t Datapath: performs operations on data
t Control: sequences datapath, memory, ...
t Cache memory
l Small fast SRAM memory for immediate access to
data

Computer Abstractions and Technology-41 Computer Architecture


Inside the Processor
t AMD Barcelona: 4 processor cores

Computer Abstractions and Technology-42 Computer Architecture


A Safe Place for Data
t Volatile main memory
l Loses instructions and data when power off
t Non-volatile secondary memory
l Magnetic disk
l Flash memory
l Optical disk (CDROM, DVD)

Computer Abstractions and Technology-43 Computer Architecture


Networks
t Communication and resource sharing
t Local area network (LAN): Ethernet
l Within a building
t Wide area network (WAN): the Internet
t Wireless network: WiFi, Bluetooth

Computer Abstractions and Technology-44 Computer Architecture


Abstractions
The BIG Picture
t Abstraction helps us deal with complexity
l Hide lower-level detail HW & SW的interface
HW:只要把ISA做出來來
t Instruction set architecture (ISA) SW: 只要⽤用ISA

l The hardware/software interface


軟體開發只要有ISA和OS
t Application binary interface 的system call就可以了了

l The ISA plus system software interface


t Implementation
l The details underlying and interface

Computer Abstractions and Technology-45 Computer Architecture


Outline
t Computer: A historical perspective
t Abstractions
t Technology
l Performance
n Definition
n CPU performance
l Power trends: multi-processing
l Measuring and evaluating performance
l Cost

Computer Abstractions and Technology-46 Computer Architecture


Technology Trends
t Electronics
technology continues
to evolve
l Increased capacity
and performance
l Reduced cost
DRAM capacity

Year Technology Relative performance/cost


1951 Vacuum tube 1
1965 Transistor 35
1975 Integrated circuit (IC) 900
1995 Very large scale IC (VLSI) 2,400,000
2005 Ultra large scale IC 6,200,000,000

Computer Abstractions and Technology-47 Computer Architecture


Concorde:
• Capacity: 132 persons
• Range: 4000 miles
• Cruising speed: 1350 mph

747-400:
• Capacity: 470 persons
• Range: 4150 miles
• Cruising speed: 610 mph

Computer Abstractions and Technology-48 Computer Architecture


1.4 Performance
Defining Performance
t Which airplane has the best performance?

Boeing 777 Boeing 777

Boeing 747 Boeing 747

BAC/Sud BAC/Sud
Concorde Concorde
Douglas DC- Douglas DC-
8-50 8-50

0 100 200 300 400 500 0 2000 4000 6000 8000 10000

Passenger Capacity Cruising Range (miles)

Boeing 777 Boeing 777

Boeing 747 Boeing 747

BAC/Sud BAC/Sud
Concorde Concorde
Douglas DC- Douglas DC-
8-50 8-50

0 500 1000 1500 0 100000 200000 300000 400000

Cruising Speed (mph) Passengers x mph

Computer Abstractions and Technology-49 Computer Architecture


Response Time and Throughput
t Response time
l How long it takes to do a task
t Throughput
l Total work done per unit time
n e.g., tasks/transactions/… per hour
t How are response time and throughput affected by
l Replacing the processor with a faster version?
l Adding more processors?
t We’ll focus on response time for now…

Computer Abstractions and Technology-50 Computer Architecture


Measuring Execution Time
t Elapsed time
l Total response time, including all aspects
n Processing, I/O, OS overhead, idle time
l Determines system performance
t CPU time
l Time spent processing a given job
n Discounts I/O time, other jobs’ shares
l Comprises user CPU time and system CPU time
t Different programs are affected differently by CPU
and system performance

Computer Abstractions and Technology-51 Computer Architecture


Relative Performance
t Define Performance = 1/Execution Time
t “X is n time faster than Y”

Performance X Performance Y
= Execution time Y Execution time X = n
t Example: time taken to run a program
l 10s on A, 15s on B
l Execution TimeB / Execution TimeA
= 15s / 10s = 1.5
l So A is 1.5 times faster than B

Computer Abstractions and Technology-52 Computer Architecture


CPU Clocking
t Operation of digital hardware governed by a
constant-rate clock

Clock period

Clock (cycles)

Data transfer
and computation

Update state

t Clock period: duration of a clock cycle


l e.g., 250ps = 0.25ns = 250 10–12s
t Clock frequency (rate): cycles per second
l e.g., 4.0GHz = 4000MHz = 4.0 109Hz

Computer Abstractions and Technology-53 Computer Architecture


CPU Time

CPU Time = CPU Clock Cycles × Clock Cycle Time


CPU Clock Cycles
=
Clock Rate
t Performance improved by
l Reducing number of clock cycles
l Increasing clock rate
l Hardware designer must often trade off clock rate
against cycle count

Computer Abstractions and Technology-54 Computer Architecture


CPU Time Example
t Computer A: 2GHz clock, 10s CPU time
t Designing Computer B
l Aim for 6s CPU time
l Can do faster clock, but causes 1.2 clock cycles
t How fast must Computer B clock be?
Clock CyclesB 1.2 × Clock CyclesA
Clock RateB = =
CPU Time B 6s
Clock CyclesA = CPU Time A × Clock Rate A
= 10s × 2GHz = 20 × 10 9
1.2 × 20 × 10 9 24 × 10 9
Clock RateB = = = 4GHz
6s 6s
Computer Abstractions and Technology-55 Computer Architecture
Instruction Count and CPI

Clock Cycles = Instruct. Count × Cycles per Instruct.


CPU Time = Instruct. Count × CPI × Clock Cycle Time
Instruct. Count × CPI
=
Clock Rate
t CPI : Clock Per Instruction
t Instruction Count for a program
l Determined by program, ISA and compiler
t Average cycles per instruction
l Determined by CPU hardware
l If different instructions have different CPI
n Average CPI affected by instruction mix

Computer Abstractions and Technology-56 Computer Architecture


CPI Example
t Computer A: Cycle Time = 250ps, CPI = 2.0
t Computer B: Cycle Time = 500ps, CPI = 1.2
t Same ISA
t Which is faster, and by how much?

CPU Time = Instruct. Count × CPI × Cycle Time


A A A
= I × 2.0 × 250ps = I × 500ps A is faster…
CPU Time = Instruct. Count × CPI × Cycle Time
B B B
= I × 1.2 × 500ps = I × 600ps
CPU Time
B = I × 600ps = 1.2
…by this much
CPU Time I × 500ps
A
Computer Abstractions and Technology-57 Computer Architecture
CPI in More Detail
t If different instruction classes take different
numbers of cycles

n
Clock Cycles = ∑ (CPIi × Instruct. Counti )
i=1

t Weighted average CPI

n
Clock Cycles ⎛ Instruct. Counti ⎞
CPI = = ∑ ⎜ CPIi × ⎟
Instruct. Count i=1 ⎝ Instruct. Count ⎠

Relative frequency

Computer Abstractions and Technology-58 Computer Architecture


CPI Example
t Alternative compiled code sequences using
instructions in classes A, B, C

Class A B C
CPI for class 1 2 3
IC in sequence 1 2 1 2
IC in sequence 2 4 1 1

t Sequence 1: IC = 5 t Sequence 2: IC = 6
l Clock Cycles l Clock Cycles
=2 1+1 2+2 3 =4 1+1 2+1 3
= 10 =9
l Avg. CPI = 10/5 = 2.0 l Avg. CPI = 9/6 = 1.5

Computer Abstractions and Technology-59 Computer Architecture


Performance Summary
The BIG Picture

Instruct. Clock cycles Seconds


CPU Time = × ×
Program Instruct. Clock cycle
t Performance depends on
Instruction CPI Clock
Count Rate
Program
Compiler
Instruction Set
Organization
Technology
Computer Abstractions and Technology-60 Computer Architecture
Outline
t Computer: A historical perspective
t Abstractions
t Technology
l Performance
n Definition
n CPU performance
l Power trends: multi-processing
l Measuring and evaluating performance
l Cost

Computer Abstractions and Technology-61 Computer Architecture


1.5 The Power Wall
Power Trends

t In CMOS IC technology

Power = Capacitive load × Voltage 2 × Frequency

30 5V → 1V 1000

Computer Abstractions and Technology-62 Computer Architecture


Reducing Power
t Suppose a new CPU has
l 85% of capacitive load of old CPU
l 15% voltage and 15% frequency reduction

Pnew Cold × 0.85 × (Vold × 0.85) 2 × Fold × 0.85 4


= 2
= 0.85 = 0.52
Pold Cold × Vold × Fold
t The power wall
l We can’t reduce voltage further
l We can’t remove more heat
t How else can we improve performance?

Computer Abstractions and Technology-63 Computer Architecture


1.6 The Sea Change: The Switch to Multiprocessors
Uniprocessor Performance

Constrained by power, instruction-level parallelism, memory


latency

Computer Abstractions and Technology-64 Computer Architecture


Multiprocessors
t Multicore microprocessors
l More than one processor per chip
t Requires explicitly parallel programming
l Compare with instruction level parallelism
n Hardware executes multiple instructions at once
n Hidden from the programmer
l Hard to do
n Programming for performance
n Load balancing
n Optimizing communication and synchronization

Computer Abstractions and Technology-65 Computer Architecture


Outline
t Computer: A historical perspective
t Abstractions
t Technology
l Performance
n Definition
n CPU performance
l Power trends: multi-processing
l Measuring and evaluating performance
l Cost

Computer Abstractions and Technology-66 Computer Architecture


What Programs for Comparison?
t What’s wrong with this program as a workload?
integer A[][], B[][], C[][];
for (I=0; I<100; I++)
for (J=0; J<100; J++)
for (K=0; K<100; K++)
C[I][J] = C[I][J] + A[I][K]*B[K][J];

t What measured? Not measured? What is it good


for?
t Ideally run typical programs with typical input
before purchase, or before even build machine
l Called a “workload”; For example:
l Engineer uses compiler, spreadsheet
l Author uses word processor, drawing program,
compression software
Computer Abstractions and Technology-67 Computer Architecture
Benchmarks
t Obviously, apparent speed of processor depends on
code used to test it
t Need industry standards so that different
processors can be fairly compared => benchmark
programs
t Companies exist that create these benchmarks:
“typical” code used to evaluate systems
t Tricks in benchmarking:
l different system configurations
l compiler and libraries optimized (perhaps manually)
for benchmarks
l test specification biased towards one machine
l very small benchmarks used
t Need to be changed every 2 or 3 years since
designers couldComputer
target these standard benchmarks
Abstractions and Technology-68 Computer Architecture
Example Standardized Workload
Benchmarks
t Standard Performance Evaluation Corporation
(SPEC) : supported by a number of computer
vendors to create standard set of benchmarks
t Began in 1989 focusing on benchmarking
workstation and servers using CPU-intensive
benchmarks
t The latest release: SPEC2006 benchmarks
l CPU performance (CINT 2006, CFP 2006)
l High-performance computing
l Client-sever models
l Mail systems
l File systems
l Web-servers …

Computer Abstractions and Technology-69 Computer Architecture


SPEC CPU Benchmark
t SPEC CPU2006
l Elapsed time to execute a selection of programs
n Negligible I/O, so focuses on CPU performance
l Normalize relative to reference machine
l Summarize as geometric mean of performance ratios
n CINT2006 (integer)

n
n
∏ Execution time ratio
i=1
i

Computer Abstractions and Technology-70 Computer Architecture


CINT2006 for Opteron X4 2356
Name Description IC 109 CPI Tc (ns) Exec time Ref time SPECratio

perl Interpreted string processing 2,118 0.75 0.40 637 9,777 15.3

bzip2 Block-sorting compression 2,389 0.85 0.40 817 9,650 11.8

gcc GNU C Compiler 1,050 1.72 0.47 24 8,050 11.1

mcf Combinatorial optimization 336 10.00 0.40 1,345 9,120 6.8

go Go game (AI) 1,658 1.09 0.40 721 10,490 14.6

hmmer Search gene sequence 2,783 0.80 0.40 890 9,330 10.5

sjeng Chess game (AI) 2,176 0.96 0.48 37 12,100 14.5

libquantum Quantum computer simulation 1,623 1.61 0.40 1,047 20,720 19.8

h264avc Video compression 3,102 0.80 0.40 993 22,130 22.3

omnetpp Discrete event simulation 587 2.94 0.40 690 6,250 9.1

astar Games/path finding 1,082 1.79 0.40 773 7,020 9.1

xalancbmk XML parsing 1,058 2.70 0.40 1,143 6,900 6.0

Geometric mean 11.7

High cache miss rates

Computer Abstractions and Technology-71 Computer Architecture


SPEC Power Benchmark
t Power consumption of server at different workload
levels (10% increase each run, average them)
l Performance: ssj_ops/sec
l Power: Watts (Joules/sec)

⎛ 10 ⎞ ⎛ 10 ⎞
Overall ssj_ops per Watt = ⎜ ∑ ssj_ops i ⎟ ⎜ ∑ poweri ⎟
⎝ i =0 ⎠ ⎝ i=0 ⎠

Computer Abstractions and Technology-72 Computer Architecture


SPECpower_ssj2008 for X4

Target Load % Performance (ssj_ops/sec) Average Power (Watts)


100% 231,867 295
90% 211,282 286
80% 185,803 275
70% 163,427 265
60% 140,160 256
50% 118,324 246
40% 920,35 233
30% 70,500 222
20% 47,126 206
10% 23,066 180
0% 0 141
Overall sum 1,283,590 2,605
∑ssj_ops/ ∑power 493

Computer Abstractions and Technology-73 Computer Architecture


Outline
t Computer: A historical perspective
t Abstractions
t Technology
l Performance
n Definition
n CPU performance
l Power trends: multi-processing
l Measuring and evaluating performance
l Cost

Computer Abstractions and Technology-74 Computer Architecture


1.7 Real Stuff: The AMD Opteron X4
Manufacturing ICs

t Yield: proportion of working dies per wafer

Computer Abstractions and Technology-75 Computer Architecture


AMD Opteron X2 Wafer

t X2: 300mm wafer, 117 chips, 90nm technology


t X4: 45nm technology

Computer Abstractions and Technology-76 Computer Architecture


Integrated Circuit Cost

Cost per wafer


Cost per die =
Dies per wafer × Yield
Dies per wafer ≈ Wafer area Die area
# of good dies 1
Yield = =
# of total dies (1 + (Defects per area × Die area/2)) 2

t Nonlinear relation to area and defect rate


l Wafer cost and area are fixed
l Defect rate determined by manufacturing process
l Die area determined by architecture and circuit
design

Computer Abstractions and Technology-77 Computer Architecture


Cost of a Chip Includes ...
t Die cost: affected by wafer cost, number of dies
per wafer, and die yield (#good dies/#total dies)
t Testing cost
t Packaging cost: depends on pins, heat dissipation,
...

Computer Abstractions and Technology-78 Computer Architecture


1
?

Computer Abstractions and Technology-79 Computer Architecture


t enhance : 0.5 + 0.5 = 1
t ?enhance 4
t , ?enhance 1

t 4+1
speedup = ----------------------- = ---------- = 2.5
1+1

Computer Abstractions and Technology-80 Computer Architecture


An抖s

1.8 Fallacies and Pitfalls


Pitfall: Amdahl’s Law
t Improving an aspect of a computer and
expecting a proportional improvement in overall
performance

Taffected
Timproved = + Tunaffected
improvemen t factor
t Example: multiply accounts for 80s/100s
l How much improvement in multiply performance to
get 5 overall?

80 l Can’t be done!
20 = + 20
n
t Corollary: make the common case fast
Computer Abstractions and Technology-81 Computer Architecture
idle時消耗少是錯的!!

Fallacy: Low Power at Idle


t Look back at X4 power benchmark
l At 100% load: 295W
l At 50% load: 246W (83%)
l At 10% load: 180W (61%)
t Google data center
l Mostly operates at 10% – 50% load
l At 100% load less than 1% of the time
t Consider designing processors to make power
proportional to load

Computer Abstractions and Technology-82 Computer Architecture


Pitfall: MIPS as a Performance Metric
t MIPS: Millions of Instructions Per Second
l Doesn’t account for
n Differences in ISAs between computers
n Differences in complexity between instructions
還是要看真正執⾏行行的時間

Instruct. count
MIPS =
Execution time × 10 6
Instruct. count Clock rate
= =
Instruct. count × CPI 6 CPI × 10 6
× 10
Clock rate
l CPI varies between programs on a given CPU

Computer Abstractions and Technology-83 Computer Architecture


1.9 Concluding Remarks
Concluding Remarks
t Cost/performance is improving
l Due to underlying technology development
t Hierarchical layers of abstraction
l In both hardware and software
t Instruction set architecture
l The hardware/software interface
t Execution time: the best performance measure
t Power is a limiting factor
l Use parallelism to improve performance

Computer Abstractions and Technology-84 Computer Architecture

Das könnte Ihnen auch gefallen