Sie sind auf Seite 1von 37

1

Chapter 1
Parallel Computer Models

2
In this chapter…

• THE STATEOFCOMPUTING
• MULTIPROCESSORS AND MULTICOMPUTERS
• MULTIVECTOR AND SIMDCOMPUTERS
• PRAM AND VLSI MODELS
• ARCHITECTURAL DEVELOPMENT TRACKS

3
THE STATE OF COMPUTING
Computer Development Milestones
• How it all started…
o 500 BC:Abacus (China) – The earliest mechanical computer/calculating device.
• Operated to perform decimal arithmetic with carry propagation digit by digit
o 1642: Mechanical Adder/Subtractor (Blaise Pascal)
o 1827: Difference Engine (Charles Babbage)
o 1941: First binary mechanical computer (Konrad Zuse; Germany)
o 1944: Harvard Mark I (IBM)
• The very first electromechanical decimal computer as proposed by Howard Aiken
• Computer Generations
o 1st 2nd 3rd 4th 5th
o Division into generations marked primarily by changes in hardware and software
technologies

4
THE STATE OF COMPUTING
Computer Development Milestones
• First Generation (1945 – 54)
o Technology & Architecture:
• Vacuum Tubes
• Relay Memories
• CPUdriven by PCand accumulator
• Fixed Point Arithmetic
o Software and Applications:
• Machine/Assembly Languages
• Single user
• No subroutine linkage
• Programmed I/O using CPU
o Representative Systems: ENIAC, Princeton IAS, IBM701

5
THE STATE OF COMPUTING
Computer Development Milestones
• Second Generation (1955 – 64)
o Technology & Architecture:
• Discrete Transistors
• Core Memories
• Floating Point Arithmetic
• I/O Processors
• Multiplexed memory access
o Software and Applications:
• High level languages used with compilers
• Subroutine libraries
• Processing Monitor
o Representative Systems: IBM 7090, CDC1604, Univac LARC

6
THE STATE OF COMPUTING
Computer Development Milestones
• Third Generation (1965 – 74)
o Technology & Architecture:
• IC Chips (SSI/MSI)
• Microprogramming
• Pipelining
• Cache
• Look-ahead processors
o Software and Applications:
• Multiprogramming and Timesharing OS
• Multiuser applications
o Representative Systems: IBM 360/370, CDC6600, T1-ASC, PDP-8

7
THE STATE OF COMPUTING
Computer Development Milestones
• Fourth Generation (1975 –90)
o Technology & Architecture:
• LSI/VLSI
• Semiconductor memories
• Multiprocessors
• Multi-computers
• Vector supercomputers
o Software and Applications:
• Multiprocessor OS
• Languages, Compilers and environment for parallel processing
o Representative Systems: VAX9000, Cray X-MP, IBM 3090

8
THE STATE OF COMPUTING
Computer Development Milestones
• Fifth Generation (1991 onwards)
o Technology & Architecture:
• Advanced VLSI processors
• ScalableArchitectures
• Superscalar processors
o Software andApplications:
• Systems on a chip
• Massively parallel processing
• Grand challenge applications
• Heterogeneous processing
o Representative Systems: S-81, IBM ES/9000, Intel Paragon, nCUBE 6480, MPP, VPP500

9
THE STATE OF COMPUTING
Elements of Modern Computers
• Computing Problems
• Algorithms and Data Structures
• Hardware Resources
• Operating System
• System Software Support
• Compiler Support

10
THE STATEOF COMPUTING
Evolution of Computer Architecture
• The study of computer architecture involves both thefollowing:
o Hardware organization
o Programming/software requirements
• The evolution of computer architecture is believed to havestarted
with von Neumann architecture
o Built as a sequential machine
o Executing scalar data
• Major leaps in this context cameas…
o Look-ahead, parallelism and pipelining
o Flynn’s classification
o Parallel/Vector Computers
o Development Layers

11
THESTATEOFCOMPUTING
Evolution of Computer Architecture

12
Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University 13
13
THE STATEOF COMPUTING
Evolution of Computer Architecture

14
THESTATEOFCOMPUTING
Evolution of Computer Architecture

15
Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University 16
THE STATE OF COMPUTING
System Attributes to Performance
• Machine Capability and Program Behaviour
• Peak Performance
• Turnaround time
• Cycle Time, Clock Rate and Cycles Per Instruction (CPI)
• Performance Factors
o Instruction Count, Average CPI, Cycle Time, Memory Cycle Time and No. of memorycycles
• System Attributes
o Instruction Set Architecture, Compiler Technology, Processor Implementation andcontrol,
Cache and Memory Hierarchy
• MIPS Rate, FLOPSand Throughput Rate
• Programming Environments – Implicit and Explicit Parallelism

16
THE STATEOF COMPUTING
System Attributes to Performance
• Cycle Time (processor) 𝜏
• Clock Rate 𝑓= 1/𝜏
• Average no. of cycles per instruction 𝐶𝑃𝐼
• No. of instructions in program 𝐼𝑐
• CPUTime 𝑇= 𝐼𝑐× 𝐶𝑃𝐼×𝜏
• Memory CycleTime 𝑘
𝜏
• No. of Processor Cyclesneeded 𝑝
• No. of Memory Cyclesneeded 𝑚
• Effective CPUTime 𝑇= 𝐼𝑐× (𝑝+ 𝑚× 𝑘)×𝜏

17
THE STATE OF COMPUTING
System Attributes to Performance
• MIPS Rate
𝐼𝑐 𝑓 𝑓×𝐼
• 𝜏= 𝑇×106
= 𝐶𝑃𝐼×106 = 𝐶×10
𝑐
6

• Throughput Rate
6
𝑀𝐼𝑃𝑆×10 𝑓
• 𝑊𝑝= =
𝐼 𝑐 𝐶𝑃𝐼×𝐼𝑐

18
THE STATE OF COMPUTING
System Attributes to Performance
• A benchmark program contains 450000 arithmetic
instructions, 320000 data transfer instructions and 230000
control transfer instructions. Each arithmetic instruction takes
1 clock cycle to execute whereas each data transfer and
control transfer instruction takes 2 clock cycles to execute. On
a 400 MHzprocessors, determine:
o Effective no. of cycles per instruction (CPI)
o Instruction execution rate (MIPS rate)
o Execution time for this program

19
THE STATEOF COMPUTING
System Attributes to Performance
Performance Factors
Instruction Average Cycles per Instruction (CPI) Processor Cycle
Count (Ic) Time (𝜏)
Processor Memory Memory Access
Cycles per References per Latency (k)
System Instruction Instruction (m)
Attribute (CPI and p)
s
Instruction-set
Architecture
Compiler
Technology
Processor
Implementation
and Control
Cache and
Memory
Hierarchy 20
Multiprocessors and Multicomputers
• Shared Memory Multiprocessors
o The UMA Model
o The NUMAModel
o The COMAModel
o The CC-NUMAModel
• Distributed-Memory Multicomputers
o The NORMAMachines
o Message Passing multicomputers
• Taxonomy of MIMD Computers
• Representative Systems
o Multiprocessors: BBN TC-200, MPP, S-81, IBM ES/9000 Model 900/VF,
o Multicomputers: Intel Paragon XP/S, nCUBE/2 6480, SuperNode 1000, CM5, KSR-1

21
Multiprocessors and Multicomputers

22
Multiprocessors and Multicomputers

23
Multiprocessors and Multicomputers

24
Multiprocessors and Multicomputers

25
Multiprocessors and Multicomputers

26
Multivector and SIMDComputers

• Vector Processors
o Vector Processor Variants
• Vector Supercomputers
• Attached Processors
o Vector Processor Models/Architectures
• Register-to-register architecture
• Memory-to-memory architecture
o Representative Systems:
• Cray-I
• Cray Y-MP (2,4, or 8 processors with 16Gflops peakperformance)
• Convex C1, C2, C3 series (C3800 family with 8 processors, 4 GBmain memory, 2
Gflops peak performance)
• DEC VAX 9000 (pipeline chaining support)

27
Multivector and SIMDComputers

28
Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University 29
Multivector and SIMDComputers

• SIMDSupercomputers
o SIMD Machine Model
• S = < N, C, I, M, R >
• N: No. of PEsin themachine
• C: Set of instructions (scalar/program flow) directly executed by control unit
• I: Set of instructions broadcast by CUto all PEsfor parallelexecution
• M: Set of masking schemes
• R: Set of data routing functions
o Representative Systems:
• MasPar MP-1 (1024 to 16384PEs)
• CM-2 (65536 PEs)
• DAP600 Family (up to 4096PEs)
• Illiac-IV (64 PEs)

29
Multivector and SIMDComputers

30
PRAM and VLSI Models

• Parallel Random Access Machines


o Time and SpaceComplexities
• Time complexity
• Space complexity
• Serial and Parallel complexity
• Deterministic and Non-deterministic algorithm
o PRAM
• Developed by Fortune and Wyllie (1978)
• Objective:
o Modelling idealized parallel computers with zero synchronization or memory
access overhead
• An n-processor PRAM has a globally addressable Memory

31
PRAM and VLSI Models

32
PRAM and VLSI Models

• Parallel Random Access Machines


o PRAM Models
o PRAM Variants
• EREW-PRAM Model
• CREW-PRAMModel
• ERCW-PRAMModel
• CRCW-PRAM Model
o Discrepancy with Physical Models
• Most popular variants: EREWandCRCW
• SIMD machine with shared architecture is closest architecture modelled by PRAM
• PRAM Allows different instructions to be executed on different processors
simultaneously. Thus, PRAM really operates in synchronized MIMD mode with
shared memory

33
PRAM and VLSI Models

• VLSIComplexity
Model
o The 𝑨𝑻𝟐Model
• Memory Bound
on ChipArea
• I/O Bound on
Volume 𝑨𝑻
• Bisection
Communication
Bound (Cross-
section area)
𝑨𝑻
• Square of this area used aslower
bound

34
Architectural Development Tracks

35
Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University 36
Architectural Development Tracks

36
Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University 37
Architectural Development Tracks

37
Sumit Mittu, Assistant Professor, C SE/IT, Lovely Professional University 38