Module 1 - Advanced Computer Architecture

INTRODUCTION TO COMPUTER
ARCHITECTURE
BASIC CONCEPTS OF COMPUTER ARCHITECTURE
Computer Architecture is the design

of
computers, including their instruction sets,
hardware components, and system
organization.
It refers to the understanding of the
components
that
Moremake up the computer and
specifically, the way refers
architecture they
to
are
the attributes of the system that are visible to
interconnected.
the
programmer those attributes that have a
direct
- Instruction Sets
impact on the execution of a program.
- Data Representation
- Addressing
- I/O
1
On the other hand, Computer Organization is
the underlying implementation of the architecture
which is transparent to the programmer. An
architecture can have a number of organizational
implementations:
- Control Signals
- Technologies
- Device Implementations
Most computers follow the Von Neumann

Architecture. It is also known as the Stored
Program Architecture or the Fetch-Decode-
Execute Architecture.
A computer follows the Von Neumann

Architecture if it meets the following criteria:
1. It has three basic hardware subsystems:

a CPU, a main memory system, and an
I/O system.
2. It is a storedprogram computer.
Programs (together with data) are stored
in main memory during execution.
3. It carries out instructions sequentially.
4. It has, or at least appears to have, a
single path between the main memory
and the control unit of the CPU.
2
ARCHITECTURAL CLASSIFICATION SCHEMES
Flynns Classification of Computers (in terms

of multiplicity of instruction-data streams) is the
most universally accepted method of classifying
computers.
Definitions of Terms:
1. Instruction Stream (IS) a sequence of

instructions as executed by a machine.
2. Data Stream (DS) a sequence of data

including input, partial, or temporary
results, called for by the instruction
stream.
Both instructions and data are fetched from the

memory units (MU). Instructions are decoded by
the control unit (CU), which sends the decoded
instruction stream to the processor unit (PU) for
execution.
Any computer can be placed in one of four broad

categories:
1. SISD (Single Instruction Stream over a
Single Data stream)
2. SIMD (Single Instruction Stream over a
Multiple Data stream)
3. MIMD (Multiple Instruction Stream over
a Multiple Data stream)
4. MISD (Multiple Instruction Stream over
a Single Data stream)
3
SISD (Single Instruction Stream over a Single
Data stream)
An SISD machine is a conventional sequential

machine (Von Neumann). A program executed by
the processor constitutes the single instruction
stream, and the sequence of data items that it
operates on constitutes the single data stream.
IS
IS DS
CU PU MU
I/O
Instructions are executed sequentially but may be

overlapped in their execution stages (pipelining).
Most SISD uniprocessor systems are pipelined.
4
SIMD (Single Instruction Stream over a Multiple
Data stream)
A single stream of instructions is broadcast to a

number of processors. Each processor operates
on its own data. The multiple data streams are
the sequences of data items accessed by the
individual processors in their own memories.
In other words, an SIMD computer has several

processors running the same program in lockstep
but each operating on different sets of data. This
type of processing is also called array
processing.
DS DS
PU1 LM1
. . data
IS sets
CU IS . . loaded
program
is . . from
hosts
loaded DS DS
from host PUn LMn
5
MIMD (Multiple Instruction Stream over a
Multiple Data stream)
These are the parallel computers (multiprocessor

and multiple computer systems). They involve a
number of independent processors, each
executing a different program and accessing its
own sequence of data items (or the same program
and the same data but not in lockstep as in SIMD
machines).
IS
IS DS
CU1 PU1
I/O
. .
. . Shared
Memory
. .
I/O IS DS
CUn PUn
IS
6
MISD (Multiple Instruction Stream over a Single
Data stream)
A common data structure is manipulated by

separate processors, and each executes a
different program.
This is also known as systolic arrays for

pipelined execution of specific algorithms.
This form of computation does not arise often in

practice.
IS
.. . IS
CU1 CU2 .. . CUn

Memory
(Program IS IS IS
and
Data) DS DS DS DS
PU1 PU2 .. . PUn
I/O
7
SYSTEM ATTRIBUTES TO PERFORMANCE
The ideal performance of a computer system

demands a perfect match between machine
capability and program behavior.
Machine capability can be enhanced with better

hardware technology, innovative architectural
features, and efficient resource management.
Program behavior is affected by algorithm design,

data structures, language efficiency, programmer
skill, and compiler technology.
The simplest measure of program performance is

the turnaround time (the interval from the time
of submission to the time of completion. It is the
sum of the periods spent for disk and memory
accesses, I/O activities, compilation time, OS
overhead, and CPU time). In order to reduce
turnaround time, one must reduce all these time
factors.
In a multiprogrammed computer, the I/O and

system overheads of a given program may overlap
with the CPU times in other programs. Therefore,
it is fair to compare just the total CPU time
needed for program execution.
8
The CPU of todays modern digital computer is
driven by a clock with a constant clock rate or
clock frequency (f in megahertz). The inverse of
the clock rate is the period or cycle time ( = 1/f
in seconds).
The size of the program is determined by its

Instruction Count c(I ), in terms of the number of
machine instructions to be executed in the
program.
Different machine instructions may require

different numbers of clock cycles to execute.
Example:
For the Intel microprocessors, the MOV

instruction (register to register) takes 2 cycles
to execute. The MOV instruction (memory to
register) takes 8 cycles to execute. While the
SHR instruction takes 4 cycles to execute.
Therefore, the cycles per instruction (CPI)

becomes an important parameter for measuring
the time needed to execute each instruction.
For a given instruction set, the average CPI over

all instruction types can be computed.
9
The CPU Time (T in seconds/program) needed to
execute the program is estimated by finding the
product of the three contributing factors:
CPU Time (T) = I CPI

c
Example 1:
A 40-MHz processor was used to execute a

program with 50,000 instructions. The average
CPI is estimated to be 3.5 cycles/instruction.
Calculate the total execution time.
Solution:
1 1
25 ns
f 40106
CPUTime (T) I CPI

c
9
500003.5
2510

4.375 ms
10
Example 2:

benchmark program with the following
instruction mix and clock cycle counts:
Instruction Instruction Clock Cycle

Type Count Count
Integer 45,000 1
Arithmetic
Data Transfer 32,000 2
Floating Point 15,000 2
Control 8,000 2
Transfer
Determine the effective CPI and execution time for

this program.
Solution:
1 1
25 ns
f 40106
TotalCycles
450001 3
2000 2 1
5000 2 8
000 2
TotalCycles 155,000
cycles
11
Total Number of
CPI Cycles
Total Number of
Instructions
155000

45000 32000 15000
8000
155000

45000 32000 15000
8000
155000

100000
1.55cycles/instructi
on
CPUTime (T) I CPI

c
9
1000001.55
2510

3.875 ms
12
The execution of an instruction requires going
through a cycle of events involving instruction
fetch, decode, operand(s) fetch, execution, and
store results.
Only the instruction decode and execution phases

are carried out in the CPU. The remaining three
operations may be required to access memory.
Memory cycle is defined as the time needed to

complete one memory reference (read or write).
Usually, a memory cycle is k times the processor
cycle . The value of k depends on the speed of
the memory technology and processor-memory
interconnection scheme used.
The CPI of an instruction can be divided into two

component terms corresponding to the total
processor cycles and memory cycles needed to
complete the execution of the instruction.
CPU Time (T) = I (p + m k)

c
where:
p is the number of processor cycles

needed for the instruction decode and
execute
m is the number of memory references
needed
k is the ratio between memory cycle
and processor cycle
13
Introduction to Computer Architecture
MIPS Rate
The processor speed is often measured in terms

of million instructions per second.
Let C be the total number of clock pulses or

cycles needed to execute a given program.
C I CPI
c
CPUTime (T) I CPI
c
C
C

f
The equation for the MIPS rate is:
Ic
MIPS
T106
Since T I CPI , then a second equation for

c
the MIPS rate can be derived as:
f
MIPS
CPI106
14
Introduction to Computer Architecture
Since CPI C/I , then a third equation for the

c
MIPS rate can be derived as:
f Ic
MIPS
C106
Example 2:

benchmark program with the following
instruction mix and clock cycle counts:
Instruction Instruction Clock Cycle

Type Count Count
Integer 45,000 1
Arithmetic
Data Transfer 32,000 2
Floating Point 15,000 2
Control 8,000 2
Transfer
Determine the MIPS rate of the system.
Solution:
From the previous example:
Ic = 100,000 instructions
T = 3.875 ms
I 100000
MIPS c 25.81MIPS
6 3
T10 3.87510
106
15

Module 1 - Advanced Computer Architecture

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Module 1 - Advanced Computer Architecture

Hochgeladen von

Copyright:

Verfügbare Formate

INTRODUCTION TO COMPUTER

BASIC CONCEPTS OF COMPUTER ARCHITECTURE

Computer Architecture is the design

Most computers follow the Von Neumann

A computer follows the Von Neumann

1. It has three basic hardware subsystems:

Flynns Classification of Computers (in terms

1. Instruction Stream (IS) a sequence of

2. Data Stream (DS) a sequence of data

Both instructions and data are fetched from the

Any computer can be placed in one of four broad

An SISD machine is a conventional sequential

Instructions are executed sequentially but may be

Most SISD uniprocessor systems are pipelined.

A single stream of instructions is broadcast to a

In other words, an SIMD computer has several

These are the parallel computers (multiprocessor

A common data structure is manipulated by

This is also known as systolic arrays for

This form of computation does not arise often in

CU1 CU2 .. . CUn

The ideal performance of a computer system

Machine capability can be enhanced with better

Program behavior is affected by algorithm design,

The simplest measure of program performance is

In a multiprogrammed computer, the I/O and

The size of the program is determined by its

Different machine instructions may require

For the Intel microprocessors, the MOV

Therefore, the cycles per instruction (CPI)

For a given instruction set, the average CPI over

CPU Time (T) = I CPI

A 40-MHz processor was used to execute a

CPUTime (T) I CPI

A 40-MHz processor was used to execute a

Instruction Instruction Clock Cycle

Determine the effective CPI and execution time for

CPUTime (T) I CPI

Only the instruction decode and execution phases

Memory cycle is defined as the time needed to

The CPI of an instruction can be divided into two

CPU Time (T) = I (p + m k)

p is the number of processor cycles

The processor speed is often measured in terms

Let C be the total number of clock pulses or

The equation for the MIPS rate is:

Since T I CPI , then a second equation for

Since CPI C/I , then a third equation for the

A 40-MHz processor was used to execute a

Instruction Instruction Clock Cycle

Determine the MIPS rate of the system.

From the previous example:

Das könnte Ihnen auch gefallen