Computer Organization

TABLE OF CONTENTS
Module No. Contents Page No.
1. Basic Structure of Computers 03
2. Instruction Formats 08
3. Central Processing Unit 12
4. Types of Memories 29
5. Input /Output organization 49
6. Types of Computers and IBM’s AS 400 57
7. BUS Architectures 62
1
Computer Organization
© Wipro Limited, 2011. All Rights Reserved. All the information in this course material is internal and
restricted. Participants shall refrain from copying, distributing, misusing and/or disclosing the content
to any third parties under any circumstances whatsoever.
2
Basic Structure of Computers
Module 1
Objectives
At the end of this module, you will be able to:
• Identify user’s, software developers, computer designers view
• List different components of computer
• Discuss layered model of computer
• Describe CPU configuration registers, buses
• Review electronic switch, register, clocked circuit
3
Computer Architecture
User view:
A computer is a machine to help the user in a variety of tasks.
– Computational
– Non- computational e.g. email
– Edutainment – Games, educational software
Software Developer’s view:
A platform on which a software package can be developed and tested.
– General purpose, e.g.. text editor, compiler, etc.
– Special purpose, i.e.. targeted to specific applications
Computer Designer’s view:
A programmable digital system, consisting of:
– System hardware able to support the software to be run.
– Software packages providing the necessary support in terms of utilities and programming languages.
User’s View: Average user does not care how the computer does the job. So it must present a very simple
user interface. Example GUIs
Software developer’s view: Presents the applications that are available for the user. This includes instruction
sets, assemblers, compilers and assorted applications
Computer designer’s view: Presents the interfaces, peripherals, timing information and signal level
information among other hardware related information.
4
A typical computer
The evolution of computers in terms of features, performance and cost has been a win-win process.
Continuous advances in technology have led to:
better performance at lower cost;
lower levels of energy consumption.
Examples of current new computer systems:
A layered model of a computer
In the layered model of the computer we have the CPU and related hardware at the innermost layer.
Second layer is the firmware or the BIOS which is ROM resident. Whenever a computer is booted, it starts
executing the program in the ROM. This code will initialize some of the peripherals and loads the disk
content on to the RAM.
Third layer is the system software resident in the (hard) disk. ROM reads the OS info from boot sector
(track0/sector0). This will have the pointer to the location, where the OS is actually residing. The information
from boot sector is loaded into RAM and executed. This will bring-up the OS.
5
The disk may contain applications like office suite, browser, media players, compilers, simulators etc. They
reside at the outer most layer. They are loaded and executed whenever a user requests for these application.
Logically applications developed by the user also reside in the same layer.
Configuration of Registers and Buses
List of Registers used in a basic computer:
Register Symbol Number of bits Reg. name Function
DR 16 Data reg. Holds memory operand
AR 12 Addr. reg. Holds addr for memory
AC 16 ACC Processor reg. (general purpose)
IR 16 Instr. Reg. Holds instr. Code
PC 12 Prog. Counter Holds addr of the (next) instr.
TR 16 Temp. reg. Holds temp. data
INPR 8 Input reg. Holds input character
OUTR 8 Output reg. Holds output character
CPU consists of
• Combinational logic
– Address, MUX, D-MUX, Coders, Decoders etc.
• Sequential logic
– Counters, Registers, Shifters, etc
6
§ All the logic devices’ out-puts have tri-state logic
Discuss logic state(o,1),tristate logic, concept of electronics switch, inverter,tri state inverter, logic gates,
Multiplexer, Flip-flop, Register, counting logic (Not in greater detail) only at functional level.A brief discussion
on number system is recommended. Facilitator can explain binary,octal,hexadecimal number system and
theire conversion form one to another. This will build appreciation for data representation.
Tri-state logic (or High impedance or Z state) makes the output neither high nor low. At this
state an output is neither sourcing current (high) nor it is sinking current (low). Effectively the gate in
disconnected from the circuit. Under such a case, we can connect two outputs together. That is more than
one output can share the same electrical line. This is the basic principle for the bus.
In the Figure above, the on the left side shows a normal logic. When the switch thrown up, o/p is high. Else
the output is low. Figure on the right side shows a tri-state logic. It has an extra position at the middle.
Since that position is not-connected (N/C), out-put will be floating. This is the tri-state output.
A tri-state device have an extra control input, normally called the Chip Enable (CE) or Chip Select (CS).
Whenever the chip is enabled, it behaves the way it is supposed to behave. For example, when a MUX is
enabled, it behaves as a MUX. When not selected (disabled), all the output are in tri-state.
In almost all the devices, CE logic is low. Unless otherwise specified, all the components in a computer
system are tri-state devices.
Summary
In this module, we discussed:
• User, Software developer’s and Hardware designer’s views
• Typical computer parts – CPU(ALU,CU), IO, Memories (main and secondary)
• Layered architecture: innermost- hardware, outermost-applications
• CPU configuration registers, buses
• CPU logic: Combinational, sequential, tri-state logic
7
Instruction Formats
Module 2
Objectives
At the end of this module , you will be able to:
• Describe Instruction formats
• Explain control unit of a computer
8
Instruction Formats
Instruction Format
• Program consists of a sequence of instructions.
• Instructions are binary codes that specify some action.
• Many instructions contain or specify data/address used by them.
• Instruction specified by field/s :
– Opcode (operation code)- specifies the operation
– Operand- specifies the data or address of the data
The solution algorithm for any problem consists of a series of steps that must be carried out in a specific
sequence. To implement such an algorithm on a computer, these steps are broken down into smaller steps,
each of which represents one machine instruction. The resulting sequence of instruction is a machine
language program representing the algorithm. The same general approach is used to enable the computer
to perform the functions specified by individual machine instructions; that is, each of these instructions is
executed by carrying out a sequence of more rudimentary operations.
(a) Zero-operand instruction
Opcode
(b) One-operand instruction
Opcode Address
(c) Two-operand instruction
Opcode addr1 addr2
Instructions in a Simple Computer

Consider a basic computer having three instruction formats, each of 16-bits. (Opcode 3 bits)
1. Memory-reference instruction: uses 12 bits to specify an address & one bit for addressing mode I, I is 0 for
direct & I is 1 for indirect addressing.
2. The register-reference instruction: are recognized by 111 in operation code with 0 in left most bit. Remaining
12 bits specify an operation or test on AC.
3. Input-output instruction: does not need to refer memory & is recognized by 1 in the left most bit & 111 in other
position of the opcode.
9
• Basic computer instruction format:
Instructions categories:
Register-memory: This instruction allows memory words to be fetched into registers which can be used as
ALU inputs in subsequent instructions.
Register-register: This instruction fetches two operands from registers, brings them to the ALU input
registers to perform some operations and store the results back in the register.
Memory-memory: This instruction fetches operands from a memory into the ALU input registers, to perform
an operation, and then writes the results back to the memory
• Instruction register bits are connected in the following way:
Bit 15 is transferred to flip flop represented by I.
Bits 0-11 are applied to control logic gates.
Opcode bits 12-14 are sent to a decoder (3:8).
• 4-bit sequence counter (SC) in the control unit counts in binary from 0 to 15 which are decoded into 16
timing signals,T0 through T15.
• The SC can be incremented or cleared synchronously.
Design criteria for instruction Formats
Short instructions are better than long ones - n instructions each of 1-bit take only half the memory
when compared to n no. of 32-bit instructions.
Processor speed depends on the length of instruction.
Let transfer rate of a memory be t bps.
Length of an instruction be r bits
Now, then memory delivers t/r instruction/sec
So instruction execution time depends on r.
10
Instruction Formats
Control Unit of a Simple Computer
We see that bits 12-14 of instruction register is decoded by a 3-8 decoder which will enable the appropriate
controller in the control logic unit. The address/register field will select the address field for the control
logic. Control logic circuit also takes the sequence counter input. This will enable each of the low-level
operations from the control logic. Sequence counter can be incremented to execute next low-level instruction
or can be cleared to start a new instruction
Summary
• Instruction formats
• Zero, one two operand instructions
• Simple control unit of a computer
11
Central Processing Unit

Module 3
Objectives
• Describe building blocks of CPU registers (MAR,PC,IR etc)
• Define Register Transfer Language
• Illustrate bus and Memory transfer data transfer
• Explain Micro-operations – Arithmetic and logic
• Know what is hardwired and micro programmed control
12
The Central Processing Unit (CPU)

• Building blocks of a CPU
• Register Transfer
• Bus and Memory Transfer
• Arithmetic, Logic and Shift Micro-operations
• Arithmetic, Logic Shift unit
• Control Unit
In its simplest form, a computer system has one unit that executes program instructions. This unit
communicates with and often controls the operation of other subsystems within the computer. Because of
its central role, this unit is known as the Central Processing Unit, or CPU. Often, a subsystem, of the
computer, such as an input unit or a mass storage device, may incorporate a processing unit of its own. Such
a processing unit, although central to its subsystem is clearly not central to the whole computer system.
The principles involved in the design and operation of a CPU are independent of its position in a computer
system. In this unit, we explore the organization of the hardware that enables a CPU to perform its main
function – to fetch and execute instructions.
Basic building blocks of a CPU

• Memory Address Register (MAR) - This register holds the address of the memory location to be accessed.
The desired address is loaded into MAR through a common Bus.
• Program Counter (PC) - This register keeps track of the program flow. Its count is incremented after every
fetch of Op Code or Operand from the Program Memory.
• Stack Pointer (SP) - This is an up-down counter which holds the address of the top of the Stack. Its count is
incremented or decremented after every Push or Pop operation of the Stack.
• Instruction Register (IR) - The Op Code fetched from the program memory is loaded here, and its output is
fed to the Instruction Decoder for generating the Control Signals.
• Temporary Registers - These registers are accessible only to the system, and cannot be accessed by the user.
One of the ALU inputs comes from the common bus, and the other is loaded into Register Y before ALU
operation. The ALU output is always latched into Register Z. Typically these two registers viz., Y and Z are
transparent to the user; and are not accessible.
Program counter is also called as Instruction Pointer (IP)
13
• Programmable ALU - The arithmetic and logic functions of the ALU are selectable by a multi-bit control input
(SAL).
• Register array - This consists of general-purpose registers selectable by a multi-bit control input (SRG ).
• Instruction Decoder - This block generates the sequence of control signals necessary for making each of the
functional blocks, described above, to work according to the Op Code held in IR.
• The internal hardware organization of a digital computer is defined by the following:
9 The set of registers it contains & their functions.
9 The sequence of micro-operations performed on the binary information stored in the registers.
9 The control that initiates the sequence of micro-operations.
Modular Approach: Digital System design uses modular approach.
Modules are constructed from the digital components like registers, decoders, arithmetic elements, control
logic etc.
The various modules are interconnected by a common data & control path to form a digital system.
Register Transfer
Descriptive explanation of the micro operations becomes lengthy and ambiguous.
Use Register transfer language (RTL):
Symbolic notation used to describe the micro-operations on the contents of the registers.
Ex: MAR-register holds the address of a memory location.
PC - Program counter, IR - Instruction register.
Advantages:
• Helps to express in symbolic form - Concise, Precise.
• Tool for describing the internal organization.
• Facilitate design process.
RTL is similar to assembly language programming. It specifies what operation is to be done at every instant
of clock pulse. There may be multiple operations in each clock, as long as the operations are mutually
exclusive. For example - we can transfer the content of a register to accumulator, and at the same time,
read the data from memory to temporary register. It is possible since the first operation uses the internal
bus and the second operation uses the external bus. Both these operations are mutually exclusive.
14
Control functions:
• Transfer/manipulation of data occurs only under a predetermined condition which is generated using
control signals.
Ex: If (P=1) then (R2 R1)
where P is the control signal generated in control section.
Above statement can also be represented as:
P: R2 R1, where P is the control function (0 or 1).
Ex: T: R2 R1, R1 R2 .
• Every register transfer implies a hardware construction for implementing it.
• Consider transfer from R1 to R2 when P=1. i.e. ( P: R2 R1)
Bus and Memory Transfer

• Common bus: Efficient scheme for transferring information between multiple registers.
Bus C, R1 Bus R1 C
A common bus system can be constructed using multiplexers.
• Memory read: The read operation can be depicted as:
Read: DR M[AR]
AR is address register , DR is data register and M is memory word.
• Memory write: The write operation can be depicted as:
Write : M[AR] R1
Three-state Bus Buffer: Constructed by three state gate.
Fig. Shows a bus line for three
state-buffers.
15
Micro-operations
• Micro-operations are elementary operations performed within defined clock cycles, on the data stored in
the registers present in the various modules.
Ex: Operations like movement of data within the registers of the CPU such as
shift, clear, load etc,
arithmetic operations,
logic operations like AND, OR, NOT, XOR, etc.,
• Each instructions given to a computer consists of sequence of these micro-operations within the CPU.
• Micro-operations are controlled by the Control unit that generates the control signal for execution of
operations.
Micro-operations are the basic operations within the Central Processing Unit. For example add r1,r2
instruction involves following micro-operations:
• Move register r1 to ALU and latch the data
• Move contents of register r2 to temporary register (like Y)
• Add the content of temp reg. Y to the content of ALU; store the result in another temp. register Z
• Move the content of temporary register Z to r1.
Types of Micro-operations
The micro-operations are classified into four categories:
1. Register transfer micro-operations: transfer binary information from one register to another. (No change
of information)
2. Arithmetic micro-operations: perform arithmetic operations on numeric data stored in registers.
3. Logic micro-operations: perform bit manipulation operations on non-numeric data stored in registers.
4. Shift micro-operations: perform shift operations on data stored in registers.
Addition, subtraction, increment, decrement & shift.
Symbolic designation Description
R3 <= R1+ R2 : Contents of R1 plus R2 transferred to R3
R3 <= R1 - R2 : Contents of R1 - R2 transferred to R3
R2 <= R2 : Complement the contents of R2(1’sComplement)
16
R2 <= R2 +1 : 2’s complement the contents of R2(negate)
R3 <= R1+R2+1 : R1 plus the 2’s complement of R2( subtraction)
R1 <= R1+1 : Increment the content of R1 by one
R1 <= R1 -1 : Decrement the content of R1 by one
Arithmetic Micro-operations
Logic micro-operations specify binary operation for strings of bits stored in registers.
These operations consider each bit of the register separately & treat them as binary variables.
Ex: Exclusive-OR micro-operation with the contents
of two registers R1 & R2 is given as,
P: R1 R1 xl R2
1010 Content of R1
1100 Content of R2
0110 Content of R1 after P=1
There are 16 logic operations, most computer implement them using only four gates: AND,OR,XOR &
Inverter.
17
Logic Micro-operations
• One stage of logic circuit: Implementation requires the logic gates be inserted for each bit or pair of bits in
the registers to perform the required logic function.
Applications
Selective-set: This operation sets to 1 the bits in register A
for corresponding 1’s in register B.
Ex: 1010 Content of A at the beginning
1100 B (logic operand) (OR)
1110 A <= A OR B
Selective-complement : This operation complements bits in A where there are corresponding 1’s in B.
Ex: 1010 A before
1100 B (logic operand ) (Ex-OR)
0110 A after
Selective -clear: This operation clears to 0 the bits in A only where there are corresponding 1’s in B.
Ex:1010 A before
1100 B (logic operand) A = A.B
0010 A after
The mask: operation is similar to the selective-clear operation =0 position)
Ex: 1010 A before A = A.B
1100 B (logic operand)
1000 A after masking
Insert: Mask & OR . Clear: Ex-OR
18
Arithmetic Logic Shift Unit
Shift micro-operations are used for serial transfer of data.
They also used in conjunction with arithmetic, logic,& other data processing operations.
Logic shift: 1-bit shit to left or right .
Ex: R1 shlR1 R2 shrR2
Circular shift (or rotate operation): 1-bit shifting without
information loss. Ex:R cil R (cir shift left/right register R)
Arithmetic shift: It shifts the signed binary number to left/right.
Ex: R ashl/r R (arithmetic shift left/right R)
Ref. Fig.. Arithmetic shift right.
ashl => multiply by 2; ashr => divide by 2
A simple arithmetic-logic unit. Select lines S0-S1 will select either arithmetic or logic operation. The
selected block will perform the operation. Each of the blocks can do FOUR operations (like AND, OR, NAND,
NOR) on input A and B.
19
CPU Organization
• Central role in the computer system and does the bulk of data processing operations.
• Executes programs (instructions) stored in the main memory by fetching the instructions, examining &
executing the instructions one by one.
• Supervises other system components, peripherals
• Has three parts: Register set, ALU, CU.
As the name suggests, CPU is responsible for all the activities and synchronization. It fetches instructions
from memory and executes them in order. It also initiates data transfer from other system components and
peripherals. CPU achieves synchronization by providing the proper signals to each of the components. CPU
is composed of on chip register sets for temporary data storage, Arithmetic and Logic unit to carryout the
instructions and the Control Unit for synchronization.
The figure above decomposes the three units of CPU ALU, CU and register sets. External devices are the
(main) memory, disks (secondary memory) and peripherals (Ex. Printer). The external devices communicate
with the CPU over a common (shared) bus.
Memory may further decomposed as RAM, Cache, ROM, Flash etc. Similarly the disk may be Floppy, Hard
disk or CD drives etc.
20
Instruction execution:
CPU steps in executing the instruction:
1. Fetch the next instruction from the memory into the instruction register (IR).
2. Update the Program Counter (PC).
3. Determine the type of instruction just fetched (decode).
4. If the instruction uses data in memory, determine their location
5. Fetch the data, if any, into the internal CPU registers.
6. Execute the instruction.
7. Store the result(s) in the proper place.
8. Go to step 1 to begin executing the next instruction.
The above sequence is referred as fetch-decode-execute cycle
Instruction execution starts when the CPU reads an instruction stored in the memory location as pointed by
Program counter (PC) or instruction pointer (IP) into instruction register. It updates the PC to point to next
instruction. Determines the kind of instruction, fetches the data from memory or registers, and, then
operates on these data. The result may be stored back in to the memory or the appropriate registers. All
these steps are implemented by the micro-codes and can be written as Register transfer Language (RTL)
Primary function of CPU: Main function of a processor is to execute sequences of instructions stored in a
memory called the main memory.
The sequence of operations involved in processing an instruction constitutes an instruction cycle, which can
be divided as two phases:
1. Fetch cycle: In the fetch cycle the instruction is obtained from the main memory
2. Execution cycle: The execution cycle includes decoding the instruction, fetching any required
operands, & performing the operation specified by the instruction’s opcode.
Determine the type of Instruction:
The micro-operation for the indirect address condition.
Decoder output D7 gives the mode of operation. (refer to the figure on slide No. 14)
AR M[AR]
This can be symbolized as follows:
D7‘ I T3: AR M[AR]
D7‘ I‘T3: Nothing
D7 I‘T3: Execute a register-reference instruction
D7 I T3: Execute an input out instruction
21
Secondary function of CPU:
In addition to executing the program, the CPU supervises the other system components,
For Ex., the CPU directly or indirectly controls the IO operations such as data transfer between IO devices
& main memory.
These operations require CPU attention infrequently & such an IO request for CPU is called an interrupt.
When the IO (peripheral) wants the attention of CPU, it raises an interrupt to the CPU. In response to the
interrupt- CPU will perform the following tasks (if that interrupt is enabled and can be serviced)
• Notes the current location of instruction execution
• Services the interrupt by giving the proper acknowledgement
• Continues execution from where it had stopped.
Basic CPU organization: All CPU designs have been based on the following two premises.
1. The CPU should be fast (measured by its time t CPU) as the available
technology permits. Since cost increases with speed, the number of components in the CPU must be
small.
2. A main memory of large capacity is needed to store the programs and data required by the CPU. It is of
slower (tM) technology than that of the CPU. Cost is proportional to size.
t CPU
<t M
3. The CPU contains a minimal set of registers for temporary storage of instructions and operands.
4. Instructions whose operands are in CPU registers can be executed quickly than instructions whose operands
are in the main memory.
22
A simple Accumulator - based CPU.
Here the CPU register called the Accumulator (AC) plays the central role, being used to store an input or an
output operand (result) in the execution of most instructions.
‘Hence the name’.
Refer Fig a. A simple accumulator based CPU.
Fig b. Operation of the CPU.
(On the next two pages)
The CPU can be either: A simple Accumulator-based CPU or General register organization CPU
In an accumulator based CPU, all the operations are done with respect to the accumulator. Data is fetched
and stored in a temporary register.
In a General register organization, operands can be stored in general purpose registers. In some cases, the
registers can be used to do some arithmetic and/or logic operations.
We see that the accumulator based CPU has an ALU and a register called Accumulator. Normally we do not
have general purpose registers. In the absence of general purpose registers, CPU uses load and store
instruction extensively. Thus, sometimes they are called LOAD/STORE architecture.
23
• In the general register organization there is a register set (16 registers).
• Operands to ALU come from this register set and the results are also stored in the same set.
Control Unit
Data processing unit is logically reconfigured by the control unit to perform certain sets of (micro) operations.
Hardwired control:
Advantage: Produce fast mode operation.
Disadvantage: Modification is tedious.
Microprogrammed control has the advantage that it can be modified easily, easy to implement and the
control logic can be simulated. But it is slow and occupies a large amount of silicon area.
24
A typical hardwired control unit:
(1) The block diagram consist of two decoders, sequence counter & a number of control logic gates. Ref. Fig.
in the notes page.
(2) Timing signals: The sequence counter (SC) can be incremented to provide sequence of timing signals.
Ex: at time T4, SC is cleared to 0 if decode D3 output is active.
This can be given by statement,
D3T4:SC 0
Micro-programmed control:
• Control memory: - control word.
-microinstruction.
-microprogram.
-control memory.
-control address register.
-sequencer.
-pipeline register & hardwired control.
Address Sequencing: This is method sequencing (fetching) the next instruction. Depending on the type of
the instruction that is currently being executed, sequencing can be of the type:
-routine.
-mapping.
-conditional branching.
-mapping of instruction.
-subroutines & register.
A Full adder can be taken as an example to illustrate working of a simple computer. Depending on the
different combinations of inputs A, B and Carry-in Cin, it can perform different tasks.
25
Here we illustrate different operations possible with a full adder. We are neglecting carry bit in this
example.
1. With B = B, Cin = 0, Y = A + B. It acts as an adder
2. With B = B Cin = 1, Y= A + B + 1. It is add with carry operation
3. With B = 0, Cin = 0 output Y = A or full adder acts as a buffer.
4. With B=0 Cin = 1, Y = A+1. It is increment operation
5. With B=1, Cin = 0 Y = A – 1. It is subtraction. Note that when we add a number to all 1’s it will decrement
the value by one (neglecting carry)
6. With B=1, Cin=1, Y = A, It is buffer. Note the previous example (5) decremented the A value by
1. With Cin = 1 it will be A - +1, and, value of A will be restored back, thereby generating buffer
operation.
7. With B = B’ (1’s complement of B), Cin = 0, Y = A – B (1’s complement subtraction)
8. With B= B’, Cin = 1, Y = A – B. Subtraction using 2’s complement
We may need few extra circuitry at the B input to generate. For example we can pass B through XOR gates
to get either B or B’. Similarly we can pass B through OR gate to generate B or 1, AND gate to generate B
or 0.
In practice, we need a combination of all these to generate all combinations of B.
26
Consider the simple computer that can perform the arithmetic operations like add, sub, add with carry etc
for the data available in memory locations (or CPU registers). We need to generate signals for B input (B,
B’) along with carry (C = 0 or 1). Moreover we need to select the two memory locations where the data is
stored. Figure above shows the schematics for doing these tasks.
Control unit generates the address for memory for the first variable A, which will be latched (not shown in
the figure). It then selects the second variable B and the control signals to selector circuit for the appropriate
operations (like add, sub, increment etc). The selector will pass the signal B along with C in the required
format. The adder will take the two data A and B along with C to carry-out the required operations.
27
Summary
– Building blocks of CPU registers (MAR,PC,IR etc)
– Register Transfer Language
– Bus and Memory transfer
– Micro-operations – Arithmetic and logic
– CPU organization
– Control unit – Hardwired and microprogrammed
– Simple Control Unit
28
Memory Organization
Module 4
Objectives
• List Memory types Memories
• Discuss Magnetic. Optic (CDROM) and semiconductor memories, ROMs and RAMs
• Know cache memories, Flash memories
29
Memory Organization
1. Internal processor memory: This consists of a small set of high-speed registers used as a working memory
for temporary storage of instructions & data.
2. Cache memory: It is a very high speed memory. It allows the CPU to access information quickly.
3. Main memory: The memory unit directly communicates with CPU. This is a relatively large fast memory
used for program & data storage during computer operation.
4. Auxiliary memory: Devices that provide backup storage are called auxiliary memory.
Ex: Hard disks, CD-ROMs, magnetic tapes.
Programs and the data are stored in the main memory of the computer during execution. The execution
speed of programs is highly dependent on the speed with which instructions and data can be transferred
between the CPU and the main memory. It is also important to have a large memory, to facilitate execution
of programs that are large and deal with huge amounts of data.
Ideally, the memory should be fast, large and inexpensive. Unfortunately, all these are mutually exclusive
and, as such, it is impossible to meet all three of these requirements simultaneously. Increased speed and
size are achieved at increased cost. To solve this problem, much work has gone into developing clever
structures that improve the apparent speed and size of the memory, yet keep the cost reasonable.
As you go up in the hierarchy, the cost and speed increase and the size decreases.
A very fast memory can be achieved if SRAM chips are used. But these chips are expensive because their
basic cells have six transistors. Thus for cost reasons, it is impractical to build a large memory using SRAM
chips. The only alternative is to use DRAM chips, which have much simpler basic cells and thus are much
less expensive. But such memories are significantly slower.
30
Memory Organization
Although DRAMs allow main memories in the range of tens of megabytes to be implemented at a reasonable
cost, the affordable size is still small compared to the demands of large programs with voluminous data. A
solution is provided by using secondary storage, mainly magnetic disks, to implement large memory spaces.
Very large disks are available at a reasonable price, and they are used extensively in computer systems.
However, they are much slower than the main memory unit. So we conclude the following:
Cost-effective storage can be provided by magnetic disks. A large, yet affordable, main memory can be built
with DRAM technology. This leaves SRAMs to be used in smaller units where speed is of the essence, such
as in cache memories.
All of these different types of memory unit are employed effectively in a computer. The entire computer
memory can be viewed as the hierarchy depicted in figure above.
The figure in the slide shows the physical arrangement of the memory devices.
Cost : We define the cost c of the memory as follows:
c = C/S dollars/bit, where C is price in dollars, S is
the storage capacity in bits.
Thus c is the cost of each bit of memory.
Access time(tA): The average time required to read a
fixed amount of information, e.g., one
word, from memory.
Access rate of memory defined as 1/t & measured in
words per second.
31
Access mode: An important property of a memory device is the order or sequence in which information can be
accessed.
– If location can be accessed in any order & the access time is independent of the location being accessed,
then the memory is called a random-access memory (RAM).
Ex: Ferrite-core & semiconductor memories.
– Memories where storage location can be accessed only in certain predetermined sequences are called
serial-access memories.
Ex: Magnetic-tape, Magnetic-bubble, & optical memories employ serial methods.
In the case of magnetic tapes, it may be necessary to move the tape to the specified location before the
data can be accessed. Hence the time to access the data will vary depending upon where the read head is
at present and the location where the data is to be accessed. Time to access the same data may be
different at different instants of time. On the other hand the semiconductor memories (typically) take the
same amount of time to access any location at any instant of time.
Storage mechanism: The physical processes involved in storage are sometimes inherently unstable, so the
stored information will be lost unless proper action will be not taken.
• Three important memory characteristics that can destroy the information.
i) Destructive readout
ii) dynamic storage
iii) volatility.
• Destructive Read Only (DRO): Memories having the property that the method of reading a memory location
destroys the stored information.
• Non Destructive (NDRO): Memories in which reading does not affect the stored data.
• Static Memory: Memories that does not require periodic refreshing
• Dynamic Memory: Memories that require periodic refreshing
– Refreshing: In some memory, over a period of time, the stored information (in form of charges) tends to
leak (discharge), unless the charge is restored by a periodic refreshing.
The restore can be done automatically by using a buffer register. In DRO
memories, each read operation is followed by a write operations to restore the original state of the
memory
32
Memory Organization
• tA is the time between the receipt of a read request by the memory & the delivery of the requested
information to its external output terminals.
• Cycle time(t ) is time needed to complete any read or write operation in the memory.
• But in DRO & dynamic memories, it is required to refresh the memory state.
• Data-transfer rate : The maximum amount of information that can be transferred to or from the memory,
every second is 1/t ,this quantity is called the data-transfer rate or bandwidth b .
Reliability: Which can be measured by the mean time to failure
(MTTF).
In general, memories with no moving parts have much higher reliability than memories such as magnetic
disks, which involves mechanical motion. Failure occurs due to wear and tear of the rotating devices.
• Volatile memory: Memories in which the stored information remains valid as long as the power is applied
to the unit are called volatile memories.
Ex : RAMs.
• Non-volatile memory: Memories that can store the information even when the applied power is removed
Ex: ROMs, magnetic tapes, magnetic disks, optical disks.
Most semiconductor memories are volatile & most of the magnetic memories are nonvolatile.
RAM Organization: A general approach to reducing the access circuitry cost in RAM is by using matrix, or array,
organization.
• One-dimensional or 2-dimensional memory organization is used.
• The semiconductor & ferrite core can be used for memory array organization.
• Fig. below shows a RAM unit with input-outputs:
Normally data from the memory chip is not enabled making the data lines to go to tri-state.
During the read operation, CPU puts the address on the address bus, this address selects the chip CS* and
CPU asserts (energies) the read signal. The read signal enables output of the chip OE*, and data is
outputted on to the bus. CPU then removes the OE* signal and the address.
Similarly for the write operations, CPU puts the address and data on the bus. CPU also asserts the WR*
signal so that the data on the bus will be latched to the memory location.
33
The above fig is a basic RAM unit. Its has address bus, a data bus and control bus. Whenever the data is to
be read from the RAM, the address of the data is placed on the address bus. The read line in the control bus
is activated. The address is decoded from the address bus and the data in location pointed to by the address
decoder is placed on the data bus by enabling the read driver through the timing and control circuit unit.
Similarly when data is to be stored in the RAM unit, the address of the location where the data is stored is
placed on the address bus and the data on the data bus. Now the write line on the control bus is activated,
so that the write driver is enable. The address decoder enables that particular location where the data is to
be stored which is on the data bus.
Communication between a memory and its environment is achieved through data input and output lines,
address selection lines, and control lines (Read and Write inputs) as shown in the figure. The n data input
lines provide the information to be stored in memory, and the n data output lines supply the information
coming out of particular word chosen among the 2k available inside the memory. The two control inputs viz.
specify whether the data is Written into the memory or read from the memory.
34
Memory Organization
2D scheme is used to expand the memory for the same given number of address lines. Here the X- and Y
address lines are used of same length.
Consider the reading of the data at the location C(3,0). The read operation consists of first sending the X
address on the address bus which is 11( in binary). Now the row 3 is activated. To activate the first column,
00 is sent on the address bus for the Y address decoder. Thus we have the location C(3,0) enabled for
reading. Also note that a read signal is also sent to enable the read driver which is not shown in the figure.
Similarly the same steps are used for write operations.
Random Access: Access time is same for all locations, i.e., access can be made randomly irrespective of the
locations.
• SRAM: Static Random Access Memory
– Low density, high power, expensive, fast
– Static: content last “forever”(until power down)
• DRAM: Dynamic Random Access Memory
– High density, low power, cheap, slow
– Dynamic: needs to be “refreshed” regularly
The technology we used to build memory hierarchy can be divided into two categories: Random Access and
Non-so-Random Access.
Unlike all other aspects of life where the word random usually associates with bad things, random, when
associates with memory access, for the lack of a better word, is good!
Because random access means you can access any random location at any time and the access time will be
the same as any other random locations.
35
Which is NOT the case for disks or tape where the access time for a given location at any time can be quite
different from some other random locations at some other random time.
As far as Random Access technology is concerned, we will concentrate on two specific technologies:
Dynamic RAM and Static RAM.
The advantages of Dynamic RAMs are high density, low cost, and low power so we can have a lot of them
without burning a hole in our budget or our desk top.
The disadvantages of DRAM are they are slow. Also they will forget what you tell them if you don’t remind
them constantly (Refresh).
SRAM only has one redeeming feature: it is fast. Other than that, they have low density, expensive, and
burn a lot of power. SRAM actually has another redeeming feature. They will not forget what you tell them.
They will keep whatever you write to them as long as power supply is available or overwritten.
ROM: Memories whose contents cannot be altered are called read-only memories.
• A ROM is non-erasable storage devices.
• The ROMs are widely used for storing the control programs such as micro-programs.
PROMs: ROMs whose contents can be changed(usually off-line & with some difficulty) are called Program-
mable Read-Only Memories (PROMs).
Memory organization ROMs
36
Memory Organization
ROMs can be realized by simple combinational logic. Or simply having a grid of lines and having a link
across the grid wherever a logic 1 is needed and cutting the link where a logic 0 is needed.
For example in the figure above, consider the row 2 is activated by putting logic 1 on the line. It will enable
the memory element wherever the connection is closed. This will pass Vcc (logic 1) to the output. Wherever
the connection is open, memory element is disabled, and the output is low or logic 0
Memory organization PROMs

• One-time user-Programmable ROM
• Suitable for lower volume applications
• Internal structure similar to ROM
• Has fusible links for each memory elements that can be permanently blown depending on the bit to be stored
in the element
• Can be downloaded with user data using a special equipment called PROM Programmer .
• Lacks Re-programmability
PROM programmers can be connected to a PC. Program can be developed and tested on the PC, downloaded
to the PROM programmer to program the PROM.
Memory organization DRAM

• MOS Capacitor based RAMs
– Data stored as a charge on the capacitor
– Hence has the tendency to discharge and lose the data
– Needs periodic refreshing of the charge
• Density is 4 times that of SRAM due to the simple Cell structure
– Consume less power
– Cost is considerably lower than SRAM
37
• Address inputs need to be handled in a complex manner than the one in SRAM
– Requires complex timing control in order to take of care of the refreshing
– Slower than SRAMs in access speed
Used in the next level of memory after Cache in computers. In the current scenario, the RAM of a computer
are DRAMs (Dynamic Random Access Memories). The reason is low cost and high density.
Digital data stored on capacitor in DRAMs tends to discharge. Hence we need to refresh the DRAMs from
time to time. This reduces the access time for the DRAMs.
DRAM consists of a capacitor, buffer and switches. We need minimum of two switches S1 and S3 for write
and read respectively. Switches S2 and S4 ensure the read operation also refreshes the capacitor.
Typical DRAM has n address input and can address 22n. This is possible by multiplexing address as row and
column. The row and column address are selected by two special lines Row Address Select (RAS) and
Column Address Select (CAS). Separating address as row and column address also eases the DRAM refreshing.
We can refresh an entire row (or column) at a time.
38
Memory Organization
• Refreshing is an essential event in DRAM
– Needs to be performed at regular intervals in order to ensure Data integrity
• DRAM Controller on board/on chip is an essential part of the system
– Controls refresh operations of DRAM
• One of the first things the init code must do is : -
– Initialize the DRAM controller
• Consult board/chip designer to determine correct initialization sequence.
One of the first things your software must do is initialize the DRAM controller. If you do not have any other
RAM in the system, you must do this before creating the stack or heap. As a result, this initialization code
is usually written in assembly language and placed within the hardware initialization module.
Almost all DRAM controllers require a short initialization sequence that consists of one or more setup
commands. The setup commands tell the controller about the hardware interface to the DRAM and how
frequently the data there must be refreshed. To determine the initialization sequence for your particular
system, consult the designer of the board or read the data books that describe the DRAM and DRAM
controller. If the DRAM in your system does not appear to be working properly, it could be that DRAM
controller either is not initialized or has been initialized incorrectly.
Memory organization (Contd.).

Auxiliary memory:
Magnetic Disks: It is a circular plate constructed of metal or plastic coated with magnetized material.
– bits are stored in magnetized concentric surfaces is called tracks.
– tracks divided in to sectors.
• disks may have multiple read/write head.
• disks attached permanently called hard disks.
• removable are floppy disks.
In the magnetic memories, Data is stored as North pole or south pole on the magnetic device to represent
logic 1 and 0. Magnetic memories are slower than the semiconductor memories; but, have a higher density
and low cost. Moreover, the data stored in a magnetic device will not be deleted when the power is
removed from the device.
Magnetic memories form the secondary memories in the system. Hard disks, floppy, tapes etc. are the
examples for the magnetic memories
39
Magnetic device is divided into a number of tracks. Each track is further divided in to sectors. Each sector
can hold data which is a multiple of 2n. The figure above shows the sector size of 4096 (4K) bytes. This is
sandwiched between preamble bits and post-amble bits. When we write the data that is more than 4K, we
require second sector. The pre-amble and post-amble bytes will act as the link between first and the
subsequent sectors.
The data written occupies one full sector even when the data written is less than 4K. Thus partitioning disk
into sectors may seem to waste the memory space. But it will speed up the data-access. In the absence of
sectors, locating and retrieving the data would be complicated.
We can have more than one disk on the spindle. Each disk can store the data on both the sides of the disk.
That will speed-up the operations. Figure above shows four disks and eight heads. Data can be read from
one of the heads only.
40
Memory Organization
Photo of Disk Head, Arm, Actuator
State of the Art: Seagate Cheetah 36
36.4 GB, 3.5 inch disk
12 platters, 24 surfaces
10,000 RPM
18.3 to 28 MB/s internal media transfer rate
9772 cylinders (tracks), (71,132,960 sectors total)
Avg. seek: read 5.2 ms, write 6.0 ms (Max. seek: 12/13)
$2100 or 17MB/$ (6¢/MB)
0.15 ms controller time
Memory Organization - CD ROM
41
CD-ROM
• disks can only be written once (re-writable device are available now)
• Data is encoded and read optically with a laser
• storage capacity: 650MB
• least expensive to produce
• (~$5 per disk = <¢1/MB)
• 30 years safe storage
• Drives come in various speeds, and in different disk-loading mechanisms
• Digital data is represented as a series of Pits and Lands.
Pit = a little depression, forming a lower level in the track
Land = the at part between pits, or the upper levels in the track
Reading a CD is done by shining a laser at the disc and detecting changing
reflections patterns.
1 = change in height (land to pit or pit to land)
0 = a ”fixed” amount of time between 1’s
Memory Organization - Magnetic Tape

2. Magnetic tape: The tape is a strip of plastic coated with a magnetic recording medium. Bits are recorded as
magnetic spots on the tape.
Fuji film DDS-4 tapes
Magnetic tape uses a method similar to that of VCR tape for storing data. A recording medium consisting of
a thin tape with a coating of a fine magnetic material, used for recording analog or digital data. Data is
stored in frames across the width of the tape. The frames are grouped into blocks or records which are
separated from other blocks by gaps. Magnetic tape is a serial access medium, similar to an audio cassette,
and so data (like the songs on a music tape) cannot be quickly located.
However large amounts of information can be stored within magnetic tape. This characteristic has prompted
its use in the regular backing up of hard disks.
42
Memory Organization
Cache Memory
Definitions
• Dictionary meaning: a. A hiding place used especially for storing provisions. b. A place for concealment and
safekeeping, as of valuables. c. A store of goods or valuables concealed in a hiding place
• Computer science meaning: A fast storage buffer in the central processing unit of a computer.
• Cache: High-speed speed memory, logically placed between CPU and main memory.
• Works on the locality of reference
• At a given instant of time, memory is accessed within a small neighborhood.
• This block of memory can be stored in a small high-speed memory rather than in normal RAM
• This high-speed memory is the cache memory
When a CPU fetches an instruction, next instruction normally will be within the neighborhood of this
instruction, except in branch condition. This is known as locality of reference. Memory reference to data
also follows the locality of reference, though it is less as compared to the instruction (code). Example
lookup table, matrices , iterative procedures etc. Thus during a given instant of time span, memory access
is localized. This portion of memory can be placed in a high speed memory to speed-up the operation of the
program.
Cache memory access time is less than the access time of main memory by a factor of 5 to 10. It is the
fastest component in the memory hierarchy, next only to the CPU registers.
Cache Operation
• When the CPU needs to access memory, it checks the cache first.
• If the data is found in cache (cache hit ), it is read.
• Else (cache miss) data is accessed from main memory.
– A block of data from the required location is then transferred to the cache memory.
• Cache performance is measured in hit ratio.
When the data needed is available in cache, we say hit has occurred, else a miss has occurred. Hit ratio is
the ratio of number of hits to total number of memory access. Hit ratios of 90% and above are reported.
43
Cache swapping
• Whenever cache miss occurs, a block of data must be transferred from main memory to the cache.
• A block of data already residing in cache should be written back to the main memory
• It requires a mapping algorithm to implement the above tasks.
• Moreover we need a search algorithm to check if the data is in cache
– We use associative memory to search a cache memory.
Associative memories are special memory devices that access the information associated with the data.
Cache Memory Mapping

Mapping : The transformation of data from main memory to cache is referred as mapping process.
Three types of mapping:
1. Associative mapping.
2. Direct mapping.
3. Set-associative mapping.
A small portion of the main memory resides in the cache memory. Same portion in cache may be occupied
by different sets of data at different instants of time. So there must be mapping technique to map the
address from the main memory to the cache memory. Three different techniques are proposed to accomplish
mapping.
• Example of cache memory. The cache memory is logically closer to CPU - implies faster access.
44
Memory Organization
Cache Memory - Mapping

• Hardware part: Block diagram of associative memory used in Associative mapping
In the regular memory we provide an address and the data at that location is accessed. In an associative
memory, we provide data and the information associated with that data is accessed if that data is
available.
The block diagram of an associative memory is shown in the Fig. above. It has a memory array and logic for
m words with n bits per word. The argument register A and key register K each have n bits, one for each bit
of a word. The match register M has m bits, one for each memory word. Each word in memory is compared
in parallel with the content of the argument register. The words that match the bits of the argument
register set a corresponding bit in the match register. After the matching process, those bits in the match
register that have been set indicate the fact that their corresponding words have been matched. Reading
is accomplished by a sequential access to memory for those words whose corresponding bits in the match
register have been set.
The key register provides a mask for choosing a particular field or key in the argument word. The entire
argument is compared with each memory word if the key register contains all 1’s. otherwise, only those bits
in the argument that have 1’s in their corresponding position of the key register are compared. Thus the key
provides a mask or identifying piece of information which specifies how the reference to memory is made.
Ex: A 101 111100
K 111 000000
Word 1 100 111100 no match
Word 2 101 000001 match
Word 2 matches the unmasked argument field because the three leftmost bits of the argument and word
are equal
45
Replacement algorithm
• Whenever a cache miss occurs, we have to replace them.
• We have two popular algorithms
• Least Recently Used (LRU)
• Least Frequently Used (LFU)
Least recently used algorithm assumes that the block of data which was not used in the recent past is not
needed and replaces that block.
LFU assumes that a block of data, that is not frequently used, may not be needed for further accessing. This
block is replaced.
Both the algorithm needs extra hardware (not shown in our diagrams) to keep track of when the data was
accessed.
Other algorithms being Firs-in First-out, and random replacement.
Writing to the Cache

• While writing in to the cache, both cache and main memory are to be updated.
• Two popular techniques are proposed
• Write through: Whenever there is write operation, main memory is also updated
• Write back: Write only when swapping is done.
– This requires a special bit in cache to indicate the memory is changed
– This special bit is called dirty bit
Write through is simple to implement. But writing data every time the data changes will slow down the
operation. Problem becomes more when the same memory location is being altered frequently. Example
count in a loop.
Write-back writes only during swapping. This feature makes the operation faster. The limitation is the
entire block is written in to the main memory even if a single location is changed.
46
Memory Organization
Flash Memory
• Latest of the re-programmable ROM family
– Named so because of its rapid erase and write times
– Typical erase and write times – 10ms
• Posses both bulk erase and sector erasing features
– Doesn’t have the Byte-by-Byte re-programming feature
• Compromise between
– EPROMs’ low cost and high densities
– EEPROMs’ high re-programming speeds
• Comes in two flavors
– NOR flash : Similar in read - operation to SRAM
– NAND flash: Read block wise
• NOR more popular
• Consists of segments
– Segments need not be of uniform size
Flash memory primarily comes in two flavors, NOR and NAND. Reading NOR flash is essentially like reading
SRAM. You can read values from random addresses, with the whole of its address space visible. You can
execute code directly from NOR Flash, since it looks like SRAM (this is often referred to as Execute-In-Place
or XIP).
Flash – To be noted
Erase before write – otherwise ?
Nice about Flash
– Non Volatile
– Faster reads
– Lock
• Protection
– High density
47
Current Memory Hierarchy
Who Cares About the Memory Hierarchy?
Y-axis is performance
X-axis is time
Latency
Cliché:
Note that x86 didn’t have cache on chip until 1989
Time
Summary
• Memory types Memories and pyramid
• Random and sequential memories
• Magnetic. Optic (CDROM) and semiconductor memories
• ROMs and RAMs
• DRAMs and refreshing DRAMs
• Cache memories
– Operations
– Hit, miss, hit ratio
– Mapping and associative memory
• Flash memories
48
Input Output Organization
Module 5
Objectives
• Discuss Peripherals, memory and IO mapping
• List Modes of Data transfer
• Know IO Processors
49
I/O Organization
Peripheral devices:
• I/O: The input-output subsystem of computer termed as I/O, provides efficient mode of communication
between central system & outside environment.
• Peripherals: I/O devices attached to the computer are also called peripherals. Common peripherals are
keyboards, display unit & printers.
One of the basic features of a computer is its ability to exchange data with other devices. This communication
capability allows a user, for example, to enter a program and its data via the keyboard of a video terminal
and receive results on a display or a printer. A computer may be required to communicate with a variety of
equipment, such as video terminals, printers, and plotters, as well as magnetic disk and magnetic tape
drives. In addition to these standard I/O devices, a computer may be connected to other types of equipment.
For example, in industrial control applications, input to a computer may be the digital output of a voltmeter,
a temperature sensor, or a fire alarm. Similarly, the output of a computer may be a digitally coded
command to change the speed of a motor, open a valve, or cause a robot to move a specified manner. A
general purpose computer should have the ability to deal with a wide range of device characteristics in
varying environments.
• I/O Interface: Due to difference in signal, speed, mode & word format b/w peripherals & CPU , I/O interface
are required.
• I/O Bus & Interface Modules:
• I/O command: Processor selects and issues the function code to the interface through control lines.
Selected interface responds to the function code and executes it. These codes are called I/O commands.
• Four types of I/O commands:
1. Control command: It is issued to activate or initialize the I/O before any data transfers.
Ex: To rewind the tape, to start the tape moving in the forward direction.
2. Status: The command is used to test the various status conditions in the interface & peripheral.
Ex: The OS might try to check the status of the peripheral before a transfer is initiated such that interface
can recover the errors through status register.
50
3. Output data: These command causes the interface to respond by transferring data from the bus into one
of its registers.
Ex: While sending data to tape, processor checks the correct position of the tape by status command &
than processor issues a data output command.
4. Input data: It is opposite of the data output command.
Memory & I/O Mapping

• Memory Mapping
– Assignment of addresses to memory registers in various memory chips in a system
– To ensure only one memory device is activated during each cycle
• What happens if more than one location is selected for the same address?
– To transfer data to/from the device, uP does a memory write/read
– Devices need direct access to the memory bus
Memory mapping is defined as the assignment of addresses to memory registers in various memory chips
in a system. The essence of memory mapping is to enable the microprocessor and the hardware device it
controls to share access to a specific range of memory addresses. To send data to the device, the
microprocessor simply moves the information into the memory locations exactly as if it were storing
something for later recall. The hardware device can then read those same locations to obtain the data.
Memory-mapped devices, of course, need direct access to your PC’s memory bus. Through this connection,
they can gain speed and operate as fast as the memory system and its bus connection allows.
I/O Mapping
IO Mapping
• The way IO is connected to the CPU
• Memory Mapped IO
– Memory and IO Ports share the same set of addresses
– CPU Communicates with IO device the same way as it does with external memory
• IO Mapped IO
– IO has a separate address
– CPU employs IO Instructions
51
The CPU communicates with I/O devices in much the same way as it communicates with external memory.
The I/O devices are associated with addressable registers called I/O ports to which the CPU can store a
word (an output operation) or from which it can load a word (an input operation). All I/O data transfers are
implemented by memory referencing instructions, this approach is called memory mapped I/O. This
approach requires that memory locations and I/O ports share the same set of addresses, so an address bit
pattern that are assigned to memory cannot be assigned to an I/O port and vice versa. Some CPUs employ
I/O instructions that are distinct from memory referencing instructions. These instructions produce control
signals to I/O ports but not memory locations. This approach is called I/O mapped I/O.
Modes of Data Transfer

Data transfer to & from peripherals may be handled in one of the three possible modes.
1.Programmed I/O.
2.Interrupt-initiated I/O.
3.Direct Memory Access (DMA).
1. Programmed I/O:
– Programmed I/O operations are the results of I/O instructions written in the computer program.
– Each data item transfer is initiated by an instruction in the program.
– Here the CPU stays in a program loop until the I/O unit indicate that it is ready for data transfer.
– This is time consuming process since processor is busy needlessly.
2. Interrupt- initiated I/O:
– This uses interrupt facility and some special commands to inform the interface to issue an interrupt
request signal when the data are available from the device.
– Meanwhile, CPU can proceed to execute another program and the interface will be monitoring the
device.
52
Interrupts
• A signal informing a program that an event has occurred
• Normal Program execution temporarily suspended by
– Some external signal
– Special instruction in the program
• Calls a procedure which services the interrupt
• Execution returned to the interrupted program
Many peripheral devices such as serial interfaces, keyboards and real-time clocks need to be serviced
periodically. For example, incoming characters or keystrokes have to be read from the peripheral or the
current time value needs to be updated from a periodic clock source.
The two common ways of servicing devices are by polling and by using interrupts. Polling means that a
status bit on the interface is periodically checked to see whether some additional operation needs to be
performed, for example whether the device has data ready to be read. A device can also be designed to
generate an interrupt when it requires service. This interrupt interrupts normal flow of control and causes
an interrupt service routine (ISR) to be executed to service the device.
In general, it is advantageous to use interrupts when the overhead required by polling would consume a
large percentage of the CPU time or would complicate the design of the software
Interrupt Sources
• Hardware Interrupts
• Non Maskable Interrupt NMI input pin
• Interrupt INTR input pin
• Edge triggered/level triggered interrupts
• Vectored/non-vectored interrupts
• Software Interrupts
• Execution of the interrupt instruction
• Conditional Interrupts e.g. Divide by Zero Interrupt.
• Maskable interrupts can be disabled by software; non-makable interrupts can not be disabled from
software
• Edge triggered interrupts are recognized at either raising or falling edge of interrupt; level triggered
interrupts are recognized at the logic level (low or high) of interrupt
• Vectored interrupts have a specific location for the ISR (interrupt Service Routine); for non-vectored
interrupt, the requesting device should provide the address of the ISR
• Software interrupts like divide by zero or unspecified instruction are also called as exceptions.
53
Modes of Data Transfer

3. Direct Memory Access (DMA)
• The transfer of the data between a slow storage devices such as magnetic disks & main memory is often
limited by the access time of these memory devices.
• Removing CPU from the path & allowing the peripheral device manage the memory buses directly will
improve the speed of transfer and saves the CPU time.
This transfer technique is called Direct Memory Access (DMA)
During DMA transfer,
• CPU is idle & has no control of the memory buses.
• DMA controller manage the transfer directly between the I/O device & memory.
DMA Operations
When a DMA wants to access the bus and hence memory, it raises a request to the CPU using HOLD request
pin. Typically CPU puts all its bus in to tri-state and issues a Hold Ack. (HLDA) signal to DMA controller. DMA
controller now takes control of the bus and initiates data transfer. CPU is idle during this time. Controller,
after finishing the memory access, puts its own bus in to tri-state and removes hold signal to the CPU. CPU
removes the HLDA signal and takes control of the bus itself.
• Two signals in the CPU that facilitate DMA transfer:
1. Bus request: The bus request(BR) input is used by the DMA controller to request the CPU to relinquish
control of the buses.
2. Bus grant: The CPU activates the bus grant (BG) output to inform the external DMA that the buses are in
the high impedance state. Then DMA controller can take the hold on memory buses without interven
tion of the CPU.
54
• Types of DMA Transfer:
1. Burst transfer: It is one method of transfer, where a block sequence of number of words is transferred
in a continuous burst while DMA controller is master over memory buses. Burst transfer is best suited
for fast devices like disks.
2. Cycle stealing : Another technique where DMA controller allowed to transfer one data word at a time
after which it must return control of the buses to CPU.
DMA (Contd.).
• DMA Controller: Needs the usual circuits of an interface to communicate with the CPU & I/O device. In
addition, it needs an address register, a word register & a set of Address lines.
• Address register & address lines are used for direct communication with the memory.
• The word count register specifies the number of words that must be transferred.
The CPU initialize the DMA controller by sending the
following information.
1.The starting address of the memory block where data are
available (for read) or where data are to be stored (for write).
2. The word count, which is the number of words in the
memory block.
3. Control to specify the mode of transfer such as read or
write.
4. A control to start the DMA transfer.
55
Input -Output Processor

• The computer may incorporate one or more external processors & assign them the tasks of communicating
directly with I/O devices, instead of each interface communicate with the CPU.
• The processor that communicate with remote terminals over telephone & other media are called data
communication processor (DCP).
The overhead of communicating with peripherals is large in a computer system. This is due to wide variety
of peripherals demanding different kinds of signals. Moreover the peripherals are slow devices. Assigning
the task of communicating with peripherals to an IO processor frees the CPU to do more useful tasks
• Block diagram of a computer with IOP.
• Commands :
– Instructions that are read from memory by an IOP are called commands, to distinguish them from
instructions that are read by the CPU.
• CPU-IOP Communication:
Memory unit acts as a message center where each processor leaves information for the other.
Summary
• Peripherals, memory and IO mapping
• Modes of Data transfer
– Programmed IO
– Interrupt driven
• Types of interrupts
– DMA
• IO Processors
56
Types of Computers
Module 6
Objectives
• List types of Computer
• Discuss IBM’s AS-400
57
Types of Computers
Computers can be generally classified by size and power as follows (though there is considerable overlap):
• Personal computer: A small, single-user computer based on a microprocessor.
• Workstation: A powerful, single/multi user computer. A workstation is like a personal computer, but it has
a more powerful microprocessor and, in general, a higher-quality monitor.
• Minicomputer: A multi-user computer capable of supporting up to hundreds of users simultaneously.
• Mainframe: A powerful multi-user computer capable of supporting many hundreds or thousands of users
simultaneously.
• Supercomputer: An extremely fast computer that can perform hundreds of millions of instructions per
second.
Types of computers classifications are thin. Fore example whatever the work stations could do a few years
ago, personal computers can do now. Personal computers can also be configured as a minicomputer to
support multi –user environment (with slightly reduced performances). For example loading OS like Windows
NT or Linux on a PC more than one user can connect to it in a multi-user environment.
Super computers are typically application specific and are useful for a heavy number-crunching (calculation
intensive) applications like weather forecasting, flight simulation etc.
Supercomputer
• Supercomputer, a broad term referring to the fastest computers currently available.
• They are very expensive.
• Employed for specialized applications requiring immense amounts of mathematical calculations (number
crunching).
Ex: Weather forecasting, scientific simulations, (animated) graphics, fluid dynamic calculations, nuclear
energy research, electronic design, and analysis of geological data (e.g. in petrochemical prospecting).
• In India the C-DAC and SERC at IISc., Bangalore have developed many versions of Supercomputer, latest
being the PARAM 10000.
PARAM 10000 Supercomputer from C-DAC, India:
58
Types of Computers
Mainframe
• Mainframe was a term originally referring to the cabinet containing the central processor unit or “main
frame” of a room-filling Stone Age batch machine.
• Now-a-days a Mainframe is a very large and expensive computer capable of supporting hundreds, or even
thousands, of users simultaneously.
• Difference between a supercomputer and a mainframe is:
– supercomputer channels all its power into executing a few programs as fast as possible.
– Mainframe uses its power to execute many programs concurrently. Speed is not the major criteria
IBM Mainframe. Model: z900
In some ways, mainframes are more powerful than supercomputers because they support more simultaneous
programs. But supercomputers can execute a single program faster than a mainframe
Minicomputer
• It is a midsize computer.
• A minicomputer is a multiprocessing system capable of supporting up to 200 users simultaneously.
• The distinction between large minicomputers and small mainframes has blurred, however, as has the
distinction between small minicomputers and workstations
Workstations
• A type of computer used for applications that require a moderate amount of computing power and relatively
high quality graphics capabilities.
• Generally comes with a large, high-resolution graphics screen, large amount of RAM, built-in network
support, and a graphical user interface.
• Most have a mass storage device such as a disk drive. but a special type of workstation, called a diskless
workstation, comes without a disk drive.
• Commonly used OSs are UNIX (or UNIX flavors) and Windows NT.
• Workstations are typically linked together to form a local-area network, although they can also be used as
stand-alone systems.
IBM. HP, Sun and Silicon Graphics are the major manufacturers of workstation. All of them use their own
flavor of Unix operating system and their own graphic libraries. IBM uses AIX, HP uses HP-UX, Sun uses
Solaris and SG uses Irix as the OS. Examples of the models of Work-station are SPARC series from Sun and
O2 from SG.
59
Personal Computer
• A small, relatively inexpensive computer designed for an individual user.
• All are based on the microprocessor technology that enables manufacturers to put an entire CPU in one
chip.
• In businesses they are mainly used for word processing, accounting, desktop publishing, and for running
spreadsheet and database management applications.
• At home, the most popular use for personal computers is for playing games and for surfing the Internet.
Generally classified by size and chassis / case:
Desktop model
- A computer designed to fit comfortably on top of a desk.
- Generally limited to three internal mass storage devices.
- Desktop models designed to be very small are sometimes referred to as slimline models.
Notebook Computer
• An extremely lightweight personal computer, typically weigh less than 6 pounds and are small enough to fit
easily in a briefcase.
• Uses a variety of techniques, known as flat-panel technologies, to produce a lightweight and non-bulky
display screen.
• Modern notebook computers are nearly equivalent to desktop computers in computing power. They have the
same CPUs, memory capacity, and disk drives.
• They are expensive. Notebook computers cost about twice as much as equivalent regular-sized computers.
• Notebook computers come with battery packs that enable you to run them without plugging them in.
Batteries need to be recharged every few hours.
Laptop:
A small, portable computer — small enough that it can sit on your lap.
Nowadays, laptop computers are more frequently called notebook computers
Palmtop:
A small computer that literally fits in the palm.
Palmtops are severely limited in size, but they are practical for certain functions such as phone books and
calendars.
Due to small size, most palmtop computers do not include disk drives.
Many contain PCMCIA slots to insert disk drives, modems, memory, and other devices.
60
Types of Computers
PDA
• Short for Personal Digital Assistant, a handheld device that combines computing, telephone/fax, and net
working features.
– A typical PDA can function as a cellular phone, fax sender, and personal organizer.
• Most PDAs are pen-based, using a stylus rather than a keyboard for input, incorporating handwriting
recognition features.
– Some also react to voice input by using voice recognition technologies.
• The field of PDA was pioneered by Apple Computer, with first PDA, the Newton MessagePad in 1993.
– Modest success in the marketplace, due to their high price and limited applications. However, they may
eventually become common gadgets.
– PDAs are also called palmtops, hand-held computers and pocket computers.
Sharp’s Zaurus PDA: Apple’s Laptop with DVD drive:
IBM AS-400
• http://en.wikipedia.org/wiki/IBM_System_i
Summary
• Types of Computer:
• PC, Workstation, Mini-computer, main frame, Super computer
• Note book computers, PDAs
• IBM AS-400
61
Bus Architectures
Module 7
Objectives
• Discuss Bus - address, data and control buses
• List different Bus architecture
62
Bus Architectures
Bus
• Definition — Signals and protocol
• Use — Co-exist
• Example — ISA, PCI, VME, RS-232, USB, Firewire, Ethernet
• Bus considerations
• Basic Signals
A bus is a set of lines (wires) designed to transfer all bits of a word from a specified source to a specified
destination. A bus can be unidirectional or bi-directional. Bus requires logic circuits to control access to
them in the form of bus controller. The data transfer can occurs between the processor and memory &
between I/O devices and the processor.
The buses are classified in two ways, in the nature of connections.
Parallel Bus. Where n bits of information is transferred at a time. Examples :
PCI Peripheral Component Interconnect
PCMCIA: Personal Computer Memory Card International Association
SCSI: Small Computer Systems Interface
VME: Versabus Modular European
ISA: I ndustry Standard Architecture.
An example of parallel bus -PCI Bus- is explained in subsection 2.4.1.
Serial Bus. Where the data is transferred as one bit at a time.
USB: Universal Serial Bus.
RS 232: Recommended Standard serial Bus of EIA.
Ethernet
• Bus is a set of (common) wires that interconnect components in a computer system.
• All the devices share the bus.
• Address bus carries address information from the CPU to the device
• Data bus carries data between CPU and the device
• Control bus carries control information between CPU and device.
• Address and Data bus are standardized. Variations occur in Control bus.
63
• Control bus should have the signals like
– Read, write, Select
– Request and grant
– Interrupts
– Timing synchronization signals.
– Clock information.
• Data transfer over a bus can either be synchronous or asynchronous
Synchronous data transfer is carried out with a reference clock signal. It dictates at what clock cycle the
control signals are to be activated and for how many clock cycles the signal should remain active.
Asynchronous data transfer does not involve clock cycles. It is carried out using hand-shake signals. Typical
handshake signals are REQuest, GRant, and ACKnowledge. Device which wants to use the bus raises a
request. A bus arbitrator resolves the requests and grants the bus to the device. Requesting device then
acknowledges the request and uses the bus. When data transfer is complete, Requesting device releases
the request signal.
Centronics
Used for parallel printers
– 8 data bits
– 3 control signals
– BUSY indicates the printer is busy and can’t accept further data
– STROBE indicates the CPU has put a byte on the data bus
– ACKnowledge indicates printer has accepted the byte on the data bus.
Centronics bus is a simple asynchronous bus used for parallel printers. It uses three wire handshake signals
for the communications along with 8 bits of data transfer. The printer indicates either it is busy or not on
the busy line and acknowledges the receipt of data on the ack line. The host computer sends one byte at
a time and strobes the data to indicate availability of data to the printer.
It is asynchronous as clock is not a part of the bus.
64
Bus Architectures
Centronics Bus
The figure shows the waveforms of the Centronics printer. Printer provides busy and acknowledgement
signals and the computer provides stb (strobe) and data signals. The lines marked data are set of 8 lines
or is a bus. Operations are:
• Computer checks if the printer is busy by sampling the busy line. If printer is busy, computer waits for the
busy line to go low.
• When busy is low, computer puts the data onto the data bus. (shown by the data line crossing over).
• Computer indicates it has put data on the data-bus by making the STB* low.
• Printer accepts this request puts the busy line to high; reads the data and stores it in buffer and/or
prints this data
• Computer removes the strobe signal
• just before the busy line going low again printer pulses the ACK* signal. (This can be used to interrupt
the computer so that computer can send the next data instead of polling as in step 1)
• Printer makes the busy line to go low. Next data transmission can start.
Serial Data Communication: RS-232

• RS-232C is a standard interface
– for serial transmission of data between computers and other devices
– Describes both the physical interface and the transmission protocol.
• Standards and interfaces usually restrict RS-232 to <=20kbps and line lengths of < =15m.
– However, in practice, RS-232 is far more robust than the traditional specified limits of 20kbps over a
15m line would imply.
• Successors to RS-232 : RS-422 and RS-423.
– backward compatible so that RS-232 devices can connect to an RS-422 port.
65
The RS-232 standard supports two types of connectors — a 25-pin D-type connector (DB-25) and a 9-pin D-
type connector (DB-9). The type of serial communications used by PCs requires only 9 pins so either type of
connector will work equally well.
Most 56kbps DSUs are supplied with both V.35 and RS-232 ports because RS-232 is perfectly adequate at
speeds up to 200kbps. Mainframes and midrange computers are capable of far higher speeds than their
rated 19.2kbps. Usually these “low speed” ports will run error free at 56kbps and above.
The 15m limitation for cable length can be stretched to about 30m for ordinary cable, if well screened and
grounded, and about 100m if the cable is low capacitance as well.
USB
• Up to 127 devices
• Data transfer speed 12Mbps
• Cable length 5 meters.
• Bi-phase NRZI (Non-Return to Zero Inverted) data
• Bit-stuffing sync
NRZI retains a logic 1 as it is. First logic 0 is retained as it is; subsequent logic 0’s are toggled.
If 6 consecutive logic 1’s are sent as data, an extra 0 is added.
USB supports both plug n play and hot-pluggable. That is the device can be dynamically connected and
removed.
Why USB ?
– Low Cost
• Low-cost cables and connecters
• Optimized for integration in device and host hardware
• Low-cost sub channel at 1.5 Mb/Sec
– Ease of use
• Single model for cable and connectors
• Hot attach and Detach support
• Dynamically attachable and configurable peripherals
66
Bus Architectures
USB Device View
There is an “A” end and a “B” end. The “A” end is the flat end and is referred to as the “upstream” going
toward the computer and the “B” end is the more square end and is referred to as “downstream” or
connected to the individual peripherals.
USB Architecture
USB Device Types :
• Host
– The master device in a USB system
– Initiates all data transfer
– Manages device attach / disconnect
• Device or Function
– The slave device in a USB system
– Respond only to request from the host
– Cannot initiate data transfers
• Hub
– Intermediary between Host and Device
There is only one host in USB. All the other devices are the clients. The host (or the master) initializes the
data transfers. Clients (slaves) can not initiate data transfer requests.
67
USB Host
• Host PC is in charge of bus.
• The host has to know that devices are on the bus and capabilities of each.
• On power up hubs make the host aware of all the devices attached by the system
• The host assigns a unique USB address to the device and then determines if the newly attached USB device
is a Hub a or a function .
• When USB device has been removed from the hub ports, the hub disable the port and provide an indication
of device removal to the host.
• The host manages the flow of data on the bus.
• Multiple peripherals may want to transmit data at the same time.
• The host controller handles this by dividing the data path into 1 millisecond frames and giving each
transmission as a portion of the frame.
• Host also has error checking duties and adds error checking bits during data transfer
USB Protocol
USB is polled bus.
• For each transaction, host controller sends USB packet on schedule basis. The packet which describes type
and direction is called token packet. USB device selects itself by decoding its address.
• In a given transaction, data is transferred from host to a device or from a device to host
• The source of transaction then sends a data packet or indicates it has no data to transfer. Destination
responds with hand shake.
– USB data transfer model between source and destination (on host & endpoint )is referred to as a pipe.
SCSI
• Small Computer System Interface.
• A common parallel bus for a number of devices.
• Each device is addressed.
• Most of the SCSI devices are plug and play
• Data length up to 16 bits and data transfer rates up to 160 MBps.
Each SCSI device, like USB, has an unique address. CPU communicates with the SCI device by sending the
address.
68
Bus Architectures
PCI
• PCI is an extension of the earlier EISA bus
• Support plug and Play (PnP).
• Can support multiple bus and protocol conversion
• Supports bus masters (switches)
PCI can act as either an independent bus or as an intermediate bus to connect the CPU bus to other buses.
On a single PCI bus we can connect 32 devices. More number of devices can be connected to the secondary
PCI bus. A bridge need to be used to interconnect a primary and a secondary buses.
PCI Bus
• Basic terminology • Master / Slave
– Peripheral Component Interconnect – Arbitration
– Bussed signals / Point-to-Point signals
– Signal types • Bridges
– Turn-around cycle – Host to PCI
– Decoding – PCI to PCI
• Positive
• Subtractive
PCI Connector Mother board & add-on card
69
PCI Address Space

• Memory Space
• I/O Space
• Configuration Space
– Pre-defined Header Region
– Device Dependent Region
Configuration Space Header
PCI Commands
• Memory Read • I/O Read
• Memory Read line • I/O Write
• Memory Read Multiple • Configuration Read
• Memory Write • Configuration Write
• Memory Write and Invalidate • Special Cycle
• Dual Address Cycle • Interrupt Acknowledge
Bus Arbitration
• More than one device may request for the bus – bus contention
• A master has to resolve these requests and grant to one of the device at a time.
• This is known as Bus Arbitration.
70
Bus Architectures
Different forms of arbitrations are proposed. Depending on the priority or closeness to the system. Priority
based arbitrator may further be modified to rotate the priority. That is when a device (of highest priority)
is serviced, it is made the lowest priority, thereby enabling other devices to have higher priority . PCI follows
this approach. Whenever an interrupt is serviced, it is assigned the lowest priority.
JTAG
• Acronym for “Joint Test Action Group”
• A methodology for testing ICs
• Methodology specified by IEEE standard 1149.1
Pseudonyms:
IEEE Std 1149.1, JTAG, Scan, B-Scan…………….and many, many more!!!!
TDI — Test Data In
Serial stream of data going into IC
Daisy-chained from previous TDO pin
TDO — Test Data Out
Serial of data coming out of IC
Daisy-chained to next TDI pin
TCK — Test Clock
Provides edge sensitive clock to test circuitry
TMS — Test Mode Select
Value at rising edge of TCK sets next TAP controller state
controller state
TRST — Test Reset (optional)
Active low asynchronous reset. Resets TAP controller
71
JTAG Signals/ Pins

Four or five signals dedicated to test
– Data signals daisy chained (TDI/TDO)
– Control Signals are shared (TMS/TCK/TRST)
A set of dedicated test pins - Test Data In (TDI), Test Mode Select (TMS), Test Clock (TCK), Test Data Out
(TDO) - and one optional test pin Test Reset (TRST*). These pins are collectively referred to as the Test
Access Port (TAP). This serial standard which follows a logic called “Boundary Scan” and is standardized by
Joint Test Access Group and hence the name JTAG. Boundary scan is a methodology allowing complete
controllability and observability of the boundary pins of a JTAG compatible device via software control. This
capability enables in-circuit testing without the need of bed-of-nail in-circuit test equipment.
How Does it Work
Each ICs contain special Hardware. ICs connect to tester or Boundary Scan Controller via serial stream
Controller via serial stream.Test data is delivered by scanning (shifting serially) . Patterns latched/applied
to the assembly in parallel after serial shift is completed after serial shift is completed.
Response to test data is scanned back into the tester/boundary scan controller . This can be used
simultaneously with traditional bed -of-nails testing or stand alone.
72
Bus Architectures
PCI Signals
Summary
• Bus - address, data and control buses
• Centronics bus and timing diagram
• Serial bus – RS232 and SCSI
• USB
• Host, client and hub
• PCI bus
• Configuration space address
- Programmability and PnP
• Commands, protocol fundamentals
• Arbitration
• JTAG
References
1. Carl Hamacher, Hill V and Vranesic, Zvonko G. Computer Organization, Ed 4. New Delhi: McGraw Hill, 2001.
2.- Mano, Morris, M. Computer System Architecture. Ed 3. New Delhi: Prentice-Hall India, 1992.
73
NOTES :
74
NOTES :
75
NOTES :
76

Computer Organization

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Computer Organization

Hochgeladen von

Copyright:

Verfügbare Formate

TABLE OF CONTENTS

Module No. Contents Page No.

1. Basic Structure of Computers 03

3. Central Processing Unit 12

5. Input /Output organization 49

6. Types of Computers and IBM’s AS 400 57

• Identify user’s, software developers, computer designers view

• List different components of computer

• Discuss layered model of computer

• Describe CPU configuration registers, buses

• Review electronic switch, register, clocked circuit

A computer is a machine to help the user in a variety of tasks.

– Non- computational e.g. email

– Edutainment – Games, educational software

Software Developer’s view:

A platform on which a software package can be developed and tested.

– General purpose, e.g.. text editor, compiler, etc.

– Special purpose, i.e.. targeted to specific applications

Computer Designer’s view:

A programmable digital system, consisting of:

– System hardware able to support the software to be run.

Continuous advances in technology have led to:

better performance at lower cost;

lower levels of energy consumption.

Examples of current new computer systems:

A layered model of a computer

Configuration of Registers and Buses

List of Registers used in a basic computer:

Register Symbol Number of bits Reg. name Function

DR 16 Data reg. Holds memory operand

AR 12 Addr. reg. Holds addr for memory

AC 16 ACC Processor reg. (general purpose)

IR 16 Instr. Reg. Holds instr. Code

PC 12 Prog. Counter Holds addr of the (next) instr.

TR 16 Temp. reg. Holds temp. data

INPR 8 Input reg. Holds input character

OUTR 8 Output reg. Holds output character

– Address, MUX, D-MUX, Coders, Decoders etc.

– Counters, Registers, Shifters, etc

§ All the logic devices’ out-puts have tri-state logic

• User, Software developer’s and Hardware designer’s views

• Typical computer parts – CPU(ALU,CU), IO, Memories (main and secondary)

• Layered architecture: innermost- hardware, outermost-applications

• CPU configuration registers, buses

• CPU logic: Combinational, sequential, tri-state logic

• Describe Instruction formats

• Explain control unit of a computer

• Instructions are binary codes that specify some action.

• Many instructions contain or specify data/address used by them.

• Instruction specified by field/s :

– Opcode (operation code)- specifies the operation

– Operand- specifies the data or address of the data

(a) Zero-operand instruction

(b) One-operand instruction

(c) Two-operand instruction

Opcode addr1 addr2

Instructions in a Simple Computer

• Basic computer instruction format:

• Instruction register bits are connected in the following way:

Bit 15 is transferred to flip flop represented by I.

Bits 0-11 are applied to control logic gates.

Opcode bits 12-14 are sent to a decoder (3:8).

• The SC can be incremented or cleared synchronously.