Sie sind auf Seite 1von 10

Embedded Systems Purushotam Shrestha

Chapter 3: Memory
The purpose of memory is to store data, required for the processor to process or archival purposes. The
storage may be long term or short term. This is the main parameter that memory devices are categorized on
the basis of. In general, non volatile memory systems hold data for a very long time if not permanently while
volatile memory systems store data for very short period of time.
In embedded systems, we require memory for storing program and/ or data for processing. Sometimes it may
be for archival purpose but normally it is served by
a separate system. Address
Our focus is semiconductor memories because lines Memory Data
these are the type of memories that a processor Device lines
interacts mostly with.
The adjoining figure shows a basic memory system.
It consists of actual storage locations to which
address lines, data lines and control lines are
connected. Location is selected by address bits on
address lines. If read is selected, data contained in Read / Write and
the location appears in the data lines or if data is to other control lines
be written, the data that is to be stored in the
selected location is put on the data lines and write Figure: A basic memory system
is selected.

A memory system is specified as mXn, where


m is no of address locations
n is bit-width: the number of bits that each location can store.

3.1 Memory Write Ability and Storage Permanence


Memory Write Ability
Memory Write Ability refers to the process of putting bits in specific location of the memory and the ease and
speed with which the process can be completed. The writing process may be time consuming like in ROMs or
faster like in registers.
A basic writing process involves providing address values to the address lines and data to the data lines and
selecting write function. But there are different methods to actually write the data into the memory

A ROM memory can be built using a combinational logic whose inputs act as address lines and output act as
data lines. The circuit has to be designed such that each combination of inputs give out data that is stored in
the address given by the input lines, once implemented in hardware the stored values can't be changed.
Another circuit has to be designed for another set of values. The write time is incomparably large.

A register is made up of flip-flops. Writing is easily accomplished, just put the data in the data lines and enable
load. Similarly, RAMs which are built around basic transistor storage cells, also have faster writing. Also, these
are re-writable.

On the basis of writing ability, the memory can be:


High End: The memory with high end write ability is the easiest one to write data into. These are flip-flop based
memory that a processor can write directly into. Examples are registers and RAM, which are also the memory
closest to the processor and used by the processor during its computations.
Middle Range: The memories in this range are a little bit difficult to write and are slower than high end ones.
The processor can still write in them, but with a slower speed. These memories are not accessed frequently and
used for storing data for a longer period of time. Examples are FLASH, EEPROM. The middle range memories
can be used in design and testing phase of an embedded system.

Chapter 3 1
Embedded Systems Purushotam Shrestha

Lower Range: A Special Programmer Device is required to write data into this type of memory. It is slower
memory. Examples are UV EPROM in which data is written by using voltages higher than that of normal
operation, one time programmable (OTP) ROM, in which data is written by blowing connections which
represent bit values. The OTPROM can be programmed only once.
Low End: In the Low End memory devices, data is written only during manufacturing. The memory device is
manufactured with data into it. The data writing process starts with the design of the chip, locations and the
data to be held by the location, and completes with actual manufacturing of the chip. Once manufactured, the
data cannot be re-written. The mask programmed ROM is an example. In embedded systems, such memories
can be used to hold program or some data that are used very frequently, but the values to be stored must be
final.

The write ability of a memory device is compared on the basis of access time: the period between the time
address is provided and data is made available or stored into.
Another aspect is cycle time which is the time between two successive read or write times.

Storage Permanence
Memory systems can retain the stored bits almost permanently or start losing as soon as data are written into
the memory cells. The time period that a memory system can retain the stored bits varies with the type of
memory.

The memory based upon the combinational logic circuit discussed above stores data almost permanently while
RAM cannot hold it even for a short period of time without power. Based upon storage permanence, memory
devices can be

High End: Once the data are written, the high end memory devices can hold data almost permanently.
Examples are mask programmed ROM, OTP ROM. The program of an embedded system can be put into this
type of memory.

Middle Range: The memory with middle range storage permanence can store for certain period of time. The
examples are NVRAM, a battery-backed RAM, or Flash Memory.

Lower Range: Lower range memory devices can hold data as long as power is available. The data are lost once
the power supply is plugged off. These are used by the processor during its operations, the data are not written
for storage purpose. An example is SRAM in which bits are stored in transistors/ flip flops which can hold data
as long as power is available.

Low End: From data retention point of view, the low end memory devices are the worst ones. They lose data
even when power is available. The stored data are represented by charge stored in capacitor which, practically,
loses its charge if not refreshed periodically. Example is DRAM. They are used because the packing density is
high and occupies less space.

The graph below shows different types of memories on the basis of their write ability and storage permanence.
The ideal memory can retain data forever and the fastest writing speed. But, for practical memories, the one
with fastest writing speed cannot retain data for a short period of time while the one with the slowest writing
speed can retain data for longest period of time.

3.2 Memory Types


Memories are built upon a wide variety of the technologies. Different technologies give rise to memories with
different characteristics. Some of main classification parameters are
Read-Write Ability
Data retention
Access Methods
The first two are related to write ability and storage permanence.

2 Chapter 3
Embedded Systems Purushotam Shrestha

Figure: Graph
showing
write ability
and storage
permanence

Read-Write Ability:
Read Only: Writing process is very difficult and takes a lot of time. Once written, data is retained and can be
read easily. Rewriting is almost not possible, to store different data, a different chip might be required.
These memories are used to contain booting processes, MAC addresses, program memory of an embedded
system.
Eg Mask Programmed ROM, OTP(One Time Programmable ) ROM

Read and Write: In memory with read and write ability, writing process is very easy and takes very less time.
Read and Write access times are less and almost same, hence are faster as well. Re-writing is simple compared
to Read Only Memory described earlier. Due to this reason, such type of memory can be used as working
memory for processor.
Eg RAM.
Magnetic tapes and disks can also be considered as Read-Write memory, but compared to semiconductor ones
are very slow.

Data Retention:
Volatile: Volatile memories cannot retain stored data without power supply. Generally, volatile memories have
faster access times and are used by processors during its computational tasks.
Eg S-RAMs, which are built using flip-flops which store bits only in the availability of power. D-RAMs are also
volatile memories that use transistor capacitor combination to store data as charge in the capacitor.

Non-Volatile: Non Volatile memories can hold stored data even when power supply is removed. Writing
process is not as easy as that for volatile memory. Non-Volatile memories use permanent or programmable
connections to represent bits. They are used in applications where data is to be retained for a long time, like
holding a MAC address of a communication device or long term data storage.
Eg ROM, EPROM, FLASH.

Access
Sequential Access: Data is accessed sequentially. An address location is not accessible without going through
previous location. Access times are different for different locations. These are slower type of memory and are

Chapter 3 3
Embedded Systems Purushotam Shrestha

cheaper too. Sequential Access memories are used for storing high volume data that the processor does not
read and write directly and frequently.
Eg Hard Disk, Compact Disk, Magnetic Tape Drive.

Random Access: Data access is random. Access times are same and independent of locations and are faster than
other types of memory. From access point of view, this type of memory is best suitable for processor.
Eg RAM, this is how RAM got its name, Random Access Memory.
Semiconductor ROM is also randomly accessible; the difficulty in writing/ rewriting is a different matter.

Memories may be bit addressed in which each bit can be accessed or byte/word addressed in which a group of
bits, a word, is addressed.
The location of memory relative to processor can also be a parameter to classify memories: internal memory
which the processor can directly access and external memory which are accessible to processor via i/o
controllers.
It should be noted that a single memory may be categorized into different categories, like a ROM is non-volatile,
randomly accessible read only memory. It may be bit or byte addressable and internal or external depending
upon application.

ROM
ROMs are non volatile in form. A processor can read data from ROMs but cannot write directly. Writing and re-
writing processes are very difficult or impossible.
A ROM may be implemented using combinational circuits. The input may be considered as address bits and
corresponding outputs as data bits stored in the address. An entirely new circuit can be designed and built upon
a truth table whose input side represent address and output represent stored data using basic gates.

Figure : ROM
implementation using
truth table

A decoder can also be used. The input lines


are address values. Only one of output lines
is high for any given input. Data lines that
need 1 are connected to the output line
and that dont need are left out. Active
devices may be used for connections.
The ROMs are used to store permanent data
like MAC address of a switch, IMEI number
of a mobile equipment, game cartridges. In
embedded systems, the ROM may be used
to store the program or any data which
might be required frequently but takes a lot
of resources to compute.

Figure ROM using


decoder

4 Chapter 3
Embedded Systems Purushotam Shrestha

Mask Programmed ROM


It is read only memory. The data are written during manufacturing process and re-writing is not possible
thereafter. The word mask comes from the IC fabrication technology. The chip is designed according to the
contents to be stored in the memory; a mask is prepared according to the layout and used in photolithography.
Once the process is completed, the data retention is very good.
The data must be checked rigorously, there cannot be any errors. Once fabricated, a chip with incorrect data
has no other option than to be discarded.
The read access times are fast enough. The main advantage is they have very low cost per bit.

One Time Programmable ROM


The OTP ROM is programmable by the user, but for only one time. The chips are manufactured with
programmable connections which are later blown by using special ROM programmer device. The
programmable connections are basically a fuse; a connection may represent 1 and a blown fuse 0. The user
prepares a file containing address and data to be stored and is put into the programmer which then blows the
fuse as required and stores data. If an error occurs, then the chip is most probably useless.

EPROMs
These are erasable programmable ROMs and can be reprogrammed/ re-written. First the memory is erased
and then data is written. The re- writing involves a time
consuming and difficult process and requires a special
programmer device. The number of re-writing is
limited based on technology.

The programmable connection is provided by a MOS


transistor with a floating gate surrounded by an
insulator. Negative charges form a channel between
source and drain storing a logic 1. Large positive
voltage at gate causes negative charges to move out of
channel and get trapped in floating gate storing a logic
0. Erasing is to cause negative charges to return to
channel from floating gate restoring the logic 1 and
writing is putting store 0 where required and leave Figure:
other connections as it is where a 1 is required. EPROM
The erasing can be done by shining UV rays on the gate technology
as in the UV-EPROM or by electrically, using higher
voltages, as in the EEPROM.
The FLASH type of memory is special kind of EEPROM.
The erase process is made faster by erasing large
blocks of memory at a time unlike individual word in
the EEPROM. The write ability and storage
permanence is similar to the EEPROMs.

RAMs

These are volatile form of memory with easy Read / Write control
and faster read write capability. The read
write access times are less and almost same. cell
This feature allows these memories to be Select line
used directly by the processor.
Each bit is stored by a memory cell. The read
write ability is possessed by these memory Figure:A
cells which have read and write control lines. Data bit
memory cell
An enable function selects or disables the
cell.

Chapter 3 5
Embedded Systems Purushotam Shrestha

A word is composed of a group of such cells with common select and control line. An 8-bit word would contain
8 memory cells. A word is selected by an output of decoder which takes in the address bits. The read write
control lines are common to all the words.

The individual memory cell may be


Common control lines
built using different hardware. S-
RAM employs flip-flops for its cells.
Flip-flops are bi-stable circuits which
2X4
store bits as one of its two states. Memory
The use of flip-flops requires more Decoder
cells
number of transistors which makes
it less compact, but is faster. D-RAM Address
uses a transistor and a capacitor for lines
its memory cells. The bits are stored
in form of presence or absence of
charge in the capacitor. The
capacitors have a natural tendency selection
to discharge and a refresh circuit is lines
required to hold the state of the data lines
capacitor. It is slower than S-RAM
but is more compact.

3.3 Memory Composing

A memory is specified as mXn where m is the number of the words of address locations contained by the
memory chip and n is the bit width of each word.
R/W Enable
Each chip has address lines that select word locations,
Word width: n bits
control lines that select read write function or enable or
disable of the chip, data lines that give out or take in data. -----

a Address -----

Memory chips are available in standard sizes. But the need Lines: -----

often differs from size of these readily available standard 2a=m -----

words Address . . . . . .. . . . . .
memories. . . . . . . . . .
. . . . . . . . . . . .
Locations:
m words
When available memory is larger, simply ignore unneeded -----

high-order address bits and higher data lines.


n bit data lines
When available memory is smaller, compose several
A mXn memory
smaller memory chips into one larger memory
Connect side-by-side to increase width of words. All the available address lines go to all the chips. Other control
lines are also shared by all the chips. The data lines are simply put together maintaining the order.
Connect top to bottom to increase number of words. Low-order address lines and control lines are shared by
all the chips. Use a decoder of suitable size that takes in the high-order address lines and generates selection
signals that are used to select individual chips.

6 Chapter 3
Embedded Systems Purushotam Shrestha

Use both of the above approaches to increase number of location/words and the width of the words.

3.4 Memory Hierarchy and cache


Memory Hierarchy
The purpose of all memory
systems is storage and retrieval
of data. But a wide variety of
memories exist with
contrasting performance on
different characteristics. Some
of the fundamental
characteristics are read/write
access times, storage
permanence, storage capacity
and cost per bit. Ideally, a
designer would want to use
faster memory but at cheaper
price. In reality, that is not the
case.

Chapter 3 7
Embedded Systems Purushotam Shrestha

Some have faster access times but are expensive(SRAM), some are compact but are more complex(DRAM),
some can store a lot of data at a lower cost but are slower(disk systems) etc
The design of memory system involves trade-off among various characteristics. A computer system cannot be
built with only one type of memory. The memory that directly deals with the processor must be faster than
others. Mass storage systems can accept slower access times. The memory part of a system may comprise
different types of memory put together to fit the requirements at different levels.
The memory hierarchy summarizes the memory types and their performances using a pyramidal structure. The
figure below shows the pyramid containing the different memories. At the top are the registers that are closest
to the processor, are the fastest ones as well as the most expensive. As one goes down the hierarchy one can
see the trend:
Increasing distance from the processor or decreasing access by the processor
Increasing access times or Decreasing speed
Increasing capacity
Decreasing cost per bit.

Now the designer can follow the memory hierarchy, the memories at the top should be used for computational
purposes, but care should be taken to optimized use as they are expensive. The ones at the bottom can be used
for storage of infrequently used data.

Cache
The main memory is slower than the processor. A designer would like to build a main memory as fast as the
processor but that would significantly increase the cost. A direct and frequent access to the main memory would
slow down the processor. To address this problem, a faster memory whose speed is close to or same as that of
the processor, is employed. This memory resides within the processor chip and contains a copy of block of main
memory so that the processor can access the contents faster.
Cache memory is built from SRAM which is more expensive but gives faster access times.

Cache operation: Cache memory works on the principle of locality:subsequent data accesses are from a same
locality. The processor needs some data, either to read from or write into the memory, and makes an memory
access request. First the cache is checked for requested data,
If the required data is in the cache it is called cache hit. Data is fed to the processor quickly.
If the required data is not in the cache, it is called cache miss. Then a block of data is transferred from the
specified location in the main memory to the cache.
The data transfer between the processor and the cache is word level while that between the cache and the main
memory is block level.

Cache Structure: The cache is designed to contain a block of main memory. A cache is organized into c= 2r lines,
each of which can contain the w words of a single block of main memory. Suppose that a main memory
contains 2n words addressable by n bits. The words are grouped into m blocks each containing w words. A cache
line comprises of w words plus tag and control bits. The length of line excluding the tag and control bits is known
as line size. The total number of words contained by the cache is called the cache size
There are more blocks in the main memory than the number of lines in the cache. Mapping techniques are
required to identify a particular block of memory in the cache. Or one can say that mapping functions translate
main memory addresses into cache addresses which are fewer than the former. Mapping functions map
selected(not all) main memory blocks into cache lines. Tag bits are used to identify the block currently stored in
the cache line. Tag bits are usually a portion of main memory address bits. The organization of a cache is
dependent upon the mapping function employed.

Mapping
Direct Mapping
The mapping function is implemented as,
i= j modulo m
Where,
i= cache line number
j= main memory block number
m= number of lines in cache
8 Chapter 3
Embedded Systems Purushotam Shrestha

Figure :Cache Structure

Each block of memory has a fixed line in the cache. The main memory address is interpreted as tag bits, line
bits and word bits. The required data is accessed by comparing the tag bits, then selecting the line number by
line bits and the required word within the line using word bits
It is simple and has inexpensive implementation.
Suffers from a problem called thrashing . Because a block in main memory has fixed cache line, when data
from two blocks that occupy same line are frequently accessed repeated swapping occurs reducing hit rate.

Associative Mapping
The addresses are mapped such that any block can occupy any line in the cache. The main memory address is
interpreted as tag bits and word bits. When a data access is required the tag bits are compared
simultaneously to find out the required line in the cache and the words bits are used to select the data word.

It solves the problem of thrashing but requires a complex hardware for the simultaneous comparison.

Set-Associative
It uses the features of both direct and associative mapping. The cache is divided into sets of equal lines. The
main memory address bits are interpreted as tag bits, set bits and word bits. The set is selected using direct
mapping technique. The mapping is as
i= j modulo v
Where
i= set number,
j= main memory block number,
v= number of sets in the cache,

The total number of line in the cache, m, is expressed as


m= v X k, where k is the number of lines in a set

Chapter 3 9
Embedded Systems Purushotam Shrestha

So a block is stored in a particular set. But within the set, the block can occupy any line using the associative
technique. Once a set is selected, the tag bits are used to identify the required line and words bits are used for
finding the particular word.

Larger number of sets makes the set-associative more direct and larger number of line within the set makes the
mapping more associative. For v= m and k= 1, the set associative mapping reduces to direct mapping and for v=
1 and k= m, it reduces to associative mapping. A common type is v= m/2 and k= 2.

Levels of Cache: There may be a single cache or multiple levels named as L1, L2, L3 and so on. The L1 is the
fastest and closest to the processor, the speed and distance between the processor both decrease as the level
increase.

Replacement Policies:
Direct mapping does not require any policy while Associative and Set associative require some kind of policy,

1. Random: A random block is replaced.


2. LRU: Least Recently used block is replaced. Bits are used to count the usage and the one with least value is
replaced.
3. FIFO: First in first out, the one that was transferred first is replaced.

Write Techniques: Writing to main memory is required whenever the contents of the cache are changed by the
processor. The writing can be done as follows:

Write through : Write to main memory whenever cache is written.


Main memory is always valid.
Main memory is accessed a lot, hence slower.
Write back: Write to main memory only when the block is to be replaced from the cache. A bit called dirty bit or
use bit determines whether the content is changed.
Main Memory accesses are less, faster
Main memory is not always valid, other devices accessing the content may get invalid data

Impact of cache on system performance:


Cache Size: Larger cache size may increase hit rate.
Line size: Larger line sizes may hold data that may not be required frequently.
Degree of associativity: When the set-associative method is used, the degree of direct and that of associative
method must be balanced.

There is always an optimum size, deviation from where would lead to decreased system performance.

10 Chapter 3