Sie sind auf Seite 1von 6

November 2001

November 2001 PAPER


WHITE ™

TRENDS IN PC SYSTEM MEMORY

PC technologies continue to evolve as per- tem memory. For example, when transferring a file from
formance demands grow, and system the network to a local disk, data from the network is
memory technology is no exception. transferred by the PCI host adapter to memory. This is
Throughout the history of the PC, commonly referred to as direct memory access (DMA),
memory technology has steadily as opposed to programmed I/O (PIO), in which the pro-
progressed in capacity and perfor- cessor is directly involved in all data transfers. The
mance to meet the increasing processor, after performing any required formatting op-
requirements of other PC hardware subsystems and erations, initiates a transfer from memory to local disk
software. In the past, there have been relatively clear in- storage. Once initiated, the data is transferred directly
dustry transitions from one memory technology to its from memory to disk without any further processor
successor. However, today there are multiple choic- involvement.
es—PC133, RDRAM, and Double Data Rate (DDR)—
In summary, the system memory functions as the pri-
and more choices may exist in the future as providers of
mary storage component for processor code and data,
DRAM system memory accommodate a growing vari-
and as a centralized transfer point for most data move-
ety of platforms and form factors. These range from
ment in today's systems.
small handheld devices to high-end servers, each with
different power, space, speed, and capacity require-
Performance Factors
ments for system memory.
Memory parameters that impact system performance
This white paper focuses on PC system memory issues are capacity, bandwidth, and latency.
and trends. It begins by reviewing the role of memory in
the system and the key memory parameters that affect Capacity
system performance. The paper then presents today's
How does capacity impact system performance? The
alternatives and explores industry trends.
first step in answering this question is to describe the
memory hierarchy. Table 1 shows the capacities and
Role of Memory in the System
speeds of various storage mechanisms found in a typi-
The primary role of memory is to store code and data for cal mainstream desktop computer in service today.
the processor. Although caching and other processor
architecture features have reduced its dependency on
memory performance, the processor still requires most
of the memory bandwidth. Figure 1 shows the major
consumers of memory bandwidth: the processor,
graphics subsystem, PCI devices (such as high-speed
communications devices), and hard drives. Other lower-
bandwidth interfaces such as the USB and parallel ports
must also be accommodated. The memory hub pro-
vides an interface to system memory for all of the high-
bandwidth devices. The I/O hub schedules requests
from other devices into the memory hub.

Memory plays a key role in the efficient operation of I/O


devices such as graphics adapters and disk drives. In a
Figure 1. Major Consumers of Memory Bandwidth
typical system, most data transfers move through sys-

Visit the Vectors Technology Information Center @ www.dell.com/r&d 1


www.dell.com/r&d
Trends in PC System Memory

These storage mechanisms range from the very fast, cess is comparatively slow and system performance is
but low-capacity, Level 1 (L1) cache memory to the significantly impacted if the processor must frequently
much slower, but higher-capacity, disk drive. wait for disk access. Adding system memory reduces
this probability. The amount of capacity required to re-
Typical Capacity Current Speed
duce disk activity to an acceptable level depends on the
Level 1 cache 32 KB 4 ns
Level 2 cache 256 KB 20 ns operating system and the type and number of active ap-
System memory 128 MB 100 ns plications, including background tasks.
Hard drive 4 GB* 8 ms
Semiconductor technology has provided capacity im-
*Although many disk drives contain much more storage capacity, 4 GB
is the memory-addressing limitation in most PC systems. provements consistent with Moore's law—greater than
a 1.4x compound annual growth rate—and the outlook
Table 1. Memory Storage Mechanisms is for similar increases over the next few years. These
Ideally, a computer would use the fastest available stor- increases have exceeded the requirements of main-
age mechanisms—in this case L1 cache—for all data. stream desktop PCs and, as a result, the number of
However, the laws of physics (which dictate that higher- memory slots is being reduced from three to two in
capacity storage mechanisms are slower) and cost con- many of today's client platforms. However, servers and
siderations prevent this. Instead, PCs use a mechanism high-end workstations continue to take advantage of
called “virtual memory,” which makes use of the L1 and the capacity increases.
L2 cache, main system memory, and the hard drive.
Bandwidth
The virtual memory mechanism allows a programmer to
Memory bandwidth is a measure of the rate at which
use more memory than is physically available in the sys-
data can be transferred to and from memory, typically
tem, and to keep the most frequently and recently used
expressed in megabytes per second (MB/sec). Peak
data in the fastest storage. When more memory is
bandwidth is the theoretical maximum transfer rate be-
needed than is available in system memory, some data
tween any device and memory. In practice, peak
or code must be stored on disk. When the processor ac-
bandwidth is reduced by interference from other devic-
cesses data not available in memory, information that
es and by the “lead-off” time required for a device to
has not been accessed recently is saved to the hard
receive the first bit of data after initiating a memory
drive. The system then uses the vacated memory space
request.
to complete the processor's request. However, disk ac-
There should be adequate memory bandwidth to sup-
New Applications Drive Memory Performance Increases
port the actual data rates of the highest-speed devices
Adequate memory bandwidth is essential for the following cutting-edge applications and to provide enough headroom to prevent significant
and for Single Memory Architecture (SMA) implementations:
interference between devices. In many systems, mem-
• Data Streaming—Encryption/decryption for secure data communication, encod-
ing and decoding audio and video for real-time multimedia applications. ory and I/O hubs are designed to accommodate peak
• 3D Imaging—Data visualization, computer-aided design (CAD), game applica- requirements by buffering transfers and scheduling
tions, and future 3D user interfaces.
• Pattern Matching—Speech recognition and data-mining applications (such as conflicting memory requests.
Internet search engines) tend to use large amounts of data, access data that is
not immediately required, and employ data structures too large to fit in cache. Table 2 shows the data rates of various system compo-
Word patterns, used in speech recognition programs to identify a spoken word, nents over the last 4 years. Although the need for
consume a great deal of memory. Future complex audio user interfaces may
memory bandwidth is not directly proportional to these
require that grammars stored on the hard drive be cached in memory.
• Concurrency and Multithreading—Improved multithreading support in the data rates, the upward trend is obvious. Memory sys-
Microsoft® Windows® 2000 operating system has led to more multithreaded tems have done a fairly good job of keeping up with
applications. Concurrent operations such as system management and antivirus.
• SMA—In a SMA, the graphics controller and memory hub are combined into a system requirements over this period of time, moving
single device. System memory is used for graphics operations, instead of the ded- from 533 MB/sec to 2133 MB/sec. Dual-memory inter-
icated high-performance DRAM available on a graphics add-in card. Designed for
faces using Rambus or DDR memory boost bandwidth
target markets that are not as sensitive to graphics performance, SMA lowers
system performance. For this reason, higher system memory performance bene- to 3200 MB/sec.
fits PCs using SMA.

2
November 2001

Current Memory Technologies


Peak Data Rates Over Time (MB/sec)
I/O, Processor, and Over time, PC sys- DRAM Core Architecture
Video Devices 1997 1998 1999 2000 tem memory has
Storage in RAM chips is arranged in a rectangular
Storage 33 33 66 100 evolved to in- matrix of cells, similar to the cells in a spreadsheet.
Network 12.5 12.5 12.5 12.5 crease capacity Each cell consists of a single transistor and capac-
Multimedia (1394) 0 0 0 50 itor and is capable of holding an electric charge that
and bandwidth,
I/O (USB) 1.5 1.5 1.5 1.5 represents one bit of data. Each memory cell is
I/O Total 47 47 80 164
and reduce laten- uniquely identified by a horizontal row address and
Processor 533 800 1066 3200 cy. There has been a vertical column address. Data can be read from
or written to any cell by supplying a row and col-
Graphics 266 533 1066 1066 progress at the umn address and the necessary control signals.
Total peak data rates 846 1380 2212 4430 component and in- To save manufacturing costs, most memory chips
terface level. While use a common set of address lines for both row
Table 2. Data Rates of I/O, Processor, and Video Devices and column addresses. A row address strobe
the basic compo-
(RAS) identifies the signals on the address lines as
Latency nent core a row address; a column address strobe (CAS)
architecture (see identifies these signals as a column address.
Latency is a measure of the delay from the data request CAS Latency
DRAM sidebar)
until the data is returned. It is a function of peak band- There is a delay for both row and column accesses.
has not changed
width, lead-off time, and interference between devices. In general, there is no separate specification for
significantly, the row accesses. However, column access is speci-
In general, processors are more sensitive to latency
interfaces to the fied in terms of CAS latency. This is typically two or
than bandwidth because they work with smaller blocks three clocks for SDRAM systems, and DDR system
component cores specifications allow half-clock increments from 1.5
of data and can waste a significant number of clocks
have and continue to three clocks. This parameter directly impacts the
waiting for critical data. In contrast, I/O data transfers interface latency and actual bandwidth.
to evolve. Speed
are relatively long, and bandwidth is a more important
and capacity ad-
consideration than latency.
vances in the core have been driven by advances in
Data transfers moving to and from system memory semiconductor technology. Interfaces have also bene-
must pass through the memory hub and, in many cas- fited from semiconductor advances, but with notable
es, the I/O hub. These components are collectively architectural changes. PC memory interfaces have tran-
referred to as the chip set or core logic, and are major sitioned through conventional DRAM, page-mode
contributors to the latency from a device to memory. DRAM (or fast page-mode), and extended data out
They can restrict or exploit system memory capabilities (EDO) DRAM. Today, three main memory interfaces are
and must be designed to provide a balance of band- used:
width, buffering, and scheduling of data transfers for
• Synchronous DRAM (SDRAM)
optimum memory performance.
• DDR SDRAM
Over the last 4 years, memory bandwidth has kept up • Rambus
with system needs, but latency improvements have
lagged. Current Rambus and DDR technologies double SDRAM
the memory bandwidth over 100-MHz SDRAM, but do SDRAM is a 64-bit-wide interface (or 72 bits in imple-
not reduce latency. Ideally, latency should be reduced mentations that include error correction capability). The
in proportion to processor clock rate increases. New interface clock rate began at 66 MHz, evolved to 100
processor architecture features—improved caches, MHz, and is now capable of operating at 133 MHz in
more support for out-of-order execution, and prefetch system memory implementations. (100- and 133-MHz
instructions for applications that are sensitive to proces- implementations are often referred to as PC100 and
sor latency—have helped to offset this lag in latency PC133.) The bandwidth in MB/sec is equal to the clock
improvements. multiplied by 8. (A 64-bit wide interface transfers 8
bytes per clock.) This yields 1066 MB/sec bandwidth for
current 133-MHz SDRAM. Both latency and bandwidth

3
www.dell.com/r&d
Trends in PC System Memory

have improved in proportion to the clock speed to be replaced rapidly as the memory performance de-
increases. mands of new system components, operating systems,
and applications increase. Dual SDRAM interfaces are
A single system memory semiconductor component
used in servers for added bandwidth and capacity.
may provide 4, 8, or 16 bits on each data transfer. Re-
gardless of the bits per transfer, each component In system memory implementations, SDRAM frequen-
contains the same amount of storage (in bits) for a given cies are not expected to increase beyond 133 MHz.
DRAM component generation. Multiple components Instead, DDR will be used beyond the 133-MHz perfor-
are mounted on a memory module (or DIMM). The sys- mance point. (SDRAM frequencies beyond 133 MHz are
tem's memory hub interfaces with one or more of these available, but are used in specialized applications such
memory modules. as graphics controllers.)

An SDRAM module combines the components to pro-


DDR SDRAM
vide 64 bits on each data transfer. Desktop and portable
systems use 8- and 16-bit component interfaces to pro- DDR SDRAM (see DDR Naming Conventions
vide 64-bit interfaces in lower capacities. For example, sidebar, “DDR
Naming conventions for current implementations
a typical desktop or portable SDRAM module might Naming Conven- of DDR memory are summarized in the following
consist of four 16-bit components, each with a 128- tions”) is similar to table:
megabit (Mb) capacity, for a total of 64 MB of memory. standard SDRAM,
In contrast, 4- and 8-bit components are used for the but adds strobes Name Name
that operate at the Based on Based on
higher capacities needed in servers. An SDRAM module Original Clock Band-
used in a server might contain 16 4-bit components, same frequency as Name Rate width
each with 128-Mb capacity, for a total capacity of 256 the clock. These DDR DDR200 PC1600
MB of memory. Component capacities have increased strobes travel with SDRAM
(100 MHz)
over the life of SDRAM in proportion to the advances in the data from the
DDR DDR266 PC2100
the semiconductor industry. memory hub to the SDRAM
memory compo- (133 MHz)
Registered Memory Modules nents on writes
As the number of memory components increases, so and to the memory
does the load on the address lines between the memo- hub from the memory components on reads. Data is
ry hub and the memory modules. The memory hub transferred on both the rising and falling edges of the
shown in Figure 1 can only address a limited number of strobe. This architectural change, coupled with lower
components without encountering errors or reducing signal levels and advances in semiconductor process-
its clock rate. Registered memory modules (as opposed es, doubles the data rates and the bandwidth. However,
to unbuffered modules) address this issue by including DDR does not reduce latency. In general, the PC1600
a clocked register that buffers the signals carried on the implementations have greater latency than PC133. De-
address lines. Introduced with SDRAM and also imple- spite this, PC1600 provides a performance advantage
mented in DDR memory, registered memory modules for most systems and applications in which it is
are used mainly on servers. Registered memory mod- implemented.
ules introduce an extra clock of latency. However,
Like SDRAM, PC system memory implementations of
overall system performance is not impacted because
DDR have 64-bit data paths (or 72 bits for error correc-
the increased capacity more than compensates for in-
tion capability). Bandwidths for DDR are 1600 MB/sec
creased latency.
for PC1600 and 2133 MB/sec for PC2100. Component
Current SDRAM Implementations capacities for volume production DDR began at 128 Mb
and will be available at least through the 1-Gb
133-MHz SDRAM is used in today's value and main-
generation.
stream computers. Its performance is adequate for the
application mix in the target markets, but it is expected

4
November 2001

One of the goals of DDR was a smooth technology tran- Rambus components have followed the semiconductor
sition. Many memory hubs are designed to operate with technology curve in terms of capacity; however, the
either SDRAM or DDR SDRAM modules. However, maximum memory allowed on a Rambus interface is
practical space and cost considerations, driven in part limited to 32 components. (With registered DDR or
by different module connectors, has prevented any SDRAM, 64 components are allowed, and with unbuf-
practical system-level application of this capability. fered SDRAM, 48 components are allowed.) Thus, the
maximum memory possible on a Rambus channel is
DDR has minor incremental system- and component-
256 MB with 64-Mb chips, 512 MB with 128-Mb chips,
level cost penalties over standard SDRAM, but these
and 1.024 gigabytes (GB) with 256-Mb chips.
costs will decline with volume and advances in semi-
conductor processes. DDR volumes will increase in In workstations, Rambus has distinct performance ad-
notebook, desktop, and server platforms over the next vantages over current SDRAM and DDR implementa-
year. tions due to its higher actual bandwidth and the ability
of the memory hub to exploit that bandwidth. It will re-
Rambus main the preferred technology in this market until new-
The Rambus architecture is significantly different from er technology implementations match its performance.
previous system memory interfaces. Both commands
and data are communicated on packet buses from the Upcoming Memory Interface
memory hub to the modules. There are two command Transitions
buses—RAS bus and CAS bus—and one data bus. There are key upcoming transitions in the DDR and
Eight transfers on both edges of a 400-MHz clock are Rambus memory technologies, as well as an emerging
used for all packets. The separate command and data new technology, Advanced DRAM Technology (ADT).
buses, along with identical packet lengths, allow mem-
ory hubs to improve the scheduling of memory DDR
transfers to achieve higher memory data bus utilization. The next key transition in PC memory interfaces will be
The result is higher actual bandwidth. an increase in the DDR clock rate from 133 MHz to 166
The Rambus data path is 16 bits wide (18 bits for error MHz with a corresponding data transfer rate increase to
correction), rather than 64 bits wide. With SDRAM or 333 megatransfers/sec. (This DDR memory technology
DDR, 4-, 8-, or 16-bit devices are used to create the 64- is referred to as PC2700). This provides 2.7 GB/sec of
bit interfaces. In contrast, with RDRAM, a single com- bandwidth and reductions in processor access latency
ponent can provide the full memory interface function. that are proportional to the increased clock rate. Al-
This allows RDRAM to be implemented in a capacity though production system memory components will
that better matches the needs of a particular system. begin to appear in 2002, cost premiums are expected.
However, each memory component must be capable of In addition, incorporation of this technology into reli-
providing the full bandwidth needed by the system. able, cost-effective PC platforms presents many
This requirement, coupled with other internal architec- challenges. A widespread industry rollout of PC2700 is
ture specifications, have resulted in a cost premium for not expected until 2003.
RDRAM components. The following DDR transition—DDR II—will begin at a
The Rambus data path provides 16 bits per transfer at clock frequency of 200 MHz and a data rate of 400
800 million transfers per second (megatransfers/sec). megatransfers/sec for a bandwidth of 3.2 GB/sec. The
This results in a bandwidth of 1600 MB/sec. Rambus la- net effect of frequency increases and other interface
tencies are slightly greater than SDRAM, because changes on latency is not clear. The signaling levels of
information must be converted to and from packets. As DDR II will be reduced and data transfer timing will be
with DDR200, the disadvantages of increased latency based on differential strobes. Both the reduction in sig-
are generally outweighed by the increased bandwidth. naling levels and the new strobes will allow more cost-
effective implementations in PC platforms. Although

5
www.dell.com/r&d
Trends in PC System Memory

prototype DDR II components have already been pro- for a given memory module implementation is falling
duced, use in high-volume PC production is not over time. A dual interface helps to alleviate the result-
expected until 2004. The initial size of the DDR II com- ing performance issues. Multiple memory interfaces in
ponent is expected to be 512 Mb, but 256-Mb high-end systems are not new, but their use in main-
components may be needed for small-capacity, dual- stream desktop PCs is expected to begin soon.
memory interface systems.
Future Transitions
ADT Improvements in the memory interface performance re-
ADT is an emerging memory interface technology tar- quire advances in semiconductor technology and the
geted for introduction slightly after DDR II. Details of techniques used to maintain signal integrity and timing
this interface have not been publicly disclosed; howev- across the interface. There are challenges on the com-
er, all major DRAM component manufacturers are ponent side to providing capacity and speed increases
involved in defining both the DDR and ADT interfaces. It at the pace of Moore's law; it is difficult to make com-
is expected that ADT will initially target desktop applica- ponents bigger and faster at the same time. Containing
tions. There is a strong possibility that only one of these cost, reducing size, controlling emissions, and main-
standards will be adopted by the PC industry, and the taining the reliability of the system memory interface
emerging standard will incorporate features from both becomes more difficult as performance increases. Fu-
technologies. ture interface transitions must address all of these
issues.
Rambus
Rambus has plans for a new high-speed memory inter- Conclusion
face called Yellowstone. This next-generation interface Dell recognizes the important role that PC manufactur-
is intended to provide 3.2 Gb/sec per differential signal ers play in driving technology advances in PC systems.
pair between DRAM components and the memory in- Based on customer requirements, Dell drives improve-
terface. A 200-millivolt differential signal is planned to ments in memory system reliability and performance,
support this data rate. The Yellowstone is expected to configurations, and form factors. Dell development
be within mainstream PC memory cost budgets. teams work with all major memory component and chip
set suppliers to ensure proper product development pri-
Multiple Memory Interfaces orities. Dell is a member of the Joint Electronic Device
Although memory technology has kept pace with in- Engineering Council (JEDEC) and meets on a regular ba-
creasing capacity requirements, system memory sis with other industry entities such as the ADT
bandwidth is not keeping pace with the demands of the consortium and Rambus to ensure the availability of fu-
rest of the system. The ratio of bandwidth to capacity ture technology that can benefit customers.

Information in this document is subject to change without notice.


© 2001 Dell Computer Corporation. All rights reserved.

Trademarks used in this text: The DELL logo is a trademark of Dell Computer Corporation; Microsoft and Windows are registered trademarks of Microsoft Cor-
poration. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell
Computer Corporation disclaims any proprietary interest in trademarks and trade names other than its own.

Das könnte Ihnen auch gefallen