Sie sind auf Seite 1von 59

Design of High Speed DDR3 SDRAM (Double Data Rate3

Synchronously Dynamic RAM) Controller


ABSTRACT
In computing, DDR3 SDRAM or double-data-rate three synchronous dynamic
random access memories is a random access memory interface technology used for high
bandwidth storage of the working data of a computer or other digital electronic devices.
DDR3 is part of the SDRAM family of technologies and is one of the many DRAM
(dynamic random access memory) implementations. DDR3 SDRAM is the 3rd
generation of DDR memories, featuring higher performance and lower power
consumption. In comparison with earlier generations, DDR1/2 SDRAM, DDR3
SDRAM is a higher density device and achieves higher bandwidth due to the
further increase of the clock rate and reduction in power consumption.
In this work, the DDR3SDRAM controller is designed and it can interface with
Look up table based Hash CAM circuit. Content-addressable memory (CAM) is a
special type of computer memory used in certain very high speed searching applications.
Because a CAM is designed to search its entire memory in a single operation, it is much
faster than RAM in virtually all search applications.The architecture of DDR3SDRAM
controller consists of Initialization fsm Command fsm, data path , bank control ,clock
counter, refresh counter, Address FIFO, command FIFO ,Wdata FIFO and R_data reg .
In this paper, an advanced DDR3SDRAM controller architecture was designed
and which can interface with a high performance Hash-CAM based lockup circuit. The
DDR3SDRAM controller normal write, read and fast read operations are verified by
simulation and DDR3SDRAMcontroller is synthesized.
1. INTRODUCTION
1.1.1 DDR3 SDRAM:
In electronic engineering, DDR3 SDRAM or double-data-rate three synchronous
dynamic random access memories is a random access memory technology used for high
bandwidth storage of the working data of a computer or other digital electronic
devices.DDR3 is part of the SDRAM family of technologies and is one of the many
DRAM (dynamic random access memory) implementations. DDR3 SDRAM is an
improvement over its predecessor, DDR2 SDRAM.
The primary benefit of DDR3 is the ability to transfer I/O data at eight times the
data rate of the memory cells it contains, thus enabling higher bus rates and higher peak
rates than earlier memory technologies. However, there is no corresponding reduction in
latency, which is therefore proportionally higher. In addition, the DDR3 standard allows
for chip capacities of 512 megabits to 8 gigabits, effectively enabling a maximum
memory module size of 16 gigabytes.
The DDR3 SDRAM is not very much different from the previous generation
DDR memory in terms of its design and working principles. In fact, it is true: DDR3
SDRAM is a sort of third reincarnation of DDR SDRAM principles. Therefore, we have
every right to compare DDR3 and DDR2 SDRAM side by side here. Moreover, this
comparison will hardly take a lot of time.
The frequencies of DDR3 memory could be raised beyond those of DDR2 due to
doubling of the data prefetch that was moved from the info storage device to the
input/output buffer. While DDR2 SDRAM uses 4-bit samples, DDR3 SDRAM uses 8-bit
prefetch also known as 8n-prefetch. In other words, DDR3 SDRAM technology implies
doubling of the internal bus width between the actual DRAM core and the input/output
buffer. As a result, the increase in the efficient data transfer rate provided by DDR3
SDRAM doesnt require faster operation of the memory core. Only external buffers start
working faster. As for the core frequency of the memory chips, it appears 8 times lower
than that of the external memory bus and DDR3 buffers (this frequency was 4 times
lower than that of the external bus by DDR2) So, DDR3 memory can almost immediately
hit higher actual frequencies than DDR2 SDRAM, without any modifications or
improvements of the semiconductor manufacturing process. However, the above
described technique also has another side to it: unfortunately, it increases not only
memory bandwidth, but also memory latencies. As a result, we shouldnt always expect
DDR3 SDRAM to work faster than DDR2 SDRAM, even if it operates at higher
frequencies than DDR2.
The final DDR3 SDRAM specification released by JEDEC recently describes a
few modifications of this memory with frequencies from 800 to 1600MHz. The table
below shows the major specifications of the memory modifications listed in the spec:
Considering that the latency of widely spread DDR2-800 SDRAM with 4-4-4
timings equals 10ns, we can really question the efficiency of DDR3 SDRAM at this time.
It turns out that the new DDR3 can only win due to higher bandwidth that should make
up for worse latency values. Unfortunately, the transition to DDR3 SDRAM is a forced
measure to some extent. DDR2 has already exhausted its frequency potential completely.
Although we can still push it to 1066MHz with some allowances, further frequency
increase lowers the production yields dramatically thus increasing the price of the DDR2
SDRAM modules. That is why JEDEC didnt standardize DDR2 SDRAM with working
frequencies exceeding 800MHz, supporting the transition to DDR3 technology.
However, DDR3 SDRAM offers a few other useful improvements that will
encourage not only the manufacturers but also the end users to make up their minds in
favor of the new technology. Among these advantages I would like to first of all mention
lower voltage of the DDR3 SDRAM modules that dropped down to 1.5V. It is 20% lower
than the voltage of DDR2 SDRAM modules, which eventually results into almost 30%
reduction in power consumption compared with DDR2 memory working at the same
clock speeds. More advanced memory chips manufacturing technologies also contribute
to this positive effect.
The BGA chip packaging also underwent a few modifications, and now it features more
contact pins. This simplifies the chip mounting procedure and increases mechanical
1.2 DDR3 Based Lookup Circuit for High-Performance Network
Processing
With the development of network systems, packet processing techniques are
becoming more important to deal with the massive high-throughput packets of the
internet. Accordingly, advances in memory architectures are required to meet the
emerging bandwidth demands. Content Addressable Memory (CAM) based techniques
are widely used in network equipment for fast table look up. However, in comparison to
Random Access Memory (RAM) technology, CAM technology is restricted in terms of
memory density, hardware cost and power dissipation. Recently, a Hash-CAM circuit ,
which combines the merits of the hash algorithm and the CAM function, was proposed to
replace pure CAM based lookup circuits with comparable performance, higher memory
density and lower cost. Most importantly, off-chip high density low-cost DDR memory
technology has now become an attractive alternative for the proposed Hash-CAM based
lookup circuit. However, DDR technology is optimized for burst access for cached
processor platforms. As such, efficient DDR Bandwidth utilization is a major challenge
for lookup functions that exhibit short and random memory access patterns. The extreme
low-cost and high memory density features of the DDR technology allow a trade-off
between memory utilization and memory-bandwidth utilization by customizing the
memory access. This, however, requires a custom purpose DDR memory controller that
is optimized to achieve the best read efficiency and highest memory bandwidth. The
objective of this work was to investigate advanced DDR3 SDRAM controller
architectures and derive a customized architecture for the abovementioned problem.
DDR3 SDRAM is the 3rd gener memories, featuring higher performance and lower
power consumption . In comparison with earlier generations, DDR1/2 SDRAM, DDR3
SDRAM is a higher density device and achieves higher bandwidth due to the further
increase of the clock rate and reduction in power consumption benefiting from 1.5V
power supply at 90 nm fabrication technology. With 8 individual banks, DDR3 memory
is more flexible to be accessed with fewer bank conflicts.
The proposed Hash-CAM based look up circuit is shown in Figure 1.
The original data and reference address information are stored in the DDR3 SDRAM
lookup request (data input) for a given content is pipelined and processed by the Hash
circuit to generate an address. This address value is forwarded to DDR3 SDRAM
Interface where it is translated into instructions and addresses that are recognized by the
DDR3 memory as an access.
The stored data & addresses in the memory are read back to the Hash-CAM circuit in
order to validate the match. In the case of corresponding reference address is reference
address is returned.
1.3 DDR3 Advantages
Lower power
Higher speed
Master reset
More performance
Larger densities
Modules for all applications
C CH HA AP PT TE ER R 2 2
Literature survey
2.1 Types of Memory controllers
2.1.1 Double Data Rate-Synchronous dynamic random access Memory (DDR1
SDRAM) controller
Double Data Rate-SDRAM, or simply DDR1, was designed to replace SDRAM.
DDR1 was originally referred to as DDR-SDRAM or simple DDR. When DDR2 was
introduced, DDR became referred to as DDR1. Names of components constantly change
as newer technologies are introduced, especially when the newer technology is based on a
previous e one. The principle applied in DDR is exactly as the name implies double data
rate. The DDR actually doubles the rate data is transferred by using both the rising and
falling edges of a typical digital pulse. Earlier memory technology such as SDRAM
transferred data after one complete digital pulse. DDR transfers data twice as fast by
transferring data on both the rising and falling edges of the digital pulse. Look at figure
below.
2.1.1.1DDR Digital Pulse
As shown in the above figure, DDR can transfer twice the amount of data per single
digital pulse by using both the rising edge, and the falling edge of the digital signal. DDR
can transfer twice the data as SDRAM.
2.2 Double Data Rate-Synchronous dynamic random access Memory (DDR2
SDRAM) controller
DDR2 is the next generation of memory developed after DDR. DDR2 increased
the data transfer rate referred to as bandwidth by increasing the operational frequency to
match the high FSB frequencies and by doubling the prefetch buffer data rate. There will
be more about the memory prefetch buffer data rate later in this section.
DDR2 is a 240 pin DIMM design that operates at 1.8 volts. The lower voltage
counters the heat effect of the higher frequency data transfer. DRR operates at 2.5 volts
and is a 188 pin DIMM design. DDR2 uses a different motherboard socket than DDR,
and is not compatible with motherboards designed for DDR. The DDR2 DIMM key will
not align with DDR DIMM key. If the DDR2 is forced into the DDR socket, it will
damage the socket and the memory will be exposed to a high voltage level. Also be
aware the DDR is 188 pin DIMM design and DDR2 is a 240 pin DIMM design.
2.3 Double Data Rate-Synchronous dynamic random access Memory (DDR3
SDRAM) controller
DDR3 was the next generation memory introduced in the summer of 2007 as the
natural successor to DDR2. DDR3 increased the pre-fetch buffer size to 8-bits an
increased the operating frequency once again resulting in high data transfer rates than its
predecessor DDR2. In addition, to the increased data transfer rate memory chip voltage
level was lowered to 1.5 V to counter the heating effects of the high frequency.
By now you can see the trend of memory to increase pre-fetch buffer size and
chip operating frequency, and lowering the operational voltage level to counter heat. The
physical DDR3 is also designed with 240 pins, but the notched key is in a different
position to prevent the insertion into a motherboard RAM socket designed for DDR2.
DDR3 is both electrical and physically incompatible with previous versions of
RAM. In addition to high frequency and lower applied voltage level, the DDR3 has a
memory reset option which DDR2 and DDR1 do not. The memory reset allows the
memory to be cleared by a software reset action. Other memory types do not have this
feature which means the memory state is uncertain after a system reboot. The memory
reset feature insures that the memory will be clean or empty after a system reboot. This
feature will result in a more stable memory system. DDR3 uses the same 240-pin design
as DDR2, but the memory module key notch is at a different location.
2.4 COMPARISION OF DDR1, DDR2 AND DDR3
2.5 Other Memory Types
2.5.1 Video Random Access Memory
VRAM is a video version of FPM and is most often used in video accelerator
cards. Because it has two ports, it provides the extra benefit over DRAM of being able to
execute simultaneous read/write operations at the same time.
One channel is used to refresh the screen and the other manages image changes. VRAM
tends to be more expensive. Video RAM, also known as multi port dynamic random
access memory (MPDRAM), is a type of RAM used specifically for video adapters or 3-
D accelerators. The "multi port" part comes from the fact that VRAM normally has two
independent access ports instead of one, allowing the CPU and graphics processor to
access the RAM simultaneously. VRAM is located on the graphics card and comes in a
variety of formats, many of which are proprietary.
The amount of VRAM is a determining factor in the resolution and color depth of
the display. VRAM is also used to hold graphics-specific information such as 3-D
geometry data and texture maps. True multi port VRAM tends to be expensive, so today,
many graphics cards use SGRAM (synchronous graphics RAM) instead. Performance is
nearly the same, but SGRAM is cheaper.
2.5.2 Flash Memory
This is a solid-state, nonvolatile, re-writable memory that functions like RAM and a hard
disk combined. If power is lost, all data remains in memory. Because of its high speed,
durability, and low voltage requirements, it is ideal for digital cameras, cell phones,
printers, handheld computers, pagers and audio recorders.
2.5.3 Shadow Random Access Memory
When your computer starts up (boots), minimal instructions for performing the startup
procedures and video controls are stored in ROM (Read Only Memory) in what is
commonly called BIOS. ROM executes slowly. Shadow RAM allows for the capability
of moving selected parts of the BIOS code from ROM to the faster RAM memory.
2.5.4 Static Random Access Memory
Static RAM uses a completely different technology. In static RAM, a form of flip-flop
holds each bit of memory. A flip-flop for a memory cell takes four or six transistors along
with some wiring, but never has to be refreshed. This makes static RAM significantly
faster than dynamic RAM. However, because it has more parts, a static memory cell
takes up a lot more space on a chip than a dynamic memory cell. Therefore, you get less
memory per chip.
Static random access memory uses multiple transistors, typically four to six, for each
memory cell but doesn't have a capacitor in each cell. It is used primarily for cache. So
static RAM is fast and expensive, and dynamic RAM is less expensive and slower. So
static RAM is used to create the CPU's speed-sensitive cache, while dynamic RAM forms
the larger system RAM space.
2.5.5 Dynamic Random Access Memory
Dynamic random access memory has memory cells with a paired transistor and
capacitor requiring constant refreshing. DRAM works by sending a charge through the
appropriate column (CAS) to activate the transistor at each bit in the column. When
writing, the row lines contain the state the capacitor should take on. When reading, the
sense-amplifier determines the level of charge in the capacitor. If it is more than
50 percent, it reads it as a 1; otherwise it reads it as a 0. The counter tracks the refresh
sequence based on which rows have been accessed in what order. The length of time
necessary to do all this is so short that it is expressed in nanoseconds.
A memory chip rating of 70ns means that it takes 70 nanoseconds to completely
read and recharge each cell. One of the most common types of computer memory
(RAM). It can only hold data for a short period of time and must be refreshed
periodically. DRAMs are measured by storage capability and access time. Storage is
rated in megabytes (8 MB, 16 MB, etc). Access time is rated in nanoseconds (60ns, 70ns,
80ns, etc) and represents the amount of time to save or return information. With a 60ns
DRAM, it would require 60 billionths of a second to save or return information. The
lower the nano speed, the faster the memory operates. DRAM chips require two CPU
wait states for each execution. Can only execute either a read or write operation at one
time. The capacitor in a dynamic RAM memory cell is like a leaky bucket. It needs to be
refreshed periodically or it will discharge to 0. This refresh operation is where dynamic
RAM gets its name. Dynamic RAM has to be dynamically refreshed all of the time or it
forgets what it is holding. The downside of all of this refreshing is that it takes time and
slows down the memory.
Memory cells are etched onto a silicon wafer in an array of columns (bit lines)
and rows (word lines). The intersection of a bit line and word line constitutes the address
of the memory cell. Memory cells alone would be worthless without some way to get
information in and out of them. So the memory cells have a whole support infrastructure
of other specialized circuits. These circuits perform functions such as: Memory is made
up of bits arranged in a two-dimensional grid. In this figure, red cells represent 1s and
white cells represent 0s. In the animation, a column is selected and then rows are charged
to write data into the specific column.
Identifying each row and column (row address select and column address select)
Keeping track of the refresh sequence (counter) Reading and restoring the signal from a
cell (sense amplifier) Telling a cell whether it should take a charge or not (write enable)
Other functions of the memory controller include a series of tasks that include identifying
the type, speed and amount of memory and checking for errors. The traditional RAM
type is DRAM (dynamic RAM). The other type is SRAM (static RAM). SRAM
continues to remember its content, while DRAM must be refreshed every few milli
seconds.
DRAM consists of micro capacitors, while SRAM consists of off/on switches.
Therefore, SRAM can respond much faster than DRAM. SRAM can be made with a rise
time as short as 4 ns. DRAM is by far the cheapest to build. Newer and faster DRAM
types are developed continuously. Currently, there are at least four types:
FPM (Fast Page Mode)
ECC (Error Correcting Code)
EDO (Extended Data Output)
SDRAM (Synchronous Dynamic RAM)
2.5.6 Cache Memory
Cache Memory is fast memory that serves as a buffer between the processor and
main memory. The cache holds data that was recently used by the processor and saves a
trip all the way back to slower main memory. The memory structure of PCs is often
thought of as just main memory, but it's really a five or six level structure: The first two
levels of memory are contained in the processor itself, consisting of the processor's small
internal memory, or registers, and L1 cache, which is the first level of cache, usually
contained in the processor. The third level of memory is the L2 cache, usually contained
on the motherboard. However, the Celeron chip from Intel actually contains 128K of L2
cache within the form factor of the chip. More and more chip makers are planning to put
this cache on board the processor itself. The benefit is that it will then run at the same
speed as the processor, and cost less to put on the chip than to set up a bus and logic
externally from the processor.The fourth level is being referred to as L3 cache. This
cache used to be the L2 cache on the motherboard, but now that some processors include
L1 and L2 cache on the chip, it becomes L3 cache. Usually, it runs slower than the
processor, but faster than main memory. The fifth level (or fourth if you have no "L3
cache") of memory is the main memory itself. The sixth level is a piece of the hard disk
used by the Operating System, usually called virtual memory. Most operating systems
use this when they run out of main memory, but some use it in other ways as well.
This six-tiered structure is designed to efficiently speed data to the processor
when it needs it, and also to allow the operating system to function when levels of main
memory are low. If there were one type of super-fast, super-cheap memory, it could
theoretically satisfy the needs of this entire memory architecture. This will probably
never happen since you don't need very much cache memory to drastically improve
performance, and there will always be a faster, more expensive alternative to the current
form of main memory.
2.5.7 Content-Addressable Memory (CAM)
Content-addressable memory (CAM) is a special type of computer memory used
in certain very high speed searching applications. It is also known as associative memory,
associative storage, or associative array, although the last term is more often used for a
programming data structure.
Hardware associative array:
Unlike standard computer memory (random access memory or RAM) in which
the user supplies a memory address and the RAM returns the data word stored at that
address, a CAM is designed such that the user supplies a data word and the CAM
searches its entire memory to see if that data word is stored anywhere in it. If the data
word is found, the CAM returns a list of one or more storage addresses where the word
was found (and in some architecture, it also returns the data word, or other associated
pieces of data).
C CH HA AP PT TE ER R 3 3
Design of DDR3SDRAM COTROLLER
3.1 Introduction
The DDR3 SDRAM uses double data rate architecture to achieve high-speed
operation. The double data rate architecture is 8n-prefetch architecture with an interface
designed to transfer two data words per clock cycle at the I/O pins. A single read or write
access for the DDR3 SDRAM consists of a single 8n-bit-wide, one-clock-cycle data
transfer at the internal DRAM core and eight corresponding n-bit-wide, one-half-
clockcycle data transfers at the I/O pins. The differential data strobe (DQS, DQS#) is
transmitted externally, along with data, for use in data capture at the DDR3 SDRAM
input receiver. DQS is center-aligned with data for WRITEs. The read data is transmitted
by the DDR3 SDRAM and edge-aligned to the data strobes.
The DDR3 SDRAM operates from a differential clock (CK and CK#). The
crossing of CK going HIGH and CK# going LOW is referred to as the positive edge of
CK. Control, command, and address signals are registered at every positive edge of CK.
Input data is registered on the first rising edge of DQS after the WRITE preamble, and
output data is referenced on the first rising edge of DQS after the READ preamble. Read
and write accesses to the DDR3 SDRAM are burst-oriented. Accesses start at a selected
location and continue for a programmed number of locations in a programmed sequence.
Accesses begin with the registration of an ACTIVATE command, which is then followed
by a READ or WRITE command. The address bits registered coincident with the
ACTIVATE command are used to select the bank and row to be accessed. The address
bits registered coincident with the READ or WRITE commands are used to select the
bank and the starting column location for the burst access. DDR3 SDRAM use READ
and WRITE BL8 and BC4. An auto precharge function may be enabled to provide a self-
timed row precharge that is initiated at the end of the burst access.
As with standard DDR SDRAM, the pipelined, multibank architecture of DDR3
SDRAM allows for concurrent operation, thereby providing high bandwidth by hiding
row precharge and activation time. A self refresh mode is provided, along with a power-
saving, power-down mode.
3.2 Functional Block Diagram
Fig. 3.1: Functional Block Diagram.
The functional block diagram of the DDR3 controller is shown in Figure 3.1. The
architecture of DDR3SDRAM controller consists of Initialization fsm Command fsm,
data path , bank control ,clock counter, refresh counter, Address FIFO, command FIFO
,Wdata FIFO and R_data reg .
Initialization fsm generates proper i-State to initialize the modules in the design.
Command fsm generates c-State to perform the normal write, read and fast write, read
operations. The data path module performs the data latching and dispatching of the data
between Hash CAM unit and DDR3SDRAM banks. The Address FIFO gives the address
to the Command fsm so the bank control unit can open particular bank and address
location in that bank. The Wdata FIFO provides the data to the data path module in
normal and fast write operation. The R_data reg gets the data from the data path module
normal and fast read operation.
In this project the designed DDR3 controller provides interface to the HASH
CAM circuit and the DDR Memory Banks. If the data word is found, the CAM returns a
list of one or more storage addresses where the word was found (and in some
architecture, it also returns the data word, or other associated pieces of data). Because a
CAM is designed to search its entire memory in a single operation, it is much faster than
RAM in virtually all search applications.
The DDR3 controller gets the address, data and control from the HASH CAM
circuit in to the Address fifo. Write data fifo and control fifo respectively.
3.2.1 Address fifo
DDR3 SDRAM controller gets the address from the Address fifo so that
controller can perform the read from the memory or write in to the memory address
location specified by the Address fifo. Here the Address fifo width is 13 bit and stack
depth is 8.
3.2.2 Write data fifo
DDR3 SDRAM controller gets the data from the Write data fifo in write operation
in to the memory address location specified by the Address fifo. Here the Address fifo
width is 64 bit and stack depth is 8.
3.2.3 Control fifo
DDR3 SDRAM controller gets the command from the Control fifo controller can
perform the read from the memory or write in to the memory address location specified
by the Address fifo. Here the Control fifo width is 2 bit and stack depth is 8. If the control
fifo gives the 01 DDR3 controller performs the Normal read operation. If the control is
10 DDR3 controller performs the Normal read operation and if control is 11 DDR3
controller performs the Fast read operation.
3.2.4 Read data register
When DDR3 controller performs Normal read or Fast read operation Read data
register gets the data send to the Hash Cam circuit.
3.2.5 Initialization Finite State Machine
Fig. 3.2: Initial FSM State Diagram.
Before normal memory accesses can be performed, DDR3 needs to be
initialized by a sequence of commands. The INIT_FSM state machine handles this
initialization. Figure 3.2 shows the state diagram of the INIT_FSM state machine.
During reset, the INIT_FSM is forced to the i_IDLE state. After reset, the
sys_dly_200US signal will be sampled to determine if the 200s power/clock
stabilization delay is completed. After the power/clock stabilization is complete,
the DDR initialization sequence will begin and the INIT_FSM will switch from
i_IDLE to i_NOP state and in the next clock to I_PRE.
The initialization starts with the PRECHARGE ALL command. Next a
LOAD MODE REGISTER command will be applied for the Extended mode
register to enable the DLL inside DDR, followed by another LOAD MODE
REGISTER command to the mode register to reset the DLL. Then a PRECHAGE
command will be applied to make all banks in the device to idle state. Then two,
AUTO REFRESH commands, and then the LOAD MODE REGISTER command
to configure DDR to a specific mode of operation. After issuing the LOAD
MODE REGISTER command and the tMRD timing delay is satisfied, INIT_FSM
goes to i_ready state and remains there for the normal memory access cycles
unless reset is asserted. Also, signal sys_init_done is set to high to indicate the
DDR initialization is completed. The i_PRE, i_AR1, i_AR2, i_EMRS and i_MRS
states are used for issuing DDR commands. The LOAD MODE REGISTER
command configures the DDR by loading data into the mode register through the
address bus. The data present on the address bus (ddr_add) during the LOAD
MODE REGISTER command is loaded to the mode register. The mode register
contents specify the burst length, burst type, CAS latency, etc. A
PRECHARGE/AUTO PRECHARGE command moves all banks to idle state. As
long as all banks of the DDR are in idle state, mode register can be reloaded with
different value thereby changing the mode of operation. However, in most
applications the mode register value will not be changed after initialization. This
design assumes the mode register stays the same after initialization.
As mentioned above, certain timing delays (tRP, tRFC, tMRD) need to be
satisfied before another non-NOP command can be issued. These SDRAM delays
vary from speed grade to speed grade and sometimes from vendor to vendor. To
accommodate this without sacrificing performance, the designer needs to modify
the HDL code for the specific delays and clock period (tCK). According to these
timing values, the number of clocks the state machine will stay at i_tRP, i_tRFC1,
i_tRFC2, i_tMRD states will be determined after the code is synthesized. In cases
where tCK is larger than the timing delay, the state machine doesnt need to
switch to the timing delay states and can go directly to the command states. The
dashed lines in Figure 3.3 show the possible state switching paths.
3.6 Different states of Initial FSM:
3.6.1 Idle:
When reset is applied the initial fsm is forced to IDLE state irrespective of
which state it is actually in when system is in idle it remains idle without
performing any operations.
3.6.2 No Operation (NOP):
The NO OPERATION (NOP) command is used to instruct the selected
DDR SDRAM to perform a NOP (CS# is LOW with RAS#, CAS#, and WE# are
HIGH). This prevents unwanted commands from being registered during idle or
wait states. Operations already in progress are not affected.
3.6.3 Precharge (PRE):
The PRECHARGE command is used to deactivate the open row in a
particular bank or the open row in all banks as shown in Figure 3.3. The value on
the BA0, BA1 inputs selects the bank, and the A10 input selects whether a single
bank is precharged or whether all banks are precharged.
Fig. 3.3: Precharge Command.
3.6.4 Auto Refresh (AR):
AUTO REFRESH is used during normal operation of the DDR SDRAM
and is analogous to CAS#-before-RAS# (CBR) refresh in DRAMs. This command
is nonpersistent, so it must be issued each time a refresh is required. All banks
must be idle before an AUTO REFRESH command is issued.
3.6.5 Load Mode Register (LMR):
The mode registers are loaded via inputs A0An. The LOAD MODE
REGISTER command can only be issued when all banks are idle, and a
subsequent executable command cannot be issued until tMRD is met.
3.6.6 Read/Write Cycle:
The Figure 4.1 shows the state diagram of CMD_FSM which handles the
read, write and refresh of the SDRAM. The CMD_FSM state machine is
initialized to c_idle during reset. After reset, CMD_FSM stays in c_idle as long as
sys_INIT_DONE is low which indicates the SDRAM initialization sequence is not
yet completed.
Once the initialization is done, sys_ADSn and sys_REF_REQ will be
sampled at the rising edge of every clock cycle. A logic high sampled on
sys_REF_REQ will start a SDRAM refresh cycle. This is described in the
following section. If logic low is sampled on both sys_REF_REQ and sys_ADSn,
a system read cycle or system write cycle will begin. These system cycles are
made up of a sequence of SDRAM commands.
3.2.6 Command FSM State Diagram:
Fig. 4.3: Command FSM State Diagramfor Normal write and read.
The figure 4.1 shows the state diagram of CMD_FSM, which handles read,
write and refresh of the DDR. The CMD_FSM state machine is initialized to
c_idle during reset. After reset, CMD_FSM stays in c_idle as long as
sys_init_done is low which indicates the DDR initialization sequence is not yet
completed. From this state, a READA/WRITEA/REFRESH cycle starts depending
upon sys_adsn/rd_wr_req_during_ref_req signals as shown in the state diagram.
All rows are in the closed status after the DDR initialization. The rows
need to be opened before they can be accessed. However, only one row in the
same bank can be opened at a time. Since there are four banks, there can be at
most four rows opened at the same time. If a row in one bank is currently opened,
it needs to be closed before another row in the same bank can be opened. ACTIVE
command is used to open the rows and PRECHARGE (or the AUTO
PRECHARGE hidden in the WRITE and READ commands as used in this design)
is used to close the rows. When issuing the commands for opening or closing the
rows, both row address and bank address need to be provided.
In this design, the ACTIVE command will be issued for each read or write
access to open the row. After a tRCD delay is satisfied, READA or WRITEA
commands will be issued with a high ddr_add[10] to enable the AUTO REFRESH
for closing the row after access. Therefore, the clocks required for read/write cycle
are fixed and the access can be random over the full address range. Read or write
is determined by the sys_r_wn status sampled at the rising edge of the clock before
the tRCD delay is satisfied. If logic high is sampled, the state machine switches to
c_READA. If a logic low is sampled, the state machine switches to c_WRITEA.
For read cycles, the state machine switches from, c_READA to c_cl for
CAS latency, then switches to crate for transferring data from DDR to processor.
The burst length determines the number of clocks the state machine stays in
c_rdata state. After the data is transferred, it switches back to c_idle.
For write cycles, the state machine switches from c_WRITEA to c_wdata
for transferring data from bus master to DDR, then switches to c_tDAL. Similar to
read, the number of clocks the state machine stays in c_wdata state is determined
by the burst length. The time delay tDAL is the sum of WRITE recovery time
tWR and the AUTO PRECHARGE timing delay tRP. After the clock rising edge
of the last data in the burst sequence, no commands other than NOP can be issued
to DDR before tDAL is satisfied.
The dashed lines indicate possible state switching paths when the tCK
period is larger than the timing delay specification.
Command FSM with fast read operation
Fast read can be achieved by switching banks. Bank control logic is used to issue
desired bank addresses at each cycle when a bank active command or read command is
issued. The state machine for this method is given in Figure 4(b). The proposed controller
provides the control interface for switching between normal write/read mode and fast
read mode. Unlike other data processing techniques, the distinct characteristic of the
random data lookup is the uncertainty of the incoming data. In this work, address FIFOs
are applied to buffer the row/column addresses separately for each read request. The
empty flag of the row address FIFO (addr_fifo_empty) is checked in order to evaluate
whether the next command is active (ACT) or read (RDA).
Command FSM Fast read with auto precharge
4.2 Different states of Command FSM:
4.2.1 Refresh Cycle:
DDR memory needs a periodic refresh to hold the data. This periodic
refresh is done using AUTO REFRESH command. All banks must be idle before
an AUTO REFRESH command is issued. In this design all banks will be in idle
state, as every read/write operation uses auto pre charge.
4.2.2 Active (ACT):
The ACTIVE command is used to open (or activate) a row in a particular
bank for a subsequent access, like a read or a write, as shown in Figure 4.2. The
value on the BA0, BA1 inputs selects the bank, and the address provided on inputs
A0An selects the row.
Fig. 4.2: Activating a Specific Row in a Specific Bank.
4.2.3 Read:
The READ command is used to initiate a burst read access to an active row,
as shown in Figure 4.3. The value on the BA0, BA1 inputs selects the bank, and
the address provided on inputs A0Ai (where Ai is the most significant column
address bit for a given density and configuration) selects the starting column
location.
Fig. 4.3: Read Command
4.2.4 Write:
The WRITE command is used to initiate a burst write access to an active row as
shown in Figure 4.4. The value on the BA0, BA1 inputs selects the bank, and the address
provided on inputs A0Ai (where Ai is the most significant column address bit for a
given density and configuration) selects the starting column location.
Fig. 4.4: Write Command
Similar to the FP and EDO DRAM, row address and column address are
required to pinpoint the memory cell location of the SDRAM access. Since
SDRAM is composed of four banks, bank address needs to be provided as well.
The SDRAM can be considered as a four by N array of rows. All rows are
in the closed status after the SDRAM initialization. The rows need to be
opened before they can be accessed. However, only one row in the same bank
can be opened at a time. Since there are four banks, there can be at most four rows
opened at the same time. If a row in one bank is currently opened, it must be
closed before another row in the same bank can be opened.ACTIVE command is
used to open the rows and PRECHARGE (or the AUTO PRECHARGE hidden in
the WRITE and READ commands, as used in this design) is used to close the
rows. When issuing the commands for opening or closing the rows, both row
address and bank address need to be provided.
For sequential access applications and those with page memory
management, the proper address assignments and the use of the SDRAM pipeline
feature deliver the highest performance SDRAM controller. However, this type of
controller design is highly associated with the bus master cycle specification and
will not fit the general applications.Therefore, this SDRAM controller design does
not implement these custom features to achieve the highest performance through
these techniques.
In this design, the ACTIVE command will be issued for each read or write
access to open the row. After a tRCD delay is satisfied, READ or WRITE
commands will be issued with a high sdr_A[10] to enable the AUTO REFRESH
for closing the row after access. So, the clocks required for read/write cycle are
fixed and the access can be random over the full address range.
Read or write is determined by the sys_R_Wn status sampled at the rising
edge of the clock before tRCD delay is satisfied. If a logic high is sampled, the
state machine switches to c_READA. If a logic low is sampled, the state machine
switches to c_WRITEA.
For read cycles, the state machine switches from c_READA to c_cl for
CAS latency, then switches to c_rdata for transferring data from SDRAM to bus
master. The number of clocks the state machine stays in c_rdata state is
determined by the burst length. After the data is transferred, it switches back to
c_idle.
For write cycles, the state machine switches from c_WRITEA to c_wdata
for transferring data from bus master to SDRAM, then switches to c_tDAL.
Similar to read, the number of clocks the state machine stays in c_wdata state is
determined by the burst length. The time delay tDAL is the sum of WRITE
recovery time tWR and the AUTO PRECHARGE timing delay tRP. After the
clock rising edge of the last data in the burst sequence, no commands otherthan
NOP can be issued to SDRAM before tDAL is satisfied.
As mentioned in the INIT_FSM section above, the dash lines indicates
possible state switching paths when tCK period is larger than timing delay spec.
4.2.5 Refresh Cycle:
Similar to the other DRAMs, memory refresh is required. A SDRAM
refresh request is generated by activating sdr_REF_REQ signal of the controller.
The sdr_REF_ACK signal will acknowledge the recognition of sdr_REF_REQ
and will be active throughout the whole refresh cycle. The sdr_REF_REQ signal
must be maintained until the sdr_REF_ACK goes active in order to be recognized
as a refresh cycle. Note that no system read/write access cycles are allowed when
sdr_REF_ACK is active. All system interface cycles will be ignored during this
period. The sdr_REF_REQ signal assertion needs to be removed upon receipt of
sdr_REF_ACK acknowledge,otherwise another refresh cycle will again be
performed.
Upon receipt of sdr_REF_REQ assertion, the state machine CMD_FSM
enters the c_AR state to issue an AUTO REFRESH command to the SDRAM.
After tRFC time delay is satisfied, CMD_FSM returns to c_idle.
4.2.6. Data Path Control
Data path module performs the data latching and dispatching based on
the command fsm states. It provides interface between the Read data register
and the memory banks
6. Bank control
The bank control controls the all the eight banks effectively depending
upon the istate and cstate by sending the required control signals.
4.2.7 Timing Diagrams:
The figures 4.5 and 4.6 are the read cycle and write cycle timing diagrams
of the reference design with the two CAS latency cycles and the burst length of
four. The timing diagrams may be different due to the values of the timing delays
tMRD/tRP/tRFC/tRCD/tRCD/tWR, the clock period tCK, the CAS latency and
the burst length. The total number of clocks for read and write cycles are decided
by these factors. In the example shown in the figures, the read cycle takes 10
clocks and the write cycle takes 9 clocks.
The state variable c_State of CMD_FSM is also shown in these figures.
Note that the ACTIVE, READ, WRITE commands are asserted one clock after the
c_ACTIVE, c_READA, c_WRITEA states respectively.
The values of the region filed with slashes in the system interface input
signals of these figures are dont care. For example, signal sys_R_Wn needs to
be valid only at the clock before CMD_FSM switches to the c_READA or
c_WRITEA states. Depending on the values of tRCD and tCK, this means the
signal sys_R_Wn needs to be valid at state c_ACTIVE or the last clock of state
c_tRCD.

Fig. 4.5: Read Cycle Timing Diagram.
Fig. 4.6: Write Cycle Timing Diagram.
C CH HA AP PT TE ER R 4 4
Results
4.1 Simulation results
4.1.1 Address fifo
4.1.2 Control fifo
4.1.3 Write data fifo
4.1.4 initialization fsm
4.1.5 Command fsm
4.1.5.1 Normal Write operation
4.1.5.2 Normal Read operation
4.1.5.2 Fastl Read operation
4.1.6 Data path control
4.1.7 Clock counter
4.1.8 Refresh Counter
4.1.9 Bank Control
4.1.10Top module
4.1.10.1 Normal Write operation (a)
4.1.10.1 Normal Write operation (b)
4.1.10.2 Normal Read operation (a)
4.1.10.2 Normal Read operation (b)
4.1.10.3 Fast Read operation (a)
4.1.10.3 Fast Read operation (b)
4.2 Synthesis results
4.2. 1 Block Level Schematic
4.2. 2 Register Transfer Level Schematic
4.2.3 Technology Schematic
]
4.2.4 Advanced HDL Synthesis Report
Macro Statistics
# ROMs : 1
8x3-bit ROM : 1
# Adders/Subtractors : 14
3-bit adder : 6
3-bit subtractor : 4
4-bit addsub : 3
4-bit subtractor : 1
# Counters : 13
3-bit down counter : 5
3-bit up counter : 6
4-bit down counter : 1
5-bit up counter : 1
# Registers : 1013
Flip-Flops : 1013
# Multiplexers : 5
13-bit 8-to-1 multiplexer : 1
2-bit 8-to-1 multiplexer : 1
5-bit 4-to-1 multiplexer : 2
64-bit 8-to-1 multiplexer : 1
4.2.5 Final Report
Final Results
RTL Top Level Output File Name : DDR3_top.ngr
Top Level Output File Name : DDR3_top
Output Format : NGC
Optimization Goal : Speed
Keep Hierarchy : NO
Design Statistics
# IOs : 262
Cell Usage :
# BELS : 474
# GND : 1
# INV : 14
# LUT2 : 25
# LUT2_D : 3
# LUT2_L : 3
# LUT3 : 114
# LUT3_D : 7
# LUT3_L : 8
# LUT4 : 204
# LUT4_D : 8
# LUT4_L : 22
# MUXF5 : 49
# MUXF6 : 15
# VCC : 1
# FlipFlops/Latches : 406
# FD : 44
# FDCE : 12
# FDE : 206
# FDR : 87
# FDRE : 31
# FDS : 26
# Shift Registers : 2
# SRL16 : 2
# Clock Buffers : 2
# BUFGP : 2
# IO Buffers : 195
# IBUF : 83
# OBUF : 112
4.2.6 Device utilization summary:
Selected Device: 3s500efg320-5
Number of Slices: 267 out of 4656 5%
Number of Slice Flip Flops: 298 out of 9312 3%
Number of 4 input LUTs: 410 out of 9312 4%
Number used as logic: 408
Number used as Shift registers: 2
Number of IOs: 262
Number of bonded IOBs: 197 out of 232 84%
IOB Flip Flops: 108
Number of GCLKs: 2 out of 24 8%
4.2.7 Timing Summary:
Speed Grade: -5
Minimum period: 5.952ns (Maximum Frequency: 168.010MHz)
Minimum input arrival time before clock: 5.546ns
Maximum output required time after clock: 4.040ns
4.3 ADVANTAGES
1. Higher bandwidth performance increase, effectively up to 2400MHz.
2. Performance increase at low power (longer battery life in laptops).
3. Enhanced low power features with improved thermal design (cooler).
4. Compared w1th DDRSDRAM the voltage of DDR3 SDRAM was lowered
from 2.5V to 1.5V. This improves power consumption and heat generation, as well
as enabling more dense memory configurations for higher capacities
5. DDR3 SDRAM achieves nearly twice the bandwidth of the preceding
single data rate DDR2 SDRAM by double pumping (transferring data on the rising
and falling edges of the clock signal) without increasing the clock frequency.
6. DDR SDRAM is a particularly expensive alternative to DDR3 SDRAM,
and most manufacturers have dropped its support from their chipsets.
7. CAS latency is less compared to DDRSDRAM.
8. SDRAM can accept one command and transfer one word of data per clock
cycle. Typical clock frequencies are 50 and 133 MHz.{ddr two commands n
supports upto200Mhz}.
9. Low power consumption.
10. Low manufacturing cost.
11. Low-voltage, 1.5V DDR3 reduced chip count provides significant power
savings.
C CH HA AP PT TE ER R 5 5
FUTURE SCOPE
FUTURE SCOPE
1. DDR4 SDRAM is the 4th generation of DDR SDRAM.
2. DDR3 SDRAM improves on DDR SDRAM by using differential signaling
and lower voltages to support significant performance advantages over DDR
SDRAM.
3. DDR3 SDRAM standards are still being developed and improved.
DDR SDRAM Standard Frequency (MHz) Voltage
DDR 400-533 2.5
DDR2 667-800 1.8
DDR3 1066 to ... 1.5
4. Higher frequencies enable higher rates of data transfer.
5. DDR3 SDRAM (Double Data Rate Three Synchronous Dynamic Random
Access Memory) is the third generation of DDR SDRAM.
6. Reduced power consumption due to 90mm fabrication technology.
7. Pre-fetch buffer is doubled to 8 bits to further increase performance.
Disadvantages:
Commonly higher CAS Latency but compensated by higher bandwidth,
thereby increasing overall performance under specific applications generally costs
much more than equivalent DDR2 memory.
Conclusion
In this project we have designed a High speed DDR3 SDRAM Controller
with 64-bit data transfer which synchronizes the transfer of data between DDR
RAM and External peripheral devices like host computer, laptops and so on. The
advantages of this controller compared to SDR SDRAM , DDR1 SDRAM and
DDR2 SDRAM is that it synchronizes the data transfer, and the data transfer is
twice as fast as previous, the production cost is also very low.
We have successfully designed using Verilog HDL and simulated using
ModelSim, synthesized using Xilinx tool.
C CH HA AP PT TE ER R 6 6
REFERENCES
1. A. J. McAuley, et al, Fast Routing Table Lookup Using CAMs, Proceedings on 12th
Annual Joint Conference of the IEEE Computer and Communications Societies
(INFOCOM), Vol.3, March 1993, pp.1382 1391.
2. X. Yang, et al, High Performance IP Lookup Circuit Using DDR SDRAM, IEEE
International SOC Conference (SOCC), Sept. 2008, pp. 371-374.
3. G. Allan, The Love/Hate Relationship with DDR SDRAM Controllers, MOSAID
Technologies Whitepaper, 2006.
4. H. Kim, et al, High-Performance and Low-Power Memory- Interface Architecture for
Video Processing Application, IEEE Transactions on Circuit and Systems for Video
Technology, Vol. 11, Nov. 2001, pp. 1160-1170.
5. E. G. T. Jaspers, et al, Bandwidth Reduction for Video Processing in Consumer
Systems, IEEE Transactions on Consumer Electronics, Vol. 47, No. 4, Nov. 2001, pp.
885- 894.
6. N. Zhang, et al, High Performance and High Efficiency Memory Management System
for H.264/AVC Application in the Dual-Core Platform, ICASE, Oct. 2006, pp. 5719-
5722.
7. J. Zhu, et al, High Performance Synchronous DRAMs Controller in H. 264 HDTV
Decoder, Proceedings of International Conference on Solid-State and Integrated Circuts
Technology, Vol. 3, Oct. 2004, pp. 1621-1624.
8. High-Performance DDR3 SDRAM Interface in Virtex-5 Devices, Xilinx, XAPP867
(v1.0), Sept 24, 2007.
9. T. Mladenov, Bandwidth, Area Efficient and Target Device Independent DDR
SDRAM Controller, Proceedings of World Academy of Science, Engineering and
Technology, Vol. 18, De. 2006, pp. 102-106.
10. DDR3 SDRAM Specification (JESD79-3A), JEDEC Standard, JEDEC Solid State
Technology Association, Sept. 2007.
11. www.altera.com/literature/ug/ug_altmemphy.pdf, External DDR Memory PHY
Interface Megafunction User Guide (ALTMEMPHY) accessed on 23 Feb. 2009.
APPENDIX
VERILOG HDL
Overview
Hardware description languages, such as Verilog, differ from software
programming languages because they include ways of describing the propagation of time
and signal dependencies (sensitivity). There are two assignment operators, a blocking
assignment (=), and a non-blocking (<=) assignment. The non-blocking assignment
allows designers to describe a state-machine update without needing to declare and use
temporary storage variables. Since these concepts are part of the Verilog language
semantics, designers could quickly write descriptions of large circuits, in a relatively
compact and concise form. At the time of Verilog introduction (1984), Verilog
represented a tremendous productivity improvement for circuit designers who were
already using graphical schematic-capture, and specially-written software programs to
document and simulate electronic circuits.
The designers of Verilog wanted a language with syntax similar to the C
programming language, which was already widely used in engineering software
development. Verilog is case-sensitive, has a basic preprocessor (though less
sophisticated than ANSI C/C++), and equivalent control flow keywords (if/else, for,
while, case, etc.), and compatible language operators precedence. Syntactic differences
include variable declaration (Verilog requires bit-widths on net/reg types), demarcation
of procedural-blocks (begin/end instead of curly braces {}), and many other minor
differences.
A Verilog design consists of a hierarchy of modules. Modules encapsulate design
hierarchy, and communicate with other modules through a set of declared input, output,
and bidirectional ports. Internally, a module can contain any combination of the
following: net/variable declarations concurrent and sequential statement blocks and
instances of other modules. Sequential statements are placed inside a begin/end and
executed in sequential order within the block. But the blocks themselves are executed
concurrently, qualifying Verilog as a Dataflow language.
Verilog concept of 'wire' consists of both signal values (4-state: "1, 0, floating,
undefined"), and strengths (strong, weak, etc.) This system allows abstract modeling of
shared signal-lines, where multiple sources drive a common net. When a wire has
multiple drivers, the wire's (readable) value is resolved by a function of the source drivers
and their strengths.
A subset of statements in the Verilog language is synthesizable. Verilog modules
that conform to a synthesizable coding-style, known as RTL (register transfer level), can
be physically realized by synthesis software. Synthesis-software algorithmically
transforms the (abstract) Verilog source into a net list, a logically-equivalent description
consisting only of elementary logic primitives (AND, OR, NOT, flip-flops, etc.) that are
available in a specific VLSI technology. Further manipulations to the net list ultimately
lead to a circuit fabrication blueprint (such as a photo mask-set for an ASIC, or a bit
stream-file for an FPGA).
History, Beginning
Verilog was invented by Phil Moorby and Prabhu Goel during the winter of
1983/1984 at Automated Integrated Design Systems (later renamed to Gateway Design
Automation in 1985) as a hardware modeling language. Gateway Design Automation was
later purchased by Cadence Design Systems in 1990. Cadence now has full proprietary
rights to Gateway's Verilog and the Verilog-XL simulator logic simulators.
Verilog-95
With the increasing success of VHDL at the time, Cadence decided to make the
language available for open standardization. Cadence transferred Verilog into the public
domain under the Open Verilog International (OVI) (now known as Accellera)
organization. Verilog was later submitted to IEEE and became IEEE Standard 1364-
1995, commonly referred to as Verilog-95.
In the same time frame Cadence initiated the creation of Verilog-A to put
standards support behind its analog simulator Spectre. Verilog-A was never intended to
be a standalone language and is a subset of Verilog-AMS which encompassed Verilog-
95.
Verilog 2001
Extensions to Verilog-95 were submitted back to IEEE to cover the deficiencies
that users had found in the original Verilog standard. These extensions became IEEE
Standard 1364-2001 known as Verilog-2001.
Verilog-2001 is a significant upgrade from Verilog-95. First, it adds explicit
support for (2's complement) signed nets and variables. Previously, code authors had to
perform signed-operations using awkward bit-level manipulations (for example, the
carry-out bit of a simple 8-bit addition required an explicit description of the boolean-
algebra to determine its correct value.) The same function under Verilog-2001 can be
more succinctly described by one of the built-in operators: +, -, /, *, >>>. A generate/end
generate construct (similar to VHDL's generate/end generate) allows Verilog-2001 to
control instance and statement instantiation through normal decision-operators
(case/if/else). Using generate/end generate, Verilog-2001 can instantiate an array of
instances, with control over the connectivity of the individual instances. File I/O has been
improved by several new system-tasks. And finally, a few syntax additions were
introduced to improve code-readability (eg. always @*, named-parameter override, C-
style function/task/module header declaration.) Verilog-2001 is the dominant flavor of
Verilog supported by the majority of commercial EDA software packages.
Verilog 2005
Not to be confused with SystemVerilog, Verilog 2005 (IEEE Standard 1364-
2005) consists of minor corrections, spec clarifications, and a few new language features
(such as the uwire keyword.) A separate part of the Verilog standard, Verilog-AMS,
attempts to integrate analog and mixed signal modelling with traditional Verilog.
Design Styles
Verilog, like any other hardware description language, permits a design in either
Bottom-up or Top-down methodology.
Bottom-Up Design
The traditional method of electronic design is bottom-up. Each design is
performed at the gate-level using the standard gates (refer to the Digital Section for more
details). With the increasing complexity of new designs this approach is nearly
impossible to maintain. New systems consist of ASIC or microprocessors with a
complexity of thousands of transistors.
These traditional bottom-up designs have to give way to new structural,
hierarchical design methods. Without these new practices it would be impossible to
handle the new complexity.
Top-Down Design
The desired design-style of all designers is the top-down one. A real top-down
design allows early testing, easy change of different technologies, a structured system
design and offers many other advantages. But it is very difficult to follow a pure top-
down design. Due to this fact most designs are a mix of both methods, implementing
some key elements of both design styles.
Figure shows a Top-Down design approach.
Verilog Abstraction Levels
Verilog supports designing at many different levels of abstraction. Three of them
are very important:
Behavioral level
Register-Transfer Level
Gate Level
Behavioral level
This level describes a system by concurrent algorithms (Behavioral). Each
algorithm itself is sequential, that means it consists of a set of instructions that are
executed one after the other. Functions, Tasks and Always blocks are the main elements.
There is no regard to the structural realization of the design.
Register-Transfer Level
Designs using the Register-Transfer Level specify the characteristics of a circuit
by operations and the transfer of data between the registers. An explicit clock is used.
RTL design contains exact timing bounds: operations are scheduled to occur at certain
times. Modern RTL code definition is "Any code that is synthesizable is called RTL
code".
Gate Level
Within the logic level the characteristics of a system are described by logical links
and their timing properties. All signals are discrete signals. They can only have definite
logical values (`0', `1', `X', `Z`). The usable operations are predefined logic primitives
(AND, OR, NOT etc gates). Using gate level modeling might not be a good idea for any
level of logic design. Gate level code is generated by tools like synthesis tools and this
net list is used for gate level simulation and for backend.
About Verilog HDL
Digital systems are highly complex. At their most detail level, they may consist of
millions of elements like transistors or logic gates. Therefore, for large digital systems,
gate level design is dead. To avoid that, Verilog HDL was introduced.
Verilog HDL is a Hardware Description Language (HDL). It is a language
used to describe a digital system, which may be a computer or a component of a
computer. One may describe a digital system at several levels. For example, an HDL
might describe the layout of the wires, resistors and transistors on an Integrated
Circuit (IC) chip, i.e. the switch level. Or, it might describe the logical gates and flip-
flops in a digital system, i.e. the gate level. An even higher level describes the
registers and the transfers of vectors of information between registers. This is called
the Register Transfer Level (RTL). Verilog supports all of these levels. It is very
much like the C language.
Salient Features of Verilog:
Primitive logic gates such as AND, OR and NAND gates are built-in into this
language.
Flexibility of creating a user-defined primitive (UDP). Such a primitive could
either be a combinational logic primitive or a sequential logic primitive.
Switch-level modeling primitive gates, such as PMOS and NMOS are also
built-in into this language.
Explicit language constructs are provided for specifying pin-to-pin delays,
path delays and timing checks of a delay.
A design can be modeled in three different styles or in a mixed style. These
styles are behavioral style modeled using continuous assignments, style-
modeled using gate and module instantiations.
There are two data types in verilog HDL, the net data type and register data
type. The net type represents a physical connection between structural
elements while a register type represents an abstract data storage element.
A design can be of arbitrary size; the language does not impose a limit.
Verilog HDL is non-proprietary and is an IEEE standard.
It is both human and machine readable. Thus it can be used as an exchange
anguage between tools and designers.
The capability of the Verilog HDL language can be further extended by using
the programming language interface (PLI) mechanism. PLI is a collection of
routines that allow foreign functions to access information within a verilog
module and allows for designer interaction with the simulator.
At the behavioral-level, Verilog HDL can be used to describe a design not
only at the RTL-level, but also at the architectural-level and its algorithmic-
level behavior.
At the structural-level, gate and module instantiations can be used.
Verilog HDL also has built-in logic functions such as & (bit wise AND) | (bit
wise OR).
Notion of concurrency and time can be explicitly modeled.
Powerful file read and write capabilities are provided
Verilog HDL can be used to perform response monitoring of the design under test,
that is, the values of a design under test can be monitored and displayed. These
values can also be compared with expected values, and in case of a mismatch, a
report message can be printed.
The language is non-deterministic under certain situations, that is, a model may
produce different results on different simulators; for example, the ordering of
events on an event queue is not defined by the standard.

Das könnte Ihnen auch gefallen