Sie sind auf Seite 1von 5

2013 IEEE Eighth International Conference on Networking, Architecture and Storage

An FPGA-based Random Functional Verification Method for Cache

Li Tiejun Zhang Jianmin Li Sikun


School of Computer Science School of Computer Science School of Computer Science
National University of Defense National University of Defense National University of Defense
Technology Technology Technology
Changsha, China Changsha, China Changsha, China
tjli@nudt.edu.cn jmzhang@nudt.edu.cn sikunli@nudt.edu.cn

Abstract—Functional verification is the most difficult and time- verification flow of the chips, when improving the efficiency
consuming step in VLSI design flow, owing to the complexity and coverage rate of functional verification is necessary.
and scale of chips rapidly increasing. The key problem of VLSI Cache is the very important component in the
functional verification is improving the efficiency and microprocessors. The main function of Cache is to bridge the
coverage. For the important component-Cache in the gap between the main memory and processor cores.
microprocessors, an FPGA-based pseudo-random functional According the types of stimulus, the functional verification
verification method is proposed in this paper. The testbench of method of Cache can be divided into two categories: directed
this method is synthesizable, and the field programmable gate verification method and random verification method.
array (FPGA) emulation process is integrated to improve the
Broadly speaking, functional verification for Cache faces
efficiency of verification. The functional verification coverage
three problems. Firstly, the huge verification space is the
is increased by automatically generating the constraints
directed pseudo-random test stimuli. The method is applied in
main bottleneck of functional verification for Cache. In
the real chips, and is compared with the pseudo-random general, the functional complexity and design scale of Cache
software simulation method. The results show that our method is very large. If using directed verification method, it is very
is faster by about three orders of magnitude, and find more difficult for verification team to finish so much testbench.
bugs in the designs. Secondly, the automatic correctness check for verification
results is very necessary. The complex function of Cache and
Keywords-Cache; functional verification; pseudo-random; large scale of testbench bring much trouble to check the
field programmable gate array results correctness for Cache verification. Therefore the
correctness check for verification results is difficult, time-
I. INTRODUCTION consuming and error-prone. Thirdly, the high efficiency and
coverage rate is the main goal for functional verification of
Functional verification remains one of the largest Cache. When the directed verification method is adopted, the
challenges in modern VLSI design flows taking up to 70% verification team should write testbench for each functional
[1] of the total design time, owing to the dramatically point of Cache. It consumes much time in verification flow.
increasing complexity and the number of transistors for a Furthermore, some unpredictable combinational condition
single silicon chip. Functional verification is the register and corner cases are very difficult to be covered by the
transfer level (RTL) code or gate level netlist should directed testbench. The random verification method can
conform to the specification of the VLSI chip. The aim of solve the three problems during the verification flow of
functional verification is to find and locate the bugs in the Cache. The random verification method generates the
circuits. In recent years, new methods have been developed flexible scale of testbench, and easily covers many
to cope with the verification challenge. Software simulation unexpected functional corner cases. Therefore, it can arrive
method and hardware emulation method are the most at higher verification coverage rate. However, completely
popular functional verification techniques. The main random test stimuli always cover the same function points
advantage of software simulation method is that localization more than once. Consequently these unnecessary repeatedly
of errors in the design is easy. However, since the software testbench reduce the efficiency of verification.
simulation method is slow, its runtime is too long, and its For the above problems for the verification flow of
functional coverage generally is not very high, with typical Cache, a synthesizable pseudo-random functional
design sizes exceeding the half-million synthesized gates verification method is proposed in this paper. Firstly, to
mark. The most popular hardware emulation technique is improve the functional coverage rate, the pseudo-random
FPGA-based emulation method. It is generally faster than stimulus generation technique is adopted. Secondly, all of the
software simulation method for about a few orders of testbench are synthesizable. Thus the FPGA-based hardware
magnitude. Further, it generally can arrive at the high emulation approach is applied in this method, to achieve
coverage rate. However its debug process is difficult and its substantial improvement in verification efficiency. Thirdly,
testbench should be synthesizable. Therefore, the hardware constraint-directed random test generation technique is
emulation method is generally applied in functional introduced to reduce the unnecessary repeated test. The

978-0-7695-5034-3/13 $26.00 © 2013 IEEE 277


DOI 10.1109/NAS.2013.44
pseudo-random stimuli are generated constrained by specific random functional verification method is sketched in Fig. 1.
functional specification. The specific constraints include This test approach consists of four components: data mirror
unreachable states of FSM, impossible assignment for input image, constraints directed test stimuli generator, automatic
signals, etc, according to the functional specification. The error checker, pseudo-random number generator. Next each
synthesizable pseudo-random functional verification method of four modules will be detailedly described.
is applied in the real chips, and is compared with the random
verification method on software simulation. The results show
that our method is faster by about three orders of magnitude, Pseudo-random Number
and find more deep bugs in the Cache designs. Generator
The paper is organized as follows. The next section
surveys the relevant existing research work on random
verification technique. Section 3 introduces the synthesizable
pseudo-random functional verification method for Cache.
Section 4 shows and analyzes the comparison results on the Constraints
real chips. Finally, Section 5 concludes the paper and Data Mirror Directed Pseudo-
outlines future research work. Image random Test
Stimuli Generator
II. RELATED WORK
There have been many different contributions to research
on random test generation approach in the last few years,
owing to the increasing importance in numerous practical
applications of VLSI design flow. Patrick Girard et al. [2]
compared the random and pseudo-random test generation Automatic Error Checker
method for BIST of delay, stuck-at and bridging faults. In
[3], an efficient dynamic instruction and stimulus generator
and its associated methodology were proposed. The
Figure 1. Architecture of the synthesizable pseudo-random functional
generator was used to generate random stimuli with one of verification method
the five random modes. It has been applied in the functional
verification of a high performance low-power embedded The data mirror image stores the writing data sent to
processor, and was proved to have significantly reduced Cache. For each writing data operation to Cache, the data are
verification cycle, improved test coverage. simultaneously stored in the same address of mirror image
Mike Bartley et al. [4] have compared three verification memory, to keep the data consistency. When the data are
techniques: directed testing, pseudo-random testing and read back from Cache, the corresponding data will be read
property checking by verifying two versions of a bridge from the same address in mirror image memory to
between two on-chip buses. The results show that the automatically check the correctness of reading operation.
property checking method found 18 bugs, and the directed The data mirror image is composed of the memories, and its
testing method found 14 bugs, and the pseudo-random reading and writing logic. The capacity of the memories is
testing method found 22 bugs. In [5], the authors proposed limited by the memory resource in FPGA chips. After reset
an efficient test generation technique, which can be used to operation, all the memories in data mirror image will be
achieve full state and transition coverage in simulation based initialized. The detailed initialization flow is showed as
verification for a wide variety of cache coherence protocols. follow. The writing enable signal is keep the value of 1, and
Based on effective analysis of the state space structure, this the writing data signals are assigned to 0, and the writing
method can generate more efficient test sequences (50% address signals are increased by 1 from 0 to the maximum
shorter) compared with tests generated by breadth first address of the memories. After all of the items in the
search. memories are initialized by 0, the writing enable signal is set
The above verification methods have adopted the pseudo- to 0. It indicates the initialization operation is finished. In
random test technique. Further, all of the pseudo-random test next step, the testbench generator will start to produce the
techniques are based on software simulation. The main pseudo-random test stimuli.
advantage of software simulation method is easy to The constraints directed pseudo-random test stimuli
localization of the errors in the designs. However, the generator mainly includes a finite state machine (FSM)
software simulation method is very slow, and its functional which generates and sends the accessing signals to Cache.
coverage and efficiency is generally quite limited. Therefore, During the process of testbench generating, this module will
how to improve the coverage and efficiency is the main exclude some impossible combinational condition for the
bottleneck for functional verification of Cache. logic of Cache, such as unreachable states of FSM,
impossible assignment for input signals, etc, according to the
III. RANDOM VERIFICATION METHOD FOR CACHE
functional specification of Cache. Then verification state
First of all, the overview of the pseudo-random space can be reduced and the efficiency can be improved
functional verification method is introduced in this section. through these useful constraints. As showed in Fig. 2, the
The architecture of the proposed FPGA-based pseudo-

278
FSM contains 5 states: IDLE, BUILD, IDGEN, SEND, errors. The first type of faults is the reading data error. When
UPDATE. The workflow and transition between these states the checker receives the reading data, it will compare the
are described as follows. reading date and the data at the same address in mirror image
1ˊIDLE˖The initial state of this FSM is called IDLE. memory. If they are unequal, it is indicated there exits a fault
If the initialization of mirror image is finished, the enable for the reading date from Cache. The second fault type is
signal of generating pseudo-random number is set to 1, and called reading ID error. In general, the double data rate
is sent to random number generator. Then the next state goes (DDR) SDRAM is employed in the main memory of CPU.
to BUILD. Otherwise, the next state stays at current state Whereas, the reading data returned from DDR SRAM may
IDLE. be out of order. Therefore, the test stimuli generator always
2ˊBUILD ˖ In this state, the constraints directed allocates a read ID in company with the address for each
pseudo-random signals, such as the address, writing data and reading operation. As above mentioned, it also maintenances
so on, are build. When received the pseudo-random number, a used ID set. When reading data and ID are returned from
this module extracts the lowest n bits as the writing data to Cache, the error checker will determine whether this ID
be sent to Cache. The next m bits of random number are belongs to the used ID set. The third fault type is the error
treated as the accessing address to Cache. The value of m is correcting code (ECC) check error. Currently ECC technique
determined by the depth d of mirror image memory, more is generally applied in DDR SDRAM and Cache. While the
precisely m=ªlog2dº. The highest bit is denoted as the type reading data are received from Cache, the error checker will
of operation. It indicates the current operation to Cache is compute ECC parity, and judge there exits one bit fault or
reading or writing operation. If it is reading operation, the two bits faults.
next state is transmitted to IDGEN. Otherwise, the next state The timeout reporting module judges whether the reading
is jumped to SEND. date can be return from Cache within a time limit after the
3ˊIDGEN˖In this state, the module detects whether reading operation is started. In this module, a 64-bit counter
there exits any idle reading ID number or not. If the set of is added for each reading ID. When a idle reading ID is used,
idle IDs are empty, the FSM stays at the current state. the corresponding counter begins to increase by 1 from 0.
Otherwise, the module selects an idle reading ID. This ID is While that reading ID and data are received from Cache, the
added into the used ID set, and removed from idle ID set. A counter will stop and set to 0. Otherwise, the counter will
matching list between the used ID and address is updated at continue accumulating, until the counter reaches the preset
the same time. Then the next state goes to SEND. timeout value. If timeout encountered, this module will
report the error signal to the system.
4ˊSEND˖The main function is sending control and
data signals to Cache. The sending valid signal is set to 1. At
the same time, all other signals, such as the generated prbs_gen_64bit(clk,rst_n,enable,seed_data,prbs_o)
address, writing data, operation type, reading ID, are sent to
Cache. Then the next state will arrive at UPDATE. always @(posedge clk) begin
5ˊUPDATE˖In this state, the writing data to Cache if (!rst_n) begin
are updated to the mirror image memory. When this current lfsr_q <= seed_data;
operation is writing operation, the writing data are end
simultaneously stored in the same address of mirror image
else if (enable) begin
memory. Then the next state will return back to IDLE.
lfsr_q[64] <= lfsr_q[64] ^ lfsr_q[63];
lfsr_q[63] <= lfsr_q[62];
lfsr_q[62] <= lfsr_q[64] ^ lfsr_q[61];
IDLE lfsr_q[61] <= lfsr_q[64] ^ lfsr_q[60];
Initialization
Update Over
lfsr_q[60:2] <= lfsr_q[59:1];
lfsr_q[1] <= lfsr_q[64];
Constraints
Directed end
Pseudo- UPDATE BUILD end
random Test
Stimuli
Generator Write Op assign prbs_o = lfsr_q[64:1];
Send Packets
Read Op

SEND IDGEN Figure 3. 64-bit PRBS pseudo-random Number Generating Algorithm

ID Generation
The pseudo-random number generator is used to produce
the random number. It adopts a typical 64-bit PRBS pseudo-
Figure 2. The finite state machine
random number generating algorithm. This algorithm is
illustrated in Fig. 3. If a 64-bit random number is not enough
The automatic error checker is composed of error
for all the signals, the generator changes different seeds to
judgment module and timeout reporting module. The error
build multiple random numbers. The input signals of this
judgment module mainly collects and judges three types of
algorithm include the clock signal clk, and the reset signal

279
rst_n, the enable signal enable, the 64-bit seed signal reading data is returned or the counter reaches the preset
seed_data. The output signal is a 64-bit pseudo-random value. The other branch fulfills the receiving data from
number called prbs_o. Cache. While the reading data and ID is arrived, the address
In Fig. 3, while reset signal is valid, the initial value of is obtained from the matching list between used ID and
the 64-bit register signal lfsr_q is assigned by the seed data. address, according to the ID. Then the data are read from the
When the reset signal is invalid, the algorithm is waiting for corresponding address of the mirror image memory, and
the enable signal. If the enable signal is 1, the linear compared with the reading data from Cache. At the same
feedback shift mode is adopted to generate the pseudo- time, the reading ID error and ECC parity error are checked
random number. In other words, the current clock cycle of and reported.
64-bit lfsr_q is obtained by ring shift of the last clock cycle
of lfsr_q. However, there is mutation on some bits. The 64th IV. EVALUATION RESULTS AND ANALYSIS
bit is generated by exclusive or operation of the 63rd and To evaluate the effectiveness and efficiency of the
64th of last cycle lfsr_q. Similarly, the 62nd bit is exclusive method, the three capacities of Cache, including 128KB,
or of the 61st and 64th of last cycle lfsr_q. The 61st bit is 256KB and 512KB, are implemented. The pseudo-random
exclusive or of the sixtieth and 64th of last cycle lfsr_q. This software simulation method is very popular verification
algorithm has two advantages. The first is all of the codes in technique for Cache. Therefore the proposed FPGA-based
this algorithm are synthesizable, and can be applied in FPGA pseudo-random verification method is compared with
verification. The second is the algorithm can generate pseudo-random software simulation method. The designs
different pseudo-random number by changing seed data, to under test (DUT) are 128KB Cache, 256KB Cache and
obtain more stimuli and improve the test coverage. 512KB Cache.
The software simulation environment is Cadence NC-
Verilog simulator. The experiments were conducted on a 2.9
Mirror image initialization
GHz Intel Xeon machine having 64 GB memory and running
the Linux operating system. The platform of the
synthesizable pseudo-random verification method is an
FPGA board based on Xilinx Virtex-6 565T FPGA device.
Pseudo-random Receive
number generation reading data The synthesizable pseudo-random verification method is
implemented in RTL verilog. The ISE tool is used to perform
synthesis, layout and routing. Finally, the bit stream of the
Build the control Obtain address DUT and testbench is generated, and is downloaded to the
and data signal by read ID FPGA verification board. The FPGA device works on
60MHz.
Allocate read
The evaluation results of the software simulation method
ID Read data and and pseudo-random verification method on FPGA are list in
compare Table 1. Table 1 shows the number of requests sent by
testbench within 10 seconds (Reqs no) and the number of
Send signals to bugs found by pseudo-random software simulation method.
Cache Errors check The last two columns provide the number of requests sent by
testbench within 10 seconds (Reqs no) and the number of
Update mirror bugs found by pseudo-random verification method on
image FPGA. The columns called Reqs no. are to evaluate
efficiency of the two verification methods.
Timeout report
TABLE I. VERIFICATION RESULTS OF TWO METHODS ON CACHE

Figure 4. Workflow of pseudo-random verification method Software simulation FPGA-based method


DUT
Reqs no. Bugs no. Reqs no. Bugs no.
The workflow of FPGA-based pseudo-random 128KB 22147 0 18134321 3
verification method is depicted in Fig. 4. First of all, the data 256KB 21320 2 16165842 5
512KB 20531 2 15271273 5
mirror image is initialized, and then the method branches.
One branch is controlled by the FSM, and generates the
constraints directed pseudo-random test stimuli. After From Table 1, we may observe the following. The
initialization, the first step is producing 64-bit pseudo- FPGA-based pseudo-random verification method strongly
random number. Then all the signals to be sent to Cache, outperforms the popular software simulation method by
such as the address, the operation type and the writing data, about three orders of magnitude. The 128KB Cache is fully
are built. If the current operation is reading operation, an idle test by the designer. Therefore software simulation method
reading ID is allocated and sent to Cache in company with cannot find any fault in this design. However the FPGA-
other control and data signals. Next the mirror image is based pseudo-random verification method has found 3 bugs.
updated by storing data into the same address of the memory. One of three bugs is the writing after reading error of LRU
Simultaneously, the timeout counter is started until the array. The cycle interval error between fill buffer requests

280
and miss buffer requests leads to another bug. The last one is more aggressive techniques to improve the efficiency of
the address conflict problem of input queue and miss buffer locating the bugs. The other future work is to apply our
requests. The three bugs are deeply faults in the design, and method to more designs.
dug very difficultly.
The 256KB and 512KB Cache are modified based on ACKNOWLEDGMENT
128KB Cache. Therefore the software has found 2 bugs The authors would like to thank all peer reviewers for
introduced by modification. The two bugs of 256KB Cache their valuable comments and suggestions. This work is
are identical as the two bugs of 512KB Cache. However, supported by the National Natural Science Foundation of
FPGA-based pseudo-random verification method has found China under grant No. 61103083 and 61133007, and
5 bugs. The two bugs are the same as the faults found by National High Technology Research and Development
software simulation method. The other three bugs are similar Program of China (863 Program) under grant No.
as the bugs found in 128KB Cache. In summary, the FPGA- 2012AA01A301.
based pseudo-random verification method achieves
substantial improvement on coverage, and can cover more REFERENCES
corner cases and combinational conditions. This method can [1] P. Bose, D.H. Albonesi, and D. Marculescu, “Guest editors’
help to find more faults in the designs, and improve the introduction: power and complexity aware design,” IEEE Micro, vol.
efficiency of verification, and shorten the time to market of 23, pp. 8-11, 2003.
chips. [2] Patrick G, Christian L, Serge P, and Arnaud V, “Comparison between
random and pseudo-random generation for BIST of delay, stuck-at
V. CONCLUSIONS and bridging faults,” Proc. 6th IEEE International On-Line Testing
Workshop, pp. 121-126, 2000.
For the functional verification of Cache, we propose an [3] Liang Z, Yan X, Wang J, and Xu Z, “A dynamic random instruction
FPGA-based pseudo-random verification method. This and stimulus generation for functional verification of embedded
method has two features: one is the testbench is processor,” Proc.5th International Conference on ASIC, pp. 459-462,
synthesizable, and the FPGA emulation process is introduced 2003.
to improve the efficiency of verification; the second is the [4] Mike B, Darren G, and Tim B, “A comparison of three verification
techniques: directed testing, pseudo-random testing and property
pseudo-random stimuli are generated automatically, and then checking,” Proc. 39th Design Automation Conference, pp. 819-823,
the functional verification coverage is increased. This 2002.
method is compared with the pseudo-random software [5] Qin X, Mishra P, “Automated generation of directed tests for
simulation method. The results show that our method is transition coverage in cache coherence protocols,” Proc. 2012 Design,
faster by about three orders of magnitude, and find more Automation and Test in Europe, pp. 3-8, 2012.
bugs in the designs. One of the future works is to explore

281

Das könnte Ihnen auch gefallen