Sie sind auf Seite 1von 14

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO.

3, MARCH 2010 479

Verification of Datapath and Controller Generation


Phase in High-Level Synthesis of Digital Circuits
Chandan Karfa, Student Member, IEEE, Dipankar Sarkar, and Chittaranjan Mandal

Abstract—A formal verification method of the datapath and free data transfers among the concurrent RT-operations. The
controller generation phase of a high-level synthesis (HLS) second task is to generate the controller finite state machine
process is described in this paper. The goal is achieved in two
(FSM) by identifying the control signals required in each state.
steps. In the first step, the datapath interconnection and the con-
troller finite state machine description generated by a high-level Such a synthesis flow is depicted in Fig. 1.
synthesis process are analyzed to obtain the register transfer- The use of high-level synthesis systems becomes crucial to
operations executed in the datapath for a given control assertion deal with the increasing complexity of today’s very-large-scale
pattern in each control step. In the second step, an equivalence integration designs and shortened design cycle. Continuous
checking method is deployed to establish equivalence between the
evolution in the HLS process, however, has made the synthesis
input and the output behaviors of this phase. A rewriting method
has been developed for the first step. Unlike many other reported steps so intricate that the synthesis procedures cannot be
techniques, our method is capable of validating pipelined and assumed to be correct by construction. Designs synthesized
multicycle operations, if any, spanning over several states. The by HLS may contain errors due to bugs in the tool. For
correctness and complexity of the presented method have been instance, the research reported in [18] identified two bugs in
treated formally. The method is implemented and integrated with
a widely used HLS tool SPARK [10] recently. We have an
an existing HLS tool, called structured architecture synthesis tool.
The experimental results on several HLS benchmarks indicate indigenous synthesis tool structured architecture synthesis tool
the effectiveness of the presented method. (SAST) in which flaws resulting from bugs in the tool have
been uncovered in the course of formally verifying its output.
Index Terms—Controller, datapath, equivalence checking, for- This underlines the need for efficient verification mechanisms
mal verification, FSM, FSMD models, high-level synthesis, reg- for HLS. The research reported in [7], [8], [19], [23] tried
ister transfer level.
to formally establish end-to-end equivalence between the
input behavioral description and the design synthesized by
I. Introduction HLS. However, the input specification of HLS is given at a
High-level synthesis (HLS) is the process of translating a be- high abstraction level compared to the level of abstraction
havioral description into a register transfer level (RTL) descrip- of the output. Also, extensive optimizations are carried out
tion containing a datapath and a controller [9]. The synthesis at various phases. An end-to-end verification technique falls
process consists of several subtasks carried out in sequence short of meeting all the challenges posed by phase-specific
such as, scheduling, allocation and binding, and datapath and verification tasks. The above techniques have to make some
controller generation [9]. The operations in the behavioral simplifying assumptions regarding the synthesis flow. In [19],
description are assigned time steps in the scheduling phase. for example, it is assumed that code motion techniques [11]
The allocation and binding process binds the variables to a set have not been applied during scheduling which, however, is
of registers and the operations to a set of functional units (FUs) quite contrary to what happens in most of the modern day
in each control step. In the phase of datapath and controller HLS tools such as SPARK [10]. Therefore, a phase-wise
generation, the first task is to generate the datapath by provid- verification technique that can handle the difficulties of
ing a proper interconnection path from the source register(s) each synthesis subtask separately is desirable for HLS
to the destination register for every register transfer (RT)- verification. A verification flow which works hand-in-hand
operation. The objective of this step is to maximize sharing of with HLS is depicted in Fig. 1. A number of researches were
interconnection units among RT-operations ensuring conflict- reported in the literature on verification of each phase of HLS.
Verification of the phase of scheduling is addressed in [5], [6],
Manuscript received January 14, 2009; revised July 2, 2009. Current [14], [16]–[18], [21], that of allocation and binding in [15],
version published February 24, 2010. This work was supported by Microsoft [20], and that of datapath and controller generation in [1], [3].
Corporation and Microsoft Research India under Microsoft Research India
Ph.D. Fellowship Award, and in part by the Ministry of Communica- The objective of this paper is to ensure the correctness of
tions and Information Technology (SMDP-II projet), Government of India the datapath and controller generation phase assuming that
and the Department of Science and Technology, New Delhi, India, Grant the scheduling phase and the allocation and binding phase
SR/S3/EECE/053/2008. This paper was recommended by Associate Editor V.
Bertacco. have already been verified. Although this phase does not
The authors are with the Department of Computer Science and Engi- bring about any change in the control flow, the verification
neering, Indian Institute of Technology, Kharagpur 721302, India (e-mail: of this phase still has many challenges. First and foremost,
ckarfa@yahoo.co.in; ds@cse.iitkgp.ernet.in; chitta@iitkgp.ac.in).
Digital Object Identifier 10.1109/TCAD.2009.2035542 the input RTL behavior transforms to an output consisting of
c 2010 IEEE
0278-0070/$26.00 

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
480 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010

Fig. 1. Hand-in-hand synthesis and verification framework.

a datapath, which is merely a structural description, and a Fig. 2. Steps of datapath and controller verification.
controller FSM. So, the verification task involves identification
of the RT-operations executed in a controller state from the
control signal assertions in that state. The nontriviality of this the scheduler SG and the RTL SG by symbolic simulation
task is due to the following reasons. First, it is not possible to to ensure that the datapath supports the RT-operations of that
obtain an RT-operation from a given control signal assertion state. However, the state-wise interaction of the controller and
pattern by examining the control signal values individually the datapath through the status and control lines has not been
in isolation. This is because an RT-operation may involve a verified; also, multicycle and pipelined operations have not
micro-operation which is accomplished by a set of control been considered.
signals rather than an individual control signal. Secondly, each In this paper, verification of the datapath and controller
RT-operation is associated with a spatial sequence of micro- generation phase is accomplished in two steps as shown in
operations depicting the data flow from the source register(s) Fig. 2. The input of this phase is modeled as an FSMD which
to a destination register. The analysis mechanism has to reveal is used in many of the HLS verification works [3], [14], [16],
this spatial sequence. [17]. In step 1, an FSMD M2 is constructed from the datapath
The second challenge in verification of this phase lies in interconnection information and the controller FSM. In the
handling multicycle or pipelined RT-operations which require next step, equivalence between the FSMD M1 representing
more than one FSM state. For example, suppose an RT- the behavior after the allocation and binding phase, and the
operation involves a k-cycle FU in a state q of the input FSMD M2 is established. The first step involves identification
behavior; then in the output behavior, all the states in any of the set of all the concurrent RT-operations realized by the
path of length k leading to q should realize datapaths from given control signal values. An elegant rewriting method is
the operand register(s) to the FU inputs while only state q presented in this paper to accomplish the tasks in step 1.
should realize the datapath from the operand register(s) to the Several inconsistencies, both in the datapath and in the con-
destination register. On the other hand, if the FU is a pipelined troller, if present, are revealed during construction of the
one, then only the (k − 1)th predecessor state (and not the FSMD M2 . The rewriting method is strong enough to handle
remaining ones) in any path leading to q should realize the pipelined and multicycle operations. The second step is much
operand datapaths. Accordingly, the control assertions in these simpler compared to the first one; it is only required to show
states should reflect setting up of such partial datapaths. Thus, that the RT-operations in each state of M1 are available in the
there may not be a one-to-one correspondence between the corresponding state of M2 . However, this step of equivalence
control assertion pattern in an FSM state and the RT- checking is required to account for the often applied algebraic
operations in the corresponding state of the input behavior. transformation techniques for interconnection optimization. A
We have not come across work on equivalence checking of preliminary version of this paper has been published in [13].
pipelined or multicycle operations in the literature. The contributions of this paper are as follows.
The allocation and binding phase and the datapath and 1) A rewriting based method for constructing an FSMD
controller generation phase have been verified together in [3] from a datapath and a controller description. The method
using the finite state machine with datapath (FSMD) model is versatile enough to handle pipelined, multicycle, and
[9]. Demonstrating the equivalence of the FSMD obtained by chained operations.
functional composition of two parts of the implementation, the 2) Rigorous treatments of soundness, completeness, and
control part and the operative part, with the scheduled FSMD complexity of the rewriting method.
accomplishes the functional verification. In [1], the synthesis 3) Handling some algebraic transformations for intercon-
of datapath interconnect and control is verified as follows. nection optimization that occur during datapath synthesis
The operations in each state of the scheduled behavior are using a normalization technique of arithmetic expres-
converted to an equivalent structured graph (SG) having the sions.
hardware components as vertices and connectivity among the 4) We have provided extensive experimental results to show
components as edges. It then shows the equivalence between the effectiveness of the presented method.

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
KARFA et al.: VERIFICATION OF DATAPATH AND CONTROLLER GENERATION PHASE IN HIGH-LEVEL SYNTHESIS OF DIGITAL CIRCUITS 481

The paper is organized as follows. A brief description of


the FSMD model is given in Section II. The basic issues
involved in construction of the FSMD M2 from the datapath
and the controller FSM are discussed in Section III. The
overall FSMD construction framework is given in Section IV.
Various flaws in the datapath and controller descriptions which
get detected during the rewriting process are also discussed
here. In Section V, the correctness and the complexity of the
rewriting method are given. The equivalence checking method
is given in Section VI. Experimental results on several HLS
benchmarks are given in Section VII. The paper is concluded Fig. 3. Scheduling of a relational operation. (a) Original input of HLS.
in Section VIII. (b) Scheduled behavior.

encoded by the micro-operation x ⇐ y. The datapath


II. FSMD Models
components essentially are the storage elements (regis-
We briefly introduce the FSMD model in this section. A ters), the functional units, the interconnection compo-
more elaborate treatment is available in [14]. Formally, an nents (buses, muxes, de-muxes, switches, etc.) or the
FSMD is defined as an ordered tuple Q, q0 , I, V, O, f, h, signal lines.
where Q = {q0 , q1 , q2 , . . . , qn } is the finite set of control 2) The control signal assertion pattern for every micro-
states, q0 ∈ Q is the reset state, I is the set of primary operation in M : Let there be n control signals. A
input signals, V is the set of storage variables, O is the set of control signal assertion pattern needed for any micro-
primary output signals, f is the state transition function and h operation is represented as an ordered n-tuple of the
is the update function of the output and the storage variables. form u1 , u2 , . . . , un , where each ui , 1 ≤ i ≤ n, rep-
Let B ⊆ V be the set of Boolean variables in the FSMD. Let resents the value of the control signal ci from the domain
S be a set of Boolean literals of the form b or ¬b, where {0, 1, X}; ui = X implies that the control signal ci is not
b ∈ B is a Boolean variable. Let U be a set of storage or required (relevant) for a particular micro-operation. Let
output assignments of the form {x ⇐ e | x ∈ O ∪ V and e an A be the set of all possible control assertion patterns.
an arithmetic predicate or expression over I ∪ (V − B)}. The So, a function fmc : M → A is constructed to
transition function f and the update function h are defined capture the datapath structure, in its entirety; the DP
as f : Q × 2S → Q and h : Q × 2S → U, respectively. The interconnection is conveyed by common signal naming.
functions f and h are defined only for such subsets of S which Note that the function fmc is not necessarily onto.
do not contain both b and ¬b. To capture the computation
Example 1: Let us consider the datapath shown in Fig. 4. In
of the condition of state transition in the initial behavior, the
this figure, r1, r2, and r3 are registers, M1, M2, and M3 are
scheduler introduces Boolean variables, one for each relational
multiplexers, FU1 and FU2 are functional units and r1 out,
operation, to store the result of that operation as depicted in
r2 out, r3 out, f 1Lin, f 1Rin, f 2Rin, f 1Out, f 2Out
Fig. 3 (where le is a Boolean variable). Since variables are
are interconnection wires. The control signal names start with
bound to registers in the allocation and binding phase, the
CS. 䊏
variables in the FSMDs M1 and M2 of Fig. 2 are datapath
Let the ordering of the control signals in a control signal
(DP) registers.
assertion pattern be CS M11 ≺ CS M10 ≺ CS M2 ≺
CS M3 ≺ CS FU1 ≺ CS FU2 ≺ CS r1Ld ≺
III. Construction of FSMD M2
CS r2Ld ≺ CS r3Ld. The function fmc : M → A is
Let us now examine how the FSMD M2 can be constructed given in the first two columns of Table I with the first column
from the datapath interconnection description and the con- designating M and the second one designating A . Ignore the
troller FSM whose transitions are labeled with the subsets of third column of the table for the time being. This function can
S and the control assertion values. Construction of M2 essen- be obtained from the output of any HLS tool containing the
tially consists in replacing the members of subsets of S with RTL behavior of each component used in the datapath.
the corresponding relational expressions over the DP registers
and the control assertion values with the corresponding RT-
operations. We shall describe the second task first and then B. Method of Obtaining the Micro-Operations for a Control
discuss how the same method accomplishes the first task. Assertion Pattern
The next task is to obtain the set of micro-operations M A
A. Representation of the Datapath Description (⊆ M ) which are activated by a given control assertion pattern
The following two pieces of information have to be ex- A. The following definition is in order.
tracted from the datapath description in order to find the RT- Definition 1 (Superposition of assertion patterns): Let A1
operations in each state of the FSMD M2 . and A2 be two arbitrary control signal assertion patterns. Let
1) The set of all possible micro-operations in the datapath: πi (A) denote the ith projection of an assertion pattern A which
Let this set be denoted as M . A data movement from is the asserted value ui of the control signal ci . The assertion
an input y of a datapath component to its output x is pattern, A1 θ A2 , obtained by superposition θ of A1 and A2 ,

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
482 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010

which are marked bold in the table. It may be noted that µ ∈


M −M A if it contains at least one U in some component, oth-
erwise it is in M A . 䊏
It may be noted that the construction of M A cannot be
achieved by examining each individual control signal value in
A in isolation because a micro-operation may be accomplished
by a set of control signals rather than an individual control
signal. There is no information available in an assertion pattern
to group the control signals so that each group defines a micro-
operation around a datapath component.

C. Identification of RT-Operations Realized by a Set


of Micro-Operations
Each RT-operation is accomplished by a set of concurrent
micro-operations. For example, let us assume that an FU
Fig. 4. Datapath with control signals.
performs addition operation on its two input data f 1Lin and
TABLE I f 1Rin. An RT-operation r3 ⇐ r1 + r2 may be accomplished
Construction of the Set M A from the Function fmc for the over a datapath by the concurrent micro-operations r1 out ⇐
Control Assertion Pattern A = 1, 0, 1, 0, 1, 0, 1, 1, 0 r1 , r2 out ⇐ r2 , f 1Lin ⇐ r1 out, f 1Rin ⇐ r2 out,
Micro-Operation (µ) Control Assertion Pattern fmc (µ) θ A
f 1Out ⇐ f 1Lin + f 1Rin, r3 ⇐ f 1Out. So, in order to
of µ (fmc (µ)) find the concurrent RT-operations accomplished by a control
r1 out ⇐ r1 X, X, X, X, X, X, X, X, X X, X, X, X, X, X, X, X, X
r2 out ⇐ r2 X, X, X, X, X, X, X, X, X X, X, X, X, X, X, X, X, X
assertion pattern A, it is necessary to find the RT-operations
r3 out ⇐ r3 X, X, X, X, X, X, X, X, X X, X, X, X, X, X, X, X, X realized by the set M A of concurrent micro-operations.
f 1Lin ⇐ r1 out 0, 0, X, X, X, X, X, X, X U, 0, X, X, X, X, X, X, X
f 1Lin ⇐ r2 out 0, 1, X, X, X, X, X, X, X U, U, X, X, X, X, X, X, X
Finding an RT-operation from a given set of micro-
f1Lin ⇐ r3 out 1, X, X, X, X, X, X, X, X 1, X, X, X, X, X, X, X, X operations is also not trivial because of two reasons. First,
f1Rin ⇐ r2 out X, X, 1, X, X, X, X, X, X X, X, 1, X, X, X, X, X, X
f 1Rin ⇐ r3 out X, X, 0, X, X, X, X, X, X X, X, U, X, X, X, X, X, X
there may be more than one RT-operation realized in that
f2Rin ⇐ r2 out X, X, X, 0, X, X, X, X, X X, X, X, 0, X, X, X, X, X particular state of the FSM. Secondly, there is a spatial
f 2Rin ⇐ r1 out X, X, X, 1, X, X, X, X, X X, X, X, U, X, X, X, X, X
f 1Out ⇐ f 1Lin + f 1Rin X, X, X, X, 0, X, X, X, X X, X, X, X, U, X, X, X, X
sequence of concurrent micro-operations needed to accomplish
f1Out ⇐ f1Lin − f1Rin X, X, X, X, 1, X, X, X, X X, X, X, X, 1, X, X, X, X an RT-operation but these are available in an unordered manner
f2Out ⇐ r3 out × f2Rin X, X, X, X, X, 0, X, X, X X, X, X, X, X, 0, X, X, X
f 2Out ⇐ r3 out/f 2Rin X, X, X, X, X, 1, X, X, X X, X, X, X, X, U, X, X, X
in M A .
r1 ⇐ f1Out X, X, X, X, X, X, 1, X, X X, X, X, X, X, X, 1, X, X The concurrent RT-operations accomplished by the set M A
r3 ⇐ f 1Out X, X, X, X, X, X, X, X, 1 X, X, X, X, X, X, X, X, U
r2 ⇐ f2Out X, X, X, X, X, X, X, 1, X X, X, X, X, X, X, X, 1, X
of micro-operations are identified using a rewriting method.
The method also reveals the spatial sequence of data flow
satisfies the following conditions. For all i needed for an RT-operation in a reverse order (from the
destination register back to the source registers). The basic
πi (A1 θ A2 ) = πi (A1 ), for πi (A1 ) = πi (A2 ) method consists in rewriting terms one after another in an
= πi (A1 ), for πi (A1 )
= πi (A2 ) ∧ πi (A1 ) = X, expression. Let M ′A be the subset of M A that contains all the
= U(undefined), for πi (A1 )
= πi (A2 ) ∧ micro-operations whose right hand side (RHS) expressions are
πi (A1 )
= X. to be rewritten. How the subset M ′A is chosen from M A will
be discussed shortly. For present discussion, it is sufficient to
We define the set M A as M A = {µ | µ ∈ M andfmc (µ) θ A = note that the set M ′A contains micro-operations of the form
fmc (µ)}. The superposition of the control assertion pattern of r ⇐ r− in, where r is a register and r− in is its input terminal.
each micro-operation and the pattern A is checked one by one Next, the RHS expression “r− in” is rewritten by looking for a
to decide whether to include that particular micro-operation in micro-operation in M A of the form “r− in ⇐ s” or “r− in ⇐
M A or not. Let us consider the superposition of the assertion s1 op s2 .” So, after rewriting “r− in,” we have the RHS
pattern for a micro-operation µ and a given control assertion expression, either of the form “s” or of the form “s1 op s2 .”
pattern A. It may be noted that bits in A being outputs of the In the next step, s (or s1 and s2 for the latter case) are rewritten
controller circuit cannot contain ‘X’. If πi (fmc (µ)) = X, then provided they are not registers. When the expression in hand is
πi (fmc (µ) θ A) is also X. Now, consider some j such that of the form “s1 op s2 ” (and s1 , s2 are not registers), rewriting
πj (fmc (µ)) = 0 or 1. If µ is executed by A, then πj (fmc (µ)) = takes place from left to right in a depth-first manner. Thus,
πj (A). So, fmc (µ) θ A becomes fmc (µ) if µ is performed by at any point of time, the expression in hand can be of the
the assertion pattern A. Since πi (fmc (µ))
= U for any µ and form “((s1 op1  s2 ) op2  s3) op3 ↑ . . .,” where the pointer
i, and πi (fmc (µ) θ A) = U when πi (fmc (µ))
= X
= πi (A), indicates the signal to be rewritten next and the signals s1 ,
µ
∈ M A iff πi (fmc (µ) θ A) = U, for some i. s2 , and s3 occurring at its left are all registers. The process
Example 2: For the datapath given in Fig. 4, let the con- terminates successfully when all si ’s in the expression in hand
trol assertion pattern in a particular FSM state be A = are registers.
1, 0, 1, 0, 1, 0, 1, 1, 0. The selection process is tabulated in Example 3: We illustrate the rewriting process for the data-
Table I (column 3). M A comprises those micro-operations path given in Fig. 4. Let us consider the control assertion pat-

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
KARFA et al.: VERIFICATION OF DATAPATH AND CONTROLLER GENERATION PHASE IN HIGH-LEVEL SYNTHESIS OF DIGITAL CIRCUITS 483

tern A = 1, 0, 1, 0, 1, 0, 1, 1, 0. Recall that the corresponding


set M A of micro-operations has been derived in example 2 as
{ r1 out ⇐ r1, r2 out ⇐ r2, r3 out ⇐ r3, f 1Lin ⇐
r3 out, f 1Rin ⇐ r2 out, f 2Rin ⇐ r2 out, f 1Out ⇐
f 1Lin − f 1Rin, f 2Out ⇐ r3 out × f 2Rin, r1 ⇐
f 1Out, r2 ⇐ f 2Out}. The micro-operations in which a reg-
ister occurs in the left hand side (LHS) are r1 ⇐ f 1Out and
r2 ⇐ f 2Out which form the set M ′A . The sequence of rewrit-
ing steps for the micro-operation r1 ⇐ f 1Out is as follows:
r1 ⇐ f 1Out
⇐ f 1Lin−f 1Rin [by f 1Out ⇐ f 1Lin−f 1Rin] (step 1)
⇐ r3 out − f 1Rin [by f 1Lin ⇐ r3 out] (step 2)
⇐ r3 − f 1Rin [by r3 out ⇐ r3 (step 3)
⇐ r3 − r2 out [by f 1Rin ⇐ r2 out] (step 4)
⇐ r3 − r2 [by r2 out ⇐ r2] (step 5)
Similarly, the RT-operation r2 ⇐ r3 × r2 can be obtained
starting from the other micro-operation r2 ⇐ f 2Out in M ′A .
So, the RT-operations r1 ⇐ r3 − r2 and r2 ⇐ r3 × r2 are exe-
cuted by the given control assertion pattern A in a transition of
the FSM. The forward spatial sequence of the micro-operations
for an RT-operation is the reverse order in which they are
used in the above rewriting steps; more specifically, therefore, Fig. 5. (a) Schedule with a 2-cycle multiplier. (b) Schedule with a 3-
stage pipelined multiplier. (c) Schedule with a chained adder and subtracter.
the forward sequence of r1 ⇐ r3 − r2 is r2 out ⇐ r2, (d) Input and output timing of a k-cycle multiplier. (e) Input and output timing
f 1Rin ⇐ r2 out, r3 out ⇐ r3, f 1Lin ⇐ r3 out, of a k-stage pipelined multiplier. (f) Input and output timing for a chained
f 1Out ⇐ f 1Lin − f 1Rin, r1 ⇐ f 1Out. 䊏 adder and subtracter.
Let us now examine how the condition of execution
associated with a controller FSM transition is made to operands [Figs. 5(b) and (e)]. Operation chaining allows two or
correspond to an arithmetic predicate over registers. For this more faster operations to be performed serially within one step
purpose, let us recall Fig. 3 given in Section II. The Boolean [Figs. 5(c) and (f)]. The datapath may have all such variations
variable le may be bound to a register during allocation and of FUs. Also, the controller needs to assert proper values to
binding phase (in the case of Mealy machine implementation the control signals to execute an operation in a multicycle or
of the controller FSM). In such a case, an arithmetic predicate pipelined way over multiple clocks or to execute more than one
(relational operation) will be realized in the same way as an operation in one clock in a chained manner. Since, operation
arithmetic operation. However, if it is scheduled in the state chaining is restricted to a single control state, the rewriting
itself (in the case of Moore machine implementation of the mechanism described in the previous section can be applied
controller FSM), the Boolean variable le need not be stored straightway. In the following, verification issues of other two
in a register; instead it may be made available only as a status cases are discussed.
signal line output from the datapath feeding to the controller. To execute an operation in a multicycle FU, the data have to
We account for this case by simply including in the set be held constant on the FU inputs over all the control steps it
M ′A the micro-operations containing status signals in their takes to execute that operation. In Fig. 5(d), for example, the
LHS in addition to the micro-operations having registers in inputs x1 and y1 are held on the FU inputs for k steps, where
the LHS. the FU is a k-cycle multiplier. If the operation starts execution
at the ith control step, then the result x × y is available only at
D. Multicycle, Pipelined, and Chained Operations the (i + k − 1)th control step. Therefore, in order to ensure that
Functional units have different propagation delays depend- the datapaths are set properly, we need to verify the following
ing upon the functions they are designed to perform. As a points.
result, concurrently activated units with delays shorter than 1) From i to (i + k − 1) steps, the left and the right input
a clock cycle remain un-utilized in most part of the clock expressions to the FU are the same.
cycle. To circumvent this problem, three well known tech- 2) The registers feeding the inputs to the FU are not
niques, namely multicycle execution, pipelined execution, and updated (by any other operations) in the ith to the
operation chaining [9] are used. In multicycle execution, the (i + k − 1)th time steps.
clock cycle is shortened to allow the fast operations to execute For a k-stage pipelined FU operation starting at the ith step,
in one clock cycle, and the slower operations are permitted to the output of the operation is available at (i + k − 1)th step.
take multiple clock cycles to complete execution as shown in Fig. 5(e) reflects the scenario. The FU, however, may take a
Fig. 5(a). The corresponding Fig. 5(d) depicts the time steps new set of inputs in each of the (i+1)th step to the (i+k −1)th
for a k-cycle operation. Pipelining several sets of operands step as shown in the figure. So, the datapaths from the operand
over the cycles of a multicycle execution of an operation p registers to the inputs of the FU are set in the ith step, whereas
allows concurrent execution of p on each of those sets of the datapath from the output of that FU to the destination

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
484 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010

Algorithm 1 RTLV-1
/* Finds the set of RT-operations accomplished by a given set of micro-
operations.
Input: The set M A of micro-operations for a given control assertion pattern
A and signal value.
Output: The set RTA of RT-operations accomplished by M A . */
Method:
1: Let RTA be φ;
2: if signal = 0 then
3: M ′A = {µ|µ ∈ M A and µ has a register or a status line in its LHS
term };
4: else if signal = 1 then
5: M ′A = {µ|µ ∈ M A and µ has an output signal of any multicycle FU
in its LHS term};
6: else if signal = 2 then
7: M ′A = {µ|µ ∈ M A and µ has an output signal of any pipelined FU in
its LHS term};
Fig. 6. Schematic of our FSMD construction framework from the datapath 8: end if
description and the controller FSM. 9: if more than one micro-operation in M ′A has the same register name in
its LHS then
register where the result needs to be stored is set only at the 10: Report (“Same register is updated by more than one micro-operation”);
(i + k − 1)th step. 11: else
12: for each µ in M ′A do
Based on the above discussion, in Table I corresponding to 13: replaced = φ;
fmc , we need to record two additional pieces of information 14: Seq[0] ⇐ µ;
namely, the type and the cycle information of each micro- 15: µ ← findRewriteSeq (µ, M A , replaced, Seq, 1);
/* “µ” - initially a micro-operation which is finally transformed to
operation to handle multicycle and pipelined operations. Let an RT-operation by the function.
there be an FU performing a multicycle (pipelined) opera- “replaced” - a set of signals rewritten already - used by the function
tion op needing k cycles (stages). The corresponding micro- to detect if a data flow loop is set up by the control assertion.
“Seq” contains the final sequence of micro-operations used in rewrit-
operation is p ⇐ LopR, where p, L, and R are respectively ing - depicts the data flow in reverse, which obtains the RT-operation.
the output, the left input and the right input of the FU. The last parameter contains the number of micro-operations currently
For such a micro-operation, we designate ‘M’ (‘P’) as type in Seq */
16: RTA = RTA ∪ {µ};
and k as cycle. For all other micro-operations, the type 17: end for
will be ‘N’ and the cycle value will be one. They are all 18: end if
normal single cycle operations. So, fmc is now of the form
fmc : M → {type, cycle, A}, where type ∈ {‘M’, ‘N’, ‘P’ },
cycle ∈ N, the set of natural numbers and A ∈ A , the set of the signal value and then calls the function findRewriteSeq
control assertion patterns. for each member of M ′A . Initially, RTLV-0 invokes RTLV-1
for each transition of the controller FSM with signal equal
to zero. The function findRewriteSeq, in turn, identifies the
IV. The Overall Construction Framework
RT-operation starting from that micro-operation based on the
of FSMD M2
procedure discussed in Section III-C. The modules RTLV-1 and
Fig. 6 depicts the schematic of the method presented for the function findRewriteSeq are given as Algorithm 1 and
construction of FSMD M2 . The module RTLV-0 is the central Algorithm 2, respectively. It may be noted that the function
module which takes the datapath description in the form findRewriteSeq is capable of identifying the RT-operations
of function fmc and the controller FSM and constructs the that involve registers and outputs of pipelined units at the LHS.
FSMD M2 . For each transition of the controller FSM, RTLV- The module RTLV-0 uses the module multicycle and pipelined
0 obtains the set M A of micro-operations activated by the to handle the multicycle and pipelined operations, respectively,
control assertion A associated with the transition; the method the details of which are discussed in the subsequent sections.
used for this purpose is discussed in Section III-B. It then
A. Handling of Multicycle Operations
invokes RTLV-1 with the parameters M A and the variable
signal, the latter having values {0, 1, 2}. The assignment It may be recalled that for any RT-operation r ⇐ r1opr2,
signal = 0 is used to obtain the RT-operations involving only where “op” is a multicycle operation, there is one transition,
single cycle operations or an RT-operation at the last step of τ say, in which the RT expressions r ⇐ p, p ⇐ LopR,
a multicycle or a pipelined operation; signal = 1 is used to L ⇐ r1, and R ⇐ r2 are all realized; furthermore, each of the
obtain an RT-operation with the output of a multicycle FU sequences of transitions of length k−1 leading to the transition
at the LHS (corresponding to all the cycles of a multicycle τ will realize p ⇐ r1opr2 by having p ⇐ LopR,
operation except the last one); signal = 2 is used to obtain an L ⇐ r1, and R ⇐ r2 in each member of the sequence. In
RT-operation with the output of a pipelined FU at the LHS addition, these transition sequences do not realize any other
(corresponding to the first cycle of a pipelined operation). RT-operation which has “r1” or “r2” as LHS terms. We verify
Note that for a k-stage pipelined operation, only the first these facts by the following steps. The call graph of the same
stage (where signal = 2) and the last step (where signal = 0) is shown pictorially in Fig. 6.
are relevant for the operation. The module RTLV-1 computes 1) Let τ be a transition from the state q1 . Let RTLV-0 iden-
the set M ′A of micro-operations to be rewritten based on tify by using RTLV-1 the RT-operation r ⇐ r1opr2

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
KARFA et al.: VERIFICATION OF DATAPATH AND CONTROLLER GENERATION PHASE IN HIGH-LEVEL SYNTHESIS OF DIGITAL CIRCUITS 485

Algorithm 2 findRewriteSeq (µ, M A , replaced, Seq, i)


/* replaces (rewrites) the leftmost nonregister signal(s) in the RHS expres-
sion of µ, if possible, using some micro-operation m; accordingly, puts
s in replaced, m in Seq (as the ith entry) and invokes itself recursively;
finally returns the rewritten µ having only register signals at its RHS */

1: if the RHS of the µ contains either register signals or the output signals
of some pipelined FU then
2: Report (“the RT-operation found is µ”); return µ;
/* terminates successfully */
3: else Fig. 7. Working with multicycle and pipelined operations.
4: Let s be the leftmost nonregister signal in the RHS expression of µ
which is neither a register nor an output of a pipelined FU. pattern associated with the transition q1 , q2 . Let ‘×’ be a
5: if s ∈ replaced then three cycle multiplier. We have to now ensure that the RT-
6: Report (“loop set up in the datapath by the control assertion”); return operation fuOut ⇐ r2 × r3 is realizable by the assertion
empty RT-operation;
7: else patterns associated with the transitions belong to the sequences
8: Let M s ⊂ M A be the set of micro-operations s.t. each member of of transitions of length 2 (= 3 − 1) which terminate in q1 . The
M s has s as its LHS signal. module Multicycle finds this set of transitions T1 . For this
9: if M s == φ then
10: Report (“Inadequate set of micro-operations”); return empty RT- example, the transition set is T1 = {qi , qj , qj , q1 , q2 , qj }.
operation; Also, we have to ensure that the registers r2 and r3 are not
/* No micro-operation found in M A which has s as its LHS signal updated in any of the transitions in T1 . Tasks 1 and 2 are done
*/
11: else if M s contains more than one micro-operation then in steps 2 and 3, respectively, as described above. 䊏
12: Report (“data conflict”); return empty RT-operation B. Handling of Pipelined Operations
/* more than one driver activated for a signal */
13: else For any RT-operation r ⇐ r1opr2, where “op” is a k-
14: /* M s contains a single micro-operation. */ stage pipelined operation, there is one transition in which the
Let M s be {m};
15: Seq[i] = m; RT expression r ⇐ p is realized and the first member of each
16: Let m be of the form s ⇐ e; Let µ be of the form t ⇐ ((e1 )s(e2 )); of the sequences of transitions of length (k − 1) leading to this
17: replace all the occurrences of s in the RHS expression of µ with transition will realize L ⇐ r1, R ⇐ r2, and p ⇐ LopR. In
the RHS expression of m; thus µ becomes t ⇐ ((e1 )(e)(e2 ));
18: replaced = replaced ∪ {s}; other words, for an RT-operation r ⇐ r1opr2 identified in
19: return findRewriteSeq(µ, M A , replaced, Seq, i + 1); the (i + k − 1)th step, where op is a pipelined operation, the
20: end if ith step should contain the RT-operation p ⇐ r1opr2 and
21: end if
22: end if the (i + k − 1)th step should contain the RT-operation r ⇐ p.
It may be noted that although p does not contain the value
of the expression r1opr2 in the ith step, for convenience,
where op is a multicycle operation in τ. RTLV-0 passes we resort to such encoding to indicate that at the ith step,
q1 to the routine multicycle. The latter carries out a the FU is activated to act on the operands “r1” and “r2.” We
backward BFS traversal over the control FSM from the ensure that the RT-operations are indeed obtained in the above
state q1 (with depth = 1) up to a depth of k to identify manner by the following steps. The call graph of this sequence
all the sequences of transitions of length k − 1 which of steps is also shown pictorially in Fig. 6.
terminate in q1 . Each transition occurring in these se- 1) Let τ be a transition from the state q1 . Let RTLV-
quences is to be checked for containing the RT-operation 0 identify by means of RTLV-1 that the transition τ
p ⇐ r1opr2 subsequently by RTLV-0. So the routine is one in which the RT-operation r ⇐ p is realized,
Multicycle returns the set, T1 say, of all these transitions. where p is the output of a pipelined FU. Now, RTLV-
2) On obtaining the set T1 , the module RTLV-0 selects 0 passes q1 to the routine Pipelined. The latter carries
transitions from T1 one by one for finding the out a backward BFS traversal up to depth k − 1 over
RT-operations realized in them using RTLV-1 with the controller FSM from the state q1 to identify all
signal = 1. RTLV-1 puts in M ′A those micro-operations the sequences of transitions of length k − 1 which
which have at their LHS the outputs of multicycle terminate in q1 . The first member of all such transition
FUs and subsequently ensures that the members of T1 sequences have to be checked for containing the RT-
contain the operation p ⇐ r1opr2. operation p ⇐ r1opr2 subsequently by RTLV-0. So,
3) The module RTLV-0 now checks for a micro-operation the routine Pipelined returns the set, T2 say, of all these
which has “r1” or “r2” as its LHS term in each transition first transitions in these sequence.
of the set T1 . If such a micro-operation is found by 2) On obtaining the set T2 , the module RTLV-0 selects
RTLV-0, it reports an error message indicating that an transitions from T2 one by one for finding the RT-
operand is disturbed during a multicycle operation. operations realized in them using RTLV-1 with the
Example 4: Let us consider the controller FSM given in parameter signal = 2. For signal = 2, RTLV-1 puts in
Fig. 7. The control assertion pattern associated with each M ′A only those micro-operations of M A which have
of FSM’s transition is not shown explicitly for clarity. Let outputs of the pipelined FUs at their LHS. As RTLV-
us assume that RTLV-0 identifies by using RTLV-1 the RT- 1 invokes findRewriteSeq with µ ∈ M ′A , the latter
operation r1 ⇐ r2 × r3 which is realizable by the assertion returns RT-operations of the form p ⇐ r1opr2. Thus,

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
486 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010

RTLV-0 can ascertain that the members of T2 indeed one component try to pass through a single data line due to
contain the desired RT-operation. wrong control assertion.
3) If it is found by RTLV-0 that all the transitions in T2 Race condition (steps 9 and 10 of RTLV-1): It means that
contain the operation p ⇐ r1opr2, then it rewrites the one register is attempted to be updated by two values in the
RHS of the RT-operation r ⇐ p (i.e., p) in the transition same time step due to wrong control assertion pattern.
τ with the RHS expression of p ⇐ r1opr2. So, finally Error in the datapath for a multicycle operation: It occurs
the RT-operation r ⇐ r1opr2 is associated with τ. when the input paths for a k-cycle FU are not set in any of
Example 5: Let us again consider the controller FSM given the first k − 1 steps of execution of an operation in that FU.
in Fig. 7. Let us assume that RTLV-0 identifies by using RTLV- This occurs again due to wrong control assertion for the step
1 the RT-operation r1 ⇐ fuOut in transition q1 , q2 . Let in question and gets detected in RTLV-0.
FU be a three stage pipelined multiplier. We have to now Input operand is disturbed during execution of a multicycle
ensure that the RT-operation fuOut ⇐ r2 × r3 is realizable operation: It means that an input register of a multicycle oper-
by the assertion pattern associated with the first member of the ation is updated by an RT-operation midway during execution
sequences of transitions of length 2 (= 3−1) which terminates of that multicycle operation. This situation arises due to either
in q1 . The module Pipelined finds this set of transitions T2 . For of the following two reasons: a) If the RT-operation is also
this example, the transition set is T2 = {qi , qj , q2 , qj }. The present in the FSMD M1 , then it is an error in the scheduling
above mentioned task is done by step 2. If step 2 is successful, policy of the high-level synthesis, or b) if this RT-operation
then the actual RT-operation r1 ⇐ r2 × r3 is obtained by is not present in the FSMD M1 , then it occurs due to wrong
rewriting f 2Out of r2 ⇐ f 2Out by the RHS expression of control assertion which causes an erroneous RT-operation in
f 2Out ⇐ r3 × r1 in the transition q1 , q2  in step 3. 䊏 the datapath. Such flaws get detected in RTLV-0.
C. Handling Chained Operations Error in pipelining: It means that the datapaths correspond-
ing to the inputs of a pipelined unit are not properly set due to
The operation chaining scenario is depicted in Fig. 5(f)
wrong control assertion pattern at the state where the execution
where two single cycle functional units are chained in the
of that pipelined operation begins. This class of errors gets
datapath. The results of both the FUs are available in the
detected in RTLV-0.
same time step as shown in the figure. As all the operations
are performed in single cycle and there exists a spatial
sequence among the operations that are in the chain, our V. Correctness and Complexity of the Algorithm
rewriting method can handle this variation of the datapath. A. Correctness of the Modules of the Algorithm
However, chaining of pipelined FUs, multicycle FUs with
The correctness of module RTLV-0 and RTLV-1 depends
pipelined FUs, multicycle FUs with single cycle FUs and
directly on that of the module findRewriteSeq. On the other
pipelined FUs with single cycle FUs are also possible in the
hand, the modules Multicycle and Pipelined deploy a backward
datapath. Among them, the first two cases usually do not
BFS traversal which can be assumed to be correct. The
occur in practical circuits. The module RTLV-0 can handle
details of correctness of these four modules are omitted here
chaining of multicycle/pipelined FUs with single cycle FUs.
for brevity. Instead, the proofs of termination, soundness,
The detailed implementation of RTLV-0 is not given in this
and completeness of the function findRewriteSeq are given
paper for brevity. It is also possible to extend our rewriting
below.
method to handle the first two scenarios of chaining.
Theorem 1 (Termination): The function findRewriteSeq
D. Verification During Construction of FSMD M2 always terminates.
Several inconsistencies that can be detected while construct- Proof: If a recursive invocation does not detect one of
ing FSMD M2 are as follows. the error situations depicted in steps 6, 10, 12 (and hence
Loops set up in the datapath by the controller (steps 5 and terminate), then it must replace a signal in the RHS expression
6 of the function findRewriteSeq): One nonregister datapath of µ and enhance the set “replaced.” The same signal, once
signal line can be assigned only one value in a particular replaced, is never replaced in subsequent invocations. There
control step. If a nonregister term is attempted to be rewritten is only a finite number of signals in the datapath. Hence, the
twice during one invocation of findRewriteSeq by RTLV-1, function cannot invoke itself more times than the number of
then it implies an improper control assertion pattern setting signals in the datapath.
up a loop in the datapath without having any register. Definition 2 (Forward rewriting by a micro-operation): An
Inadequate set of micro-operations performed by a con- expression e is said to be obtained from an expression e− by
trol assertion pattern (steps 9 and 10 of the function forward rewriting by a micro-operation s ⇐ er , if e can be
findRewriteSeq): This situation arises due to either of the obtained by replacing one or more occurrences of er in e− by s.
following two reasons: a) interconnection between two datap- In findRewriteSeq, the rewriting of an expression e1 at
ath components is not actually set by the control pattern but is hand by a micro-operation s ⇐ e2 is carried out by replacing
required to complete an RT-operation or b) the control signals all the occurrences of s in e1 by e2 . In contrast, the forward
are asserted in a wrong manner which leads to a situation rewriting does the opposite, in keeping with the direction of
where the required data transfer is not possible in the datapath. data flow represented by the micro-operation (hence the name).
Data conflict (steps 11 and 12 of the function Lemma 1 (Realizability of an RT-operation): An RT-oper-
findRewriteSeq): It means that data from more than ation t ⇐ e is realizable over a datapath if there exists a

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
KARFA et al.: VERIFICATION OF DATAPATH AND CONTROLLER GENERATION PHASE IN HIGH-LEVEL SYNTHESIS OF DIGITAL CIRCUITS 487

the pre-order traversal is the reverse of the right-first post-order


traversal.
Definition 3 (Linear chain over the datapath): A linear
chain over the datapath is an RT-operation of the form dl ⇐ dr ,
where dl and dr are any datapath signals.
For a linear chain which is realizable using a set M A of
micro-operations, there exists a micro-operation sequence σ =
µ0 , µ1 , . . . , µk , where µi ∈ M A and is of the form di+1 ⇐
Fig. 8. (a) A datapath for the RT-operation t ⇐ (r1 + r3) + r2. (b) Parse tree di , 0 ≤ i ≤ k, d0 = dr , dk+1 = dl and di ’s are the datapath
corresponding to the expression (r1 + r3) + r2. (c) Parse tree corresponding to signals such that dl is obtained by forward rewriting by the
the expression (r1 + r2) + r3.
members of σ. It might be noted that when k = 0, a linear
sequence σ of micro-operations (over the datapath) such that chain is essentially a micro-operation. The parse tree of such
the expression “t” is obtainable from the expression e by a linear chain comprises only the root and is of depth one.
forward rewriting of e by the members of, and according to, The micro-operation sequence realizing an operation
the sequence σ. t ⇐ r1opr2 may be viewed, in general, as µ0 , . . . ,
Proof: By induction on the length |σ| of σ. The details µj−1 , µj , . . . µi−1 , µi , µi+1 , . . . µk , where the subsequence
are omitted for brevity. µ0 , . . . , µj−1  realizes a linear chain depicting data move-
Theorem 2 (Soundness): Let the function findRewriteSeq ment from r2 to the right input of the FU (typically of the
be invoked with a micro-operation µ of the form t ⇐ e0 and form fRin ⇐ r2), the subsequence µj , . . . , µi−1  realizes
the set M A of micro-operations corresponding to the control a linear chain depicting data movement from r1 to the left
assertion pattern A. Let it return an RT-operation p of the input of the FU (typically of the form fLin ⇐ r1), µi is a
form t ⇐ e, where e comprises registers or the output signal micro-operation corresponding to the FU operation (typically
of some pipelined FU. The RT-operation p is realizable over of the form fOut ⇐ fLin op fRin) and the subsequence
the datapath. µi+1 , . . . , µk  realizes a linear chain of data movement from
Proof: Let the function terminate successfully (in step 1), a functional unit (FU) output to the destination signal t.
obtain “Seq” as µ0 , µ1 , . . . , µk , and return an RT-operation Lemma 2: For a realizable linear chain, the function
t ⇐ e. The LHS signal of the argument micro-operation findRewriteSeq returns the reverse of the micro-operation
t ⇐ e0 is never disturbed by the function. Let us consider sequence that realizes the linear chain.
the reverse of “Seq” µk , . . . , µ1 , µ0  = σ, say. Thus, the first Proof: Let dl ⇐ dr be a linear chain. Let the micro-
member µ0 in “Seq” (that is, the last member in σ), is of operation sequence over the set M A realizing the linear
the form t ⇐ e0 . Let the (RHS) expression obtained after chain be σ = µ0 , µ1 , . . . , µk . More specifically, let the
application of µi in “Seq” be ei . Clearly, ek = e and e contains corresponding forward rewriting sequence be dr ⇒µ0 dr+1
registers or the output signal of pipelined FU(s). The fact that ⇒µ1 dr+2 ⇒µ2 . . . ⇒µk−1 dr+k ⇒µk dr+k+1 = dl . It may be
the expression “t” is obtainable from ei by forward rewriting noted that dl ⇐ dr+k−i , 0 ≤ i ≤ k, are all realiz-
by the sequence µi , . . . , µ0 , 0 ≤ i ≤ k, can be proved by able linear chains realized by the forward rewriting micro-
induction on i. The details are omitted here due to brevity. operation sequence µk−i , . . . , µk . It can be proved that
In order to demonstrate the completeness of the rewrite for the realizable chain dl ⇐ dr+k−i , for any i, 0 ≤
procedure, we introduce the notion of parse tree corresponding i ≤ k, the function findRewriteSeq obtains the sequence
to a register transfer operation t ⇐ e as realized by a given set µk , µk−1 , . . . , µk−j , 0 ≤ j ≤ i, by induction on i. The
M A of micro-operations. The parse tree of t ⇐ e is the parse details are omitted due to brevity.
tree of the expression e parenthesized in accordance with its Theorem 3 (Completeness): If there is an RT-operation p of
realization by M A with its root node labeled as t. For example, the form t ⇐ e which is realizable using the micro-operations
the RT-operation t ⇐ r1 + r2 + r3, realized over the datapath in M A , then the function findRewriteSeq, if invoked with a
shown in Fig. 8(a), will have the parse tree corresponding to micro-operation of the form t ⇐ e0 , returns the sequence of
the expression (r1+r3)+r2 [Fig. 8(b)] and not the one shown in micro-operations corresponding to the pre-order traversal of
Fig. 8(c) (corresponding to the expression (r1 + r2) + r3). From the parse tree of p, parenthesized according to its realization
now onward, we will leave the phrase “as realized by M A ” using M A .
understood following the term “parse tree of an RT-operation.” Proof: Let the sequence of micro-operations correspond-
We denote the parse tree of an RT-operation p as T (p), and ing to the pre-order traversal of a tree of p be σpre (p),
its depth as d(T (p)); (the root is assumed to have depth 1). that corresponding to the right-first-post-order traversal of p
A forward rewrite sequence realizing an RT-operation ti ⇐ be σpost (p), and any sequence realizing p be σ(p). Since
ei can be presented as either the post-order traversal of its t ⇐ e is realizable using members of M A , there exists a
parse tree Ti or a minor variation of this order whereupon sequence σpost (t ⇐ e) of the form µ0 , µ1 , . . . , µk . Since σ
the orders of traversals of the subtrees are exchanged between is a spatial sequence and not a temporal one, without loss
themselves. We refer to this variant as right-first post-order of generality, the suffix “post” can be used. We now prove
traversal because the right subtree is traversed before the left that the function findRewriteSeq returns the micro-operation
subtree. The presence of noncommutative binary operations sequence σpre (p) = µk , µk−1 , . . . , µ0  = reverse(σpost ) We
like ‘/’, ‘%’, etc., do not impair this fact. It may be noted that accomplish this proof by induction on d(T (p)) = i, say.

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
488 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010

(Basis i = 1): The parse tree of p comprises the root labeled t1 ⇐ e1 and, by assumption, σ1 = σpost (t1 ⇐ e1 ). Since
with t and e is another register or an output signal of a d(T (t1 ⇐ e1 )) ≤ i, by induction hypothesis, the function
pipelined FU; thus, p is a realizable linear chain. By Lemma 2, can construct reverse(σ1 ) to rewrite t1 as e1 . Because of the
the micro-operation sequence µk , µk−1 , . . . , µ0  is obtainable strategy of replacing the leftmost nonregister signal first, the
by findRewriteSeq. function does rewrite t1 as e1 before taking up rewriting of
(Induction step): Suppose that the function can find the t2 . Hence, the function obtains t ⇐ e1 opt2 from t ⇐ ti
sequence σpre , when d(T (p′ )) ≤ i, where p′ is realizable and (= µm ∈ M ′A ) constructing, in the process, the sequence
of the form t ⇐ e. Now, consider any realizable RT-operation reverse(σt ), reverse(σ1 ). It then takes up rewriting of t2 to
p with d(T (p)) = i + 1. Therefore, p must be of the form e2 ; by similar argument as used above for rewriting of t1 to
t ⇐ e1 ope2 , where d(T (t1 ⇐ e1 )), d(T (t2 ⇐ e2 )) ≤ i. Let e1 , it can be seen that the function constructs the sequence
σpost (t1 ⇐ e1 ) and σpost (t2 ⇐ e2 ) be σ1 = µ1,0 , µ1,1 , . . . µ1,l  reverse(σ2 ) in course of this rewriting. Hence, it does construct
and σ2 = µ2,0 , µ2,1 , . . . µ2,r , respectively. The sequence reverse(σt ), reverse(σ1 ), reverse(σ2 ) in rewriting t ⇐ ti to
σpost (p) is, therefore, σ2 , σ1 , σt , where σt = σ(t ⇐ t ⇐ e1 ope2 = p.
t1 opt2 ). So, the sequence σpre (p) = reverse(σ2 , σ1 , σt ).
Since, d(T (t1 ⇐ e1 )), d(T (t2 ⇐ e2 )) ≤ i, by induction hy- B. Complexity Analysis of the Modules
pothesis, the function can construct the sequence reverse(σ1 ) 1) Complexity of findRewriteSeq: Let the number of FUs
and reverse(σ2 ). So, it remains to be proved that the function be f , the number of registers be r and the number of wires be
constructs (i) reverse(σt ) corresponding to t ⇐ t1 opt2 w. Let the number of interconnect components (like muxes,
and (ii) the sequence reverse(σt ), reverse(σ1 ), reverse(σ2 ) = demuxes, switches, etc.) be c and the maximum of number of
reverse(σ2 , σ1 , σt ) corresponding to p. inputs of an interconnect component be k. A micro-operation
(i) The proof that the function constructs reverse(σt ) cor- in the datapath involves either two wires through some inter-
responding to t ⇐ t1 opt2 is as follows: Let tl , tr and to connect unit or a register or an FU. So, the maximum number
respectively be the left input, the right input and the output of micro-operations possible in the datapath is O(kc + r + f ).
of the FU which performs the operation “op.” So, in order In each invocation of the function findRewriteSeq, one term
to realize the RT-operation t ⇐ t1 opt2 , it is necessary to (wire) is rewritten and no term is rewritten more than once.
realize the sequence of RT-operations tr ⇐ t2 , tl ⇐ t1 , Hence, the number of invocations of the (recursive) function is
to ⇐ tl optr , t ⇐ to , according to right-first postorder O(w). The maximum number of terms that can be present in an
traversal of the parse tree of t ⇐ t1 opt2 . In other words, RHS expression is O(r + w). So, the complexity of ensuring
the realizing sequence σt of micro-operations can be split as that no nonregister signal is present in an RHS expression
follows: σt = µ0 , µ1 , . . . , µm  = σt2 , σt1 , σt3 , σto , where σt2 = (i.e., the step 1 of findRewriteSeq) is O(r + w). Similarly,
µ0 , µ1 , . . . , µn1  corresponds to (the parse tree of) the linear the complexity of finding the leftmost nonregister signal in
chain tr ⇐ t2 , σt1 = µn1 +1 , µn1 +2 , . . . , µn1 +n2  corresponds to an RHS expression (i.e., the step 4 of findRewriteSeq)
the linear chain tl ⇐ t1 , σt3 is µn1 +n2 +1 = to ⇐ tl optr  and and that of determining whether s ∈ replaced (i.e., step
σto = µn1 +n2 +2 , µn1 +n2 +3 , . . . , µm  corresponds to the linear 5 of findRewriteSeq) is O(r + w). The maximum number
chain t ⇐ t0 . of micro-operations in M A is O(c + r + f ). Therefore, the
Now, the last micro-operation µm in the forward rewrite complexity of finding a subset M s of M A (i.e., the step 8 of
sequence must be of the form t ⇐ ti , where ti is the input findRewriteSeq) is O(c + r + f ). So, the complexity of the
signal name of the component whose output is t. Let the function findRewriteSeq is O(w ∗ ((r + w) + (r + w) + (r + w) +
function findRewriteSeq be invoked with µm as the argument. (c + r + f ))) = O(w2 + wr + wf + wc).
The function, in turn, selects its right hand side ti for rewriting. 2) Complexity of module RTLV-1: The maximum number
Since t ⇐ to is a linear chain realized by σto , by Lemma 2, of micro-operations possible in M ′A is O(r + f ). Therefore,
the function findRewriteSeq constructs reverse(σto ). In the complexity of the module RTLV-1 is |M ′A |∗ complexity
the process, the expression ti changes to to . Therefore, of findRewriteSeq = O((r + f ) ∗ (w2 + wr + wf + wc)) =
findRewriteSeq selects µn1 +n2 +1 as the next micro-operation O(w2 r + w2 f + wr 2 + f 2 w + wrf + wcr + wcf ).
in the sequence and obtains the RT-operation as t ⇐ tl optr . 3) Complexity of the modules Multicycle and Pipelined:
The function next rewrites tl to t1 , corresponding to the linear Both the modules have the same complexity O(e), where e is
chain tl ⇐ t1 ; by Lemma 2, therefore, the function constructs the number of edges in the controller FSM.
reverse(σt1 ). By similar argument, the function then constructs 4) Complexity of the module RTLV-0: Let |A|, i.e., the
reverse(σt2 ) to rewrite tr to t2 . Thus, the function constructs the number of control signals over the entire datapath, be n. So,
sequence reverse(σto ), µn1 +n2 +1 , reverse(σt1 ), reverse(σt2 ) = the complexity of obtaining the set M A of micro-operations
reverse(σt2 , σt1 , µn1 +n2 +1 , σto ) = reverse(σt ). for a control assertion pattern A using the mechanism de-
(ii) The proof that the function findRewriteSeq con- scribed in Section III-B is O(n ∗ (kc + r + f )). In the next
structs the sequence reverse(σt ), reverse(σ1 ), reverse(σ2 ) = step, the module RTLV-0 invokes RTLV-1 to find the RT-
reverse(σ2 , σ1 , σt ) for the RT-operation p is as follows: operations realized by M A . So, the complexity of this step
When findRewriteSeq is invoked with t ⇐ ti in M ′A , it is O(w2 r + w2 f + r 2 w + f 2 w + wrf + wcr + wcf ). If any of
returns the RT-operation t ⇐ t1 opt2 by constructing the the RT-operations returned by the module RTLV-1 contains in
sequence reverse(σt ). The function findRewriteSeq next takes its RHS some multicycle operation or the output signal of a
up t1 for rewriting. Since, p is realizable, so is the RT-operation pipelined unit, then it invokes multicycle for the former case

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
KARFA et al.: VERIFICATION OF DATAPATH AND CONTROLLER GENERATION PHASE IN HIGH-LEVEL SYNTHESIS OF DIGITAL CIRCUITS 489

TABLE II The number of states and the control structure of the behavior
Characteristics of the HLS Benchmarks Used in Our are not modified in this phase. Hence, there is a one-to-one
Experiment correspondence between the states of the input FSMD M1
Benchmark #BBs #Branching #Operations #Variables
and the constructed FSMD M2 . Let the mapping between the
Blocks states of M1 and those of M2 be represented by a function
DIFFEQ 4 1 19 12 f12 : Q1 ↔ Q2 . The state q2i (∈ Q2 ) of the FSMD M2 is said
EWF 1 0 53 37 to be the corresponding state of q1i (∈ Q1 ) if f12 (q1i ) = q2i .
DCT 1 0 58 53 A transition q2k−
GCD 7 5 9 3
→c q2l of the FSMD M2 is said to correspond
to a transition q − →′ q1l of the FSMD M1 if f12 (q1k ) = q2k ,
TLC 17 6 28 5 1k c
MODN 7 4 12 9 f12 (q1l ) = q2l and the condition c is equivalent to the condition
IIR FIL 3 0 27 20 c′ . A set of RT-operations are formed for each state transition
BARCODE 28 25 52 17 of the FSMD M2 from the corresponding control assertion
IEEE754 72 28 115 37 pattern. Now, the question is whether all the RT-operations
LRU 42 17 71 19 corresponding to each state transition of the FSMD M1 are
DHRC 35 14 131 72
captured by the controller or not. It may be noted that because
of minimization of the controller output functions, some
to obtain the set T1 and pipelined for the latter case to obtain spurious RT-operations may get activated. So, the verification
the set T2 . The complexity of both these modules is O(e). The tasks consist in showing that all the RT-operations in each state
module RTLV-0 now performs step 1 and step 2 as discussed transition in FSMD M1 are also present in the corresponding
above for each member of the set T1 or T2 . Their cardinalities, state transition in FSMD M2 and no extra RT-operation occurs
|T1 | and |T2 |, are O(e). So, the complexity of this step is in that transition of the FSMD M2 .
O(e∗((knc+nr+nf )+(w2 r+w2 f +r 2 w+f 2 w+wrf +wcr+wcf ))). It may be noted that algebraic transformation techniques
The module searches for the RT-operation p ⇐ e1ope2 based on commutativity, associativity, and distributivity of
in the set T1 for multicycle operations and p ⇐ e on the arithmetic operations are often used during datapath synthesis
transitions in T2 for pipelined operations. The number of RT- to improve interconnection cost [4], [25]. Hence, the RTL
operations, which have FU output signals in their LHS per- operations in one transition of FSMD M1 and in the corre-
formed by the micro-operations for a control assertion pattern, sponding transition of FSMD M2 may not be syntactically
is O(f ). So, the complexity of this step of module RTLV-0 is identical. Specification of digital systems implementing algo-
O(f ∗ e). Finally, it searches in the micro-operations activated rithmic computations involves the whole of integer arithmetic
by the control signal assertion patterns in the transitions of T1 for which a canonical form does not exist. Instead, we use a
for a micro-operation which contains a register term in its LHS normal form adapted from [24] during equivalence checking.
belonging to the expressions e1 or e2. The complexity of this The normalization process reduces many computationally
step is O(e ∗ (r ∗ (c + r + f ))). The module RTLV-0 performs the equivalent formulas to a syntactically identical form. We
above tasks for each transitions of the controller FSM. Hence, have also added several simplification rules on normalized
the overall complexity of the module RTLV-0 is O(e ∗ (w2 re + expressions, the details of which may be found in [14].
w2 fe + r2 we + f 2 we + enkc + ewrf + ewcr + enr + enf )). So, the It has been assumed that the control flow graph of the
complexity of RTLV-0 is quadratic in the size of the controller behavior is not changed subsequent to the scheduling phase.
and cubic in the size of the datapath. It may be noted that However, controller FSMs can be minimized [2]. Under such
datapath size is independent of the size of the controller. a situation, our method, in its present form, will not be able
5) Space Complexities of the Modules: It may be recalled to establish equivalence as the bijections no longer hold. If,
that the datapath structure is entirely captured by the function however, the FSM minimization is a distinct phase following
fmc . The maximum number of micro-operations possible in the datapath and controller generation phase, as is usually the
the datapath is kc + r + f . The number of control signals is case [2], then the verification of the FSM minimization can be
n. So, the space required to store fmc is O((kc + r + f ) ∗ separately addressed as the equivalence checking problem of
n)). The module findRewriteSeq and RTLV-1 work on fmc . two FSMs. Another way to upgrade the present verifier is to
So, the space complexity of findRewriteSeq and RTLV-1 is use the path-based equivalence checking method as reported
O((kc + r + f ) ∗ n)). Let the number of states in the controller in [14]; this approach, however, would have an exponential
FSM be s. The space required to store the controller FSM is upper bound.
O(sn). The modules multicycle and pipelined work on the
controller FSM. Hence, the space complexity of both of the
VII. Experimental Results
modules is O(sn). The module RTLV-0 works on both fmc and
the controller FSM. Hence, space complexity of this module The verification method described in this paper has been
is O(sn2 ∗ (kc + r + f )). implemented in C and integrated with an existing high-level
synthesis tool SAST [12]. It has been run on a 2.0 GHz Intel
Core 2 Duo machine with 2 GB RAM on the outputs generated
VI. Verification by Equivalence Checking by SAST for eleven HLS benchmarks [22]. Some of the
In the datapath and controller generation phase, the behavior benchmarks such as, differential equation solver (DIFFEQ),
represented by the input FSMD M1 is mapped to hardware. elliptic wave filter (EWF), IIR filter (IIR FIL) and discrete

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
490 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010

cosine transformation (DCT), are data intensive, some are 29 micro-operations; the corresponding figures for IEEE754
control intensive such as, greatest common divisor (GCD), example for architectural parameters 3, 2, 1 are 52, 3, 112,
traffic light controller (TLC), (a ∗ b) modulo n (MODN) and 110, and 235. The controller for the GCD example has 8
barcode reader (BARCODE), whereas some are both data and states and 21 control signals. The corresponding figures for
control intensive such as, IEEE floating point unit (IEEE754), the IEEE754 example are 120 and 147. The construction time
least recently used cache controller (LRU) and differential of the FSMD of IEEE754 is only ten times higher than that
heat release computation (DHRC) [22]. The number of basic of GCD. The average FSMD construction time is higher than
blocks, branching blocks, three-address operations and vari- that of verification time. The construction time, however, is
ables for each benchmark are tabulated in Table II. Before not very high and is less than two seconds in all the cases.
presenting the details of the experiments, a brief introduction In our second experiment, we consider two data intensive
to SAST is in order. The datapath produced by SAST may be benchmarks, i.e., DIFFEQ and DCT examples, which consist
viewed as a set of architectural blocks (A-blocks) connected of 6 and 16 multiplications, respectively. These behaviors have
by a set of global buses. Each A-block has a local functional been synthesized in SAST with two different architectural
unit (FU), local storage and local buses. The datapath is parameters. In addition to that, the same observations have
characterized by the number of A-blocks, the number of been repeated twice; in the first step, all the multiplication
global buses interconnecting the A-blocks and the number operations of the behavior are taken as 2-cycle ones; in the
of access links or access width connecting an A-block to second step, these are taken as two stage pipelined. The
the global buses. SAST takes these architectural parameters same set of observations as in the first part of Table III are
along with the high-level behavior as inputs. The tool produces carried out and recorded in Table IV for this experiment. Our
different schedule of operations, different binding of variables rewriting method successfully constructed the RT-operations
and operators and hence, different datapath interconnection from the control assertion patterns for all these cases. The
and controller from a given high-level behavior for different FSMD construction time and the verification time are not
architectural parameters. The design synthesized by the SAST particularly high here either.
tool under the above mentioned architectural parameters are In the third experiment, we consider erroneous designs.
well suited to test this verifier as the designed datapaths have For example, in the RTL design of IEEE754 floating
complex interconnections and complex data transfers. point example generated by SAST with architectural para-
In our first experiment, we assume that all operations in the meters 3, 2, 1, we modify the 70th state of the FSM.
benchmark examples are single cycle. Two different sets of The correct RT-operations in this state are var6 ⇐
architectural parameters comprising the number of A-blocks, exponent1 + exponent2 and mmult ⇐ mantissa1 ×
the number of global buses and the number of access links mantissa2 and the corresponding control assertion pattern is
are considered for each benchmark depending on the size of 0x1180400000020000402008000800118004080. Now, we
the benchmark. The results for all the HLS benchmarks are inject different faults in the RTL design by manually changing
shown in Table III. The number of registers, functional units, some bits of the control assertion pattern and check whether
interconnection wires and switches (which is the only inter- our method can successfully find the corresponding bugs. The
connection component here), the number of micro-operations set of micro-operations in this state is rendered inadequate by
possible in the datapath, the number of states in the controller changing the 19th hex digit of the assertion pattern from 2 to
FSM, the number of control signals used to control the micro- 0 (i.e., 0x1180400000020000400008000800118004080).
operations, the average time of construction of FSMD from As a result, the micro-operation alu1Lin ⇐ exponent1Out
datapath and controller and the average time of verification is not realized in this state. The findRewriteSeq function
by equivalence checking for each benchmark program for both finds this inconsistency as it fails to find the replacement
architectural parameters are shown in columns 3–11 (under the of alu1Lin during rewriting. The function reports “inad-
designation “correct design”) of this table. It may be noted equate set of micro-operations” and returns the partially
that the number of control signals for each benchmark is less computed micro-operation sequence from the destination
than the number of micro-operations in the datapath because register var6 to alu1Lin. Similarly, data conflict is next
SAST optimizes the number of control signals required to introduced in this state by changing the assertion pattern
control the micro-operations in the datapath and some of the to 0x1180400000220000400008000800118004080. As a
micro-operations depend on more than one control signal. Our result, the data from both registers r223 and exponent1
method can successfully find the RT-operations for this case. try to pass to alu1Lin. The function findRewriteSeq finds
Furthermore, the number of RT-operations in each benchmark two replacements, i.e., aluLin ⇐ r223Out and aluLin ⇐
(Table II, column 4) is much higher than the number of states exponent1Out for alu1Lin and then reports “data con-
(Table III, column 8) in the controller FSMs. It indicates that flict.” In the same way, other faults are also injected
more than one RT-operation are executed in the datapath for a and successfully found by our FSMD construction method.
control assertion pattern. Again, our method successfully finds In the next step, we set the control assertion pattern to
all the RT-operations from a given control assertion pattern. 0x1180400000020000402000800800118004004. As a re-
The FSMD construction time is seen to increase linearly sult, the RT-operation mantissa ⇐ mantissa1 × mantissa2,
with the design size. For instances, the datapath of the GCD instead of the original RT-operation mmult ⇐ mantissa1 ×
example for architectural parameter 2, 2, 1 consists of 14 mantissa2, is executed in the datapath. The RTLV-0 constructs
registers, 2 FUs, 24 interconnection wires, 15 switches and the RT-operation mantissa ⇐ mantissa1 × mantissa2. Subse-

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
KARFA et al.: VERIFICATION OF DATAPATH AND CONTROLLER GENERATION PHASE IN HIGH-LEVEL SYNTHESIS OF DIGITAL CIRCUITS 491

TABLE III
Results for Several High-Level Synthesis Benchmarks

Benchmarks Arch. Correct Design Erroneous Design


Params. Time (ms) M2 Construct. Eq. Check.
#Reg #FU #Wire #Switch #µ-opn #State #Sig Construct. Verif. #Change #Errors #Time #Errors #Time
DIFFEQ 3, 2, 1 14 3 49 42 82 13 60 188 7 6 4 165 – –
4, 3, 2 12 4 54 48 88 10 67 170 7 2 0 183 1 8
EWF 3, 2, 1 17 3 63 83 135 34 109 504 3 10 6 443 – –
4, 3, 2 19 4 75 108 165 23 138 458 3 20 9 536 – –
DCT 3, 2, 1 25 3 70 190 154 29 117 656 13 7 6 576 – –
4, 3, 2 31 4 96 130 218 23 171 597 14 5 1 554 – –
GCD 2, 2, 1 4 2 24 15 29 8 21 141 21 2 0 126 1 20
2, 1, 1 4 2 21 13 26 8 19 140 21 7 2 104 – –
TLC 2, 2, 1 14 2 32 21 46 22 28 229 35 3 1 212 – –
3, 2, 1 16 3 45 23 51 22 37 205 31 10 4 260 – –
MODN 2, 2, 1 15 2 43 33 71 15 46 144 18 8 5 162 – –
3, 2, 1 15 3 58 34 78 15 49 192 19 11 7 195 – –
IIR FIL 3, 2, 1 18 3 57 50 97 21 71 242 12 7 4 245 – –
4, 3, 2 21 4 81 74 131 16 99 231 10 5 0 252 2 12
IEEE754 3, 2, 1 52 3 112 110 235 120 147 1420 55 26 7 1306 – –
4, 3, 2 55 4 133 150 292 107 197 1260 60 16 4 1529 – –
BARCODE 3, 2, 1 30 3 63 57 118 76 82 565 68 17 7 487 – –
4, 3, 2 34 4 80 81 157 58 116 510 60 3 0 592 0 55
LRU 3, 2, 1 30 3 82 105 190 72 140 1020 48 32 11 885 – –
4, 3, 2 33 4 102 112 217 69 152 980 46 11 5 805 – –
DHRC 3, 2, 1 49 3 115 136 252 97 177 1445 67 44 11 1372 – –
4, 3, 2 57 4 139 148 289 95 198 1295 59 12 2 1387 – –

TABLE IV
Results for Multicycle and Pipelined Datapath for Two High-Level Synthesis Benchmarks

Bench- Operator Arch. Correct Design Erroneous Design


marks type Params. Datapath info. Controller info. Time (ms) M2 construct.
#Regs #FUs #Wires #Switchs #Micro-opns #States #Ctrl FSMD Equiv. #Bits
sigs construct. check changes #Errors #Time
DIFFEQ Multicycle 3, 2, 1 14 3 52 51 91 13 69 211 8 3 1 200
4, 3, 2 15 4 60 57 99 11 75 194 8 4 3 214
Pipelined 3, 2, 1 11 3 44 42 75 12 57 184 7 1 1 205
4, 3, 2 13 4 58 54 99 10 75 178 7 2 2 195
DCT Multicycle 3, 2, 1 30 3 77 92 166 42 124 723 15 6 4 730
4, 3, 2 28 4 93 126 209 27 196 726 14 3 2 722
Pipelined 3, 2, 1 30 3 76 91 165 36 123 629 14 4 3 642
4, 3, 2 30 4 91 115 199 26 154 580 13 2 2 567

quently, the equivalence checking step finds this mismatch of number of control bits changed, the number of errors identified
RT-operations in the FSMD transition. and the corresponding verification time are given in columns
We further introduce faults by randomly changing control 13–15 of Table IV. It may be noted that the execution times of
bits from 1 to 0 or vice versa in the controller of all the the method for the erroneous design are comparable with those
benchmark examples. The number of control bits changed, for the correct design. In our next part of this experiment, we
the number of errors detected during FSMD construction and introduce faults in the datapath by altering the connections of
equivalence checking and their respective time for this exper- some interconnection switches without changing their control
iment are tabulated in columns 12–16 (under the designation assertions (the controller circuit stays unaltered). These errors
“erroneous design”) of the Table III. It may be noted that are also successfully identified by our verifier; the time taken
the equivalence checking is not needed if some errors are is comparable with other experiments.
detected during FSMD construction. In most of the cases,
errors are found correctly during construction of FSMD. In
some cases, incorrect or redundant RT-operations are con- VIII. Conclusion
structed but they are found subsequently during equivalence High-level synthesis flow of complex circuits comprises
checking. Random modification of the control bits, in some various phases where each phase performs some specific
cases, resulted in (benign) partial data flows in the datapath tasks algorithmically providing for ingenious interventions
which did not realize any RT-operation. For such modified of experts. The gap between the original behavior and the
designs, no errors were reported because the original behavior finally synthesized circuit is too wide to be analyzed by
is still realized. Our rewriting method does not reveal such any monolithic reasoning mechanism. The validation tasks,
partial dataflows without compromising the correctness of the therefore, must be planned to go hand in hand with each
method. Furthermore, for pipelined and multicycle operations, phase of synthesis. The present paper concerns itself with the
we have modified the control bits so that they are performed validation of the datapath and controller generation phase of
inconsistently. All these errors have also been detected. The high-level synthesis.

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.
492 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 3, MARCH 2010

The verification task is performed in two steps. In the [14] C. Karfa, D. Sarkar, C. Mandal, and P. Kumar, “An equivalence-checking
method for scheduling verification in high-level synthesis,” IEEE Trans.
first step, a novel and formally proven rewriting method Comput.-Aided Design Integrated Circuits Syst., vol. 27, no. 3, pp. 556–
is presented for finding the RT-operations performed in the 569, Mar. 2008.
datapath by a given control assertion pattern. Several incon- [15] C. Karfa, D. Sarkar, C. Mandal, and C. Reade, “Register sharing
verification during data-path synthesis,” in Proc. Int. Conf. Comput.
sistencies in both the datapath and the controller may be Theory Applicat., 2007, pp. 135–140.
revealed during construction of the FSMD (M2 ). Unlike many [16] Y. Kim, S. Kopuri, and N. Mansouri, “Automated formal verification of
other reported techniques, this paper provides a completely scheduling process using finite state machine with datapath (FSMD),”
in Proc. Int. Symp. Quality Electron. Design, 2004, pp. 110–115.
automated verification procedure of pipelined, multicycle, and [17] Y. Kim and N. Mansouri, “Automated formal verification of scheduling
chained datapaths produced through high-level synthesis. Its with speculative code motions,” in Proc. Great Lakes Symp. Very-Large-
correctness and complexity analysis are also given. In the Scale Integration (GLSVLSI), 2008, pp. 95–100.
[18] S. Kundu, S. Lerner, and R. Gupta, “Validating high-level synthesis,” in
second step, a state-based equivalence checking methodology Proc. Comput. Aided Verif., New York: ACM, 2008, pp. 459–472.
is used to verify the correctness of the controller behavior. [19] N. Mansouri and R. Vemuri, “A methodology for automated verification
Experimental results on several HLS benchmarks demonstrate of synthesized RTL designs and its integration with a high-level syn-
thesis tool,” in Proc. Formal Methods Comput.-Aided Design, 1998, pp.
the effectiveness of our method. 204–221.
[20] N. Mansouri and R. Vemuri, “Accounting for various register allocation
schemes during post-synthesis verification of RTL designs,” in Proc.
Design, Automat. Test Eur., Mar. 1999, pp. 223–230.
Acknowledgment [21] N. Narasimhan, E. Teica, R. Radhakrishnan, S. Govindarajan, and
R. Vemuri, “Theorem proving guided development of formal assertions
The authors would like to thank the anonymous reviewers in a resource-constrained scheduler for high-level synthesis,” in Proc.
for their comments and suggestions which helped authors Int. Conf. Comput. Design, 1998, pp. 392–399.
[22] P. R. Panda and N. D. Dutt, “1995 high level synthesis design reposi-
immensely to improve this paper. tory,” in Proc. Int. Symp. Syst. Synthesis, 1995, pp. 170–174.
[23] R. Radhakrishnan, E. Teica, and R. Vermuri, “An approach to high-level
synthesis system validation using formally verified transformations,” in
Proc. IEEE High Level Des. Validat. Test Workshop, 2000, pp. 80–85.
References [24] D. Sarkar and S. C. De Sarkar, “Some inference rules for integer
arithmetic for verification of flowchart programs on integers,” IEEE
[1] P. Ashar, S. Bhattacharya, A. Raghunathan, and A. Mukaiyama, “Ver- Trans. Softw. Eng., vol. 15, no. 1, pp. 1–9, Jan. 1989.
ification of RTL generated from scheduled behavior in a high-level [25] J. Zory and F. Coelho, “Using algebraic transformations to optimize
synthesis flow,” in Proc. IEEE/Assoc. Comput. Machinery Int. Conf. expression evaluation in scientific code,” in Proc. Parallel Architect.
Comput.-Aided Design, 1998, pp. 517–524. Compilation Tech., 1998, pp. 376–384.
[2] R. A. Bergamaschi, D. Lobo, and A. Kuehlmann, “Control optimization
in high-level synthesis using behavioral do not cares,” in Proc. Design
Automat. Conf., 1992, pp. 657–661. Chandan Karfa received the B.Tech. degree in
[3] D. Borrione, J. Dushina, and L. Pierre, “A compositional model for the information technology from the University of
functional verification of high-level synthesis results,” IEEE Trans. Very- Kalyani, Kalyani, India, in 2004, and the M.S.
Large-Scale Integration Syst., vol. 8, no. 5, pp. 526–530, Oct. 2000. degree in computer science and engineering from
[4] A. P. Chandrakasan, M. Potkonjak, R. Mehra, J. Rabaey, and R. W. the Indian Institute of Technology (IIT), Kharagpur,
Brodersen, “Optimizing power using transformations,” IEEE Trans. India, in 2007. Currently, he is working toward the
Comput.-Aided Design Integrated Circuits Syst., vol. 14, no. 1, pp. 12– Ph.D. degree at the Department of Computer Science
31, Jan. 1995. and Engineering, IIT Kharagpur.
[5] T.-H. Chiang and L.-R. Dung, “Verification method of dataflow algo- His research interests include formal verification
rithms in high-level synthesis,” J. Syst. Softw., vol. 80, no. 8, pp. 1256– of high-level synthesis and embedded systems.
1270, 2007.
[6] H. Eveking, H. Hinrichsen, and G. Ritter, “Automatic verification of
scheduling results in high-level synthesis,” in Proc. Design, Automat.
Test Eur., Mar. 1999, pp. 59–64.
Dipankar Sarkar received the B.Tech. and M.Tech.
[7] X. Feng and A. J. Hu, “Early cutpoint insertion for high-level software
degrees in electronics and electrical communica-
versus RTL formal combinational equivalence verification,” in Proc.
tion engineering, and the Ph.D. degree in engineer-
Design Automat. Conf., 2006, pp. 1063–1068.
ing from the Indian Institute of Technology (IIT),
[8] M. Fujita, “Equivalence checking between behavioral and RTL descrip-
Kharagpur, India.
tions with virtual controllers and datapaths,” Assoc. Comput. Machinery
He has been a faculty member at the Department
Trans. Des. Autom. Electron. Syst., vol. 10, no. 4, pp. 610–626, 2005.
of Computer Science and Engineering, IIT, for the
[9] D. D. Gajski, N. D. Dutt, A. C.-H. Wu, and S. Y.-L. Lin, “Architectural
past 27 years. His research interests include formal
models in synthesis,” in High-Level Synthesis: Introduction to Chip and
verification of circuits and systems.
System Design. Boston, MA: Kluwer, 1992, pp. 27–61.
[10] S. Gupta, N. Dutt, R. Gupta, and A. Nicolau, “SPARK: A high-level syn-
thesis framework for applying parallelizing compiler transformations,”
in Proc. Int. Conf. Very-Large-Scale Integration Design, Jan. 2003, pp.
461–466. Chittaranjan Mandal received the Ph.D. degree
[11] S. Gupta, N. Dutt, R. Gupta, and A. Nicolau, “Using global code motions from the IIT, Kharagpur, India, in 1997.
to improve the quality of results for high-level synthesis,” IEEE Trans. He is currently an Associate Professor at the
Comput.-Aided Design Integrated Circuits Syst., vol. 23, no. 2, pp. 302– Department of Computer Science and Engineering,
312, Feb. 2004. IIT, Kharagpur. Prior to joining IIT, he was with
[12] C. Karfa, J. S. Reddy, C. R. Mandal, D. Sarkar, and S. Biswas, “SAST: Jadavpur University, Calcutta, India, as a Reader.
An interconnection aware high-level synthesis tool,” in Proc. 9th Very- His research interests include formal modeling, high-
Large-Scale Integration Design Test Symp., Bangalore, India, Aug. 2005, level designs, and web technologies.
pp. 285–292. Dr. Mandal has been an Industrial Fellow of
[13] C. Karfa, D. Sarkar, and C. Mandal, “Verification of data-path and Kingston University, London, U.K., since 2000. He
controller generation phase of high-level synthesis,” in Proc. Int. Conf. was a recipient of a Royal Society Fellowship.
Adv. Comput. Commun., 2007, pp. 315–320.

Authorized licensd use limted to: IE Xplore. Downlade on May 13,20 at 1:5426 UTC from IE Xplore. Restricon aply.

Das könnte Ihnen auch gefallen