Sie sind auf Seite 1von 4

A C ircu it for All Seasons

Behzad Razavi

TSPC Logic

Since its introduction in the 1980s, four-transistor dynamic implementa- an indeterminate logical value. This
true single-phase clock (TSPC) logic tion. This approach, however, required issue can be resolved by delaying the
[1] has found widespread use in two nonoverlapping clock phases second stages clock or by placing an
digital design. Originally proposed so as to avoid transparency during inverter at the output of each stage.
as a high-speed topology, the TSPC (slow) clock transitions. Shown in Figure 2,
structure also consumes less power That is, clock genera- the latter method is
and occupies less area than other tion and distribution Originally calledDominologic[3].
methods. In this article, we study had to deal with not proposed as Note that this family
the properties of this logic family. only skews but the a high-speed performs operations
loss of timing due to topology, the using NMOS tra n-
Background nonoverlap intervals, TSPC structure sistors, with p-type
In the early 1980s, the design of high- making single-phase also consumes switches acting as
speed digital CMOS circuits faced clocking more attractive. less power and only reset devices.
some interesting challenges. One Figure 1(b) depicts occupies less Domino, however, is
general issue was related to clock dis- a single-phase approach. area than other a noninverting circuit,
tribution in complex chips; heavy ca- Merged with the dy methods. prohibiting some log-
pacitive loading and long interconnects namic latches, the logic ical functions [4].
caused both slow transitions and skew, is realized by NMOS The inverters in
making it especially difficult to distrib- or PMOS devices in alternate stages Domino logic consume power while
ute multiple, high-speed clock phases. (NMOS and PMOS blocks, respec- realizing no particularly useful func-
On the other hand, it had already been tively). Here, when the clock (CK) is tion. We then consider including
recognized that dynamic logic afforded low, node X is precharged to VDD, dynamic logic within the inverters.
simpler, faster circuits that also occu- and when CK goes high, the N block Shown in Figure 3, the result is called
pied less area. For example, clocked is enabled and, according to the NORA logic [5], and the cost is two
CMOS (C2MOS) logic, introduced in inputs, keeps the ONE or discharges clock phases.
1973 [2] and illustrated in Figure 1(a), it to ZERO. The principal issue here Both Domino and NORA circuits
replaced more complex latches with a is that the second stage begins to suffer from charge sharing; for exam-
evaluate while the first precharges X , ple, when CK goes high in Figure 2,
Digital Object Identifier 10.1109/MSSC.2016.2603228 a race condition that can lead to a C X loses charge to C P if M1 is on and
Date of publication: 14 November 2016 partially charged level at Y and hence if VX must remain nominally high.


Logic Logic
Block Block

(a) (b)

Figure 1: (a) C2MOS logic and (b) an example of single-phase clocking.


The resulting degradation is less seri-
ous in Domino due to the restoration VDD VDD
provided by the inverter. Neverthe- X CK
less, the inverter does draw a static P N CK
current in such a case. By contrast, N Block Vout
C2MOS logic is free from charge shar- Block
ing (why?).
The single-phase clocking of
Figure 2: Domino logic.
CMOS latches can be traced back VDD
to 1973, when Oguey and Vittoz re-
ported the scheme shown in Figure VDD VDD
4 for a divide-by-two circuit [6]. Com- CK
pared to C2MOS, this configuration
employs fewer devices per branch. CK N P
Block Block (b)
In 1974, Piguet filed a patent for the
latch topology depicted in Figure 4(b)
Figure 4: (a) A single-phase frequency di-
[7], where the clocked device in the CK vider reported by Oguey and Vittoz in 1973
first stage is tied to its output node. and (b) a latch filed for patent by Piguet
In 1986, Christer Svensson of Figure 3: NORA logic. in 1974.
Linkoping University, Sweden, having
read the NORA paper [5] and been in-
trigued by its properties, asked high guarantee that, when CK goes low to voltages of M 3 are roughly equal when
Ph.D. student Ingemar Karlsson to precharge the first stage, the second CK is high, we follow the stage with an
investigate methods of improving its stages output remains intact. In sum- inverter but split the signal paths [Fig-
performance [8]. Karlsson came up mary, when CK is high, the first stage ure 7(a)] [1]. This latch passes A to X if
with a different idea and ran some evaluates while the second senses, CK is high and freezes X if CK is low
SPICE simulations that looked prom- and when CK is low, the first stage is and A has no high-to-low transitions.
ising. Svensson then assigned the reset while the second stores. Since the high level at node B 2 is equal
task to his other Ph.D. student, Jiren As an application example, TSPC can to VDD - VTH3, transistor M 5 receives
Yuan. Yuan modified Karlssons to- be used in a divide-by-two circuit. Since less overdrive and suffers from some
pology and, in July 1986, reported the cascade shown in Figure 6(b) does speed degradation.
his findings to Svensson. Figure 5 not invert, we precede it with a third To arrive at a master-slave flipflop,
shows the TSPC topology drawn by TSPC stage using a clocked PMOS tran- we precede the foregoing cascade
Yuan in that memo [8]. sistor [Figure 6(c)] and tie the output with another clocked branch with
TSPC gradually found its way into to the input [1]. Note that this arrange- split outputs, as shown in Figure 7(b).
digital design. In 1992, Digital Equip- ment exhibits no charge sharing. This realization incorporates only two
ment Corporation reported the use of It is possible to further reduce the clocked devices, serving as an attrac-
TSPC in its Alpha microprocessor [9]. number of clocked transistors through tive candidate for large register files. It
In 1993, Lu et al. exploited the idea in the use of split outputs [1]. Beginning is interesting to note that, even though
the design of a 700-MHz 24-b accumu- with the structure of Figure 6(a) and the first stage is sensitive to input
lator [10], and Rogenmoser et al. dem- recognizing that the drain and source transitions when CK is high, the overall
onstrated its potential in a 1.16-GHz
prescaler [11].

TSPC Principles
Let us return to the C2MOS topology of
Figure 1(a) and, in the spirit of Piguets
circuit, remove one of the clocked
transistors [Figure 6(a)]. When CK is
high, the latch reduces to an inverter
and operates properly. When CK is
low, the circuit is in the store mode
and retains the output state if A does
not change or has only a low-to-high
transition. If we precede this struc-
ture with a Domino stage that incor-
porates N-type logic [Figure 6(b)], we Figure 5: Yuans original drawing of a TSPC circuit.


M1 M1 CK M1
M2 M2 CK M2

(a) (b) (c)

Figure 6: (a) A dynamic latch with a single clocked device, (b) cascaded TSPC stages, and (c) a three-stage master-slave flip-flop operating as
a frequency divider.


M1 M8 M1
M4 M4
B1 B1 B2
A CK M3 X D CK M3 X M5
B2 A2 B2
M2 M5 M7 M2 M5

(a) (b) (c)

Figure 7: (a) A TSPC stage with split outputs, (b) a complete latch with split paths, and (c) a split-output topology incorporating random logic.

cascade is not. Specifically, suppose the As with basic static CMOS gates, the
VDD second stage must store a ONE (so that TSPC implementations studied previ-
M1 B M4 M6 X = ZERO ). For this state to be over- ously are unratioed, i.e., their NMOS
CK written when CK is high, A 2 must rise, and PMOS device widths need not sat-
which is not possible because M 6 is off. isfy certain ratios for the circuits to
M2 C
A CK M5 M7 Similarly, if the second stage is stor- operate properly. Both classes also
ing a ZERO, only a fall in A 1 can over- exhibit zero static power dissipation
write it, which cannot occur because (except for that due to leakage). To
Figure 8: A ratioed TSPC flip-flop. M 6 is off. improve the speed, we can allow some
The TSPC latches described here static current and construct the master-
can also employ random logic. For slave flip-flop shown in Figure 8 [13].
example, M 1 and M 2 in Figure 7(a) Here, a TSPC stage serves as the mas-
CK M1 can be replaced with dual logic ter and the last two stages as the slave.
A blocks [Figure 7(c)] [1]. In addition, When CK is high, B = A, C = ZERO,
a weak bleeder, transistor M b, can and X stores a logical value. After CK
B = One M5 CK M3
be added [9] so as to improve immu- goes low, C = B and X = C = B . We
CK M4 M2 nity to noise and leakage. observe that this operation requires
In addition to less hardware and that M 5 and M 7 be strong enough to
power, TSPC logic also affords designs impress a logical ZERO at their drain
having lower phase noise. With fewer nodes when their corresponding PMOS
transistors and faster transitions in device is on. The circuit draws a static
X the signal path, TSPC techniques lead current through M 5 when CK is high
CK A to less phase noise in circuits such and through M 7 when it is low. In a
as frequency dividers and phase/fre- typical design, we choose all of the
quency detectors (PFDs). For example, transistor widths to be roughly the
t1 t [12] reports 6 dB of phase noise reduc- same, except for W 5 , which should
tion if a PFD design incorporates TSPC be two to three times greater so as to
Figure 9: The problem of race in TSPC stages. logic rather than static CMOS gates. maximize the speed.


If the op amp has an input-
V+ referred offset of VOS , then the
output voltage assumes the form
Vout =
R 5 +R 4

# ; R 4 (VT 1n n -Vos) + VBE 3 E . (1)
We note that the offset is also
scaled down.

Q2 Q1
V OUT In Figure 1 of last issues column [14],
the base-emitter voltage of Q 3 should
KT In J1 read VBE3 .
q V BE V GO + (m-1) KTo
J2 R2 q

[1] J. Yuan and C. Svensson, High-speed
R1 2 1 KT In 1 CMOS circuit technique, IEEE J. Solid-State
R2 q J2 Circuits, vol. 24, pp. 6270, Feb. 1989.
V [2] Y. Suzuki, K. Odagawa, and T. Abe,
Clocked CMOS calculator circuitry, IEEE
J. Solid-State Circuits, vol. 8, pp. 462469,
Dec. 1973.
Figure 10: Brokaws bandgap circuit. [3] R. H. Krambeck, C. M. Lee, and H. S. Law,
High-speed compact circuits with CMOS,
IEEE J. Solid-State Circuits, vol. 17, pp.
614619, June 1982.
activating both M 4 and M 3 . Since M 3 [4] M. Shoji, CMOS Digital Circuit Technol-
VDD turns on while A is still high (around ogy. Englewood Cliffs, NJ: Prentice Hall,
M1 M3 t = t 1 ), X begins to fall until A has [5] N. P. Goncalves and H. J. de Man, NORA: A
A1 dropped enough to turn on M 1 . Con- racefree dynamic CMOS technique for pipe-
lined logic structures, IEEE J. Solid-State
+ sequently, X experiences a potentially Circuits, vol. 18, pp. 261268, June 1983.
R3 R4 large glitch that may be misinterpreted [6] H. Oguey and E. Vittoz, CODYMOS fre-
R5 quency dividers achieve low power con-
Q1 Q2 Q3 by subsequent stages. This issue can sumption and high frequency, Electron.
A nA be ameliorated by making M 4 and M 5 Lett., vol. 9, pp. 386387, Aug. 1973.
[7] C. Piguet, Logic circuit for bistable
stronger and M 2 and M 3 weaker. D-dynamic flip-flops, U.S. Patent 4 057
741, Nov. 8, 1977.
[8] C. Svensson, private communication,
Figure 11: Low-voltage reference. Questions for the Reader Aug. 2016.
1) Can the third stage in the fre- [9] D. Dubberpuhl, R. Witek, R. Allmon, R.
Anglin, D. Bertucci, S. Hassoun, G. Ho-
Design Considerations quency divider of Figure 6(c) be eppner, K. Kuchler, D. Meyer, J. Montan-
As with other dynamic logic families, a simple, unclocked inverter? aro, D. Priore, V. Rajagopalan, S. Samu-
drala, and S. Sinthanam, A 200-MHz 64-b
TSPC circuits fail at sufficiently low 2) Can the frequency divider of Fig- dual-issue CMOS microprocessor, IEEE J.
clock frequencies. Transistor leakages ure 6(c) generate an output with Solid-State Circuits, vol. 27, pp. 15551567,
Nov. 1992.
arising from subthreshold conduction a 50% duty cycle? [10] F. Lu, H. Samueli, J. Yuan, and C. Svens-
and source and drain junctions corrupt son, A 700-MHz 24-b pipelined accumu-
lator in 1.2-nm CMOS for application as a
the stored states if the clock period is Answers to Last Issues Questions numerically-controlled oscillator, IEEE J.
excessively long. This issue typically 1) Brokaws bandgap, shown in Fig- Solid-State Circuits, vol. 28, pp. 878886,
Aug. 1993.
becomes more serious at high tempera- ure 10, contains both positive and [11] R. Rogenmoser, N. Felber, Q. Huang, and
tures, demanding careful simulations. negative feedback. Prove that the W. Fichtner, 1.16 GHz dual-modulus 1.2
nm CMOS prescaler, in Proc. IEEE Custom
As a rule of thumb, we consider these negative-feedback loop is stronger. Integrated Circuits Conf., 1993, pp. 27.6.1
effects for clock rates below 100 MHz. The negative- and positive-feed- 27.6.4.
[12] A. Homayoun and B. Razavi, Analysis of
The use of a single clock phase back loops consist of amplifier A phase noise in phase/frequency detec-
can, in fact, create a race condition, and transistors Q 1 and Q 2 , respec- tors, IEEE Trans. Circuits Syst. I, vol. 60,
pp. 529539, Mar. 2013.
thereby producing glitches at some tively. Carrying equal currents, the [13] B. Chang, J. Park, and W. Kim, A 1.2-GHz
nodes in TSPC circuits [11]. Consider, two transistors have equal trans- CMOS dual-modulus prescaler using new
dynamic D-type flipflops, IEEE J. Solid-
for example, the flip-flop shown in conductances, but Q 2 is degener- State Circuits, vol. 31, pp. 749754, May
Figure 6(c), whose last two stages are ated by R 2 . As a result, the positive 1996.
[14] B. Razavi, The bandgap reference, IEEE
shown in Figure 9 with the assumption feedback is weaker. Solid-State Circuits Mag., vol. 8, pp. 912,
that B is high and so is the state stored 2) Is the op amp offset also scaled Summer 2016.
at X . Now suppose CK goes high, down in the circuit of Figure 11?