Sie sind auf Seite 1von 2

Precharge free dynamic content addressable consumption with more number of searches due to undesired switching

memory from frequent precharging. An approach to reduce power from pre-


charge and also to maintain effective evaluation time is presented
T.V. Mahendra, S.W. Hussain, S. Mishra and A. Dandapat✉ in this Letter by introducing a precharge free DCAM (PF-DCAM)
that eliminates precharge prior to every search operation. Removal of
A precharge free dynamic content addressable memory (DCAM) is precharge reduces evaluation to only search phase, therefore it simplifies
introduced for low-power and high-speed search applications. CAM operation to a single write phase and multi-search phases. In other
Elimination of precharge prior to search allows hardware engine to words, the DCAM accommodates more searches and in this PF-DCAM
perform more number of searches within the stipulated time. The design the power dissipation would be lesser compared to other DCAMs
proposed DCAM cell not only removes precharge of matchline (ML) because of the removal of precharge power and the elimination of
but also utilises decoupling of bitline and searchline so that unwanted switching from precharge to search and vice versa. To the best of our
capacitive couplings are minimised at charge storage nodes. A 512 bit knowledge, this Letter proposes the first ever PF-DCAM cell.
of the proposed scheme is implemented using 45 nm CMOS
technology and its efficacy is verified and proved through rigorous
variations with 1000-point Monte-Carlo sampling of ML voltage as Proposed PF-DCAM cell: The PF-DCAM cell as shown in Fig. 1c
well as multi-search dissipation analysis. works in the absence of precharge, which allows designers to perform
search easily without affecting CAM operation. Thus, the elimination
of extra cycle (precharge) results in extensive utilities for wide range
Introduction: The significance of high-density content addressable of low-power and high-speed applications. Such a precharge free
memory (CAM) is rapidly increasing with more demand for hardware design also leads to the reduction of time duration for evaluation.
search engines in modern computing and security systems. Software Superior advantages of PF-DCAM over other existed designs are
algorithms are limited due to hardship of low search speed [1], while more remarkable in repetitive search applications. Operation of the pro-
a hardware engine such as CAM fits well in network applications. posed PF-DCAM cell is as follows:
CAM performs search operation within single clock cycle [1] because
of the parallel comparison feature of the search key with stored words
Write phase: World line (WL) is kept HIGH and data is passed into the
in memory array. Static CAM utilises static random access memory
storage nodes (N1, N2) through bitlines (BL and BL). During this phase,
cell for storage and hence dissipates unwanted leakages besides
searchlines (SL, SL) are held LOW to avoid false state.
putting constraints on the cell area. To resolve these issues, associative
memory designs were extended to several dynamic memories. Mundy
Search phase: In every evaluation, ML precharge is not required in the
[2] introduced the first dynamic CAM (DCAM) cell and later improved
proposed cell. The search bit is fed onto the searchlines by asserting WL
in [3–5] with efficient schemes. Typically, traditional CAM operation
to LOW during this phase. If the search content matches with the stored
is divided into write and evaluation phase: During write, data has to
data, then LOW logic passes to node ‘S’ through T3 or T4 and conducts
be written into storage nodes and the evaluation is characterised by
T5 while T6 is OFF; ML charges to HIGH value through path P2 as in
precharge of MLs followed by search of a desired word. It is worthy
Fig. 1c, since MLI of first cell is connected to supply. If mismatch
to note that repetitive precharge prior to every search adds additional
occurs, then T3 or T4 passes HIGH logic to node ‘S’ making T6 ON;
ML power and cycle.
ML discharges to ground (GND) through path P1 by keeping T5 OFF.
A HIGH value on ML indicates that data is found (match) where as
BL BL write PRE search PRE search LOW value implies data is not found (mismatch). The operation is
WL
WL also summarised with timing diagram of Fig. 1e.
1 0 Numerous advantages of precharge free design make CAM to
BL
1 perform more number of search within the evaluation time and results
0
BL 0 in lesser dissipation per search at high speed. Elimination of precharge
ML
1 phase provides improvement in evaluation speed by 50% over existed
0 0
a SL DCAMs as half of the evaluation duration is reduced in proposed
0 1 design. Equations (1) and (2) represent the evaluation duration require-
WL SL ment to perform ‘N’ number of searches.
PRE 0 1 0 1 In precharge-based CAM design, the evaluation time to perform ‘N’
M number of searches is:
SL SL ML 1 0 1 MM
evaluation time = (N ∗ T ) + ML delay (1)
ML
BL BL d
write search search where N is the number of searches, T is the precharge cycle time.
b
WL Proposed design requirement for evaluation of ‘N’ searches:
1
WL 0
BL
evaluation time = ML delay (2)
T1 N1 N2 T2
1 0
0 BL
S
Results: CAM arrays of 32 × 16 bit have been implemented at 45 nm
SL T3 T4 SL 1
0 0
SL
technology node for the proposed PF-DCAM, 4 T DCAM [4] and 5 T
BL BL 1
DCAM [5]. The existed DCAMs and PF-DCAM have been implemented
T6
P1
0 SL with stacks of NAND-type ML structure. First, data have been written
MM
1 0 into storage nodes of cells and repetitive searches of 25, 50, 75 and
MLI T5 ML
0 M ML
P2 100 keys have been performed according to Figs. 1d and e to validate
c e
the importance of precharge free design in search intensive applications.
Post layout simulations of the proposed and compared DCAM archi-
Fig. 1 DCAM cells and timing diagram of their operational phases tectures are carried out in SPECTRE at typical process corner under
a 4 T DCAM [4] 1 V supply. To take into account the effect of precharge in existing
b 5 T DCAM [5]
c Proposed PF-DCAM DCAMs, time ‘T’ is considered as the extra cycle time of ML precharge
d Precharge-type DCAM operational phases before every search in CAM evaluation phase. For a single search, 4 T
e Proposed PF-DCAM operational phases DCAM takes (T + 6.23 ns) to detect match/miss condition and 5 T
DCAM results the same in (T + 5.20 ns) whereas proposed design takes
Owing to technology scalability, there are reduction requirements in only 3.8 ns. Advantage of precharge elimination is more significant for
design components too. Existing works of DCAM accomplished parts incremental number of multi-searches. Out of the two phases of traditional
of these by increasing density in cell and more widely in array, yet CAM operation, one major contribution to power consumption comes
their applicability are limited due to low speed. However, it attracted from the evaluation phase that comprises of precharge and search. To
researchers to improve DCAM designs for performance efficiency. limit power dissipation at evaluation phase, PF-DCAM eliminates
Previous works on DCAM as shown in Fig. 1a [4] and Fig. 1b [5] are precharge by keeping only search during evaluation; such that switching
based on ML precharge that increases evaluation time and power of ML between precharge and search is removed. Fig. 2 shows proposed

ELECTRONICS LETTERS 3rd May 2018 Vol. 54 No. 9 pp. 556–558


architecture by showing layout of single cell to show interconnects. In For varied number of searches, a cumulative analysis based on
5 T DCAM and proposed PF-DCAM, power dissipation decreases as post-layout simulation is shown in Fig. 3 to verify the merits and
number of search increases but in 4 T DCAM dissipation increases for improvements in proposed design. Further, feasibility of the proposed
same variation of searches due to coupled bitline and searchline as scheme is checked through mismatch analysis of 1000 random searches
depicted in Fig. 1a. The utilisation of decoupled bitlines, searchlines of using Monte-Carlo simulation. The results of the analysis depicted in
cell structure leads to lessen average power; it resulted an improvement Fig. 4, which shows the variation of matchline (ML) current and
by 55% and 78.85% as compared with 5 T DCAM and 4 T DCAM, voltage swing, respectively, for mismatch of search keys. During the
respectively, at 75 repetitive searches. search cycle expanding from 100 ns, the ML voltage of different
search key corresponding to mismatch conditions show negligible
variation in levels as they settles down to LOW level in short search
BL BL time, thus the proposed PF-DCAM operates well at possible mismatch
WL variations. Different components of power consumption and energy
required for 100 searches are summarised in Table 1 for the supply of
1 V at room temperature.

SL Table 1: Power performance comparison summary


SL
Parameter 4 T DCAM [4] 5 T DCAM [5] Proposed
technology 45 nm 45 nm 45 nm
supply voltage (V) 1.0 1.0 1.0
average power (nW) 1540 464.6 195.2
peak power (mW) 3.29 4.41 1.57
MLI VDD GND ML static power (nW) 56.5 0.321 1.92
energy (pJ) 30.98 9.33 1.97
Fig. 2 Proposed PF-DCAM array by showing cell layout
Conclusion: This Letter introduces the first PF-DCAM. The proposed
PF-DCAM performs higher number of searches at minimum power
1500 proposed 6T PF-DCAM dissipation with high evaluation speed compared to existing precharge
5T DCAM [5]
4T DCAM [4]
type DCAMs. PF-DCAM also reduces evaluation time by 50% than
%
35

1200
the precharge-based designs as it accommodates extra search in place
of precharge during evaluation. This precharge free scheme would be
average power, nW

69.85%

suited for high-density storage, high-performance and low dissipation


900 % search for various applications.
29
52.92%

% Acknowledgment:: This research was supported in part by the Science


20.6
32%

600
and Engineering Research Board under project YSS/2015/001198,
Ministry of Electronics and Information Technology (MeitY) under
44

58%
55%
.8%

300 project SMDP-C2SD 9(1)/2014-MDD, Government of India.

© The Institution of Engineering and Technology 2018


0
25 50 75 100 Submitted: 16 February 2018 E-first: 27 March 2018
number of searches doi: 10.1049/el.2018.0592
One or more of the Figures in this Letter are available in colour online.
Fig. 3 Average power at different repetitive number of searches
T.V. Mahendra, S.W. Hussain, S. Mishra and A. Dandapat (Department
of Electronics and Communication Engineering, National Institute of
Technology Meghalaya, Shillong 793003, India)
1.0
✉ E-mail: anup.dandapat@nitm.ac.in

References
0.75
ML current 1 Mishra, S., Mahendra, T.V., and Dandapat, A.: ‘A 9-T 833-MHz 1.72-fJ/
Bit/search quasi static ternary fully associative cache tag with selective
matchline evaluation for wire speed applications’, Trans. Circuits Syst.
I, µA

0.5
I Reg. Papers, 2016, 63, (11), pp. 1910–1920
2 Mundy, J.L.: ‘High density four transistor MOS content addressable
memory’, U.S. Patent 3,701,980, issued, October 1972
0.25
3 Wade, J.P., and Sodini, C.G.: ‘Dynamic cross-coupled bit-line content
addressable memory cell for high-density arrays’, J. Solid-State
Circuits, 1987, 22, (1), pp. 119–121
0.0 4 Chae, M., Lee, J.-W., and Hong, S.H.: ‘A decoupled 4 T dynamic CAM
100 110 120 130 140 150 160
suitable for high density storage’, Electron. Lett., 2011, 47, (7),
time, ns pp. 434–436
write search 5 Vinogradov, V., Ha, J., Lee, C., et al.: ‘Dynamic ternary CAM for
hardware search engine’, Electron. Lett., 2014, 50, (4), pp. 256–258
1.0

ML voltage
0.75
V, V

0.5

0.0
100 101 102 103 104
time, ns

Fig. 4 Matchline current and voltage at Monte-Carlo simulation of 1000


runs

ELECTRONICS LETTERS 3rd May 2018 Vol. 54 No. 9 pp. 556–558