Sie sind auf Seite 1von 7

Power-Efficient Explicit-Pulsed Dual-Edge Triggered Sense-Amplifier Flip-Flops

This paper appears in: Very Large Scale Integration (VLSI) Systems, IEEE Transactions
on

Issue Date: Jan. 2011 Volume: 19 Issue:1 On page(s): 1 - 9 ISSN: 1063-8210 References Cited: 15 INSPEC Accession Number: 11704471 Digital Object Identifier: 10.1109/TVLSI.2009.2029116 Date of Publication: 15 December 2009 Date of Current Version: 23 December 2010 Sponsored by: IEEE Circuits and Systems Society IEEE Computer Society
State Circuits Society

IEEE Solid-

ABSTRACT A novel explicit-pulsed dual-edge triggered sense-amplifier flip-flop (DET-SAFF) for low-power and high-performance applications is presented in this paper. By incorporating the dual-edge triggering mechanism in the new fast latch and employing conditional precharging, the DET-SAFF is able to achieve low-power consumption that has small delay. To further reduce the power consumption at low switching activities, a clock-gated sense-amplifier (CG-SAFF) is engaged. Extensive post-layout simulations proved that the proposed DET-SAFF exhibits both the low-power and high-speed properties, with delay and power reduction of up to 43.3% and 33.5% of those of the prior art, respectively. When the switching activity is less than 0.5, the proposed CG-SAFF demonstrates its superiority in terms of power reduction. During zero input switching activity, CGSAFF can realize up to 86% in power saving. Lastly, a modification to the proposed circuit has led to an improved common-mode rejection ratio (CMRR) DET-SAFF.

SRAM Write-Ability Improvement With Transient Negative Bit-Line Voltage


This paper appears in: Very Large Scale Integration (VLSI) Systems, IEEE Transactions on Issue Date: Jan. 2011 Volume: 19 Issue:1 On page(s): 24 - 32 ISSN: 1063-8210 References Cited: 17 INSPEC Accession Number: 11704480 Digital Object Identifier: 10.1109/TVLSI.2009.2029114 Date of Publication: 30 October 2009 Date of Current Version: 23 December 2010 Sponsored by: IEEE Circuits and Systems Society IEEE Computer Society IEEE SolidState Circuits Society

ABSTRACT Increasing variations in device parameters significantly degrades the write-ability of SRAM cells in deep sub-100 nm CMOS technology. In this paper, a transient negative bit-line voltage technique is presented to improve write-ability of SRAM cell. Capacitive coupling is used to generate a transient negative voltage at the low-going bit-line during Write operation without using any on-chip or off-chip negative voltage source. Statistical simulations in a 45-nm PD/SOI technology show a 103X reduction in the Write-failure probability with the proposed method.

Design and Optimization of Power-Gated Circuits With Autonomous Data Retention


This paper appears in: Very Large Scale Integration (VLSI) Systems, IEEE Transactions on Issue Date: Feb. 2011 Volume: 19 Issue:2 On page(s): 227 - 236 ISSN: 1063-8210 References Cited: 24 INSPEC Accession Number: 11762132 Digital Object Identifier: 10.1109/TVLSI.2009.2033356 Date of Publication: 17 November 2009 Date of Current Version: 20 January 2011 Sponsored by: IEEE Circuits and Systems Society IEEE Computer Society IEEE SolidState Circuits Society

ABSTRACT Power gating has been widely employed to reduce subthreshold leakage. Data retention elements (flip-flops and isolation circuits) are used to preserve circuit states during standby mode, if the states are needed again after wake-up. These elements must be controlled by an external power management unit, causing a network of control signals implemented with extra wires and buffers. A powergated circuit with autonomous data retention (APG) is proposed to remove the overhead involved in control signals. Retention elements in APG derive their control by detecting rising potential of virtual ground rails when power gating starts, i.e., they control themselves without explicit control signals. Design of retention elements for APG is addressed to facilitate safe capturing of circuit states. Experiments with 65-nm technology demonstrate that, compared to standard power gating, total wirelength, and average wiring congestion are reduced by 8.6% and 4.1% on average, respectively, at a cost of 6.8% area increase. In order to fast charge virtual ground rails, a pMOS switch driven by a short pulse is employed to directly provide charges to virtual ground. This helps retention elements avoid short-circuit current while making transition to standby mode. The optimization procedure for sizing pMOS switch and deciding pulse width is addressed, and assessed with 65-nm technology. Experiments show that, compared to standard power gating, APG reduces the delay to enter and exit the standby mode by 65.6% and 28.9%, respectively, with corresponding energy dissipation during the period cut by 46.1% and 36.5%. Standby mode leakage power consumption is also reduced by 15.8% on average.

Runtime Resonance Noise Reduction with Current Prediction Enabled Frequency Actuator
This paper appears in: Very Large Scale Integration (VLSI) Systems, IEEE Transactions on Issue Date: March 2011 Volume: 19 Issue:3 On page(s): 508 - 512 ISSN: 1063-8210 References Cited: 13 INSPEC Accession Number: 11834064 Digital Object Identifier: 10.1109/TVLSI.2009.2036266 Date of Publication: 11 December 2009 Date of Current Version: 22 February 2011 Sponsored by: IEEE Circuits and Systems Society IEEE Computer Society IEEE SolidState Circuits Society

ABSTRACT Power delivery network (PDN) is a distributed resistance-inductance-capacitance (RLC) network with its dominant resonance frequency in the low-to-middle frequency range. Though high-performance chips' working frequencies are much higher than this resonance frequency in general, chip runtime loading frequency is not. When a chip executes a chunk of instructions repeatedly, the induced current load may have harmonic components close to this resonance frequency, causing excessive power integrity degradation. Existing PDN design solutions are, however, mainly targeted at reducing high-frequency noise and not effective to suppress such resonance noise. In this work, we propose a novel approach to proactively suppress this type of noise. A method based on the high dimension generalized Markov process is developed to predict current load variation. Based on such prediction, a clock frequency actuator design is proposed to proactively select an optimal clock frequency to suppress the resonance. To the best of our knowledge, this is the first in-depth study on proactively reducing instruction loop induced PDN resonance noise at the runtime.

Leakage Power and Circuit Aging Cooptimization by Gate Replacement Techniques


This paper appears in: Very Large Scale Integration (VLSI) Systems, IEEE Transactions on Issue Date: April 2011 Volume: 19 Issue:4 On page(s): 615 - 628 ISSN: 1063-8210 References Cited: 49 INSPEC Accession Number: 11883065 Digital Object Identifier: 10.1109/TVLSI.2009.2037637 Date of Publication: 12 January 2010 Date of Current Version: 22 March 2011 Sponsored by: IEEE Circuits and Systems Society IEEE Computer Society IEEE SolidState Circuits Society

ABSTRACT As technology scales, the aging effect caused by negative bias temperature instability (NBTI) has become a major reliability concern. In the mean time, reducing leakage power remains to be one of the key design goals. Because both NBTI-induced circuit degradation and standby leakage power have a strong dependency on the input vectors, input vector control (IVC) technique could be adopted to reduce the leakage power and mitigate NBTI-induced degradation. The IVC technique, however, is ineffective for larger circuits. Consequently, in this paper, we propose two gate replacement algorithms [direct gate replacement (DGR) algorithm and divide and conquer-based gate replacement (DCBGR) algorithm], together with optimal input vector selection, to simultaneously reduce the leakage power and mitigate NBTI-induced degradation. Our experimental results on 23 benchmark circuits reveal the following. 1) Both DGR and DCBGR algorithms outperform pure IVC technique by 15%-30% with 5% delay relaxation for three different design goals: leakage power reduction only, NBTI mitigation only, and leakage/NBTI cooptimization. 2) The DCBGR algorithm leads to better optimization results and save on average more than 10 runtime compared to the DGR algorithm. 3) The area overhead for leakage reduction is much more than that for NBTI mitigation.

Switched-capacitor DC-DC converters for low-power onchip applications


This paper appears in: Power Electronics Specialists Conference, 1999. PESC 99. 30th Annual IEEE Issue Date : Aug 1999 Volume : 1 On page(s): 54 - 59 vol.1 Meeting Date : 27 Jun 1999-01 Jul 1999 ISSN : 0275-9306 Print ISBN: 0-7803-5421-4 Cited by : 11 INSPEC Accession Number: 6460651 Digital Object Identifier : 10.1109/PESC.1999.788980 Date of Current Version : 06 August 2002

ABSTRACT The paper describes switched-capacitor DC-DC power converters (charge pumps) suitable for on-chip, low-power applications. The proposed configurations are based on connecting two identical but opposite-phase SC converters in parallel, thus eliminating the need for separate bootstrap gate drivers. The authors focus on emerging very low-power VLSI applications such as battery-powered or selfpowered signal processors where high power conversion efficiency is important and where power levels are in the milliwatt range. Conduction and switching losses are considered to allow design optimization in terms of switching frequency and component sizes. Open-loop and closed-loop operation of an experimental, fully integrated, 10 MHz voltage doubler is described. The doubler has 2 V or 3 V input and generates 3.3 V or 5 V output at up to 5 mW load. The

converter circuit fabricated in a standard 1.2 CMOS technology takes 0.7 mm2 of the chip area

Dynamic Context Compression for Low-Power CoarseGrained Reconfigurable Architecture


This paper appears in: Very Large Scale Integration (VLSI) Systems, IEEE Transactions on Issue Date: Jan. 2010 Volume: 18 Issue:1 On page(s): 15 - 28 ISSN: 1063-8210 References Cited: 28 INSPEC Accession Number: 11037023 Digital Object Identifier: 10.1109/TVLSI.2008.2006846 Date of Publication: 24 March 2009 Date of Current Version: 22 December 2009 Sponsored by: IEEE Circuits and Systems Society IEEE Computer Society IEEE SolidState Circuits Society
+Yoonjin Kim+Ma

ABSTRACT Most of the coarse-grained reconfigurable architectures (CGRAs) are composed of reconfigurable ALU arrays and configuration cache (or context memory) to achieve high performance and flexibility. Specially, configuration cache is the main component in CGRA that provides distinct feature for dynamic reconfiguration in every cycle. However, frequent memory-read operations for dynamic reconfiguration cause much power consumption. Thus, reducing power in configuration cache has become critical for CGRA to be more competitive and reliable for its use in embedded systems. In this paper, we propose dynamically compressible context architecture for power saving in configuration cache. This power-efficient design of context architecture works without degrading the performance and flexibility of CGRA. Experimental results show that the proposed approach saves up to 39.72% power in configuration cache with negligible area overhead (2.16%).

Built-In Sensor for Signal Integrity Faults in Digital Interconnect Signals


This paper appears in: Very Large Scale Integration (VLSI) Systems, IEEE Transactions on Issue Date: Feb. 2010 Volume: 18 Issue:2 On page(s): 256 - 269 ISSN: 1063-8210 References Cited: 35 INSPEC Accession Number: 11071371 Digital Object Identifier: 10.1109/TVLSI.2008.2010398 Date of Publication: 14 April 2009 Date of Current Version: 19 January 2010

Sponsored by: IEEE Circuits and Systems Society IEEE Computer Society IEEE SolidState Circuits Society

ABSTRACT Testing of signal integrity (SI) in current high-speed ICs, requires automatic test equipment test resources at the multigigahertz range, normally not available. Furthermore, for most internal nets of state-of-the-art ICs, external speed testing is not possible for the newest technologies. In this paper, on-chip testing for SI faults in digital interconnect signals, using built-in high speed monitors, is proposed. A coherent sampling scheme is used to capture the signal information. Two monitors to test SI violations are proposed: one for undershoots at the high logic level and the other for overshoots at the low logic level. The monitors are capable of detecting small noise pulses and have been extended to test sequentially more than one signal. The cost of the proposed strategy is analyzed in terms of area, delay penalization, and test time. The effects of clock jitter and process variations are analyzed. Experimental results obtained in designed and fabricated circuits show the feasibility of the proposed testing strategy. A good agreement appears between the theoretical analysis, simulation results, and the experimental measurements.

Reducing SRAM Power Using Fine-Grained Wordline Pulsewidth Control


This paper appears in: Very Large Scale Integration (VLSI) Systems, IEEE Transactions on Issue Date: March 2010 Volume: 18 Issue:3 On page(s): 356 - 364 ISSN: 1063-8210 References Cited: 35 INSPEC Accession Number: 11142348 Digital Object Identifier: 10.1109/TVLSI.2009.2012511 Date of Publication: 10 June 2009 Date of Current Version: 22 February 2010 Sponsored by: IEEE Circuits and Systems Society IEEE Computer Society IEEE SolidState Circuits Society

ABSTRACT Embedded SRAM dominates modern SoCs, and there is a strong demand for SRAM with lower power consumption while achieving high performance and high density. However, the large increase of process variations in advanced CMOS technologies is considered one of the biggest challenges for SRAM designers. In the presence of large process variations, SRAMs are expected to consume larger power to ensure correct read operations and meet yield targets. In this paper, we propose a new architecture that significantly reduces the array switching power for SRAM. The proposed architecture combines built-in self-test and digitally controlled delay elements to reduce the wordline pulsewidth for memories while ensuring correct read operations, hence reducing the switching power. Monte Carlo simulations using a 1-Mb SRAM macro in an industrial 45-nm

technology are used to verify the power saving for the proposed architecture. For a 48-Mb memory density, a 27% reduction in array switching power can be achieved for a read access yield target of 95%. In addition, the proposed system can provide larger power saving as process variations increase, which makes it an attractive solution for 45-nm-and-below technologies.

Statistical Leakage Estimation Based on Sequential Addition of Cell Leakage Currents


This paper appears in: Very Large Scale Integration (VLSI) Systems, IEEE
Transactions on

Issue Date: April 2010 Volume: 18 Issue:4 On page(s): 602 - 615 ISSN: 1063-8210 References Cited: 34 INSPEC Accession Number: 11204168 Digital Object Identifier: 10.1109/TVLSI.2009.2013956 Date of Publication: 16 October 2009 Date of Current Version: 22 March 2010 Sponsored by: IEEE Circuits and Systems Society IEEE Computer Society
State Circuits Society

IEEE Solid-

ABSTRACT This paper presents a novel method for full-chip statistical leakage estimation that considers the impact of process variation. The proposed method considers the correlations among leakage currents in a chip and the state dependence of the leakage current of a cell for an accurate analysis. For an efficient addition of the cell leakage currents, we propose the virtual-cell approximation (VCA), which sums cell leakage currents sequentially by approximating their sum as the leakage current of a single virtual cell while preserving the correlations among leakage currents. By the use of the VCA, the proposed method efficiently calculates a full-chip leakage current. Experimental results using ISCAS benchmarks at various process variation levels showed that the proposed method provides an accurate result by demonstrating average leakage mean and standard deviation errors of 3.12% and 2.22%, respectively, when compared with the results of a Monte Carlo (MC) simulation-based leakage estimation. In efficiency, the proposed method also demonstrated to be 5000 times faster than MC simulation-based leakage estimations and 9000 times faster than the Wilkinson's method-based leakage estimation.

Das könnte Ihnen auch gefallen