Sie sind auf Seite 1von 44

1

EECS598 Non-Volatile Storage

Jerry Kao
jckao@umich.edu

Electrical Engineering & Computer Science Department The University of Michigan, Ann Arbor

University of Michigan

A SURVEY OF CIRCUIT INNOVATIONS IN FERROELECTRIC RANDOM-ACCESS MEMORIES


Ali Sheikholeslami, and P. Glenn Gulak

University of Michigan

FRAM Structure

Research are done in following three areas: material


processing, modeling, circuit design.
University of Michigan
3

Motives for FRAM: short programming time and low power consumption. Easily integration in a SoC.

FRAM Comparison

FRAM is superior in term of write-access time and overall power consumption.

Target application: contactless smart card, and digital camera Also hoping to be part of the mobile device market.
This paper focused on the six innovative circuit techniques.
4

University of Michigan

Ferromagnetic Cores Background


Main technology prior to the 1950s. a current the x-access and y-access wire magnetized in a 0
or 1 direction.

Read access consists of a write access followed by sensing. Writing the wrong data will induce a large current. write the data stored in sense amp back to cell after write access.

University of Michigan

Ferroelectric Capacitors Background


inherent to the cycstal structure and does not disappear in absence of electric field.

Name was adopted to convey similarity in the hysteresis loop. Key concept: spontaneous polarization: a displacement that is

Popular matieral is lead zirconate titanate (PZT), perovskites. At 0V, the cell has two possible states.

University of Michigan

Techniques to Reduce Voltage Disturbance


Novel material process to make the

loop more square like. Add the access transistor to each cell. (1T-1C) Access transistor OFF FE cap disconnect from bit line (BL) Access transistor ON FE cap is connected to BL and can be
read or write from plate line (PL).

voltage boosted VDD is applied to WL.

University of Michigan

Step-Sensing Approach Timing Diagram


8

Step PL before sensing. BL precharge to 0V turn on WL resulting in a capacitor



divider consisting CFE and CBL between PL and ground. Raise PL to VDD. Sense the voltage on BL, Vx. Sense amp restore the original data in the cell.

University of Michigan

Pulse-Sensing Approach
pulse PL before sense amp. has a smaller common mode
voltage.

step-sensing approach is preferred due to higher cm voltage.

University of Michigan

Reference Voltage Generation


Reference voltage between V0 and V1 is need to do the

10

comparison. V0 and V1 are not exact and are process and time dependent. Two type of ferroelectric imperfections: Relaxation: a partial loss of remanent charge in a s if cap is not access
for a period of time. V1 or V0

Imprint: the tendency of a cell to prefer one state over the other if it stay
in that state for a long period of time. shift in V1, V0, and VREF.

A variable reference is need to track the process Variation.

University of Michigan

10

10

One Oversized Reference Capacitor per Column

11

Two additional cells in each column


(1C/BL). CREF is sized larger than CFE so that VREF is midway between V0 and V1. When WL0 and RWL0 or WL1 and RWL1 are turned on at the same time, and the sense amp amplify the difference between BL and /BL. Reset transistor are added to reduce a voltage build up in the CREF. VREF tuning achieves using adjustable CREF, adjustable RPL, or adjustable voltage reference generator.
11

University of Michigan

11

Two Half-Sized Reference Cap per Column


also call (2 0.5C/BL) Generate VREF=(V0+V1)/2 CREF1 and CREF0 are half of the size of

CFE. In this case, VREF is going to be slightly larger than (V0+V1)/2. CREF1 and CREF0 fatigues faster than CFE.

12

University of Michigan

12

12

Two Full-Sized Reference Cap per Two Columns

13

also called (2C/2BL). CREF1 = CREF0 = CFE BL1 has V1 and BL2 has V0 before EQ

turn ON. After EQ turn ON, VBL1=VBL2=(V0+V1)/2 At the end, a 0 and 1 must be restored in CREF0 and CREF1 by pulsing RPL thru transistor driven by RP.

University of Michigan

13

13

Adding Reference Cells to Rows


also called (2C/WL) fatigue the reference voltage circuit less. reference generated by shorting RBL and

/RBL. need to add Cext to balance cap due to RBL.

14

University of Michigan

14

14

A Self-Reference Fully Differential Arch.


15

also called (2T-2C) Two CFEs store opposite values. twice the voltage difference between BL

and /BL. only used in lower density memory.

University of Michigan

15

15

Summary

16

2T-2C is the most robust, but has density issue. among 1T-1C, 2C/2BL and 2C/WL schemes have superior
sensing complexity and fatigue immunity, respectively.

University of Michigan

16

16

Ferroelectric Memory Architecture


adopted folded bitline architecture to

reduce the bitline mismatch. constant PL architecture is desired since PL is slow to move. Two disadvantages: A refresh is required. voltage range across CFE is smaller.

17

University of Michigan

17

17

Wordline-Parallel Plateline
also called (WL//PL) PL is parallel to WL a row of cells are access at the

same time. If PL is shared between two row, un-accessed row can be disturbed. When disturbed, 0 is reinforced, and 1 might be flipped.

18

University of Michigan

18

18

Bitline-Parallel Plateline
also called (BL//PL) only a single cell can be selected. absorb the y-decoder and reduce the

power significantly. PL activation can disturb all the cells in the column.

19

University of Michigan

19

19

Segmented Plateline
also called (Segmented PL) Break the PL into local segments.
faster PL than WL//PL no disturbance to non-selected cell
compared to BL//PL.

20

University of Michigan

20

20

Merged Wordline/Plateline (ML) Architecture


Since WL and PL are parallel, people though

of ways to merge them. either two 1C-1T cells or one 2C-2T cell. write 0 into C1 and 1 into C2. four phase operations:

21

BLn=0V and BLn+1=VDD ML1 and ML2 set to VDD, forcing 0 into C1. ML1 pulled down to ground, leaving 0 in C1, and forcing 1 into C2. ML1 pull to VDD and ML2 are pull to ground forcing 1 into C1 if BLn were at VDD.

write access

Faster read access time. same read/write time higher density


read access
University of Michigan
21

21

Nondriven Plateline Architecture


also called Nondriven Plateline(NDP) Constant voltage on PL reduce read/write

access time. PL=VDD/2 read operation

22

BL1=BL2=0V activate WL VDD/2 used to switch the cap storing 1. Good for SrBi2Ta2O9 Sense amp restore the value by holding BL1=BL2.

Write operation is done similar to read


operation except that BL is hold at VDD or 0V.

University of Michigan

22

22

Bitline-Driven Architecture
PL=0V full VDD when read, and no refresh on
VDD/2

23

Shaded circuit precharge BL and /BL to VDD or 0V before activating the WL. PL is only pulsed after sensing. This reduce the read access time, but not read cycle time. Performance can be improved if combined with segmented PL.

University of Michigan

23

23

Dual-Mode Ferroelectric Memories


limited the switching of CFE during
the power down and power up mode to reduce the fatigue problem.

24

During power shutdown:


STO is turn on. PL is pulsed, writing data to CFE STO pull to ground, ready for power off. During power on sequence:

University of Michigan

24

24

Transpolarizer-Based Architectures
two CFE connected in opposite

direction. Simpler reference voltage since (V1+V0)/2 always equal to VDD/2. Although it is a 1T-2C structure, the C is smaller than 1T-1C to get small signal level on BL. Read operation with t4 and t5 doing write back.

25

University of Michigan

25

25

Cross-Point Array of Ferroelectric Gain Cells


26

Memory architecture without PL and



destructive read. consist of array of gain cells. two caps form a capacitor divider, and the transistor amplify the result. In standby, WL=BL=VDD In read, precharge BL to VDD and lower WL slightly. BL with cell storing 0 would have a larger current than BL with cell storing 1.

University of Michigan

26

26

Chain FRAM (NAND Architecture)


similar to NAND flash. in unit of cell block. A cell block is terminated by a BL
and PL on each end. In standby, all WL=VDD. in active operation, WLx=0V and raising Block-Select(BS). other WL remain high allowing BL voltage and PL voltage to reach the selected cells. Increase the number of cell in cell block increase density but reduce readout delay. 1024 cells per bit line and 16 cells per cell block reduces area by 63%.

27

University of Michigan

27

27

Architecture Summary

28

University of Michigan

28

28

Future Trends
Progress in density, access time, and SoC integration can be
assumed.

29

62kb and 256kb has been achieved with 1Mb expected. Access time hasnt improved, but can be through circuit innovation. It is easier to integrate FRAM to SoC compare to EEPROM.

University of Michigan

29

29

30

ULTRALOW POWER DATA STORAGE FOR SENSOR NETWORKS


Gaurav Mathur, Peter Desnoyers, Deepak Ganesan, Prashant Shenoy

University of Michigan

30

30

Motivation
What is the most energy-efficient storage platform for the
sensor networks, and what is the implication on sensor network design?

31

Results Parallel NAND flash is 100X more energy-efficient storage compared to


other flash memories and the radio on MicaZ.

University of Michigan

31

31

Background

32

NOR flash is less dense than NAND and uses more energy for

erase and programming, but provides random read access time less than 100ns. NAND flash has significantly higher starting latency, but can stream subsequently read bytes at high speed since it is always page-oriented. Writes are one-way. Need to erase before the next write. A microcontroller is used to translate the disk like operation to NAND interface, which also increase power consumption. This takes care of erasure, page remapping, ECC, and wear leveling.

University of Michigan

32

32

Flash Energy Consumption


measured on Mica mote with 10 resistor with 3.3V supply Toshiba NAND is 21X more efficient than Telos NOR.

33

University of Michigan

33

33

Affect of Size of Data on Energy Consumption

34

read operation has a smaller

energy overhead compared to write operation. having a write buffer can amortizes the fix cost over a larger number of data bytes.

University of Michigan

34

34

Idle Current
NOR and NAND device are smaller between 2A and 5A,
which is smaller than mote CPUs 5A and 15A or self discharge current of AA battery of 10A.

35

NOR and NAND device has idle current that is 17X smaller than MMC.

University of Michigan

35

35

Summary
parallel NAND flash is the most energy efficient storage for

sensor network. A desired device would have the performance of a parallel NAND and the pin count of a serial NAND flash. ECC is better handle using the microcontroller during idle cycle.

36

University of Michigan

36

36

Implication on Sensor Systems


Compare energy consumption of flash to CPU, radio. writing a byte in flash is 11X more expensive than
computation.

37

radio transmission of a byte is 200X over write access, and 500X over read access. Suggested that storage energy should be part of the trade-off. Applications that benefit In-network Query Process. Use of History Network-level compression Custody Transfer

University of Michigan

37

37

Re-thinking Sensor Net Design


Sensor network service involve three operation: computation,

storage and communication. characterize those operations by two parameters: frequency and magnitude. Model using a sensor service emulator.

38

University of Michigan

38

38

Impact on Communication Service


NAND flash provides significant energy gain for batch size

39

greater than 128 bytes. In 1% duty cycles, it achieves 3.8 times less energy/byte with batch size of 512 bytes and 58 times improvement for a batch size of 65kbytes. The 7.5% duty cycle has smaller preamble resulting in less fix energy cost per packet.

University of Michigan

39

39

Impact on Data Aggregation


effect of compression on energy consumption. Three type of compression: lossless encoding, lossy
encoding, feature extraction.

40

use a benchmark wavelet compression scheme optimized for floating pointless operation with computation complexity of 60N. Conclude that 10X energy consumption saving for using of data aggregation.

University of Michigan

40

40

Conclusion
parallel NAND flash has 100 fold more energy efficient than

serial NOR flash. This observation has implication for sensor network design. Data shows that communication and data aggregation achieves at least an order of magnitude energy reduction.

41

University of Michigan

41

41

42

THE MISSING MEMRISTOR FOUND

Dmitri B. Strukov, Gregory S. Snider, Duncan R. Stewart & R. Stanley Williams

University of Michigan

42

42

The four fundamental two terminal circuit elements


43

University of Michigan

43

43

Operation

44

University of Michigan

44

44

Das könnte Ihnen auch gefallen