Sie sind auf Seite 1von 20

Lecture 24

Cache Memory Design Options

The Principle of Locality


1) When a memory item is referenced once, its likely to be referenced again in the near future. 2) When I access a memory location, it is likely that I will access also other locations in the vicinity of the first.

Temporal Locality

Spatial Locality

Whats Up With These 1-word Blocks?


If every time an address is referenced, more than just that word is brought into the cache, well see a sure performance increase. Why??? The keyword here is Spatial Locality. The Direct-Mapped Cache: (a cache entry is now a block) (block address) % (# of cache entries) block address = (word address) DIV (block size)

The Direct-mapped Cache


The mapping:

(block address) % (size of cache in blocks) block address = (word address) DIV (block size)

Memory address (32 bits) => Word address (30 bits) Tag Index Block offset

Cache

00001

00101

01001

000 001 010 011 100 101 110 111


01101 10001 10101

11001

11101

Memory

Address Mapping in a Multiword Cache Block


Have a cache with 64 blocks, each with 4 words (16 bytes). Have a byte address (1200) and want to find out the block it maps to.
block address = byte address block size (bytes)

Have 16 bytes per block: 1200/16 = 75 (block address) Now, to what cache slot does this block address map? 75 % 64 = 11

A 64-KB Cache with Four-word Blocks


Address (showing bit positions) Address(showing bit positions) 31 16 15 4 32 1 0 Hit Tag Index 16 bits V Tag 128 bits Data 16 12 2 Byte offset Block offset Data

4K entries

16

32

32

32

32

Mux 32

What Do We Gain From Larger Block Sizes?


40% 35% 30%

Miss rate

25% 20% 15% 10% 5% 0% 4 16 Block size (bytes) 64 1 KB 8 KB 16 KB 64 KB 256 KB 256

Do we know Miss Rate?

cache size

The drawback of larger blocks: Miss Penalty gets larger too. Miss Penalty is determined by memory latency and transfer rate.

Reducing Cache Misses Through Different Design


Direct mapped Direct mapped
Block # 0 1 2 3 4 5 6 7

(2-way) Set associative Set associative


Set # 0 1 2 3

Fully associative Fully Associative

Data

Data

Data

Tag Search

1 2

Tag Search

1 2

Tag Search

1 2

12 % 8

12 % 4 (*)

Just put it anywhere (*)

Set associative: (block number) % (number of sets in the cache)

(*) searches for the tag are done in parallel!

Direct mapped 2-way Set Associative

4-way Set Associative 8-way Set Associative

Fully Associative

miss rate decreases hit time increases

Spectrum of design options for cache memory

Write Access to a Memory System with Cache


Two policies: Write Through: Every time something is written to memory, it is written directly to main memory as well as to cache.
Advantage: ease of control, no extra bits in cache. Disadvantage: access time for writes is much longer than for reads.

Write Back: If the write is going to a block that is in the cache, I write it to the cache. If the write is going to a block that is not in the cache, first I generate a cache miss, bringing a copy of the block to the cache, and then write on the copy.
Advantage: write access time is a lot shorter. Disadvantage: extra bits to mark dirty entries, more complex control.

10

Reducing Cache Misses Through Different Design


Direct mapped Direct mapped
Block # 0 1 2 3 4 5 6 7

(2-way) Set associative Set associative


Set # 0 1 2 3

Fully associative Fully Associative

Data

Data

Data

Tag Search

1 2

Tag Search

1 2

Tag Search

1 2

12 % 8

12 % 4 (*)

Just put it anywhere (*)

Set associative: (block number) % (number of sets in the cache)

(*) searches for the tag are done in parallel!

11

The Set Associative Cache


One-way set associative (direct mapped) Block 0 1 2 3 4 5 6 7 Four-way set associative Set 0 1 Eight-way set associative (fully associative) Tag Data Tag Data Tag Data Tag Data Tag Data Tag Data Tag Data Tag Data Tag Data Tag Data Tag Data Tag Data Tag Data Two-way set associative Set 0 1 2 3 Tag Data Tag Data

Cache Size = 8 blocks

12

Set Associative Performance with Varying Associativity


15%

12%

9%

Miss rate
6% 3% 0% One-way Two-way Associativity Four-way 1 KB 2 KB 4 KB 8 KB Eight-way 16 KB 32 KB 64 KB 128 KB

Notice how relative improvements with higher associativity diminish fast.

13

An Example with a Small Cache


Consider a small cache with 4 one-word block. Consider the sequence of block accesses: 0,8,0,6,8.
Mapping: 0 % 4 = 0 6%4=2 8%4=0

Direct-mapped cache
Block address 0 8 0 6 8 H/M Miss Miss Miss Miss Miss Block 0 Mem[0] Mem[8] Mem[0] Mem[0] Mem[8] Mem[6] Mem[6] Block 1 Block 2 Block 3

Number of misses = 5

14

An Example with a Small Cache


Consider a small cache with 4 one-word block. Consider the sequence of block accesses: 0,8,0,6,8.
Mapping: 0 % 2 = 0 6%2=0 8%2=0

Replacement is LRU.

2-way Set-associative cache (2 elements p/set)


Block address 0 8 0 6 8 H/M Miss Miss Hit Miss Miss Set 0 Mem[0] Mem[0] Mem[0] Mem[0] Mem[8] Mem[8] Mem[8] Mem[6] Mem[6] Set 0 Set 1 Set 1

Number of misses = 4

15

An Example with a Small Cache


Consider a small cache with 4 one-word block. Consider the sequence of block accesses: 0,8,0,6,8.
Mapping

Replacement is LRU.

Fully Associative Cache


Block address 0 8 0 6 8 H/M Miss Miss Hit Miss Hit Block 0 Mem[0] Mem[0] Mem[0] Mem[0] Mem[0] Mem[8] Mem[8] Mem[8] Mem[8] Mem[6] Mem[6] Blocs 1 Block 2 Block 3

Number of misses = 3

16

Question: Why does higher associativity result in fewer misses?

17

Implementing a 4-way Set Associative Cache


Address 31 30 12 11 10 9 8 22 3210 8

Index 0 1 2 253 254 255

Tag

Data

Tag

Data

Tag

Data

Tag

Data

22

32

Set size = 4 (elements p/set) Number of sets = 256 Block size = 1 word Cache size = 2^(10) words

4-to-1 multiplexor

Hit

Data

Index: 8 bits Byte offset: 2 bits

18

Size of Tags vs. Set Associativity


Cache has 4K (2^12) blocks, 32-bit address:
Direct-mapped (4K sets)
Address: Tag (32-12-2 bits) Index (12 bits) Block offset (2 bits)

2-way set associative (2K sets)


Tag (32-11-2 bits) Index (11 bits) Block offset (2 bits)

4-way set associative (1K sets)


Tag (32-10-2 bits) Index (10 bits) Block offset (2 bits)

Question: what is happening with the number of bits I need to Store tags as we increase set associativity?

19

Replacement Policies

LRU: most commonly used. Question: What do you need to have in order to be able to figure out what item in a set was the least recently used?

20

Das könnte Ihnen auch gefallen