21SCS147L17ReplacementPolicy4 7

Cache memory Replacement Policy, Virtual Memory
Prof. Sin-Min Lee Department of Computer Science
Associative Cache
Access order
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns FIFO
h = 0.389
TM = 40.56 ns
FIFO
FIFO is similar to LRU except that because FIFO method doesn't move the most recently used block up in the queue, it ends up discarding a block that was used recently instead of the least recently used. In this example, when the memory is full A is discarded though it was recently used.
A B C A D B E F A C D B G C H I F F F F G G H I A B A B A 0 A B C C D D E F 1 MEMORY 2 3 4 5
A B B C C D E E E E E F F G H I
A A B B C D D D D D E E F G H I A A B C C C C C D D E F G H A B B B B B C C D E F G A A A A A B B C D E F
6
7
A A B C D E
A B C D Hit ratio: 7/18
A is removed from memory, although it was not the least recently used.
LRU
LRU (Least Recently Used) is a method that keeps track of how often a data block is used.
A B C A D B E F 0 A B C A D B E F 1 A B C A D B E A B C A D B B C A D C A C A C D B G C H I A C D B G C H I F E A B A B A
A C D B G C H I F
MEMORY
2 3 4 5 6 7
A C D B G C H I F A C D B G C H F A A D B G C F E F E A D B G F E A D D F F F
B E
D B E
C D B E
Hit ratio: 9/18
The bottom block is the Least Recently Used, which gets replaced (discarded) in favor of the new one, when the memory is full.
E is LRU and is removed from memory
MEMORY FULL
Direct-Mapped Cache
Access order
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns h = 0.167
TM = 50.67 ns
2-Way Set Associative Cache

Access order
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns LRU
h = 0.31389
TM = 40.56 ns
Associative Cache (FIFO Replacement Policy)

A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0
Data A B C A D B E F A C D B G C H I A B
A
C A C H E
A
B
A
B C
A
B C
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
I
B C D
I
A C D
I
A B D
E
F
E
F
E
F
E
F
E
F
E
F G
E
F G *
E
F G H
E
F G H
E
F G H
E
F G H
Hit?
Hit ratio = 7/18
Two-way set associative cache (LRU Replacement Policy)

A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0
Data 0 C A C H E 0 1 1 2 2 3 3 Hit?
A A-0
B A-1 B-0
C A-1 B-0
A A-0 B-1
D A-0 B-1 D-0
B A-1 B-0 D-0
E E-0 B-1 D-0
F E-0 B-1 D-1 F-0
A E-1 A-0 D-1 F-0 C-0
C E-1 A-0 D-1 F-0 C-0
D E-1 A-0 D-0 F-1 C-0
B B-0 A-1 D-0 F-1 C-0
G B-0 A-1 D-0 F-1 C-0
C B-0 A-1 D-0 F-1 C-0
H B-0 A-1 D-0 F-1 C-0
I B-0 A-1 D-0 F-1 C-1 I-0
A B-1 A-0 D-0 F-1 C-1 I-0 G-1 H-0 *
B B-0 A-1 D-0 F-1 C-1 I-0 G-1 H-0 *
C-0
C-0
C-0
C-0
C-0
C-0
G-0
G-0
G-1 H-0
G-1 H-0
Hit ratio = 7/18
Associative Cache with 2 byte line size (FIFO Replacement Policy)

A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0 A and J; B and D; C and G; E and F; and I and H
Data
C A
A
A J
B
A J B D
C
A J B D C G
A
A J B D C G
D
A J B D C G
B
A J B D C G
E
A J B D C G E F
F
A J B D C G E F
A
A J B D C G E F
C
A J B D C G E F
D
A J B D C G E F
B
A J B D C G E F
G
A J B D C G E F
C
A J B D C G E F
H
I H B D C G E F
I
I H B D C G E F
A
I H A J C G E F
B
I H A J B D E F
C
H E
Hit?
Hit ratio = 11/18
Direct-mapped Cache with line size of 2 bytes

Data 0 A A B B C B A A D B B B E B F B A A C A D B B B G B C B H B I B A A B B
1
C A C H E 2 3 4 5 6 7 Hit?
D
C G
J
C G
D
C G
D
C G
D
C G E F
D
C G E F
J
C G E F
J
C G E F
D
C G E F
D
C G E F
D
C G E F
D
C G E F
D
C G E F I H
D
C G E F I H *
J
C G E F I H
D
C G E F I H
Hit ratio 7/18
Two-way set Associative Cache with line size of 2 bytes

Data
A A-0
B A-1
C A-1
A A-0
D A-1
B A-1
E E-0
F E-0
A E-1
C B-0
D B-0
B B-0
G B-0
C B-0
H B-0
I B-0
A B-1
B B-0
0 C
J-0
J-1
B-0 D-0
J-1
B-0 D-0 C-0 G-0
J-0
B-1 D-1 C-0 G-0
J-1
B-0 D-0 C-0 G-0
J-1
B-0 D-0 C-0 G-0
F-0
B-1 D-1 C-0 G-0
F-0
B-1 D-1 C-0 G-0
F-1
A-0 J-0 C-0 G-0
D-0
A-0 J-0 C-0 G-0
D-0
A-1 J-1 C-0 G-0
D-0
A-1 J-1 C-0 G-0
D-0
A-1 J-1 C-0 G-0
D-0
A-1 J-1 C-0 G-0
D-0
A-1 J-1 C-1 G-1 I-0 H-0
D-0
A-1 J-1 C-1 G-1 I-0 H-0 *
D-1
A-0 J-0 C-1 G-1 I-0 H-0 *
D-0
A-1 J-1 C-1 G-1 I-0 H-0 *
A 0 C 1 2
H 3 E 2 3 Hit?
Hit ratio = 12/18
Page Replacement - FIFO

FIFO is simple to implement
When page in, place page id on end of list Evict page at head of list
Might be good? Page to be evicted has been in memory the longest time But?
Maybe it is being used We just dont know
FIFO suffers from Beladys Anomaly fault rate may increase when there is more physical memory!
Parkinson's law : "Programs expand to fill the memory available to hold them" Idea : Manage the storage available efficiently between the available programs.
Before VM
Programmers tried to shrink programs to fit tiny memories Result:
Small Inefficient Algorithms
Solution to Memory Constraints

Use a secondary memory such as disk Divide disk into pieces that fit memory (RAM)
Called Virtual Memory
Implementations of VM
Paging
Disk broken up into regular sized pages
Segmentation
Disk broken up into variable sized segments
Memory Issues
Idea: Separate concepts of
address space Disk memory locations RAM
Example:
Address Field = 216 = 65536 memory cells Memory Size = 4096 memory cells How can we fit the Address Space into Main Memory?
Paging
Break memories into Pages
NOTE: normally Main Memory has thousands of pages
1 page = 4096 bytes page page
page
New Issue: How to manage addressing?
Address Mapping
Mapping 2ndary Memory addresses to Main Memory addresses
1 page = 4096 bytes
page page virtual address
page physical address
Address Mapping
Mapping 2ndary Memory (program/virtual) addresses to Main Memory (physical) addresses
physical virtual 1 page = 4096 bytes
4095
0
8191 page page virtual address used by program
4096
page
physical address used by hardware
Paging
physical 4095 virtual 8191 4096
Illusion that Main Memory is Large Contiguous Linear Size(MM) = Size(2ndry M) Transparent to Programmer
page page
4095 / 0 4095 / 0 0
page
Paging Implementation
Virtual Address Space (Program) & Physical Address Space (MM)
Broken up into equal pages (just like cache & MM!!)
Page size Always a power of 2 Common Size:

512 to 64K bytes
Paging Implementation
Page Frames
Page Tables Programs use Virtual Addresses
Memory Mapping
Page Frame: home of VM pages in MM Page Table: home of mappings for VM pages
Page # Page Frame #
Note: 2ndry Mem = 64K; Main Mem = 32K
Memory Mapping
Memory Management Unit (MMU): Device that performs virtual-to-physical mapping
MMU
32-bit VM Address
MMU
15-bit Physical Address
Memory Management Unit

MMU 32-bit Virtual Address
Broken into 2 portions

12-bit offset in page
(since our pages are 4KB)
20-bit Virtual page #
How to determine if page is in MM? Present/Absent Bit in Page Table Entry
Demand Paging
Page Fault:
Requested page is not in MM
Possible Mapping of pages
Demand Paging:
Page is demanded by program Page is loaded into MM
Demand Paging
Page Fault:
But What to bring in for a program on start up? Possible Mapping of pages
Requested page is not in MM
Demand Paging:
Page is demanded by program Page is loaded into MM
Working Set
Set of pages used by a process Each process has a unique memory map Importance in regards to a multi-tasked OS At time t, there is a set of k recently used pages
all
References tend to cluster on a small number of pages

Put this set to Work!!! Store & Load it during Process Switching
Page Replacement Policy

Working Set:
Set of pages used actively & heavily Kept in memory to reduce Page Faults
Set is found/maintained dynamically by OS Replacement: OS tries to predict which page would have least impact on the running program
Common Replacement Schemes:
Least Recently Used (LRU) First-In-First-Out (FIFO)
Replacement Policy
Placement Policy
Which page is replaced? Page removed should be the page least likely to be referenced in the near future Most policies predict the future behavior on the basis of past behavior
Basic Replacement Algorithms

Least Recently Used (LRU)
Replaces the page that has not been referenced for the longest time By the principle of locality, this should be the page least likely to be referenced in the near future Each page could be tagged with the time of last reference. This would require a great deal of overhead.
SRAM
SRAM is faster than DRAM
DRAM
DRAMs are smaller and less expensive because SRAMs are made from four to six transistors (flip flops) per bit.
SRAMs don't require external refresh circuitry or other work in order for them to keep their data intact.
DRAMs use only one transistor, plus a capacitor.
It has been discovered that for about 90% of the time that our programs execute only 10% of our code is used! This is known as the Locality Principle
Temporal Locality
When a program asks for a location in memory , it will likely ask for that same location again very soon thereafter
Spatial Locality
When a program asks for a memory location at a memory address (lets say 1000) It will likely need a nearby location 1001,1002,1003,10004 etc.
For a 1 GHz CPU a 50 ns wait means 50 wasted clock cycles
Registers fastest possible access (usually 1 CPU cycle) <1 ns Level 1 (SRAM) cache 2-8ns
often accessed in just a few cycles, usually tens hundreds of kilobytes ~$80/MB higher latency than L1 by 2 to 10, now multi-MB ~$80/MB may take hundreds of cycles, but can be multiple gigabytes eg.2GB $11 ($0.0055/MB) millions of cycles latency, but very large eg.1TB $139 ($.000139/MB)
Level 2 (SRAM) cache 5-12ns

Main memory (DRAM) 1060ns Disk storage 3,000,000 10,000,000 ns Tertiary storage (really slow)
several seconds latency, can be huge
Main Memory and Disk estimates Frys Ad 10/16/2008
We established that the Locality principle states that only a small amount of Memory is needed for most of the programs lifetime We now have a Memory Hierarchy that places very fast yet expensive RAM near the CPU and larger slower cheaper RAM further away
The trick is to keep the data that the CPU wants in the small expensive fast memory close to the CPU and how do we do that???
Hardware and the Operating System are responsible for moving data throughout the Memory Hierarchy when the CPU needs it. Modern programming languages mainly assume two levels of memory, main memory and disk storage. Programmers are responsible for moving data between disk and memory through file I/O. Optimizing compilers are responsible for generating code that, when executed, will cause the hardware to use caches and registers efficiently.
A computer program or a hardware-maintained structure that is designed to manage a cache of information When the smaller cache is full, the algorithm must choose which items to discard to make room for the new data The "hit rate" of a cache describes how often a searched-for item is actually found in the cache The "latency" of a cache describes how long after requesting a desired item the cache can return that item
Each replacement strategy is a compromise between hit rate and latency.
Direct Mapped Cache

The direct mapped cache is the simplest form of cache and the easiest to check for a hit. Unfortunately, the direct mapped cache also has the worst performance, because again there is only one place that any address can be stored.
Fully Associative Cache

The fully associative cache has the best hit ratio because any line in the cache can hold any address that needs to be cached. However, this cache suffers from problems involving searching the cache A replacement algorithm is used usually some form of a LRU "least recently used" algorithm
N-Way Set Associative Cache

The set associative cache is a good compromise between the direct mapped and set associative caches.
Virtual Memory is basically the extension of physical main memory (RAM) into a lower cost portion of our Memory Hierarchy (lets say Hard Disk) A form of the Overlay approach, managed by the OS, called Paging is used to swap pages of memory back and forth between the Disk and Physical Ram. Hard Disks are huge, but to you remember how slow they are??? Millions of times slower that the other memories in our pyramid.

21SCS147L17ReplacementPolicy4 7

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

21SCS147L17ReplacementPolicy4 7

Hochgeladen von

Copyright:

Verfügbare Formate

Cache memory Replacement Policy, Virtual Memory

Prof. Sin-Min Lee Department of Computer Science

Hit ratio: 9/18

E is LRU and is removed from memory

2-Way Set Associative Cache

Associative Cache (FIFO Replacement Policy)

Hit ratio = 7/18

Two-way set associative cache (LRU Replacement Policy)

D A-0 B-1 D-0

B A-1 B-0 D-0

E E-0 B-1 D-0

F E-0 B-1 D-1 F-0

A E-1 A-0 D-1 F-0 C-0

C E-1 A-0 D-1 F-0 C-0

D E-1 A-0 D-0 F-1 C-0

B B-0 A-1 D-0 F-1 C-0

G B-0 A-1 D-0 F-1 C-0

C B-0 A-1 D-0 F-1 C-0

H B-0 A-1 D-0 F-1 C-0

I B-0 A-1 D-0 F-1 C-1 I-0

A B-1 A-0 D-0 F-1 C-1 I-0 G-1 H-0 *

B B-0 A-1 D-0 F-1 C-1 I-0 G-1 H-0 *

Hit ratio = 7/18

Associative Cache with 2 byte line size (FIFO Replacement Policy)

Hit ratio = 11/18

Direct-mapped Cache with line size of 2 bytes

Hit ratio 7/18

Two-way set Associative Cache with line size of 2 bytes

Hit ratio = 12/18

Page Replacement - FIFO

Solution to Memory Constraints

NOTE: normally Main Memory has thousands of pages

1 page = 4096 bytes page page

New Issue: How to manage addressing?

1 page = 4096 bytes

page page virtual address

page physical address

8191 page page virtual address used by program

physical address used by hardware

Page size Always a power of 2 Common Size:

Note: 2ndry Mem = 64K; Main Mem = 32K

15-bit Physical Address

Memory Management Unit

Broken into 2 portions

20-bit Virtual page #

How to determine if page is in MM? Present/Absent Bit in Page Table Entry

Requested page is not in MM

References tend to cluster on a small number of pages

Page Replacement Policy

Basic Replacement Algorithms

DRAMs use only one transistor, plus a capacitor.

For a 1 GHz CPU a 50 ns wait means 50 wasted clock cycles

Level 2 (SRAM) cache 5-12ns

several seconds latency, can be huge

Main Memory and Disk estimates Frys Ad 10/16/2008

Each replacement strategy is a compromise between hit rate and latency.

Direct Mapped Cache

Fully Associative Cache

N-Way Set Associative Cache

Das könnte Ihnen auch gefallen