Sie sind auf Seite 1von 60

Cache memory Replacement Policy, Virtual Memory

Prof. Sin-Min Lee Department of Computer Science

Associative Cache
Access order
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns FIFO

h = 0.389
TM = 40.56 ns

FIFO
FIFO is similar to LRU except that because FIFO method doesn't move the most recently used block up in the queue, it ends up discarding a block that was used recently instead of the least recently used. In this example, when the memory is full A is discarded though it was recently used.
A B C A D B E F A C D B G C H I F F F F G G H I A B A B A 0 A B C C D D E F 1 MEMORY 2 3 4 5

A B B C C D E E E E E F F G H I

A A B B C D D D D D E E F G H I A A B C C C C C D D E F G H A B B B B B C C D E F G A A A A A B B C D E F

6
7

A A B C D E
A B C D Hit ratio: 7/18

A is removed from memory, although it was not the least recently used.

LRU
LRU (Least Recently Used) is a method that keeps track of how often a data block is used.
A B C A D B E F 0 A B C A D B E F 1 A B C A D B E A B C A D B B C A D C A C A C D B G C H I A C D B G C H I F E A B A B A

A C D B G C H I F

MEMORY

2 3 4 5 6 7

A C D B G C H I F A C D B G C H F A A D B G C F E F E A D B G F E A D D F F F

B E

D B E

C D B E

Hit ratio: 9/18

The bottom block is the Least Recently Used, which gets replaced (discarded) in favor of the new one, when the memory is full.

E is LRU and is removed from memory

MEMORY FULL

Direct-Mapped Cache
Access order
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns h = 0.167

TM = 50.67 ns

2-Way Set Associative Cache


Access order
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns LRU

h = 0.31389
TM = 40.56 ns

Associative Cache (FIFO Replacement Policy)


A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0
Data A B C A D B E F A C D B G C H I A B

A
C A C H E

A
B

A
B C

A
B C

A
B C D

A
B C D

A
B C D

A
B C D

A
B C D

A
B C D

A
B C D

A
B C D

A
B C D

A
B C D

A
B C D

I
B C D

I
A C D

I
A B D

E
F

E
F

E
F

E
F

E
F

E
F G

E
F G *

E
F G H

E
F G H

E
F G H

E
F G H

Hit?

Hit ratio = 7/18

Two-way set associative cache (LRU Replacement Policy)


A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0

Data 0 C A C H E 0 1 1 2 2 3 3 Hit?

A A-0

B A-1 B-0

C A-1 B-0

A A-0 B-1

D A-0 B-1 D-0

B A-1 B-0 D-0

E E-0 B-1 D-0

F E-0 B-1 D-1 F-0

A E-1 A-0 D-1 F-0 C-0

C E-1 A-0 D-1 F-0 C-0

D E-1 A-0 D-0 F-1 C-0

B B-0 A-1 D-0 F-1 C-0

G B-0 A-1 D-0 F-1 C-0

C B-0 A-1 D-0 F-1 C-0

H B-0 A-1 D-0 F-1 C-0

I B-0 A-1 D-0 F-1 C-1 I-0

A B-1 A-0 D-0 F-1 C-1 I-0 G-1 H-0 *

B B-0 A-1 D-0 F-1 C-1 I-0 G-1 H-0 *

C-0

C-0

C-0

C-0

C-0

C-0

G-0

G-0

G-1 H-0

G-1 H-0

Hit ratio = 7/18

Associative Cache with 2 byte line size (FIFO Replacement Policy)


A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0 A and J; B and D; C and G; E and F; and I and H
Data
C A

A
A J

B
A J B D

C
A J B D C G

A
A J B D C G

D
A J B D C G

B
A J B D C G

E
A J B D C G E F

F
A J B D C G E F

A
A J B D C G E F

C
A J B D C G E F

D
A J B D C G E F

B
A J B D C G E F

G
A J B D C G E F

C
A J B D C G E F

H
I H B D C G E F

I
I H B D C G E F

A
I H A J C G E F

B
I H A J B D E F

C
H E

Hit?

Hit ratio = 11/18

Direct-mapped Cache with line size of 2 bytes


A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0 A and J; B and D; C and G; E and F; and I and H
Data 0 A A B B C B A A D B B B E B F B A A C A D B B B G B C B H B I B A A B B

1
C A C H E 2 3 4 5 6 7 Hit?

D
C G

J
C G

D
C G

D
C G

D
C G E F

D
C G E F

J
C G E F

J
C G E F

D
C G E F

D
C G E F

D
C G E F

D
C G E F

D
C G E F I H

D
C G E F I H *

J
C G E F I H

D
C G E F I H

Hit ratio 7/18

Two-way set Associative Cache with line size of 2 bytes


A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0 A and J; B and D; C and G; E and F; and I and H
Data

A A-0

B A-1

C A-1

A A-0

D A-1

B A-1

E E-0

F E-0

A E-1

C B-0

D B-0

B B-0

G B-0

C B-0

H B-0

I B-0

A B-1

B B-0

0 C

J-0

J-1
B-0 D-0

J-1
B-0 D-0 C-0 G-0

J-0
B-1 D-1 C-0 G-0

J-1
B-0 D-0 C-0 G-0

J-1
B-0 D-0 C-0 G-0

F-0
B-1 D-1 C-0 G-0

F-0
B-1 D-1 C-0 G-0

F-1
A-0 J-0 C-0 G-0

D-0
A-0 J-0 C-0 G-0

D-0
A-1 J-1 C-0 G-0

D-0
A-1 J-1 C-0 G-0

D-0
A-1 J-1 C-0 G-0

D-0
A-1 J-1 C-0 G-0

D-0
A-1 J-1 C-1 G-1 I-0 H-0

D-0
A-1 J-1 C-1 G-1 I-0 H-0 *

D-1
A-0 J-0 C-1 G-1 I-0 H-0 *

D-0
A-1 J-1 C-1 G-1 I-0 H-0 *

A 0 C 1 2

H 3 E 2 3 Hit?

Hit ratio = 12/18

Page Replacement - FIFO


FIFO is simple to implement
When page in, place page id on end of list Evict page at head of list

Might be good? Page to be evicted has been in memory the longest time But?
Maybe it is being used We just dont know

FIFO suffers from Beladys Anomaly fault rate may increase when there is more physical memory!

Parkinson's law : "Programs expand to fill the memory available to hold them" Idea : Manage the storage available efficiently between the available programs.

Before VM
Programmers tried to shrink programs to fit tiny memories Result:
Small Inefficient Algorithms

Solution to Memory Constraints


Use a secondary memory such as disk Divide disk into pieces that fit memory (RAM)
Called Virtual Memory

Implementations of VM
Paging
Disk broken up into regular sized pages

Segmentation
Disk broken up into variable sized segments

Memory Issues
Idea: Separate concepts of
address space Disk memory locations RAM

Example:
Address Field = 216 = 65536 memory cells Memory Size = 4096 memory cells How can we fit the Address Space into Main Memory?

Paging
Break memories into Pages

NOTE: normally Main Memory has thousands of pages

1 page = 4096 bytes page page

page

New Issue: How to manage addressing?

Address Mapping
Mapping 2ndary Memory addresses to Main Memory addresses

1 page = 4096 bytes

page page virtual address

page physical address

Address Mapping
Mapping 2ndary Memory (program/virtual) addresses to Main Memory (physical) addresses
physical virtual 1 page = 4096 bytes

4095
0

8191 page page virtual address used by program

4096

page

physical address used by hardware

Paging
physical 4095 virtual 8191 4096

Illusion that Main Memory is Large Contiguous Linear Size(MM) = Size(2ndry M) Transparent to Programmer

page page

4095 / 0 4095 / 0 0

page

Paging Implementation
Virtual Address Space (Program) & Physical Address Space (MM)
Broken up into equal pages (just like cache & MM!!)

Page size Always a power of 2 Common Size:


512 to 64K bytes

Paging Implementation
Page Frames
Page Tables Programs use Virtual Addresses

Memory Mapping
Page Frame: home of VM pages in MM Page Table: home of mappings for VM pages
Page # Page Frame #

Note: 2ndry Mem = 64K; Main Mem = 32K

Memory Mapping
Memory Management Unit (MMU): Device that performs virtual-to-physical mapping
MMU

32-bit VM Address

MMU

15-bit Physical Address

Memory Management Unit


MMU 32-bit Virtual Address

Broken into 2 portions


12-bit offset in page
(since our pages are 4KB)

20-bit Virtual page #

How to determine if page is in MM? Present/Absent Bit in Page Table Entry

Demand Paging
Page Fault:
Requested page is not in MM
Possible Mapping of pages

Demand Paging:
Page is demanded by program Page is loaded into MM

Demand Paging
Page Fault:
But What to bring in for a program on start up? Possible Mapping of pages

Requested page is not in MM

Demand Paging:
Page is demanded by program Page is loaded into MM

Working Set
Set of pages used by a process Each process has a unique memory map Importance in regards to a multi-tasked OS At time t, there is a set of k recently used pages

all

References tend to cluster on a small number of pages


Put this set to Work!!! Store & Load it during Process Switching

Page Replacement Policy


Working Set:
Set of pages used actively & heavily Kept in memory to reduce Page Faults

Set is found/maintained dynamically by OS Replacement: OS tries to predict which page would have least impact on the running program
Common Replacement Schemes:
Least Recently Used (LRU) First-In-First-Out (FIFO)

Replacement Policy
Placement Policy
Which page is replaced? Page removed should be the page least likely to be referenced in the near future Most policies predict the future behavior on the basis of past behavior

Basic Replacement Algorithms


Least Recently Used (LRU)
Replaces the page that has not been referenced for the longest time By the principle of locality, this should be the page least likely to be referenced in the near future Each page could be tagged with the time of last reference. This would require a great deal of overhead.

SRAM
SRAM is faster than DRAM

DRAM
DRAMs are smaller and less expensive because SRAMs are made from four to six transistors (flip flops) per bit.

SRAMs don't require external refresh circuitry or other work in order for them to keep their data intact.

DRAMs use only one transistor, plus a capacitor.

It has been discovered that for about 90% of the time that our programs execute only 10% of our code is used! This is known as the Locality Principle
Temporal Locality
When a program asks for a location in memory , it will likely ask for that same location again very soon thereafter

Spatial Locality
When a program asks for a memory location at a memory address (lets say 1000) It will likely need a nearby location 1001,1002,1003,10004 etc.

For a 1 GHz CPU a 50 ns wait means 50 wasted clock cycles

Registers fastest possible access (usually 1 CPU cycle) <1 ns Level 1 (SRAM) cache 2-8ns
often accessed in just a few cycles, usually tens hundreds of kilobytes ~$80/MB higher latency than L1 by 2 to 10, now multi-MB ~$80/MB may take hundreds of cycles, but can be multiple gigabytes eg.2GB $11 ($0.0055/MB) millions of cycles latency, but very large eg.1TB $139 ($.000139/MB)

Level 2 (SRAM) cache 5-12ns


Main memory (DRAM) 1060ns Disk storage 3,000,000 10,000,000 ns Tertiary storage (really slow)

several seconds latency, can be huge

Main Memory and Disk estimates Frys Ad 10/16/2008

We established that the Locality principle states that only a small amount of Memory is needed for most of the programs lifetime We now have a Memory Hierarchy that places very fast yet expensive RAM near the CPU and larger slower cheaper RAM further away

The trick is to keep the data that the CPU wants in the small expensive fast memory close to the CPU and how do we do that???

Hardware and the Operating System are responsible for moving data throughout the Memory Hierarchy when the CPU needs it. Modern programming languages mainly assume two levels of memory, main memory and disk storage. Programmers are responsible for moving data between disk and memory through file I/O. Optimizing compilers are responsible for generating code that, when executed, will cause the hardware to use caches and registers efficiently.

A computer program or a hardware-maintained structure that is designed to manage a cache of information When the smaller cache is full, the algorithm must choose which items to discard to make room for the new data The "hit rate" of a cache describes how often a searched-for item is actually found in the cache The "latency" of a cache describes how long after requesting a desired item the cache can return that item

Each replacement strategy is a compromise between hit rate and latency.

Direct Mapped Cache


The direct mapped cache is the simplest form of cache and the easiest to check for a hit. Unfortunately, the direct mapped cache also has the worst performance, because again there is only one place that any address can be stored.

Fully Associative Cache


The fully associative cache has the best hit ratio because any line in the cache can hold any address that needs to be cached. However, this cache suffers from problems involving searching the cache A replacement algorithm is used usually some form of a LRU "least recently used" algorithm

N-Way Set Associative Cache


The set associative cache is a good compromise between the direct mapped and set associative caches.

Virtual Memory is basically the extension of physical main memory (RAM) into a lower cost portion of our Memory Hierarchy (lets say Hard Disk) A form of the Overlay approach, managed by the OS, called Paging is used to swap pages of memory back and forth between the Disk and Physical Ram. Hard Disks are huge, but to you remember how slow they are??? Millions of times slower that the other memories in our pyramid.

Das könnte Ihnen auch gefallen