Beruflich Dokumente
Kultur Dokumente
Associative Cache
Access order
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns FIFO
h = 0.389
TM = 40.56 ns
FIFO
FIFO is similar to LRU except that because FIFO method doesn't move the most recently used block up in the queue, it ends up discarding a block that was used recently instead of the least recently used. In this example, when the memory is full A is discarded though it was recently used.
A B C A D B E F A C D B G C H I F F F F G G H I A B A B A 0 A B C C D D E F 1 MEMORY 2 3 4 5
A B B C C D E E E E E F F G H I
A A B B C D D D D D E E F G H I A A B C C C C C D D E F G H A B B B B B C C D E F G A A A A A B B C D E F
6
7
A A B C D E
A B C D Hit ratio: 7/18
A is removed from memory, although it was not the least recently used.
LRU
LRU (Least Recently Used) is a method that keeps track of how often a data block is used.
A B C A D B E F 0 A B C A D B E F 1 A B C A D B E A B C A D B B C A D C A C A C D B G C H I A C D B G C H I F E A B A B A
A C D B G C H I F
MEMORY
2 3 4 5 6 7
A C D B G C H I F A C D B G C H F A A D B G C F E F E A D B G F E A D D F F F
B E
D B E
C D B E
The bottom block is the Least Recently Used, which gets replaced (discarded) in favor of the new one, when the memory is full.
MEMORY FULL
Direct-Mapped Cache
Access order
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0 Tc = 10 ns Tp = 60 ns h = 0.167
TM = 50.67 ns
h = 0.31389
TM = 40.56 ns
A
C A C H E
A
B
A
B C
A
B C
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
A
B C D
I
B C D
I
A C D
I
A B D
E
F
E
F
E
F
E
F
E
F
E
F G
E
F G *
E
F G H
E
F G H
E
F G H
E
F G H
Hit?
Data 0 C A C H E 0 1 1 2 2 3 3 Hit?
A A-0
B A-1 B-0
C A-1 B-0
A A-0 B-1
C-0
C-0
C-0
C-0
C-0
C-0
G-0
G-0
G-1 H-0
G-1 H-0
A
A J
B
A J B D
C
A J B D C G
A
A J B D C G
D
A J B D C G
B
A J B D C G
E
A J B D C G E F
F
A J B D C G E F
A
A J B D C G E F
C
A J B D C G E F
D
A J B D C G E F
B
A J B D C G E F
G
A J B D C G E F
C
A J B D C G E F
H
I H B D C G E F
I
I H B D C G E F
A
I H A J C G E F
B
I H A J B D E F
C
H E
Hit?
1
C A C H E 2 3 4 5 6 7 Hit?
D
C G
J
C G
D
C G
D
C G
D
C G E F
D
C G E F
J
C G E F
J
C G E F
D
C G E F
D
C G E F
D
C G E F
D
C G E F
D
C G E F I H
D
C G E F I H *
J
C G E F I H
D
C G E F I H
A A-0
B A-1
C A-1
A A-0
D A-1
B A-1
E E-0
F E-0
A E-1
C B-0
D B-0
B B-0
G B-0
C B-0
H B-0
I B-0
A B-1
B B-0
0 C
J-0
J-1
B-0 D-0
J-1
B-0 D-0 C-0 G-0
J-0
B-1 D-1 C-0 G-0
J-1
B-0 D-0 C-0 G-0
J-1
B-0 D-0 C-0 G-0
F-0
B-1 D-1 C-0 G-0
F-0
B-1 D-1 C-0 G-0
F-1
A-0 J-0 C-0 G-0
D-0
A-0 J-0 C-0 G-0
D-0
A-1 J-1 C-0 G-0
D-0
A-1 J-1 C-0 G-0
D-0
A-1 J-1 C-0 G-0
D-0
A-1 J-1 C-0 G-0
D-0
A-1 J-1 C-1 G-1 I-0 H-0
D-0
A-1 J-1 C-1 G-1 I-0 H-0 *
D-1
A-0 J-0 C-1 G-1 I-0 H-0 *
D-0
A-1 J-1 C-1 G-1 I-0 H-0 *
A 0 C 1 2
H 3 E 2 3 Hit?
Might be good? Page to be evicted has been in memory the longest time But?
Maybe it is being used We just dont know
FIFO suffers from Beladys Anomaly fault rate may increase when there is more physical memory!
Parkinson's law : "Programs expand to fill the memory available to hold them" Idea : Manage the storage available efficiently between the available programs.
Before VM
Programmers tried to shrink programs to fit tiny memories Result:
Small Inefficient Algorithms
Implementations of VM
Paging
Disk broken up into regular sized pages
Segmentation
Disk broken up into variable sized segments
Memory Issues
Idea: Separate concepts of
address space Disk memory locations RAM
Example:
Address Field = 216 = 65536 memory cells Memory Size = 4096 memory cells How can we fit the Address Space into Main Memory?
Paging
Break memories into Pages
page
Address Mapping
Mapping 2ndary Memory addresses to Main Memory addresses
Address Mapping
Mapping 2ndary Memory (program/virtual) addresses to Main Memory (physical) addresses
physical virtual 1 page = 4096 bytes
4095
0
4096
page
Paging
physical 4095 virtual 8191 4096
Illusion that Main Memory is Large Contiguous Linear Size(MM) = Size(2ndry M) Transparent to Programmer
page page
4095 / 0 4095 / 0 0
page
Paging Implementation
Virtual Address Space (Program) & Physical Address Space (MM)
Broken up into equal pages (just like cache & MM!!)
Paging Implementation
Page Frames
Page Tables Programs use Virtual Addresses
Memory Mapping
Page Frame: home of VM pages in MM Page Table: home of mappings for VM pages
Page # Page Frame #
Memory Mapping
Memory Management Unit (MMU): Device that performs virtual-to-physical mapping
MMU
32-bit VM Address
MMU
Demand Paging
Page Fault:
Requested page is not in MM
Possible Mapping of pages
Demand Paging:
Page is demanded by program Page is loaded into MM
Demand Paging
Page Fault:
But What to bring in for a program on start up? Possible Mapping of pages
Demand Paging:
Page is demanded by program Page is loaded into MM
Working Set
Set of pages used by a process Each process has a unique memory map Importance in regards to a multi-tasked OS At time t, there is a set of k recently used pages
all
Set is found/maintained dynamically by OS Replacement: OS tries to predict which page would have least impact on the running program
Common Replacement Schemes:
Least Recently Used (LRU) First-In-First-Out (FIFO)
Replacement Policy
Placement Policy
Which page is replaced? Page removed should be the page least likely to be referenced in the near future Most policies predict the future behavior on the basis of past behavior
SRAM
SRAM is faster than DRAM
DRAM
DRAMs are smaller and less expensive because SRAMs are made from four to six transistors (flip flops) per bit.
SRAMs don't require external refresh circuitry or other work in order for them to keep their data intact.
It has been discovered that for about 90% of the time that our programs execute only 10% of our code is used! This is known as the Locality Principle
Temporal Locality
When a program asks for a location in memory , it will likely ask for that same location again very soon thereafter
Spatial Locality
When a program asks for a memory location at a memory address (lets say 1000) It will likely need a nearby location 1001,1002,1003,10004 etc.
Registers fastest possible access (usually 1 CPU cycle) <1 ns Level 1 (SRAM) cache 2-8ns
often accessed in just a few cycles, usually tens hundreds of kilobytes ~$80/MB higher latency than L1 by 2 to 10, now multi-MB ~$80/MB may take hundreds of cycles, but can be multiple gigabytes eg.2GB $11 ($0.0055/MB) millions of cycles latency, but very large eg.1TB $139 ($.000139/MB)
We established that the Locality principle states that only a small amount of Memory is needed for most of the programs lifetime We now have a Memory Hierarchy that places very fast yet expensive RAM near the CPU and larger slower cheaper RAM further away
The trick is to keep the data that the CPU wants in the small expensive fast memory close to the CPU and how do we do that???
Hardware and the Operating System are responsible for moving data throughout the Memory Hierarchy when the CPU needs it. Modern programming languages mainly assume two levels of memory, main memory and disk storage. Programmers are responsible for moving data between disk and memory through file I/O. Optimizing compilers are responsible for generating code that, when executed, will cause the hardware to use caches and registers efficiently.
A computer program or a hardware-maintained structure that is designed to manage a cache of information When the smaller cache is full, the algorithm must choose which items to discard to make room for the new data The "hit rate" of a cache describes how often a searched-for item is actually found in the cache The "latency" of a cache describes how long after requesting a desired item the cache can return that item
Virtual Memory is basically the extension of physical main memory (RAM) into a lower cost portion of our Memory Hierarchy (lets say Hard Disk) A form of the Overlay approach, managed by the OS, called Paging is used to swap pages of memory back and forth between the Disk and Physical Ram. Hard Disks are huge, but to you remember how slow they are??? Millions of times slower that the other memories in our pyramid.