Virtual Memory

Paging
Both unequal fixed size and variable size partitions are inefficient in the use of memory. It has been observed
that both schemes lead to memory wastage. Therefore we are not using the memory efficiently.
There is another scheme for use of memory which is known as paging.
In this scheme, The memory is partitioned into equal fixed size chunks that are relatively small. This chunk of
memory is known as frames or page frames.
Each process is also divided into small fixed chunks of same size. The chunks of a program is known
as pages. A page of a program could be assigned to available page frame. In this scheme, the wastage space in
memory for a process is a fraction of a page frame which corresponds to the last page of the program. At a
given point of time some of the frames in memory are in use and some are free. The list of free frame is
maintained by the operating system.
Process A , stored in disk , consists of pages . At the time of execution of the process A, the operating system
finds six free frames and loads the six pages of the process A into six frames. These six frames need not be
contiguous frames in main memory. The operating system maintains a page table for each process. Within the
program, each logical address consists of page number and a relative address within the page.
In case of simple partitioning, a logical address is the location of a word relative to the beginning of the
program; the processor translates that into a physical address. With paging, a logical address is a location of
the word relative to the beginning of the page of the program, because the whole program is divided into
several pages of equal length and the length of a page is same with the length of a page frame.
A logical address consists of page number and relative address within the page, the process uses the page table
to produce the physical address which consists of frame number and relative address within the frame.
The Figure 3.22 shows the allocation of frames to a new process in the main memory. A page table is
maintained for each process. This page table helps us to find the physical address in a frame which
corresponds to a logical address within a process.
Figure 3.22: Allocation of free frames
The conversion of logical address to physical address is shown in the figure for the Process A.
Figure 3.23: Translation of Logical Address to Physical Address
This approach solves the problems. Main memory is divided into many small equal size frames. Each process
is divided into frame size pages. Smaller process requires fewer pages, larger process requires more. When a
process is brought in, its pages are loaded into available frames and a page table is set up.
The translation of logical addresses to physical address is shown in the Figure 3.23.
Virtual Memory
The concept of paging helps us to develop truly effective multiprogramming systems.
Since a process need not be loaded into contiguous memory locations, it helps us to put a page of a process in
any free page frame. On the other hand, it is not required to load the whole process to the main memory,
because the execution may be confined to a small section of the program. (eg. a subroutine).
It would clearly be wasteful to load in many pages for a process when only a few pages will be used before
the program is suspended.
Instead of loading all the pages of a process, each page of process is brought in only when it is needed, i.e on
demand. This scheme is known as demand paging .
Demand paging also allows us to accommodate more process in the main memory, since we are not going to
load the whole process in the main memory, pages will be brought into the main memory as and when it is
required.
With demand paging, it is not necessary to load an entire process into main memory.
This concept leads us to an important consequence It is possible for a process to be larger than the size of
main memory. So, while developing a new process, it is not required to look for the main memory available in
the machine. Because, the process will be divided into pages and pages will be brought to memory on
demand.
Because a process executes only in main memory, so the main memory is referred to as real memory or
physical memory. A programmer or user perceives a much larger memory that is allocated on the disk. This
memory is referred to as virtual memory. The program enjoys a huge virtual memory space to develop his or
her program or software. The execution of a program is the job of operating system and the underlying
hardware. To improve the performance some special hardware is added to the system. This hardware unit is
known as Memory Management Unit (MMU). In paging system, we make a page table for the process. Page
table helps us to find the physical address from virtual address.
The virtual address space is used to develop a process. The special hardware unit , called Memory
Management Unit (MMU) translates virtual address to physical address. When the desired data is in the
main memory, the CPU can work with these data. If the data are not in the main memory, the MMU causes the
operating system to bring into the memory from the disk.
A typical virtual memory organization is shown in the Figure 3.24.
Figure 3.24: Virtual Memory Organization
Address Translation
The basic mechanism for reading a word from memory involves the translation of a virtual or logical address,
consisting of page number and offset, into a physical address, consisting of frame number and offset, using a
page table.
There is one page table for each process. But each process can occupy huge amount of virtual memory. But
the virtual memory of a process cannot go beyond a certain limit which is restricted by the underlying
hardware of the MMU. One of such component may be the size of the virtual address register.
The sizes of pages are relatively small and so the size of page table increases as the size of process increases.
Therefore, size of page table could be unacceptably high.
To overcome this problem, most virtual memory scheme store page table in virtual memory rather than in real
memory.
This means that the page table is subject to paging just as other pages are.
When a process is running, at least a part of its page table must be in main memory, including the page table
entry of the currently executing page.
A virtual address translation scheme by using page table is shown in the Figure 3.25.
Figure 3.25: Virtual Address Translation Method
Each virtual address generated by the processor is interpreted as virtual page number (high order list)
followed by an offset (lower order bits) that specifies the location of a particular word within a page.
Information about the main memory location of each page kept in a page table.
Some processors make use of a two level scheme to organize large page tables.
In this scheme, there is a page directory, in which each entry points to a page table.
Thus, if the length of the page directory is X, and if the maximum length of a page table is Y, then the
process can consist of up to X * Y pages.
Typically, the maximum length of page table is restricted to the size of one page frame.
Inverted page table structures

There is one entry in the hash table and the inverted page table for each real memory page rather than one per
virtual page.
Thus a fixed portion of real memory is required for the page table, regardless of the number of processes or
virtual page supported.
Because more than one virtual address may map into the hash table entry, a chaining technique is used for
managing the overflow.
The hashing techniques results in chains that are typically short either one or two entries.
The inverted page table structure for address translation is shown in the Figure 3.26.
Figure 3.26: Inverted Page table structure
Translation Lookaside Buffer (TLB)
Every virtual memory reference can cause two physical memory accesses. One to fetch the appropriate page
table entry One to fetch the desired data. Thus a straight forward virtual memory scheme would have the
effect of doubling the memory access time. To overcome this problem, most virtual memory schemes make
use of a special cache for page table entries, usually called Translation Lookaside Buffer (TLB).
This cache functions in the same way as a memory cache and contains those page table entries that have been
most recently used. In addition to the information that constitutes a page table entry, the TLB must also
include the virtual address of the entry.
The Figure 3.27 shows a possible organization of a TLB whwere the associative mapping technique is used.
Figure 3.27: Use of an associative mapped TLB
Set-associative mapped TLBs are also found in commercial products. An essential requirement is that the
contents of the TLB be coherent with the contents of the page table in the main memory. When the operating
system changes the contents of the page table it must simultaneously invalidate the corresponding entries in
the TLB. One of the control bits in the TLB is provided for this purpose.
Address Translation proceeds as follows:
Given a virtual address, the MMU looks in the TLB for the reference page.
If the page table entry for this page is found in the TLB, the physical address is obtained immediately.
If there is a miss in the TLB, then the required entry is obtained from the page table in the main
memory and the TLB is updated.
When a program generates an access request to a page that is not in the main memory, a page fault is
said to have occurred.
The whole page must be brought from the disk into the memory before access can proceed.
When it detects a page fault, the MMU asks the operating system to intervene by raising an exception.
(interrupt).
Processing of active task is interrupted, and control is transferred to the operating system.
The operating system then copies the requested page from the disk into the main memory and returns
control to the interrupted task. Because a long delay occurs due to a page transfer takes place, the
operating system may suspend execution of the task that caused the page fault and begin execution of
another task whose page are in the main memory.
Q 14: A block set associative cache consists of a total of 128 blocks. These are divided into set consisting of 4
blocks in each set. The main memory contains 8192 blocks. Each block contains 256 words -
a. How many bits are there in a main memory address?

b. How many bits are there in each of the TAG, SET and WORD fields?
Q 16: Consider a computer with the following characteristics total of 1Mbyte of main memory, the word size
is 1byte, block size is 16 bytes and cache size is 64k bytes -
a. For the main memory address D0010, 12345, CDABF and F00FF, give the corresponding tag, cache, line
address and word offsets for a direct-mapped cache.
b. Give any two main memory addresses with different tags that map to the same cache slot for a direct
mapped cache.
c. Give any two main memory addresses with different tags that map to the same cache slot for a direct
mapped cache.
d. For the main memory address of F00FF and CDABF, give the corresponding tab, cache set and offset
values fore a four-way set associative cache.
Q 25: Consider a paged logical address space (composed of 64 pages of 4k bytes each) mapped into a 2Mbyte
physical memory space-
a. What is the format of the processor's logical address?

b. What is the length of the page table?
c. What is the length of the inverted page table ?
d. What is the effect on the page table if the physical memory space is reduced by half?
Introduction
A cache stores a subset of the addresss space of RAM. An address space is the set of valid addresses. Thus,
for each address in cache, there is a corresponding address in RAM. This subset of addresses (and
corresponding copy of data) changes over time, based on the behavior of your program.
Cache is used to keep the most commonly used sections of RAM in the cache, where it can be accessed
quickly. This is necessary because CPU speeds increase much faster than speed of memory access. If we
could access RAM at 3 GHz, there wouldn't be any need for cache, because RAM could keep up. Because it
can't keep up, we use cache.
What if we wanted more RAM than we had available. For example, we might have 1 M of RAM, what if we
wanted 10 M? How could we manage?
One way to extend the amount of memory accessible by a program is to use disk. Thus, we can use 10 Megs
of disk space. At any time, only 1 Meg resides in RAM.
In effect, RAM acts like cache for disk. This idea of extending memory is called virtual memory. It's called
"virtual" only because it's not RAM. It doesn't mean it's fake. The real problem with disk is that it's really,
really slow to access. If registers can be accessed in 1 nanosecond, and cache in 5 ns and RAM in about 100
ns, then disk is accessed in fractions of seconds. It can be a million times slower to access disk than a register.
The advantage of disk is it's easy to get lots of disk space for a small cost. Still, becaues disk is so slow to
access, we want to avoid accessing disk unnecessarily.
Uses of Virtual Memory
Virtual memory is an old concept. Before computers had cache, they had virtual memory. For a long time,
virtual memory only appeared on mainframes. Personal computers in the 1980s did not use virtual memory. In
fact, many good ideas that were in common use in the UNIX operating systems didn't appear until the mid
1990s in personal computer operating systems (pre-emptive multitasking and virtual memory).
Initially, virtual memory meant the idea of using disk to extend RAM. Programs wouldn't have to care
whether the memory was "real" memory (i.e., RAM) or disk. The operating system and hardware would
figure that out.
Later on, virtual memory was used as a means of memory protection. Every program uses a range of
addressed called the address space.
The assumption of operating systems developers is that any user program can not be trusted. User programs
will try to destroy themselves, other user programs, and the operating system itself. That seems like such a
negative view, however, it's how operating systems are designed. It's not necessary that programs have to be
deliberately malicious. Programs can be accidentally malicious (modify the data of a pointer pointing to
garbage memory). Virtual memory can help there too. It can help prevent programs from interfering with
other programs. Occasionally, you want programs to cooperate, and share memory. Virtual memory can also
help in that respect.
How Virtual Memory Works
When a computer is running, many programs are simulataneously sharing the CPU. Each running program,
plus the data structures needed to manage it, is called a process.
Each process is allocated an address space. This is a set of valid addresses that can be used. This address space
can be changed dynamically. For example, the program might request additional memory (from dynamic
memory allocation) from the operating system.
If a process tries to access an address that is not part of its address space, an error occurs, and the operating
system takes over, usually killing the process (core dumps, etc).
How does virtual memory play a role? As you run a program, it generates addresses. Addresses are generated
(for RISC machines) in one of three ways:
A load instruction
A store instruction
Fetching an instruction
Load/store create data addresses, while fetching an instruction creates instruction addresses. Of course, RAM
doesn't distinguish between the two kinds of addresses. It just sees it as an address. Each address generated by
a program is considered virtual. It must be translated to a real physical address. Thus, address tranlation is
occuring all the time. As you might imagine, this must be handled in hardware, if it's to be done efficiently.
You might think translating each address from virtual to physical is a crazy idea, because of how slow it is.
However, you get memory protection from address translation, so it's worth the hardware needed to get
memory protection.
Paging
In a cache, we fetched quantities called data blocks or cache lines. Those are typically somewhere between,
say, 4 and 64 bytes. There is a corresponding terminology in virtual memory to a cache line. It's called a page.
A page is a sequence of N bytes where N is a power of 2. These days, page sizes are at least 4K in size and
maybe as large as 64 K or more. Let's assume that we have 1M of RAM. RAM is also called physical
memory. We can subdivide the RAM into 4K pages. Thus 1M / 4K = 256 pages. Thus, our RAM has 256
physical pages, weach holding 4K. Let's assume we have 10 M of disk. Thus, we have 2560 disk pages.
In principle, each program may have up to 4 G of address space. Thus, it can, in principle, access 220 virtual
pages. In reality, many of those pages are considered invalid pages.
Page Tables
How is an address translated from virtual to physical? First, like the cache, we split up a 32 bit virtual address
into a virtual page (which is like a tag) and a page offset.
If this looks a lot like a fully-associative cache, but whose offset is much much larger, it's because that's
basically what it is. We must convert the virtual page number to a physical page number. In our example, the
virtual page consists of 20 bits. A page table is a data structure which consists of 220 page table entries (PTEs).
Think of the page table as an array of page table entries, indexed by the virtual page number.
The page table's index starts at 0, and ends at 220 - 1. Here's how it looks:
Suppose your program generates a virtual address. You'd extract bits B31-12 to get the virtual page number. Use
that as an index into the above page table to access the page table entry (PTE).
Each PTE consists of a valid bit and a 20 bit physical page (it's 20 bits, because we assume we have 1M of
RAM, and 1M of RAM requires 20 bits to access each byte). If the valid bit is 1, then the virtual page is in
RAM, and you can get the physical page from the PTE. This is called a page hit, and is basically the same as a
cache hit.
If the valid bit is 0, the page is not in RAM, and the 20 bit physical page is meaningless. This means, we must
get the disk page corresponding to the virtual page from disk and place it into a page in RAM. This is called
a page fault. Because disk access is slow, slow, slow, we want to minimize the number of page faults. In
general, this is done by making RAM fully associative. That is, any disk page can go into any RAM page
(disk, RAM, and virtual pages all have the same size). In practice, some pages in RAM are reserved for the
operating system to make the OS run efficiently.
Translation
Suppose your program generated the following virtual address F0F0F0F0hex (which is 1111 0000 1111 0000
1111 0000 1111 0000 two). How would you translate this to a physical address? First, you would split the
address into a virtual page, and a page offset (see below).
Then, you'd see if the virtual page had a corresponding physical page in RAM using the page table. If the
valid bit of the PTE is 1, then you'd translate the virtual page to a physical page, and append the page offset.
That would give you a physical address in RAM.
Huge Page Tables
Page tables can be very large. If every virtual page was valid, our page table would be 220 X 21 bits. This is
about 3 Megs just for one program's page table. If there are many programs, there are many tables, each
occupying a lot of memory.
What's worse, the page tables we've been talking about are incomplete. If we have a page fault, we need to
find the page on disk. Where is it located? That information is kept in another page table, which is indexed by
the virtual page (same as the page table we talked about), and tells you where on disk to find it. Then, we have
to copy that page to RAM, and update the first page table. Thus, we need two page table tables!
These page tables are basically just data. Thus, they occupy memory as any data occupies memory. When we
switch from one process to another, we need to load its page table in RAM for easy access. It's useful to keep
it located in certain parts of RAM for just such a purpose. If RAM is suitable large, we can have several
processes' page tables in RAM at the same time.
A page table register can hold the physical address of the page table that's currently active to get quick access.
Still, these are large, and we may want to find ways to speed things up.
Inverted Page Tables
There are many schemes to reduce the size of a page table. One way is to use a hierarchy. Thus, we might
have two layers of pages. Bits B31-22 might tell you the first layer, while B21-13 might tell you the second layer.
Another idea is to use a kind of closed hash table. The hash table's size is based on the number of physical
pages. The number of physical pages is usually a lot smaller than the number of all virtual pages put together.
A hash function takes a virtual page number as input, and produces an index into the hash table as the result.
Each entry of the hash table consists of a virtual page number and a physical page number. You check to see if
the virtual page number matched, and if so, then you use the physical page.
If it missed, then you must resolve the collision based on the hash table. In practice, you may need the number
of entries of the hash table to be a few times larger than the number of physical pages, to avoid excessive
collisions.
An inverted page table takes longer to access because you may have collisions, but it takes up a lot less
memory. It helps with page hits. However, if you have a page fault, you still need a page table that maps
virtual pages to disk pages, and that will be large.
Translation Lookaside Buffer (TLB)
What's the cost of address translation? For each virtual address, we must access the page table to find the PTE
corresponding to the virtual page. We look up the physical page from the PTE, and construct a physical
address. Then, we access RAM at the physical address. That's two memory acccess: one to access the PTE,
one more to access the data in RAM. Cache helps us cut down the amount of time to access memory, but that's
only if we have cache hits. The idea of a TLB is to create a special cache for translations. Here's one example
of a TLB.
Each row in the TLB is like one slot of a cache. Assume we have 64 rows. When you have a virtual address,
you can split it into a virtual page and an offset.
In parallel, compare the virtual page to all of the entries of the TLB (say, 64). There should be, at most, one
match. Just like a fully associative cache, you want to check if the TLB entry is valid.
If a TLB hit occurs, replace the virtual page with a physical page to create a physical address.
If there's a TLB miss, then it's still possible that the virtual page resides in RAM. You must now look up the
PTE (page table entry) to see if this is the case. If the PTE says the virtual page is in RAM, then you can
update the TLB, so that it has a correct virtual to physical page translation.
The TLB is designed to only store a limited subset of virtual to physical page translation. It is really just a
cache for the page table, storing only the most frequently used translations.
The TLB can be kept small enough that it can be fully associative. However, some CPU designers make larger
TLBs that are direct mapped or set associative.

Virtual Memory

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Virtual Memory

Hochgeladen von

Copyright:

Verfügbare Formate

Paging

Figure 3.23: Translation of Logical Address to Physical Address

The concept of paging helps us to develop truly effective multiprogramming systems.

A typical virtual memory organization is shown in the Figure 3.24.

Figure 3.24: Virtual Memory Organization

Figure 3.25: Virtual Address Translation Method

Inverted page table structures

Figure 3.26: Inverted Page table structure

Translation Lookaside Buffer (TLB)

Figure 3.27: Use of an associative mapped TLB

Address Translation proceeds as follows:

a. How many bits are there in a main memory address?

a. What is the format of the processor's logical address?

Uses of Virtual Memory

How Virtual Memory Works

Huge Page Tables

Inverted Page Tables

Translation Lookaside Buffer (TLB)

Das könnte Ihnen auch gefallen