Sie sind auf Seite 1von 4

CS37 Summer 2002, Homework 7

Due on Friday, August 16th, at the beginning of class.


PROBLEM 1 - Cache (10 points) (Source: [STAL2002, 4.7])
Consider a machine for which words are 32 bits long and which has a byte-addressable main memory of 216 bytes. Assume that a direct mapped cache consisting of 32 lines, with block size of two words, is used with this machine. a) How is a 16-bit memory address divided into tag, index and byte offset? (Draw a diagram indicating the bit size and position of each field.) b) Into what cache line would bytes with each of the following addresses be stored? i. 0001 0001 0001 1011 ii. 1100 0011 0011 0100 iii. 1101 0000 0001 1101 iv. 1010 1010 1010 1010 c) Suppose that the byte with address 0001 1010 0001 1010 is stored in the cache. What are the other addresses that are stored along with it? (i.e., in the same cache line). d) How many total bytes of data memory can be stored in the cache? e) Assuming that we need to store one valid bit for every line in the cache, figure out the total size of the cache in bits (include all bits for valid, tag and data).

PROBLEM 2 - Cache (15 points)


a) Exercise 7.8 in PH2. b) Exercise 7.20 in PH2. c) Exercise 7.21 in PH2.

PROBLEM 3 - Memory Hierarchy (5 points) (Source: [STAL2002, 4.16])


A computer has a cache, main memory, and a disk used for virtual memory. If a referenced word is in the cache, 20ns (nanoseconds) are required to access it. If it is in the main memory but not in the cache, 60ns are needed to load it into the cache, and then the reference is started again. If the word is not in the main memory, 12ms (milliseconds) are required to fetch the word from disk, followed by 60ns to copy it to the cache, and then the reference is started again. For a certain program, the cache hit ratio is 0.9 and the main memory hit ratio is 0.6. Show how you can compute the average time (in ns) required to access a referenced word on this system.

PROBLEM 4 Memory Hierarchy (10 points)


a) Exercise 7.32 in PH2. b) Exercises 7.33 and 7.34 in PH2 (these questions each ask you to draw a diagram; integrate the two diagrams into one big picture which provides the answers to both problems you will be describing the complete memory system including cache and virtual memory). c) Exercise 8.22 in PH2.

PROBLEM 5 Cache (50 points)


In C/C++/Java, write a program that simulates a cache memory according to the specifications of the items below. Refer to Exercise 7.39 in PH2 for a starting point. When we simulate a cache, the value of the data access does not matter, we only need to be concerned with the order in which addresses are referenced (for our purposes in this problem, it doesnt even matter whether each access is a read or a write). Your cache simulator will work with a memory trace file, that is, a file that contains the addresses some program has accessed in the order these addresses were referenced. You will use the following sequence of addresses (in base 10) for your trace: 10 25 32 1 2 3 4 5 6 7 8 9 10 10 25 32 49 49 20 4 43 76 75 25 5 89 76 75 74 34 23 65 66 67 68 69 10 25 32 55 43 23 76 33 25 34 4 5 6 7 10 65 66 67 68 69 10 25 88 76 4 5 6 7 8 9 10 66 67 67 69 89 25 49 49 10 65 10 66 10 67 50 51 52 52 53 54 52 53 54 55 56 57 58 59 65 66 67 68 69 10 25 70 71 72 73 74 75 76 10 25 30 31 32 33 34 35 35 10 25 4 5 6 7 8 9 10 10 65 10 66 10 67 50 51 52 52 53 54 52 53 54 55 56 10 25 30 31 32 33 34 35 36 37 10 25 50 51 52 53 54 55 56 10 25 80 81 82 83 84 85 86 87 88 89 10 25 76 43 32 10 65 66 67 68 69 10 25 30 31 32 33 34 35 36 Assume that the cache is initially empty. Assume also that the memory is byte addressable and that addresses are 8 bits long. For each of the items below, you must submit your source code and also a printout indicating the number of cache hits, the number of cache misses and the state of the cache after the trace file has been processed (that is, you must show which addresses are stored in each location in the cache at the end of the trace). a) Simulate a direct-mapped cache that has 32 lines, where each line can hold one byte only. b) Simulate a direct-mapped cache that has 8 lines, where each line can hold two blocks, where each block has 4 bytes.

Extra-Credit (15 points) c) Simulate a fully-associative cache that has 32 lines, where each line can hold one
byte only. Use the second-chance algorithm that approximates LRU replacement. The Second-Chance algorithm is actually a FIFO replacement algorithm with a small modification that causes it to approximate LRU. FIFO alone means first-in-first-out, that is, the cache entry that was first loaded in will be the first one to be replaced. To implement a FIFO-scheme you need to keep track of the order in which entries were loaded into the cache. If you've taken CS15 or if you know data structures called "lists", they'll be the obvious tool to implement a FIFO-scheme (actually, a circular list is the way to go). If you dont know lists, you can still work on this problem using a little hack: - Reserve a global variable (integer) called RefCount that initially has value 0. - To every cache entry, associate an integer value called "stamp". When an entry isl loaded into the cache, make it's "stamp" take value RefCount, and then make RefCount++. If you look through all entries, you can spot the one with the minimum stamp and that will be the first one that was loaded, that is, the first one to be replaced according to the FIFO-scheme. The flaw in this is that eventually, for many more references than you will see in this homework problem, RefCount would overflow, but we don't need to worry about that. Searching for the minimum every time you look for a replacement candidate will make your simulator a bit inefficient, but this hack will still work. You can now implement Second-Chance on top of this FIFO-scheme. Assume that every cache entry has an associated R bit, which indicates whether this entry has been recently referenced or not. When the entry is first loaded into the cache, we make its R=1. Whenever an entry is referenced, even if there's no replacement search going on, we make it's R=1. When an entry is selected for replace search going on, we make it's R=1. When an entry is selected for replacement according to the FIFO order, we check its R bit. Note that this entry will be the one in the cache with the lowest "stamp" value. If R=1, the entry was referenced recently, so we clear it (giving it the promised second-chance), move it to the end of the FIFO and look for another candidate to replace (the one now at the head of the FIFO). If you're using the hack above, moving an entry to the end will be equivalent to changing its "stamp" to RefCount and doing RefCount++. Also, searching for the next candidate in line for replacement according to our FIFO-hack means that we'll look for the entry for which "stamp" = "previous stamp" + 1. If that one also has R=1, we move on to the next, and the next, and the next... until we find one entry with R=0. That will be the one to replace. If we happen to search through the entire cache and find R=1 for all entries, we go back and replace the oldest entry in the cache.

PROBLEM 6 - Disk (10 points) (Source: Anna Poplawski, CS37-X01)


Consider a disk unit where each sector is 512 bytes. Consider also that all the data well be accessing in this disk lies in the same band, where each track has 190 sectors (it's not often reasonable to assume that all data accessed is in the same band, but this will make things easier for this problem). The average seek time within this band is 4ms and the seek time between adjacent cylinders is 1.15ms. The average rotation delay is half a disk rotation. The transfer rate can be computed from the rotation speed, which is 7200 RPM. The time it takes to switch heads (between any 2 tracks in the same cylinder) is 0.85ms, and there are 16 surfaces. a) Say we run a program which reads 100 different files of 512KB each. Each individual file is stored contiguously in the disk, but since our program doesn't know the disk addresses of the files we can't do anything smart concerning the order in which the files are read. What's the total time the disk is busy during this program?

b) It is fair to assume that if you read a sector in this disk, you're likely to read the next one as part of the same read operation. "Skew" is defined to be the number of sectors between the last sector of one track and the first sector of the next, as shown in the diagrams below (N = sectors per track, s = skew). Disks often have different amounts of skew: a value used when switching tracks on the same cylinder and another when switching cylinders. Compute the optimal values of skew for cylinder switch and for track switch.

Das könnte Ihnen auch gefallen