Sol

Part 1.
Memory analysis
Question 1 (40 pt). You are given designs of 3 caches for a 16-bit address
machine:
D1:
Direct-mapped cache.
Each cache line is 1 byte.
10-bit index, 6-bit tag.
1 cycle hit time.
D2:
2-way set associative cache.
Each cache line is 1 word (4 bytes).
7-bit index, 7-bit tag.
2 cycle hit time.
D3:
fully associative cache with 256 cache lines.
Each cache line is 1 word.
14-bit tag.
5 cycle hit time.
Answer the following set of questions:
a) What is the size of each cache?
Direct mapped cache= 2Aindex * size of cache line= 2'10 * 1B lines = 1KB. 2-way set associative
cache= 2Aindex * size of cache line * 2 ways=2^7 * 4 words *2ways= 128 4B lines * 2 ways =
1KB
Fully associative cache= number of cache lines* size of each line= 256 * 4B lines = 1KB
b) How much space does each cache need to store tags?
Direct mapped cache= 1024 * 6-bit tags = 6Kb 2-way set associative cache= 256 * 7-bit tags =
1792 bits Fully associative cache= 256 * 14-bit tags = 3584 bits
c) Which cache design has the most conflict misses? Which has the least?
Direct mapped cache has likely the most conflict misses, because it is direct mapped. Fully
associative cache has the least since it is fully associative so it can never have conflict misses.
d) The following information is given to you: hit rate for the 3 caches is 50%, 70%
and 90% but did not tell you which hit rate corresponds to which cache, which
cache would you guess corresponded to which hit rate? Why?
Since the size of all three caches is same size and as we said in the previous answer that direct
mapped cache has more conflict misses and fully associative has the least so direct mapped will
have 50%, 2-way set associative 70%, and Fully associative will have 90% hit rate.
e) Assuming the miss time for each is 20 cycles, what is the average service time for
each? (Service Time = (hit rate) *(hit time) + (miss rate) *(miss time)).
We are given hit rates and miss rates. Also miss time=2o cycles for each cache and hit time= 1, 2,
5 for direct mapped, 2-way set associative and fully associative cache respectively.
Direct mapped= 0.5*1 + 0.5*20 = 10.5 cycles 2-way set associative= 0.7*2 + 0.3*20 = 7.4 cycles
Fully associative cache= 0.9*5 + 0.1*20 = 6.5 cycles.
Question 2 (30 pt). Assume we have a computer where the CPI is 1.0 when all
memory accesses (including data and instruction accesses) hit in the cache. The cache is
a unified (data + instruction) cache of size 256 KB, 4-way set associative, with a block
size of 64 bytes. The data accesses (loads and stores) constitute 50% of the instructions.
The unified cache has a miss penalty of 25 clock cycles and a miss rate of 2%. Assume
32-bitinstruction and data addresses. Now, answer the following questions:
a) What is the tag size for the cache?
Number of bits used for block offset = log 64 = 6.
Number of sets in the cache = 256K/(64 * 4) = 1K
Number of bits for index = log 1K = 10
Number of bits for tag = 32 - (10 + 6) = 16 bits
b. How much faster would the computer be if all memory accesses were cache hits?
CPI = CPlexecution + StallCyclesPerInstruction
For computer that always hits, CPI would be 1 I , e CPI = 1
Now let us compute StallCyclesPerinstruction for computer with non-zero miss rate
StallCyclesPerInstruction = (Memory accesses per instr) * miss rate * miss penalty
Memory accesses per instruction = 1 + 0.5 (1 instruction access + 0.5 data access)
StallCyclesPerInstruction = 1.5 * 0.02 * 25 = 0.75 Therefore, CPI = 1.75
Hence the computer with no cache misses is 1.75 times faster.
Part2: Handling Cache Miss

Question 3 (30 pt). You purchased a computer with the following features:
• 95% of all memory accesses are found in the cache.
• Each cache block is two words, and the whole block is read on any miss.
• The processor sends references to its cache at the rate of 109 words per second.
• 25% of those references are writes.
• Assume that the memory system can support 109 words per second, reads or writes.
• The bus reads or writes a single word at a time (the memory system cannot read or
write two words at once).
• Assume at any one time, 30% of the blocks in the cache have been modified.
• The cache uses write allocate on a write miss.
• You are considering adding a peripheral to the system, and you want to know how
much of the memory system bandwidth is already used.
Calculate the percentage of memory system bandwidth used on the average in the two
cases
below. Be sure to state your assumptions.
a. Case 1: The cache is write through.
Bus Reasoning
Bandwidth
Used
Read Hit 0 Hit means reference is found

in cache, so no bus bandwidth
used
Read Miss 109 * 0.05 miss ratio = 1- hit ratio = 1 -

* 0.75 * 2 0.95 = 0.05
reads are 75% of total number
of references
block size = 2 words
Write Hit 109 * Because we have write

0.05 * through policy we have to
0.25 *1 write to main memory on
every
hit. But we have to write only
1 word.
Writes are 25% of total
number of references,
hit ratio = 0.95
Write Miss 109 * 0.05 On every write miss we have

* 0.25 * to load a block (2 words) to
(2+1) cache because of write
allocate policy, and write 1
word ( the word to write from
CPU) because of write
through policy.
Writes are 25% of total
number of references,
hit ratio = 0.5
Total Bandwidth Used = BW used on Read hit + BW used on Read miss + BW used on Write hit
+ BW used on write miss
= 0 + 109 * 0.05 * 0.75 * 2 +109 * 0.95 * 0.25 * 1 + 109 * 0.05 * 0.25 * (2+1)
= 109 * 0.4625
Total Bandwidth Used 109 * 0.4625 = 0.4625
Total Bandwidth 109
b. Case 2: The cache is write back.
Bus Reasoning
Bandwidth
Used
Read Hit 0 Hit means reference is found

in cache, so no bus bandwidth
used
Read Miss 109 * 0.05* miss ratio = 1- hit ratio

0.75 * [ 2 * = 1 - 0.95 = 0.05
0.3 + 2] reads are 75% of total number
of references
writes are 25% of total number
of references
block size = 2 words
* The term 2*0.3 refers to
replacing the dirty block (0.3 is
the probability of the
block to be dirty). We write
back the dirty block (2 words)
and read needed block
(another 2 words)
Write Hit 0 Write hit does not generate

any traffic on the bus , just
makes the block on the cache
dirty
Write Miss 109 * 0.05 On a write miss we have to

* 0.25 * load a block (2 words) to cache
[2*0.3+2] because of write
allocate policy
and write a dirty block to main
memory (2*0.3)
Total Bandwidth Used = BW used on Read hit + BW used on Read miss + BW used on Write hit
+ BW used on write miss
; = 0 + 109 * 0.05 * 0.75 * [2 * 0.3 + 2] + 0 + 109 * 0.05 * 0.25 * [2 * 0.3 + 2]
; = 107 * 0.26
Total Bandwidth Used 109 * 0.26
= 0.108
Total Bandwidth 109

Sol

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Sol

Hochgeladen von

Copyright:

Verfügbare Formate

Part 1.

Number of sets in the cache = 256K/(64 * 4) = 1K

Number of bits for index = log 1K = 10

Number of bits for tag = 32 - (10 + 6) = 16 bits

For computer that always hits, CPI would be 1 I , e CPI = 1

StallCyclesPerInstruction = 1.5 * 0.02 * 25 = 0.75 Therefore, CPI = 1.75

Hence the computer with no cache misses is 1.75 times faster.

Part2: Handling Cache Miss

Read Hit 0 Hit means reference is found

Read Miss 109 * 0.05 miss ratio = 1- hit ratio = 1 -

Write Hit 109 * Because we have write

Write Miss 109 * 0.05 On every write miss we have

Read Hit 0 Hit means reference is found

Read Miss 109 * 0.05* miss ratio = 1- hit ratio

Write Hit 0 Write hit does not generate

Write Miss 109 * 0.05 On a write miss we have to

Das könnte Ihnen auch gefallen