Presentation is loading. Please wait.

Presentation is loading. Please wait.

Outline Cache writes DRAM configurations Performance Associative caches Multi-level caches.

Similar presentations


Presentation on theme: "Outline Cache writes DRAM configurations Performance Associative caches Multi-level caches."— Presentation transcript:

1 Outline Cache writes DRAM configurations Performance Associative caches Multi-level caches

2 00 01 10 11 DataTagValid Reference Stream:Hit/Miss 0b01001000 0b00010100 0b00111000 0b00010000 Direct-mapped Cache Blocksize=4words, wordsize= 4bytes Tag Index Byte Offset Block Offset 01 11 00 1 1 0 1

3 01 10 11 DataTagValid Reference Stream:Hit/Miss 0b01001000 0b00010100 0b00111000 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes 01 11 00 1 1 0 1

4 01 10 11 DataTagValid Reference Stream:Hit/Miss 0b01001000 0b00010100 0b00111000 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes 01 11 00 1 1 0 1

5 01 00 11 01 00 10 00 11 DataTagValid 1 1 0 1 Reference Stream:Hit/Miss 0b01001000 0b00010100 0b00111000 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes M[64-79]

6 01 00 11 01 00 10 00 11 DataTagValid 1 1 0 1 Reference Stream:Hit/Miss 0b01001000 0b00010100 0b00111000 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes M[64-79] M[208-223]

7 01 00 11 01 00 10 00 11 DataTagValid 1 1 0 1 Reference Stream:Hit/Miss 0b01001000 0b00010100 0b00111000 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes M[64-79] M[208-223] M[32-47]

8 01 00 11 01 00 10 00 11 DataTagValid 1 1 0 1 Reference Stream:Hit/Miss 0b01001000 0b00010100 0b00111000 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes M[64-79] M[208-223] M[32-47] Not Valid

9 01 00 11 01 00 10 00 11 DataTagValid 1 1 0 1 Reference Stream:Hit/Miss 0b01001000 0b00010100 0b00111000 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes M[64-79] M[208-223] M[32-47]

10 01 00 11 01 00 10 00 11 DataTagValid 1 1 0 1 Reference Stream:Hit/Miss 0b01001000H 0b00010100 0b00111000 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes M[64-79] M[208-223] M[32-47]

11 01 00 11 01 00 10 00 11 DataTagValid 1 1 0 1 Reference Stream:Hit/Miss 0b01001000H 0b00010100 0b00111000 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes M[64-79] M[208-223] M[32-47]

12 01 00 01 00 10 00 11 DataTagValid 1 1 0 1 Reference Stream:Hit/Miss 0b01001000H 0b00010100M 0b00111000 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes M[64-79] M[16-31] M[32-47]

13 01 00 01 00 10 00 11 DataTagValid 1 1 0 1 Reference Stream:Hit/Miss 0b01001000H 0b00010100M 0b00111000 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes M[64-79] M[16-31] M[32-47]

14 01 00 01 00 10 00 11 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b01001000H 0b00010100M 0b00111000M 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes M[64-79] M[16-31] M[32-47] M[48-63]

15 01 00 01 00 10 00 11 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b01001000H 0b00010100M 0b00111000M 0b00010000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes M[64-79] M[16-31] M[32-47] M[48-63]

16 01 00 01 00 10 00 11 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b01001000H 0b00010100M 0b00111000M 0b00010000H Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=4words, wordsize= 4bytes M[64-79] M[16-31] M[32-47] M[48-63]

17 Cache Writes There are multiple copies of the data lying around –L1 cache, L2 cache, DRAM Do we write to all of them? Do we wait for the write to complete before the processor can proceed?

18 Do we write to all of them? Write-through Write-back –creates data - different values for same item in cache and DRAM. –This data is referred to as

19 Do we write to all of them? Write-through - write to all levels of hierarchy Write-back –creates data - different values for same item in cache and DRAM. –This data is referred to as

20 Do we write to all of them? Write-through - write to all levels of hierarchy Write-back - write to lower level only when cache line gets evicted from cache –creates inconsistent data - different values for same item in cache and DRAM – stale data. –Inconsistent data in highest level in cache is referred to as dirty –If they all match, they are clean –The old data is stale.

21 Write-Through CPU L1 L2 Cache DRAM Sw $3, 0($5)

22 Write-Back CPU L1 L2 Cache DRAM Sw $3, 0($5)

23 Write-through vs Write-back Which performs the write faster? Which has faster evictions from a cache? Which causes more bus traffic?

24 Write-through vs Write-back Which performs the write faster? –Write-back - it only writes the L1 cache Which has faster evictions from a cache? Which causes more bus traffic?

25 Write-through vs Write-back Which performs the write faster? –Write-back - it only writes the L1 cache Which has faster evictions from a cache? –Write-through - no write involved, just overwrite tag Which causes more bus traffic?

26 Write-through vs Write-back Which performs the write faster? –Write-back - it only writes the L1 cache Which has faster evictions from a cache? –Write-through - no write involved, just overwrite tag Which causes more bus traffic? –Write-through. DRAM is written every store. Write-back only writes on eviction.

27 Does processor wait for write? Write buffer –Any loads must check write buffer in parallel with cache access. –Buffer values are more recent than cache values.

28 Does processor wait for write? Write buffer - intermediate queue for pending writes –Any loads must check write buffer in parallel with cache access. –Buffer values are more recent than cache values.

29 Outline Cache writes DRAM configurations Performance Associative caches

30 Challenge DRAM is designed for density, not speed DRAM is ______ than the bus We are allowed to change the width, the number of DRAMs, and the bus protocol, but the access latency stays slow. Widening anything increases the cost by quite a bit.

31 Challenge DRAM is designed for density, not speed DRAM is slower than the bus We are allowed to change the width, the number of DRAMs, and the bus protocol, but the access latency stays slow. Widening anything increases the cost by quite a bit.

32 Narrow Configuration CPU Cache DRAM Bus Given: –1 clock cycle request –15 cycles / word DRAM latency –1 cycle / word bus latency If a cache block is 8 words, what is the miss penalty of an L2 cache miss?

33 Narrow Configuration CPU Cache DRAM Bus Given: –1 clock cycle request –15 cycles / word DRAM latency –1 cycle / word bus latency If a cache block is 8 words, what is the miss penalty of an L2 cache miss? 1cycle + 15 cycles/word * 8 words + 1 cycle/word * 8 words = 129 cycles

34 Wide Configuration CPU Cache DRAM Bus Given: –1 clock cycle request –15 cycles / 2 words DRAM latency –1 cycle / 2 words bus latency If a cache block is 8 words, what is the miss penalty of an L2 cache miss?

35 Wide Configuration CPU Cache DRAM Bus Given: –1 clock cycle request –15 cycles / 2 words DRAM latency –1 cycle / 2 words bus latency If a cache block is 8 words, what is the miss penalty of an L2 cache miss? 1cycle + 15 cycles/2 words * 8 words + 1 cycle/2words*8words = 65 cycles

36 Interleaved Configuration CPU Cache DRAM Bus Given: –1 clock cycle request –15 cycles / word DRAM latency –1 cycle / word bus latency If a cache block is 8 words, what is the miss penalty of an L2 cache miss? DRAM

37 Interleaved Configuration CPU Cache DRAM Bus Given: –1 clock cycle request –15 cycles / word DRAM latency –1 cycle / word bus latency If a cache block is 8 words, what is the miss penalty of an L2 cache miss? 1 cycle + 15 cycles / 2 words * 8 words + 1 cycle / word * 8 words = 69 cycles DRAM

38 Recent DRAM trends Fewer, Bigger DRAMs New bus protocols (RAMBUS) small DRAM caches (page mode) SDRAM (synchronous DRAM) –one request & length nets several continuous responses.

39 Outline Cache writes DRAM configurations Performance Associative caches

40 Performance Execute Time = (Cpu cycles + Memory- stall cycles) * clock cycle time Memory-stall cycles = –accesses * misses * cycles = –program access miss –memory access * Miss rate * Miss penalty –program –instructions * misses * cycles = – program inst miss –instructions * misses * miss penalty –program inst

41 Example 1 instruction cache miss rate: 2% data cache miss rate: 3% miss penalty: 50 cycles ld/st instructions are 25% of instructions CPI with perfect cache is 2.3 How much faster is the computer with a perfect cache?

42 Example 1 misses = Iacc * Imr + Dacc * Dmr instr instr instr

43 Example 1 misses = Iacc * Imr + Dacc * Dmr instr instr instr = 1 *.02 +.25 *.03 =.02 +.0075 =.0275

44 Example 1 misses = Iacc * Imr + Dacc * Dmr instr instr instr = 1 *.02 +.25 *.03 =.02 +.0075 =.0275 Memory cycles = I *.0275 * 50 = I* 1.375

45 Example 1 misses = Iacc * Imr + Dacc * Dmr instr instr instr = 1 *.02 +.25 *.03 =.02 +.0075 =.0275 Memory cycles = I *.0275 * 50 = I* 1.375 ExecT = (Cpu CPI * I + MemCycles)*Clk

46 Example 1 misses = Iacc * Imr + Dacc * Dmr instr instr instr = 1 *.02 +.25 *.03 =.02 +.0075 =.0275 Memory cycles = I *.0275 * 50 = I* 1.375 ExecT = (Cpu CPI * I + MemCycles)*Clk = (2.3 * I + 1.375 * I) * clk = 3.675IC

47 Example 1 misses = Iacc * Imr + Dacc * Dmr instr instr instr = 1 *.02 +.25 *.03 =.02 +.0075 =.0275 Memory cycles = I *.0275 * 50 = I* 1.375 ExecT = (Cpu CPI * I + MemCycles)*Clk = (2.3 * I + 1.375 * I) * clk = 3.675IC speedup = 3.675 IC / 2.3IC = 1.6

48 Example 2 Double the clock rate from Example1. What is the ideal speedup when taking into account the memory system? How long is the miss penalty now?

49 Example 2 Double the clock rate from Example1. What is the ideal speedup when taking into account the memory system? How long is the miss penalty now? 100 cycles Memory cycles =

50 Example 2 Double the clock rate from Example1. What is the ideal speedup when taking into account the memory system? How long is the miss penalty now? 100 cycles Memory cycles = I *.0275 * 100 = I * 2.75

51 Example 2 Double the clock rate from Example1. What is the ideal speedup when taking into account the memory system? How long is the miss penalty now? 100 cycles Memory cycles = I *.0275 * 100 = I * 2.75 Exec = (2.3*I + 2.75*I)*clk = 5.05I(C/2)

52 Example 2 Double the clock rate from Example1. What is the ideal speedup when taking into account the memory system? How long is the miss penalty now? 100 cycles Memory cycles = I *.0275 * 100 = I * 2.75 Exec = (2.3*I + 2.75*I)*clk = 5.05I(C/2) speedup = old = 3.675IC = 3.675 = 1.5 new = 5.05IC/2 2.525

53 Outline Cache writes DRAM configurations Performance Associative caches

54 101 00 010 01 000 10 000 11 DataTagValid 1 1 0 1 Reference Stream:Hit/Miss 0b00111000 0b00011100 0b00111000 0b00011000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=2words, wordsize= 4bytes M[160-167] M[72-79] M[16-23] Not Valid

55 00 01 10 001 11 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100 0b00111000 0b00011000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=2words, wordsize= 4bytes 101 010 000 M[160-167] M[72-79] M[16-23] M[56-63]

56 00 01 10 001 11 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100 0b00111000 0b00011000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=2words, wordsize= 4bytes 101 010 000 M[160-167] M[72-79] M[16-23] M[56-63]

57 00 01 10 000 11 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100M 0b00111000 0b00011000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=2words, wordsize= 4bytes 101 010 000 M[160-167] M[72-79] M[16-23] M[24-31]

58 00 01 10 000 11 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100M 0b00111000 0b00011000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=2words, wordsize= 4bytes 101 010 000 M[160-167] M[72-79] M[16-23] M[24-31]

59 00 01 10 001 11 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100M 0b00111000M 0b00011000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=2words, wordsize= 4bytes 101 010 000 M[160-167] M[72-79] M[16-23] M[56-63]

60 00 01 10 001 11 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100M 0b00111000M 0b00011000 Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=2words, wordsize= 4bytes 101 010 000 M[160-167] M[72-79] M[16-23] M[56-63]

61 00 01 10 001 11 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100M 0b00111000M 0b00011000M Tag Index Byte Offset Block Offset Direct-mapped Cache Blocksize=2words, wordsize= 4bytes 101 010 000 M[160-167] M[72-79] M[16-23] M[56-63]

62 Problem Conflicting addresses cause high miss rates

63 Solution Relax the direct-mapping Allow each address to be mapped into 2 or 4 locations (a set)

64 Cache Configurations 00 01 10 11 DataTagValid 0 1 DataTagValidDataTagValid Direct-Mapped 2-way Associative - each set has two blocks DataTagValidDataTagValid Fully Associative - all addresses map to the same set

65 Cache Configurations 00 01 10 11 DataTagValid 0 1 DataTagValidDataTagValid Direct-Mapped 2-way Associative - each set has two blocks DataTagValidDataTagValid Fully Associative - all addresses map to the same set Block

66 Cache Configurations 00 01 10 11 DataTagValid 0 1 DataTagValidDataTagValid Direct-Mapped 2-way Associative - each set has two blocks DataTagValidDataTagValid Fully Associative - all addresses map to the same set Block Set

67 1001 0 0010 1 0000 0001 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000 0b00011100 0b00111000 0b00011000 Tag Index Byte Offset Block Offset 2-way Set Associative Cache Blocksize=2words, wordsize= 4bytes DataTagValidIndex Set Block

68 1001 0 0010 1 0000 0001 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000 0b00011100 0b00111000 0b00011000 Tag Index Byte Offset Block Offset 2-way Set Associative Cache Blocksize=2words, wordsize= 4bytes DataTagValidIndex

69 1001 0 0011 1 0000 0001 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100 0b00111000 0b00011000 Tag Index Byte Offset Block Offset 2-way Set Associative Cache Blocksize=2words, wordsize= 4bytes DataTagValidIndex

70 1001 0 0011 1 0000 0001 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100 0b00111000 0b00011000 Tag Index Byte Offset Block Offset 2-way Set Associative Cache Blocksize=2words, wordsize= 4bytes DataTagValidIndex

71 1001 0 0011 1 0000 0001 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100H 0b00111000 0b00011000 Tag Index Byte Offset Block Offset 2-way Set Associative Cache Blocksize=2words, wordsize= 4bytes DataTagValidIndex

72 1001 0 0011 1 0000 0001 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100H 0b00111000 0b00011000 Tag Index Byte Offset Block Offset 2-way Set Associative Cache Blocksize=2words, wordsize= 4bytes DataTagValidIndex

73 1001 0 0011 1 0000 0001 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100H 0b00111000H 0b00011000 Tag Index Byte Offset Block Offset 2-way Set Associative Cache Blocksize=2words, wordsize= 4bytes DataTagValidIndex

74 1001 0 0011 1 0000 0001 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100H 0b00111000H 0b00011000 Tag Index Byte Offset Block Offset 2-way Set Associative Cache Blocksize=2words, wordsize= 4bytes DataTagValidIndex

75 1001 0 0011 1 0000 0001 DataTagValid 1 1 1 1 Reference Stream:Hit/Miss 0b00111000M 0b00011100H 0b00111000H 0b00011000H Tag Index Byte Offset Block Offset 2-way Set Associative Cache Blocksize=2words, wordsize= 4bytes DataTagValidIndex

76 Implementation 0 1 DataTagValid Byte Address 0x100100100 Tag Index Byte Offset = Hit? MUX Block offset Data TagValid MUX=

77 Performance Implications Increasing associativity increases/decreases hit rate Increasing associativity increases/decreases access time Increasing associativity increases/decreases miss penalty

78 Performance Implications Increasing associativity increases hit rate Increasing associativity increases/decreases access time Increasing associativity increases/decreases miss penalty

79 Performance Implications Increasing associativity increases hit rate Increasing associativity increases access time Increasing associativity increases/decreases miss penalty

80 Performance Implications Increasing associativity increases hit rate Increasing associativity increases access time Increasing associativity has no effect on miss penalty

81 0 1 Direct-Mapped Cache DataTagValid 0 0 0 0 Miss Rate: Tag Index Byte Offset Block Offset Example 2-way associative Reference Stream:Hit/Miss 0b1001000M 0b0011100 0b1001000 0b0111000

82 0 1 Direct-Mapped Cache DataTagValid 0 0 0 0 Tag Index Byte Offset Block Offset Example 2-way associative Reference Stream: 0b1001000 0b0011100 0b1001000 0b0111000

83 0 1 100 Direct-Mapped Cache DataTagValid 0 1 0 0 Tag Index Byte Offset Block Offset Example 2-way associative Reference Stream: 0b1001000 0b0011100 0b1001000 0b0111000

84 0 1 100 Direct-Mapped Cache DataTagValid 0 1 0 0 Tag Index Byte Offset Block Offset Example 2-way associative Reference Stream: 0b1001000 0b0011100 0b1001000 0b0111000

85 0 1 100 001 Direct-Mapped Cache DataTagValid 0 1 1 0 Tag Index Byte Offset Block Offset Example 2-way associative Reference Stream: 0b1001000 0b0011100 0b1001000 0b0111000

86 0 1 100 001 Direct-Mapped Cache DataTagValid 0 1 1 0 Tag Index Byte Offset Block Offset Example 2-way associative Reference Stream: 0b1001000 0b0011100 0b1001000 0b0111000

87 0 1 100 001 Direct-Mapped Cache DataTagValid 0 1 1 0 Tag Index Byte Offset Block Offset Example 2-way associative Reference Stream: 0b1001000 0b0011100 0b1001000 0b0111000

88 Which block to replace? 0b1001000 0b0011100

89 Which block to replace? 0b1001000 - It entered the cache first –FIFO - First In First Out 0b0011100

90 Which block to replace? 0b1001000 - It entered the cache first –FIFO - First In First Out 0b0011100 - Longer since it has been used –LRU - Least Recently Used Random

91 Replacement Algorithms LRU & FIFO simple conceptually, but implementation difficult for high assoc. LRU & FIFO must be approximated with high associativity Random sometimes better than approximated LRU/FIFO Tradeoff between accuracy, implementation cost

92 L1 L2 Cache DRAM Memory Me L1 cache’s perspective L1’s miss penalty contains the access of L2, and possibly the access of DRAM!!!

93 Multi-level Caches Base CPI 1.0, 500MHz clock main memory-100 cycles, L2 - 10 cycles L1 miss rate per instruction - 5% w/L2 - 2% of instructions go to DRAM What is the speedup with the L2 cache? There is a typo in the book for this example!

94 Multi-level Caches CPI = 1 + memory stalls / instructions

95 Multi-level Caches CPI = 1 + memory stalls / instructions CPI old = 1 + 5% miss/instr * 100 cycles/miss = 1 + 5 = 6 cycles / instr

96 Multi-level Caches CPI = 1 + memory stalls / instructions CPI old = 1 + 5% miss/instr * 100 cycles/miss = 1 + 5 = 6 cycles / instr CPI new = 1 + L2%*L2penalty + Mem%*MemPenalty

97 Multi-level Caches CPI = 1 + memory stalls / instructions CPI old = 1 + 5% miss/instr * 100 cycles/miss = 1 + 5 = 6 cycles / instr CPI new = 1 + L2%*L2penalty + Mem%*MemPenalty = 1 + 5% * 10 + 2% * 100 = 3.5

98 Multi-level Caches CPI = 1 + memory stalls / instructions CPI old = 1 + 5% miss/instr * 100 cycles/miss = 1 + 5 = 6 cycles / instr CPI new = 1 + L2%*L2penalty + Mem%*MemPenalty = 1 + 5% * 10 + 2% * 100 = 3.5 = 1 + (5-2)%*10 + 2%*(10+100) = 3.5

99 Multi-level Caches CPI = 1 + memory stalls / instructions CPI old = 1 + 5% miss/instr * 100 cycles/miss = 1 + 5 = 6 cycles / instr CPI new = 1 + L2%*L2penalty + Mem%*MemPenalty = 1 + 5% * 10 + 2% * 100 = 3.5 = 1 + (5-2)%*10 + 2%*(10+100) = 3.5 Speedup = 6/3.5 = 1.7

100 DO GROUPWORK NOW

101 Summary Direct-mapped –simple –_____ access time –_______ hit rate Variable block size –still simple –_______ access time

102 Summary Direct-mapped –simple –fast access time –marginal hit rate Variable block size –still simple –_____ access time –_____ hit rate by exploiting __________

103 Summary Direct-mapped –simple –fast access time –marginal hit rate Variable block size –still simple –fast access time –higher hit rate by exploiting spatial locality

104 Summary Associative caches –________ the access time –________ the hit rate –associativity above ___ has little to no gain Multi-level caches –__________ worst-case miss penalty –__________ average miss penalty

105 Summary Associative caches –increase the access time –increase the hit rate –associativity above 8 has little to no gain Multi-level caches –__________ worst-case miss penalty –__________ average miss penalty

106 Summary Associative caches –increase the access time –increase the hit rate –associativity above 8 has little to no gain Multi-level caches –increases worst-case miss penalty (because you waste time accessing another cache) –Reduces average miss penalty (because so many are caught and handled quickly)


Download ppt "Outline Cache writes DRAM configurations Performance Associative caches Multi-level caches."

Similar presentations


Ads by Google