Presentation is loading. Please wait.

Presentation is loading. Please wait.

Some of the slides are adopted from David Patterson (UCB)

Similar presentations


Presentation on theme: "Some of the slides are adopted from David Patterson (UCB)"— Presentation transcript:

1 Some of the slides are adopted from David Patterson (UCB)
ELEC Microprocessors and Interfacing Lectures 32: Cache Memory - II May 2006 Saeid Nooshabadi Some of the slides are adopted from David Patterson (UCB)

2 Outline Direct-Mapped Cache Types of Cache Misses A (long) detailed example Peer - to - peer education example Block Size Tradeoff

3 Review: Memory Hierarchy
Hard Disk RAM/ROM DRAM EEPROM Processor Control Memory Memory Memory Datapath Memory Memory Registers Slowest Speed: Fastest Biggest Size: Smallest Cache L1 SRAM Cache L2 SRAM Lowest Cost: Highest

4 Review: Direct-Mapped Cache
In a direct-mapped cache, each memory address is associated with one possible block within the cache Therefore, we only need to look in a single location in the cache for the data if it exists in the cache Block is the unit of transfer between cache and memory

5 Review: Direct-Mapped Cache 1 Word Block
Memory Memory Address 1 2 3 4 5 6 7 8 9 A B C D E F 8 Byte Direct Mapped Cache Cache Index 1 Block size = 4 bytes Cache Location 0 can be occupied by data from: Memory location 0 - 3, 8 - B, ... In general: any 4 memory locations that is 8*n (n=0,1,2,..) Cache Location 1 can be occupied by data from: Memory location 4 - 7, C - F, ... In general: any 4 memory locations that is 8*n + 4 (n=0,1,2,..)

6 Direct-Mapped with 1 word Blocks Example
Address tag block index Byte offset

7 Review: Direct-Mapped Cache Terminology
All fields are read as unsigned integers. Index: specifies the cache index (which “row” or “line” of the cache we should look in) Offset: once we’ve found correct block, specifies which byte within the block we want Tag: the remaining bits after offset and index are determined; these are used to distinguish between all the memory addresses that map to the same location

8 Reading Material Steve Furber: ARM System On-Chip; 2nd Ed, Addison-Wesley, 2000, ISBN: Chapter 10.

9 Accessing Data in a Direct Mapped Cache (#1/3)
Memory Ex.: 16KB of data, direct-mapped, 4 word blocks Read 4 addresses 0x , 0x C, 0x , 0x Only cache/memory level of hierarchy Address (hex) Value of Word C a b c d ... C e f g h C i j k l

10 Accessing Data in a Direct Mapped Cache (#2/3)
4 Addresses: 0x , 0x C, 0x , 0x 4 Addresses divided (for convenience) into Tag, Index, Byte Offset fields ttttttttttttttttt iiiiiiiiii oooo tag to check if have index to byte offset correct block select block within block Tag Index Offset

11 Accessing Data in a Direct Mapped Cache (#3/3)
So lets go through accessing some data in this cache 16KB data, direct-mapped, 4 word blocks Will see 3 types of events: cache miss: nothing in cache in appropriate block, so fetch from memory cache hit: cache block is valid and contains proper address, so read desired word cache miss, block replacement: wrong data is in cache at appropriate block, so discard it and fetch desired data from memory

12 16 KB Direct Mapped Cache, 16B blocks
Valid bit: determines whether anything is stored in that row (when computer initially turned on, all entries are invalid) ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Example Block Index

13 Read 0x = 0… Index field Offset Tag field ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index

14 So we read block 1 ( ) Tag field Index field Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index

15 No valid data 000000000000000000 0000000001 0100 Tag field Index field
Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index

16 So load that data into cache, setting tag, valid
Tag field Index field Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 a b c d

17 Read from cache at offset, return word b
Tag field Index field Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 a b c d

18 Read 0x C = 0… Tag field Index field Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 a b c d

19 Data valid, tag OK, so read offset return word d
... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 a b c d

20 Read 0x = 0… Tag field Index field Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 a b c d

21 So read block 3 Tag field Index field Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 a b c d

22 No valid data 000000000000000000 0000000011 0100 Tag field Index field
Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 a b c d

23 Load that cache block, return word f
Tag field Index field Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 a b c d 1 e f g h

24 Read 0x = 0… Tag field Index field Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 a b c d 1 e f g h

25 So read Cache Block 1, Data is Valid
Tag field Index field Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 a b c d 1 e f g h

26 Cache Block 1 Tag does not match (0 != 2)
Tag field Index field Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 a b c d 1 e f g h

27 Miss, so replace block 1 with new data & tag
Tag field Index field Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 2 i j k l 1 e f g h

28 And return word j 000000000000000010 0000000001 0100 Tag field
Index field Offset ... Valid Tag 0x0-3 0x4-7 0x8-b 0xc-f 1 2 3 4 5 6 7 1022 1023 Index 1 2 i j k l 1 e f g h

29 Do an example yourself. What happens?
Chose from: Cache: Hit, Miss, Miss w. replace Values returned: a ,b, c, d, e, ..., k, l Read address 0x ? Read address 0x c ? Cache Valid 0x0-3 0x4-7 0x8-b 0xc-f Index Tag 1 1 2 i j k l 2 3 1 e f g h 4 5 6 7 ... ...

30 Answers 0x00000030 a hit 0x0000001c a miss with replacment
Index = 3, Tag matches, Offset = 0, value = e 0x c a miss with replacment Index = 1, Tag mismatch, so replace from memory, Offset = 0xc, value = d The Values read from Cache must equal memory values whether or not cached: 0x = e 0x c = d Memory Address Value of Word c a b c d ... c e f g h c i j k l

31 Block Size Tradeoff (#1/3)
Benefits of Larger Block Size Spatial Locality: if we access a given word, we’re likely to access other nearby words soon (Another Big Idea) Very applicable with Stored-Program Concept: if we execute a given instruction, it’s likely that we’ll execute the next few as well Works nicely in sequential array accesses too As I said earlier, block size is a tradeoff. In general, larger block size will reduce the miss rate because it take advantage of spatial locality. But remember, miss rate NOT the only cache performance metrics. You also have to worry about miss penalty. As you increase the block size, your miss penalty will go up because as the block gets larger, it will take you longer to fill up the block. Even if you look at miss rate by itself, which you should NOT, bigger block size does not always win. As you increase the block size, assuming keeping cache size constant, your miss rate will drop off rapidly at the beginning due to spatial locality. However, once you pass certain point, your miss rate actually goes up. As a result of these two curves, the Average Access Time (point to equation), which is really the more important performance metric than the miss rate, will go down initially because the miss rate is dropping much faster than the increase in miss penalty. But eventually, as you keep on increasing the block size, the average access time can go up rapidly because not only is the miss penalty is increasing, the miss rate is increasing as well. Let me show you why your miss rate may go up as you increase the block size by another extreme example.

32 Block Size Tradeoff (#2/3)
Drawbacks of Larger Block Size Larger block size means larger miss penalty on a miss, takes longer time to load a new block from next level If block size is too big relative to cache size, then there are too few blocks Result: miss rate goes up In general, minimize Average Access Time = Hit Time x Hit Rate Miss Penalty x Miss Rate

33 Block Size Tradeoff (#3/3)
Hit Time = time to find and retrieve data from current level cache Miss Penalty = average time to retrieve data on a current level miss (includes the possibility of misses on successive levels of memory hierarchy) Hit Rate = % of requests that are found in current level cache Miss Rate = 1 - Hit Rate

34 Extreme Example: One Big Block
Cache Data Valid Bit B 0 B 1 B 3 Tag B 2 Cache Size = 4 bytes Block Size = 4 bytes Only ONE entry in the cache! If item accessed, likely accessed again soon But unlikely will be accessed again immediately! The next access will likely to be a miss again Continually loading data into the cache but discard data (force out) before use it again Nightmare for cache designer: Ping Pong Effect Let’s go back to our 4-byte direct mapped cache and increase its block size to 4 byte. Now we end up have one cache entries instead of 4 entries. What do you think this will do to the miss rate? Well the miss rate probably will go to hell. It is true that if an item is accessed, it is likely that it will be accessed again soon. But probably NOT as soon as the very next access so the next access will cause a miss again. So what we will end up is loading data into the cache but the data will be forced out by another cache miss before we have a chance to use it again. This is called the ping pong effect: the data is acting like a ping pong ball bouncing in and out of the cache. It is one of the nightmares scenarios cache designer hope never happens. We also defined a term for this type of cache miss, cache miss caused by different memory location mapped to the same cache index. It is called Conflict miss.

35 Block Size Tradeoff Conclusions
Miss Rate Block Size Miss Penalty Block Size Exploits Spatial Locality Fewer blocks: compromises temporal locality Average Access Time Block Size Increased Miss Penalty & Miss Rate

36 Things to Remember Cache Access involves 3 types of events: cache miss: nothing in cache in appropriate block, so fetch from memory cache hit: cache block is valid and contains proper address, so read desired word cache miss, block replacement: wrong data is in cache at appropriate block, so discard it and fetch desired data from memory


Download ppt "Some of the slides are adopted from David Patterson (UCB)"

Similar presentations


Ads by Google