Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Introduction to Cache View this presentation in slideshow mode.

Similar presentations


Presentation on theme: "An Introduction to Cache View this presentation in slideshow mode."— Presentation transcript:

1 An Introduction to Cache View this presentation in slideshow mode

2 MSJ-2 Cache Viewed as a Parking Lot at ERAU Lets consider the parking lot behind King as our cache Suppose we numbered each parking spot although ERAU does not do this, many big parking structures do Further suppose, just for illustrative purposes, that we had exactly 16 parking places in the King lot, numbered in hex with 0 through F (hey, King is an engineering building, people here are supposed to know hex ;-) 0 1 2 3 4 5 6 7 8 9 A B C D E F parking lot

3 MSJ-3 Parking the Car Legally Now suppose ERAUs parking regulations stated that a faculty member could only use the slot whose number matched the last digit on his or her license plate 0 1 2 3 4 5 6 7 8 9 A B C D E F parking lot Suppose I ask you to go see if my car is in the parking lot and all you know is my license plate number You proceed directly (direct mapped!) to slot number 5 But just because theres a car there doesnt mean its mine; lots of cars have license plate numbers that end in 5 So youll have to check the license plate for the car in slot #5 to see if its mine You proceed directly (direct mapped!) to slot number 5 But just because theres a car there doesnt mean its mine; lots of cars have license plate numbers that end in 5 So youll have to check the license plate for the car in slot #5 to see if its mine Now suppose that the cost of checking the digits on the license plate grows non-linearly with the number of digits (the analogy is getting a bit strained, but it will have to do ;-) Well, you dont need to check all 7 digits; all you have to do is check the first 6 digits (ABC234) You dont need to check the 5, last digit of the license plate, since the car couldnt legally be in slot #5 unless the last digit of the license plate were a 5 Now suppose that the cost of checking the digits on the license plate grows non-linearly with the number of digits (the analogy is getting a bit strained, but it will have to do ;-) Well, you dont need to check all 7 digits; all you have to do is check the first 6 digits (ABC234) You dont need to check the 5, last digit of the license plate, since the car couldnt legally be in slot #5 unless the last digit of the license plate were a 5

4 MSJ-4 The Parking Lot as a Cache 0 1 2 3 4 5 6 7 8 9 A B C D E F cache This digit is the block frame # that this block can occupy in our (direct mapped) cache I chose this license plate picture from the web since it rather fortuitously had only hex digits in it In a real cache of course, well be looking at binary bits pulled from a physical memory address The bits may or may not line up perfectly on 4 bit nibble boundaries I chose this license plate picture from the web since it rather fortuitously had only hex digits in it In a real cache of course, well be looking at binary bits pulled from a physical memory address The bits may or may not line up perfectly on 4 bit nibble boundaries The license plate is a memory address New York and Empire State are irrelevant to finding my car in the parking lot and parts of a memory address will similarly be irrelevant to how cache works Heres the only information (called the tag) from the license plate that we have to use to check to see if our block is the one in the block frame or if some other block is parked there The parking lot is a direct mapped cache The parking spaces are block frames, the cars are blocks Each block frame can hold exactly one block The parking lot is a direct mapped cache The parking spaces are block frames, the cars are blocks Each block frame can hold exactly one block

5 MSJ-5tag0x091a2 block frame # 0xb3 tag0x2468a 0x33 Interpreting the Physical Address 0x12345678 Heres what that would be in binary 00010010001101000101011001111000 Heres how a cache might interpret these bits For example, heres a 32 bit physical address shown in hex The block frame number ( 0x33, in this example) is our parking slot number Just as we ignored New York in our license plate and parking lot example, some of the bits will be ignored by the cache (used by the alignment network, however), Everything else is the tag what you checked when you went to the correct block frame in the parking lot and wanted to see if it was my car that was parked there Only if all the fields were multiples of 4 bits wide would everything line up neatly in hex digits so that, for example, the hex for the tag could be seen in the hex for the overall address as easily as it was in our original license plate example But the size of each field is dictated by cache and memory design parameters and so is often not a multiple of 4 bits Only if all the fields were multiples of 4 bits wide would everything line up neatly in hex digits so that, for example, the hex for the tag could be seen in the hex for the overall address as easily as it was in our original license plate example But the size of each field is dictated by cache and memory design parameters and so is often not a multiple of 4 bits In reality, its the bit patterns that matter, not their hex names; but if we want to talk about these things, hex is a lot simpler to rattle off out loud My point here is that the hex representation for a tag, for example, may not be easily discernible from the hex of the original physical address; we have to look at the bit patterns in isolation, independent of their alignment in the physical address itself E.g., we can see the binary bit pattern for the tag in the binary bit pattern for the address but we dont see 0x2468a in 0x12345678 In reality, its the bit patterns that matter, not their hex names; but if we want to talk about these things, hex is a lot simpler to rattle off out loud My point here is that the hex representation for a tag, for example, may not be easily discernible from the hex of the original physical address; we have to look at the bit patterns in isolation, independent of their alignment in the physical address itself E.g., we can see the binary bit pattern for the tag in the binary bit pattern for the address but we dont see 0x2468a in 0x12345678 … and then the tag would be changed as well, since its rightmost (least significant) bits were changed, since some were confiscated to make room for the enlarged block frame # E.g., if the cache had more block frames, wed need more bits to hold the block frame number …

6 MSJ-6 block # 0 1 2 3 4 5 6 7 8 9 a b c d e f 10 11 12 13 14 15 16 17 18 19 1a memory width = block size main memory tag content (a block) 0123456701234567 block frame # cache physical address block # offset block frame # tag Direct Mapped Cache in More Detail The cache breaks up the bits of the block number into two fields: the tag and the block frame # The size of the cache in block frames determines the number of bits needed for the block frame # E.g., if the cache contains 8 block frames, 3 bits (8=2 3 ) will be needed to uniquely specify a block frame # The size of the cache in block frames determines the number of bits needed for the block frame # E.g., if the cache contains 8 block frames, 3 bits (8=2 3 ) will be needed to uniquely specify a block frame # Main memory uses the block number to find the block in memory All the other bits in the address form the tag used by the cache Main memory is organized as a set of sequential blocks A block (a.k.a., a cache line or cache grain) is the quantum of transfer between main memory and cache Even if the CPU wants just a single byte from a byte- addressable memory, main memory will transfer up an entire block Its the alignment network that later pulls out and aligns the part that the CPU actually wants Main memory is organized as a set of sequential blocks A block (a.k.a., a cache line or cache grain) is the quantum of transfer between main memory and cache Even if the CPU wants just a single byte from a byte- addressable memory, main memory will transfer up an entire block Its the alignment network that later pulls out and aligns the part that the CPU actually wants The width of the main memory (i.e., block size) determines the number of bits needed for the offset; e.g., for a block size of 16 bytes, wed need 4 bits to specify the starting position of the bytes the alignment network must extract and align for a CPU register An instructions opcode (e.g., LB, for load byte, LW for load word) specifies the number of bytes required Only the alignment network uses the byte offset field of a physical address; its not used by either the main memory or the cache The width of the main memory (i.e., block size) determines the number of bits needed for the offset; e.g., for a block size of 16 bytes, wed need 4 bits to specify the starting position of the bytes the alignment network must extract and align for a CPU register An instructions opcode (e.g., LB, for load byte, LW for load word) specifies the number of bytes required Only the alignment network uses the byte offset field of a physical address; its not used by either the main memory or the cache When the cache gets a request for a block not currently in the cache (well see how this decision is made in just a minute), memory is told to send up the requested block which is then placed in the designated block frame (parking slot) The physical address of a requested item in memory controls the operation of the memory hierarchy An address is interpreted differently by main memory and cache The physical address of a requested item in memory controls the operation of the memory hierarchy An address is interpreted differently by main memory and cache The cache (our parking lot) is a set of block frames; each of which is analogous to a numbered slot in our parking lot Each block frame can contain: A single block of memory (analogous to our car), and The tag of that block (the leading digits of a license plate) The cache (our parking lot) is a set of block frames; each of which is analogous to a numbered slot in our parking lot Each block frame can contain: A single block of memory (analogous to our car), and The tag of that block (the leading digits of a license plate) The cache extracts the tag from the address and places it in the tag portion of the block frame Presented with a physical address, the cache determines if the requested block is in cache by going to the block frame and comparing the tag of the requested block with the tag of the resident block (if any) Cache hit: If they match, the block is sent to the alignment network which uses the offset to extract the requested bytes from the block and align them properly for the destination CPU register Cache miss: If the tags dont match, cache tells main memory to send up the requested block and then places it in its block frame, overwriting any block that used to be there, and placing the new blocks tag alongside it in the frame Presented with a physical address, the cache determines if the requested block is in cache by going to the block frame and comparing the tag of the requested block with the tag of the resident block (if any) Cache hit: If they match, the block is sent to the alignment network which uses the offset to extract the requested bytes from the block and align them properly for the destination CPU register Cache miss: If the tags dont match, cache tells main memory to send up the requested block and then places it in its block frame, overwriting any block that used to be there, and placing the new blocks tag alongside it in the frame


Download ppt "An Introduction to Cache View this presentation in slideshow mode."

Similar presentations


Ads by Google