Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 1 ELEC 5200-002/6200-002 Computer Architecture and Design Fall 2006 Memory Organization (Chapter.

Similar presentations


Presentation on theme: "Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 1 ELEC 5200-002/6200-002 Computer Architecture and Design Fall 2006 Memory Organization (Chapter."— Presentation transcript:

1 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 1 ELEC 5200-002/6200-002 Computer Architecture and Design Fall 2006 Memory Organization (Chapter 7) Vishwani D. Agrawal James J. Danaher Professor Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 http://www.eng.auburn.edu/~vagrawal vagrawal@eng.auburn.edu

2 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 2 Types of Computer Memories From the cover of: A. S. Tanenbaum, Structured Computer Organization, Fifth Edition, Upper Saddle River, New Jersey: Pearson Prentice Hall, 2006.

3 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 3 Electronic Memory Devices Memory technology Typical access time Clock rate GHz Cost per GB in 2004 SRAM 0.5-5 ns 0.2-2.0 GHz $4k-$10k DRAM 50-70 ns 15-20 MHz $100-$200 Magnetic disk 5-20 ms 50-200 Hz $0.5-$2

4 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 4 Random Access Memory (RAM) Memory cell array Address decoder Read/write circuits Address bits Data bits

5 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 5 Static RAM (SRAM) Cell Bit line Word line Bit line bit

6 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 6 Dynamic RAM (DRAM) Cell Word line Bit line

7 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 7 Cost of 40GB Memory Type of memory Cost Clock rate SRAM$160k 0.2-2.0 GHz DRAM$4k 15-20 MHz Disk$20 50-200 Hz

8 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 8 Trying to Buy a Laptop Computer? Two Years Ago IBM ThinkPad X40 23717GU 1.20 GHz Low Voltage Intel® Pentium® M, 1MB L2 Cache ~$5 Microsoft® Windows® XP Professional 512 MB DRAM~$100 40 GB Hard Drive ~$40 2.71 lbs, 12.1" XGA (1024x768) IBM Embedded Security Subsystem 2.0 Intel PRO/Wireless Network Connection 802.11b, Gigabit Ethernet Integrated graphics Intel Extreme Graphics 2 No CD/DVD drive PROP, Fixed Bay Availability**: Within 2 weeks $2,149.00 IBM web price* $1,741.65 sale price*

9 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 9 2006 Choose a Lenovo 3000 V Series to customize & buy From:$999.00 Sale price:$949.00 Processor Intel Core 2 Duo T5500 (1.66GHz, 2MBL2, 667MHzFSB) Total memory 512MB PC2-5300DDR2 SDRAM Hard drive 80GB, 5400rpm Serial ATA Weight 4.0lbs

10 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 10 Cache Processor does all memory operations with cache. Miss – If requested word is not in cache, a block of words containing the requested word is brought to cache, and then the processor request is completed. Hit – If the requested word is in cache, read or write operation is performed directly in cache, without accessing main memory. Block – minimum amount of data transferred between cache and main memory. Processor Cache Small, fast memory Main memory large, inexpensive (slow) words blocks

11 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 11 Invention of Cache M. V. Wilkes, “Slave Memories and Dynamic Storage Allocation,” IEEE Transactions on Electronic Computers, vol. EC-14, no. 2, pp. 270-271, April 1965.

12 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 12 Cache Performance Average access time = T1 × h + Tm × (1 – h) = Tm – (Tm – T1) × h where –T1 = cache access time (small) –Tm = memory access time (large) – h = hit rate (0 ≤ h ≤ 1) Hit rate is also known as hit ratio, miss rate = 1 – hit rate Processor Cache Small, fast memory Main memory large, inexpensive (slow) Access time = T1 Access time = Tm

13 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 13 Average Access Time Tm T1 0 h = 1 1 h = 0 miss rate, 1 – h Access time Desirable miss rate < 5% Acceptable miss rate < 10% Tm – (Tm – T1) × h

14 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 14 Comparing Performance Processor without cache, CPI = 1 –Assume memory access time of 10 cycles –Assume 30% instructions require memory access Processor with cache –Assume hit rate 0.95 for instructions, 0.90 for data –Assume miss penalty (time to read memory into cache and from it) is 17 cycles Comparing times of 100 instructions: Time without cache 100×10 + 30×10 ──────────── = ──────────────────────────── Time with cache 100(0.95×1+0.05×17) + 30(0.9×1+0.1×17) Time with cache 100(0.95×1+0.05×17) + 30(0.9×1+0.1×17) = 5.04 = 5.04

15 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 15 Controlling Miss Rate Increase cache size More blocks can be kept in cache; chance of miss is reduced. Larger cache is slower. Increase block size More data available; reduced chance of miss. Fewer blocks in cache increase chance of miss. Larger blocks need more time to swap. Large memory Cache Blocks

16 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 16 Increasing Hit Rate Hit rate increases with cache size. Hit rate mildly depends on block size. 0% 5% 10% Cache size = 4KB 16KB 64KB 16B 32B 64B 128B 256B Block size miss rate = 1 – hit rate 90% 95% 100% hit rate, h Improving chances of getting localized data Decreasing chances of getting fragmented data

17 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 17 The Locality Principle A program tends to access data that form a physical cluster in the memory – multiple accesses may be made within the same block. Physical localities are temporal and may shift over longer periods of time – data not used for some time is less likely to be used in the future. Upon miss, the least recently used (LRU) block can be overwitten by a new block. P. J. Denning, “The Locality Principle,” Communications of the ACM, vol. 48, no. 7, pp. 19-24, July 2005.

18 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 18 Data Locality, Cache, Blocks Increase block size to match locality size Increase cache size to increase most data Data needed by a program Block 1 Block 2 Memory Cache

19 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 19 Types of Caches Direct-mapped cache Partitions of size of cache in the memory Each partition subdivided into blocks Set-associative cache

20 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 20 Direct-Mapped Cache Data needed by a program Block 1 Block 2 Memory Cache LRU Data needed Swap-out Swap-in

21 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 21 Set-Associative Cache Data needed by a program Block 1 Block 2 Memory Cache LRU Data needed Swap-in Swap-out

22 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 22 Direct-Mapped Cache 000 001 010 011 100 101 110 111 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101 01110 01111 10000 10001 10010 10011 10100 10101 10110 10111 11000 11001 11010 11011 11100 11101 11110 11111 Main memory Cache of 8 blocks 11 101 → memory address cache address: tag index 32-word word-addressable memory Block size = 1 word index (local address) tag 00 10 11 01 00 10 11

23 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 23 Direct-Mapped Cache 00 01 10 11 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101 01110 01111 10000 10001 10010 10011 10100 10101 10110 10111 11000 11001 11010 11011 11100 11101 11110 11111 Main memory Cache of 4 blocks 11 10 1 → memory address cache address: tag index block offset 32-word word-addressable memory Block size = 2 word index (local address) tag 00 11 00 10 block offset 01

24 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 24 Main memory Size=W words Number of Tag and Index Bits Cache Size=w words Each word in cache has unique index (local addr.) Number of index bits = log 2 w Index bits are shared with block offset when a block contains more words than 1 Assume partitions of w words each in the main memory. W/w such partitions, each identified by a tag Number of tag bits = log 2 (W/w)

25 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 25 Direct-Mapped Cache (Byte Address) 000 001 010 011 100 101 110 111 00000 00 00001 00 00010 00 00011 00 00100 00 00101 00 00110 00 00111 00 01000 00 01001 00 01010 00 01011 00 01100 00 01101 00 01110 00 01111 00 10000 00 10001 00 10010 00 10011 00 10100 00 10101 00 10110 00 10111 00 11000 00 11001 00 11010 00 11011 00 11100 00 11101 00 11110 00 11111 00 Main memory Cache of 8 blocks 11 101 00 → memory address cache address: tag index 32-word byte-addressable memory Block size = 1 word index tag 00 10 11 01 00 10 11 byte offset

26 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 26 Finding a Word in Cache Valid 2-bit Indexbit TagData 000 001 010 011 100 101 110 111 byte offset b6 b5 b4 b3 b2 b1 b0 = Data 1 = hit 0 = miss Tag Index Memory address Cache size 8 words Block size = 1 word 32 words byte-address

27 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 27 How Many Bits Cache Has? Consider a main memory: 32 words; byte address is 7 bits wide: b6 b5 b4 b3 b2 b1 b0 Each word is 32 bits wide Assume that cache block size is 1 word (32 bits data) and it contains 8 blocks. Cache requires, for each word: 2 bit tag, and one valid bit Total storage needed in cache = #blocks in cache × (data bits/block + tag bits + valid bit) = 8 (32+2+1) = 280 bits

28 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 28 A More Realistic Cache Consider 4 GB, byte-addressable main memory: 1Gwords; byte address is 32 bits wide: b31…b16 b15…b2 b1 b0 Each word is 32 bits wide Assume that cache block size is 1 word (32 bits data) and it contains 64 KB data, or 16K words, i.e., 16K blocks. Number of cache index bits = 14, because 16K = 2 14 Tag size = 32 – byte offset – #index bits = 32 – 2 – 14 = 16 bits Cache requires, for each word: 16 bit tag, and one valid bit Total storage needed in cache = #blocks in cache × (data bits/block + tag size + valid bits) = 2 14 (32+16+1) = 16×2 10 ×49 = 784×2 10 bits = 784 Kb = 98 KB Physical storage/Data storage = 98/64 = 1.53

29 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 29 Cache Bits for 4-Word Block Consider 4 GB, byte-addressable main memory: 1Gwords; byte address is 32 bits wide: b31…b16 b15…b2 b1 b0 Each word is 32 bits wide Assume that cache block size is 4 words (128 bits data) and it contains 64 KB data, or 16K words, i.e., 4K blocks. Number of cache index bits = 12, because 4K = 2 12 –Tag size = 32 – byte offset – #block offset bits – #index bits = 32 – 2 – 2 – 12 = 16 bits Cache requires, for each word: –16 bit tag, and one valid bit –Total storage needed in cache = #blocks in cache × (data bits/block + tag size + valid bit) = 2 12 (4×32+16+1) = 4×2 10 ×145 = 580×2 10 bits =580 Kb = 72.5 KB Physical storage/Data storage = 72.5/64 = 1.13

30 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 30 Using Larger Cache Block (4 Words) Val. 16-bit Data Indexbit Tag (4 words=128 bits) byte offset b31… b15 b14… b4 b3 b2 b1 b0 = Data 1 = hit 0 = miss 16 bit Tag 12 bit Index Memory address Cache size 16K words Block size = 4 word 4GB = 1G words byte-address 4K Indexes 0000 0000 0000 1111 1111 1111 M U X 2 bit block offset

31 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 31 Interleaved Memory Reduces miss penalty. Memory designed to read words of a block simultaneously in one read operation. Example: –Cache block size = 4 words –Interleaved memory with 4 banks –Suppose memory access ~15 cycles –Miss penalty = 1 cycle to send address + 15 cycles to read a block + 4 cycles to send data to cache = 20 cycles –Without interleaving, Miss penalty = 65 cycles Processor Cache Small, fast memory words blocks Memory bank 0 Memory bank 1 Memory bank 2 Memory bank 3 Main memory

32 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 32 Handling a Miss Miss occurs when data at the required memory address is not found in cache. Controller actions: –Stall pipeline –Freeze contents of all registers –Activate a separate cache controller If cache is full –select the least recently used (LRU) block in cache for over- writing –If selected block has inconsistent data, take proper action Copy the block containing the requested address from memory –Restart Instruction

33 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 33 Miss During Instruction Fetch Send original PC value (PC – 4) to the memory. Instruct main memory to perform a read and wait for the memory to complete the access. Write cache entry. Restart the instruction whose fetch failed.

34 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 34 Writing to Memory Cache and memory become inconsistent when data is written into cache, but not to memory – the cache coherence problem. Strategies to handle inconsistent data: –Write-through Write to memory and cache simultaneously always. Write to memory is ~100 times slower than to cache. –Write buffer Write to cache and to buffer for writing to memory. If buffer is full, the processor must wait.

35 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 35 Writing to Memory: Write-Back Write-back (or copy back) writes only to cache but sets a “dirty bit” in the block where write is performed. When a block with dirty bit “on” is to be overwritten in the cache, it is first written to the memory.

36 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 36 AMD Opteron Microprocessor L1 (split 64KB each) Block 64B Write-back L2 1MB Block 64B Write-back

37 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 37 Cache Hierarchy Average access time = h1 × T1 + (1 – h1) [h2T2+(1 – h2)Tm] Where T1 = L1 cache access time (smallest) T2 = L2 cache access time (small) Tm = memory access time (large) h1, h2 = hit rates (0 ≤ h1, h2 ≤ 1) h1, h2 = hit rates (0 ≤ h1, h2 ≤ 1) Average access time reduces by adding a cache. Processor L1 Cache (SRAM) Main memory large, inexpensive (slow) Access time = T1 Access time = Tm L2 Cache (DRAM) Access time = T2

38 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 38 Average Access Time Tm T1 0 h1=1 1 h1=0 miss rate, 1- h1 Access time T2+Tm 2 T2 T1 < T2 < Tm h2 = 0 h2 = 1 h2 = 0.5 h1 × T1 + (1 - h1) [h2 T2 + (1 - h2)Tm]

39 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 39 Processor Performance Without Cache 5GHz processor, cycle time = 0.2ns Memory access time = 100ns = 500 cycles Ignoring memory access, CPI = 1 Considering memory access: CPI= 1 + # stall cycles = 1 + 500 = 501

40 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 40 Performance with 1 Level Cache Assume hit rate, h1 = 0.95 L1 access time = 0.2ns = 1 cycle CPI= 1 + # stall cycles = 1 + 0.95×0 + 0.05×500 = 26 Processor speed increase due to cache = 501/26= 19.3

41 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 41 Performance with 2 Level Caches Assume: –L1 hit rate, h1 = 0.95 –L2 hit rate, h2 = 0.90 –L2 access time = 5ns = 25 cycles CPI = 1 + # stall cycles = 1 + 0.95×0 + 0.05(0.90×25 + 0.10×525) = 1 + 0.95×0 + 0.05(0.90×25 + 0.10×525) = 1 + 1.125 + 2.625 = 4.75 = 1 + 1.125 + 2.625 = 4.75 Processor speed increase due to caches = 501/4.75= 105.5 = 501/4.75= 105.5 Speed increase due to L2 cache = 26/4.75= 5.47 = 26/4.75= 5.47

42 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 42 Miss Rate of Direct-Mapped Cache 000 001 010 011 100 101 110 111 00000 00 00001 00 00010 00 00011 00 00100 00 00101 00 00110 00 00111 00 01000 00 01001 00 01010 00 01011 00 01100 00 01101 00 01110 00 01111 00 10000 00 10001 00 10010 00 10011 00 10100 00 10101 00 10110 00 10111 00 11000 00 11001 00 11010 00 11011 00 11100 00 11101 00 11110 00 11111 00 Main memory Cache of 8 blocks 11 101 00 → memory address cache address: tag index 32-word word-addressable memory Block size = 1 word index tag 00 10 11 01 00 10 11 byte offset Least recently used (LRU) block This block is needed

43 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 43 Miss Rate of Direct-Mapped Cache 000 001 010 011 100 101 110 111 00000 00 00001 00 00010 00 00011 00 00100 00 00101 00 00110 00 00111 00 01000 00 01001 00 01010 00 01011 00 01100 00 01101 00 01110 00 01111 00 10000 00 10001 00 10010 00 10011 00 10100 00 10101 00 10110 00 10111 00 11000 00 11001 00 11010 00 11011 00 11100 00 11101 00 11110 00 11111 00 Main memory Cache of 8 blocks 11 101 00 → memory address cache address: tag index 32-word word-addressable memory Block size = 1 word index tag 00 / 01 / 00 / 10 xx 00 xx byte offset Memory references to addresses: 0, 8, 0, 6, 8, 16 1. mis 2. mis 4. mis 3. mis 5. mis 6. mis

44 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 44 Fully-Associative Cache (8-Way Set Associative) 000 001 010 011 100 101 110 01010 111 00000 00 00001 00 00010 00 00011 00 00100 00 00101 00 00110 00 00111 00 01000 00 01001 00 01010 00 01011 00 01100 00 01101 00 01110 00 01111 00 10000 00 10001 00 10010 00 10011 00 10100 00 10101 00 10110 00 10111 00 11000 00 11001 00 11010 00 11011 00 11100 00 11101 00 11110 00 11111 00 Main memory Cache of 8 blocks 11101 00 → memory address cache address: tag 32-word word-addressable memory Block size = 1 word tag 00 10 11 01 00 10 11 byte offset LRU block This block is needed

45 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 45 Miss Rate: Fully-Associative Cache 00000 01000 00110 10000 xxxxx 00000 00 00001 00 00010 00 00011 00 00100 00 00101 00 00110 00 00111 00 01000 00 01001 00 01010 00 01011 00 01100 00 01101 00 01110 00 01111 00 10000 00 10001 00 10010 00 10011 00 10100 00 10101 00 10110 00 10111 00 11000 00 11001 00 11010 00 11011 00 11100 00 11101 00 11110 00 11111 00 Main memory Cache of 8 blocks 11101 00 → memory address cache address: tag 32-word word-addressable memory Block size = 1 word tag byte offset Memory references to addresses: 0, 8, 0, 6, 8, 16 1. miss 2. miss 3. hit 4. miss 5. hit 6. miss

46 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 46 Finding a Word in Associative Cache IndexValid 5-bit Data bit Tag byte offset b6 b5 b4 b3 b2 b1 b0 = Data 1 = hit 0 = miss 5 bit Tag no index Memory address Cache size 8 words Block size = 1 word 32 words byte-address Must compare with all tags in the cache

47 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 47 Eight-Way Set-Associative Cache byte offset b6 b5 b4 b3 b2 b1 b0 Data1 = hit 0 = miss 5 bit Tag Memory address Cache size 8 words Block size = 1 word 32 words byte-address = V | tag | data = = = = = = = 8 to 1 multiplexer

48 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 48 Two-Way Set-Associative Cache 00 01 10 11 00000 00 00001 00 00010 00 00011 00 00100 00 00101 00 00110 00 00111 00 01000 00 01001 00 01010 00 01011 00 01100 00 01101 00 01110 00 01111 00 10000 00 10001 00 10010 00 10011 00 10100 00 10101 00 10110 00 10111 00 11000 00 11001 00 11010 00 11011 00 11100 00 11101 00 11110 00 11111 00 Main memory Cache of 8 blocks 111 01 00 → memory address cache address: tag index 32-word word-addressable memory Block size = 1 word index tags 000 | 011 100 | 001 110 | 101 010 | 111 byte offset LRU block This block is needed

49 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 49 Miss Rate: Two-Way Set-Associative Cache 00 01 10 11 00000 00 00001 00 00010 00 00011 00 00100 00 00101 00 00110 00 00111 00 01000 00 01001 00 01010 00 01011 00 01100 00 01101 00 01110 00 01111 00 10000 00 10001 00 10010 00 10011 00 10100 00 10101 00 10110 00 10111 00 11000 00 11001 00 11010 00 11011 00 11100 00 11101 00 11110 00 11111 00 Main memory Cache of 8 blocks 111 01 00 → memory address cache address: tag index 32-word word-addressable memory Block size = 1 word index tags 000 | 010 xxx | xxx 001 | xxx xxx | xxx byte offset Memory references to addresses: 0, 8, 0, 6, 8, 16 1. miss 2. miss 4. miss 3. hit 5. hit 6. miss

50 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 50 Two-Way Set-Associative Cache byte offset b6 b5 b4 b3 b2 b1 b0 Data 1 = hit 0 = miss 3 bit tag Memory address Cache size 8 words Block size = 1 word 32 words byte-address V | tag | data == 00 01 10 11 2 to 1 MUX 2 bit index

51 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 51 Virtual Memory System Processor MMU: Memory management unit Virtual or logical address Physical address Main memory Physical address Data DMA: Direct memory access Cache Disk

52 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 52 Virtual vs. Physical Address Processor assumes a certain memory addressing scheme: A block of data is called a virtual page An address is called virtual (or logical) address Main memory may have a different addressing scheme: Memory address is called physical address MMU translates virtual address to physical address Complete address translation table is large and kept in main memory MMU contains TLB (translation lookaside buffer), which is a small cache of the address translation table → address translation can create its own hit or miss

53 Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 53 Page Fault Disk All data, organized in Pages (~4KB), accessed by Physical addresses Processor Cache MMU (TLB) Main Memory Pages (Write-back, same as in cache) Cached pages, Page table Page fault: a required page is not found in main memory Cache miss: a required block is not found in cache TLB miss: a required virtual address is not found in TLB “Page fault” in virtual memory is similar to “miss” in cache.


Download ppt "Fall 2006, Dec. 1, 4 ELEC 5200-002/6200-002 Lecture 13 1 ELEC 5200-002/6200-002 Computer Architecture and Design Fall 2006 Memory Organization (Chapter."

Similar presentations


Ads by Google