Presentation is loading. Please wait.

Presentation is loading. Please wait.

4/6/2005 ECE 232 1 Motivation for Memory Hierarchy What we want from memory  Fast  Large  Cheap There are different kinds of memory technologies  Register.

Similar presentations


Presentation on theme: "4/6/2005 ECE 232 1 Motivation for Memory Hierarchy What we want from memory  Fast  Large  Cheap There are different kinds of memory technologies  Register."— Presentation transcript:

1 4/6/2005 ECE 232 1 Motivation for Memory Hierarchy What we want from memory  Fast  Large  Cheap There are different kinds of memory technologies  Register Files, SRAM, DRAM, MRAM, Disk… size: speed: $/Mbyte: line size: 32 B 0.3 ns 8 B RegisterCacheMemoryDisk Memory 32 KB-4MB 1 ns $60/MB 32 B 1024 MB 30 ns $0.10/MB 4 KB 300 GB 8 X 10 6 ns $0.001/MB larger, slower, cheaper

2 4/6/2005 ECE 232 2 Need for speed Assume CPU runs at 3GHz Every instruction requires 4B of instruction and at least one memory access (4B of data)  3 * 8 = 24GB/sec Peak performance of sequential burst of transfer ( Performance for random access is much much slower due to latency ) InterfaceWidthFrequencyBytes/Sec 4-way interleaved PC1600 (DDR200) SDRAM 4 x 64b its100 MHz DDR6.4 GB/s Opteron HyperTran sport memory bus128bits200 MHz DDR6.4 GB/s Pentium 4 "800 MHz" FSB64bits200 MHz QDR6.4 GB/s PC2 6400 (DDR-II 800) SDRAM64bits400 MHz DDR6.4 GB/s PC2 5300 (DDR-II 667) SDRAM64bits333 MHz DDR5.3 GB/s Pentium 4 "533 MHz" FSB64bits133 MHz QDR4.3 GB/s

3 4/6/2005 ECE 232 3 Need for Large Memory Small memories are fast So just write small programs 640 K of memory should be enough for anybody. -- Bill Gates, 1981 Real programs require large memories  Powerpoint 2003 – 25 megabytes  Data base applications may require Gigabytes of memory

4 4/6/2005 ECE 232 4 Levels in Memory Hierarchy Hierarchy makes memory appear faster, larger and cheaper by exploiting locality of reference  Temporal locality  Spatial locality Memory  Latency (remember from pipeline?) needed for random access  Bandwidth for moving blocks of memory Strategy: Provide a Small, Fast Memory which holds a subset of the main memory  It is both low latency (smaller address space) and  High bandwidth (larger data width)

5 4/6/2005 ECE 232 5 Basic Philosophy Move data into ‘smaller, faster’ memory Operate on it (latency) Move it back to ‘larger, cheaper’ memory (bandwidth)  How do we keep track if changed What if we run out of space in ‘smaller, faster’ memory?

6 4/6/2005 ECE 232 6 Typical Hierarchy Notice that the data width is changing  Why? Bandwidth: Transfer rate between various levels  CPU-Cache: 24 GBps  Cache-Main: 0.5-6.4GBps  Main-Disk: 187MBps (serial ATA/1500) CPU regs CacheCache Memory disk 8 B32 B4 KB cachevirtual memory

7 4/6/2005 ECE 232 7 Bandwidth Issue Fetch large blocks at a time (Bandwidth)  Supports spatial locality for (i=0; i < length; i++) sum += array[i];  array has spatial locality  sum has temporal locality

8 4/6/2005 ECE 232 8 Figure of Merit Why are we building the cache?  Minimize the average memory access time  That means maximize number of access found in the cache “Hit Rate”  Percentage of Memory Access In Cache Assumption  Every instruction requires exactly 1 memory access  Every instruction requires 1 clock cycle to complete  Cache access time is same as clock cycle  Main memory access time is 20 cycles CPI (cycles/instruction) = hitRate * clocksCacheHit + (1 – hitRate) * clocksCacheMiss

9 4/6/2005 ECE 232 9 CPI Highly sensitive to hit rate  90% hit rate.90 * 1 +.10 * 20 = 2.9 CPI  95% hit rate.95 * 1 +.05 * 20 = 1.95 CPI  99% hit rate.99 * 1 +.01 * 20 = 1.01 CPI Hit rate matters  Larger cache, multi-level cache improves hit rate

10 4/6/2005 ECE 232 10 How is cache implemented Basic concept  Traditional Memory Given an address, provide some data  Associative Memory Given data, provide an address AKA “Content Addressable Memory”  “Data” is the Address  “Address” is which cache line

11 4/6/2005 ECE 232 11 Cache Implementation Fully associative (read text for set associative) Memory Addr Cache Line 0x400800XX1 0x204500XX4 0x143300XX2 0x542300XX3 …… Cache Line Memory Contents 1 2 3 4 … Associative Memory # of Cache Lines Width of Cache Lines

12 4/6/2005 ECE 232 12 The Issues How is the cache organized  Size Line size Number of Lines  Write policy  Replacement Strategy

13 4/6/2005 ECE 232 13 Cache Size Need to choose size of lines  Bigger Lines Exploit More Spatial Locality  Diminishing returns for larger and larger lines  Tends to be around 128 B And Number of Lines  More lines == Higher hit rate  Slower Memory  As many as practical Cache Line Memory Contents 1 2“ 3 4 … Width of Cache Lines

14 4/6/2005 ECE 232 14 Writing to the Cache Need to keep cache consistent with memory  Write to cache and memory simultaneously “Write-through”  Refinement: Write to cache and mark as ‘dirty’ Will need to eventually copy back to main memory “Write-back”

15 4/6/2005 ECE 232 15 Replacement Strategies Problem: We need to make space in cache for a new entry Which Line Should be ‘Evicted’  Ideal?: Longest Time Till Next Access  Least-recently used Complicated  Random selection Simple  Effect on hit rate is relatively small

16 4/6/2005 ECE 232 16 Processor-DRAM Gap (latency) µProc 60%/yr. DRAM 7%/yr. 1 10 100 1000 1980198119831984 19851986 198719881989199019911992199319941995 199619971998 1999 2000 DRAM CPU 1982 Processor-Memory Performance Gap: (grows 50% / year) Performance Time “Moore’s Law” Patterson, 1998

17 4/6/2005 ECE 232 17 Will Do Almost Anything to Improve Hit Rate Lots of techniques Most important: Make the cache big An improvement of 1% is very worthwhile Avoid worst case whenever possible Multilevel caching


Download ppt "4/6/2005 ECE 232 1 Motivation for Memory Hierarchy What we want from memory  Fast  Large  Cheap There are different kinds of memory technologies  Register."

Similar presentations


Ads by Google