Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania 18042 ECE 313 - Computer Organization Lecture 20 - Memory.

Similar presentations


Presentation on theme: "Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania 18042 ECE 313 - Computer Organization Lecture 20 - Memory."— Presentation transcript:

1

2 Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania 18042 nestorj@lafayette.edu ECE 313 - Computer Organization Lecture 20 - Memory Hierarchy 1 Fall 2004 Reading: 7.1-7.3 Portions of these slides are derived from: Textbook figures © 1998 Morgan Kaufmann Publishers all rights reserved Tod Amon's COD2e Slides © 1998 Morgan Kaufmann Publishers all rights reserved Dave Patterson’s CS 152 Slides - Fall 1997 © UCB Rob Rutenbar’s 18-347 Slides - Fall 1999 CMU other sources as noted

3 ECE 313 Fall 2004Lecture 20 - Memory2 Roadmap for the term: major topics  Overview / Abstractions and Technology  Instruction sets  Logic & arithmetic  Performance  Processor Implementation  Single-cycle implemenatation  Multicycle implementation  Pipelined Implementation  Memory systems   Input/Output

4 ECE 313 Fall 2004Lecture 20 - Memory3 Outline - Memory Systems  Overview   Motivation  General Structure and Terminology  Memory Technology  Static RAM  Dynamic RAM  Disks  Cache Memory  Virtual Memory

5 ECE 313 Fall 2004Lecture 20 - Memory4 Memory Systems - the Big Picture  Memory provides processor with  Instructions  Data  Problem: memory is too slow and too small Control Datapath Memory Processor Input Output Instructions Data “Five Classics Components” Picture

6 ECE 313 Fall 2004Lecture 20 - Memory5 Memory Hierarchy - the Big Picture  Problem: memory is too slow and too small  Solution: memory hierarchy Fastest Slowest Smallest Biggest Highest Lowest Speed: Size: Cost: Control Datapath Secondary Storage (Disk) Processor Registers L2 Off-Chip Cache Main Memory (DRAM) L1 On-Chip Cache

7 ECE 313 Fall 2004Lecture 20 - Memory6 Why Hierarchy Works  The principle of locality  Programs access a relatively small portion of the address space at any instant of time.  Temporal locality: recently accessed data is likely to be used again  Spatial locality: data near recently accessed data is likely to be used soon  Result: the illusion of large, fast memory Address Space 02 n - 1 Probability of reference

8 ECE 313 Fall 2004Lecture 20 - Memory7 Memory Hierarchy - Speed vs. Size Control Datapath Secondary Storage (Disk) Processor Registers L2 Off-Chip Cache Main Memory (DRAM) L1 On-Chip Cache 0.5-25 5,000,000 (5ms)Speed (ns):80-250 <1K Size (bytes):>100G <16G<16M 0.25-0.5

9 ECE 313 Fall 2004Lecture 20 - Memory8 Memory Hierarchy - Terminology Processor Blocks of Data Hit: Data in Upper Level Miss: Data not in Upper Level

10 ECE 313 Fall 2004Lecture 20 - Memory9 Memory Hierarchy Terminology (cont’d)  Hit: data appears in some block in the upper level (green block)  Hit Rate: the fraction of memory accesses that “hit”  Hit Time: time to access the upper level (time to determine hit/miss + access time)  Miss: data must be retrieved from block in lower level (orange block)  Miss Rate = 1 - (Hit Rate)  Miss Penalty: Time to replace block in upper level + Time to deliver data to the processor  Hit Time > Miss Rate

11 ECE 313 Fall 2004Lecture 20 - Memory10 Typical Memory Hierarchy - Details  Registers - Small, fastest on-chip storage  Managed by compiler and run-time system  Cache - Small, fast on-chip storage  Associative lookup - managed by hardware  Memory - Slower, Larger off-chip storage  Limited size <16Gb - managed by hardware, OS  Disk - Slowest, Largest off-chip storage  Virtual memory - simulate a large memory using disk, hardware, and operating system  File storage - store data files using operating system

12 ECE 313 Fall 2004Lecture 20 - Memory11 Outline - Memory Systems  Overview  Motivation  General Structure and Terminology  Memory Technology   Static RAM  Dynamic RAM  Cache Memory  Virtual Memory

13 ECE 313 Fall 2004Lecture 20 - Memory12 Memory Types  Static RAM  Storage using latch circuits  Values saved while power on  Dynamic RAM  Storage using capacitors  Values must be refreshed bit word / row select 10 01 bit C

14 ECE 313 Fall 2004Lecture 20 - Memory13 Tradeoffs - Static vs. Dynamic RAM  Static RAM (SRAM) - used for L1, L2 cache  Fast - 0.5-25ns access time (less for on-chip)  Larger, More Expensive  Higher power consumption  Dynamic RAM (DRAM) - used for PC main memory  Slower - 80-250ns access time*  Smaller, Cheaper  Lower power consumption

15 ECE 313 Fall 2004Lecture 20 - Memory14 DRAM Organization Row Decoder Column Selector / Latch / IO Row Address Column Address /RAS /CAS DATA Row Select Line Bit (data) Line

16 ECE 313 Fall 2004Lecture 20 - Memory15 0 0 010 011 DRAM Read Operation Row Decoder Column Selector / Latch / IO Row Address Column Address /RAS /CAS DATA

17 ECE 313 Fall 2004Lecture 20 - Memory16 DRAM Trends  RAM size: 4X every 3 years  RAM speed: 2X every 10 years DRAM YearSizeCycle Time 198064 Kb250 ns 1983256 Kb220 ns 19861 Mb190 ns 19894 Mb165 ns 199216 Mb145 ns 199564 Mb120 ns 1997?128 Mb ?? ns 1999?256 Mb?? ns 1980-1995 Size change: 1000:1! 1980-1995 Speed change: 2:1!

18 ECE 313 Fall 2004Lecture 20 - Memory17 The Processor/Memory Speed Gap DRAM 9%/yr. (2X/10 yrs) 1 10 100 1000 19801981198319841985198619871988 1989 19901991199219931994199519961997199819992000 DRAM CPU 1982 Processor-Memory Performance Gap: (grows 50% / year) Performance Time “Moore’s Law”

19 ECE 313 Fall 2004Lecture 20 - Memory18 Addressing the Speed Gap  Latency depends on physical limitations  Bandwidth can be increased using:  Parallelism - transfer more bits / word  Burst transfers - transfer successive words on each cycle  So... use bandwidth to support memory hierarchy!  Use cache to support locality of reference  Design hierarchy to transfer large blocks of memory

20 ECE 313 Fall 2004Lecture 20 - Memory19 Current DRAM Parts  Synchronous DRAM (SDRAM) - clocked transfer of bursts of data starting at a specific address  Double-Data Rate SDRAM - transfer two bits/clock cycle  Quad-Data Rate SDRAM - transfer four bits / clock cycle  Rambus RDRAM - High-speed interface for fast transfers  Current PCs use some form of SDRAM/RDRAM  SDRAM w/ PC100 or PC133 memory bus  RDRAM w/ PC800 memory bus

21 ECE 313 Fall 2004Lecture 20 - Memory20 Memory Configuration in Current PCs Processor System Controller L1 Cache Main Memory (DRAM) L2/L3 Cache (SRAM) (I/O Bus)

22 ECE 313 Fall 2004Lecture 20 - Memory21 Outline - Memory Systems  Overview  Motivation  General Structure and Terminology  Memory Technology  Static RAM  Dynamic RAM  Cache Memory   Virtual Memory

23 ECE 313 Fall 2004Lecture 20 - Memory22 CPU Hit: Data in Cache (no penalty) Miss: Data not in Cache (miss penalty) Cache Memory DRAM Memory Processor addrdata addrdata Cache Operation  Insert between CPU, Main Mem.  Implement with fast Static RAM  Holds some of a program’s  data  instructions  Operation:

24 ECE 313 Fall 2004Lecture 20 - Memory23 Four Key Cache Questions: 1.Where can block be placed in cache? (block placement) 2.How can block be found in cache? (block identification) 3.Which block should be replaced on a miss? (block replacement) 4.What happens on a write? (write strategy)

25 ECE 313 Fall 2004Lecture 20 - Memory24 Basic Cache Design  Organized into blocks or lines  Block Contents  tag - extra bits to identify block (part of block address)  data - data or instruction words- contiguous memory locations  Our example:  One-word (4 byte) block size  30-bit tag  Two blocks in cache CPU tag 0data 0 CPU tag 1data 1 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory

26 ECE 313 Fall 2004Lecture 20 - Memory25 Cache Example (2)  Assume:  r1==0, r2==1, r4==2  1 cycle for cache access  5 cycles for main. mem. access  1 cycle for instr. execution  At cycle 1 - PC=0x00  Fetch instruction from memory look in cache MISS - fetch from main mem (5 cycle penalty) CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L MISSMISS

27 ECE 313 Fall 2004Lecture 20 - Memory26 Cache Example (3)  At cycle 6  Execute instr. add r1,r1,r2 CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…000 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0

28 ECE 313 Fall 2004Lecture 20 - Memory27 Cache Example (4)  At cycle 6 - PC=0x04  Fetch instruction from memory look in cache MISS - fetch from main mem (5 cycle penalty) CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0 MISSMISS 6-10 FETCH 0x…4

29 ECE 313 Fall 2004Lecture 20 - Memory28 Cache Example (5)  At cycle 11  Execute instr. bne r4,r1,L CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…000 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0 6-10 FETCH 0x…004 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1

30 ECE 313 Fall 2004Lecture 20 - Memory29 Cache Example (6)  At cycle 11 - PC=0x00  Fetch instruction from memory  HIT - instruction in cache CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0 6-10 FETCH 0x…4 bne r4,r1,L 0x…1 HITHIT 110x…4 bne r4,r1,L 1 11 FETCH 0x…0 1

31 ECE 313 Fall 2004Lecture 20 - Memory30 Cache Example (7)  At cycle 12  Execute add r1, r1, 2 CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0 6-10 FETCH 0x…4 bne r1,r2,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x…0 1 12 add r1,r1,2 2

32 ECE 313 Fall 2004Lecture 20 - Memory31 Cache Example (8)  At cycle 12 - PC=0x04  Fetch instruction from memory  HIT - instruction in cache CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0 6-10 FETCH 0x…4 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x…0 1 12 add r1,r1,2 2 12 FETCH 0x04 HITHIT

33 ECE 313 Fall 2004Lecture 20 - Memory32 Cache Example (9)  At cycle 13  Execute instr. bne r4, r1, L  Branch not taken CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0 6-10 FETCH 0x…4 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x…0 1 12 add r1,r1,2 2 12 FETCH 0x04 13 bne r4, r1, L

34 ECE 313 Fall 2004Lecture 20 - Memory33 Cache Example (10)  At cycle 13 - PC=0x08  Fetch Instruction from Memory  MISS - not in cache CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0 6-10 FETCH 0x…4 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x…0 1 12 add r1,r1,2 2 12 FETCH 0x04 13 bne r4, r1, L 13 FETCH 0x08 MISSMISS

35 ECE 313 Fall 2004Lecture 20 - Memory34 Cache Example (11)  At cycle 17 - PC=0x08  Put instruction into cache  Replace existing instruction CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0 6-10 FETCH 0x…4 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x…0 1 12 add r1,r1,2 2 12 FETCH 0x04 13 bne r4, r1, L 13-17 FETCH 0x08 sub r1,r1,r1 0x…2

36 ECE 313 Fall 2004Lecture 20 - Memory35 Cache Example (12)  At cycle 18  Execute sub r1, r1, r1 CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 6-10 FETCH 0x…4 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x…0 1 12 add r1,r1,2 2 12 FETCH 0x04 2 13 bne r4, r1, L 2 13-17 FETCH 0x08 2 18 sub r1, r1, r1 0 sub r1,r1,r1 0x…2

37 ECE 313 Fall 2004Lecture 20 - Memory36 Cache Example (13)  At cycle 18  Fetch instruction from memory  MISS - not in cache CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0 6-10 FETCH 0x…4 bne r4,r1,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x…0 1 12 add r1,r1,2 2 12 FETCH 0x04 2 13 bne r4, r1, L 2 13-17 FETCH 0x08 2 sub r1,r1,r1 18 sub r1, r1, r1 0 18 FETCH 0x0C MISSMISS

38 ECE 313 Fall 2004Lecture 20 - Memory37 Cache Example (14)  At cycle 22  Put instruction into cache  Replace existing instruction CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 L: add r1,r1,r2 0x…0 6-10 FETCH 0x…4 bne r1,r2,L 0x…1 110x…4 bne r4,r1,L 1 12 FETCH 0x…0 1 12 add r1,r1,2 2 12 FETCH 0x04 2 13 bne r4, r1, L 2 13-17 FETCH 0x08 2 18 sub r1, r1, r1 0 18-22 FETCH 0x0C j L 0x…3 sub r1,r1,r1 0x…2

39 ECE 313 Fall 2004Lecture 20 - Memory38 Cache Example (15) CycleAddressOp/Instr. r1 1-5 FETCH 0x…0 60x…0 add r1,r1,r2 1 6-10 FETCH 0x…4 110x…4 bne r3,r1,L 11 FETCH 0x…0 120x…8 add r1,r1,2 2 12 FETCH 0x…4 130x…4 bne r4,r1,L 13-17 FETCH 0x…8 180x…8 sub r1,r1,r1 0 18-22 FETCH 0x..C 230x…8 j L CPU (empty) CPU (empty) L: add r1,r1,r2 0x00000000 0x00000004 0x00000008 0x0000000C 0x00000000 b0 b1 Cache Main Memory bne r4,r1,L sub r1,r1,r1 L: j L  At cycle 23  Execute j L j L 0x…3 sub r1,r1,r1 0x…2

40 ECE 313 Fall 2004Lecture 20 - Memory39 Compare No-cache vs. Cache CycleAddressOp/Instr. 1-5 FETCH 0x…0 60x…0 add r1,r1,r 6-10 FETCH 0x…4 110x…4 bne r3,r1,L 11-15 FETCH 0x…0 160x…0 add r1,r1,2 16-20 FETCH 0x…4 210x…4 bne r3,r1,L 21-25 FETCH 0x…8 260x…8 sub r1,r1,r1 26-30 FETCH 0x..C 310x…C j L CycleAddressOp/Instr. 1-5 FETCH 0x…0 60x…0 add r1,r1,r 6-10 FETCH 0x…4 110x…4 bne r3,r1,L 11 FETCH 0x…0 120x…0 add r1,r1,2 12 FETCH 0x…4 130x…4 bne r3,r1,L 13-17 FETCH 0x…8 180x…8 sub r1,r1,r1 18-22 FETCH 0x..C 230x…C j L NO CACHE CACHE M M H H M M

41 ECE 313 Fall 2004Lecture 20 - Memory40 Cache Miss and the MIPS Pipeline Compare in Cycle 1 Fetch Completes (Pipeline Restarts) Miss Detected in Cycle 2  Instruction Fetch Clock Cycle 1 Clock Cycle 2+N Clock Cycle 3+N Clock Cycle 4+N Clock Cycle 5+N Clock Cycle 6+N

42 ECE 313 Fall 2004Lecture 20 - Memory41 Cache Miss and the MIPS Pipeline Compare in Cycle 4 Miss Detected in Cycle 5 Load Completes (Pipeline Restarts)  Load Instruction Clock Cycle 1 Clock Cycle 2 Clock Cycle 3 Clock Cycle 4 Clock Cycle 5 Clock Cycle 5+N Clock Cycle 6+N

43 ECE 313 Fall 2004Lecture 20 - Memory42 Coming Up: Four Key Cache Questions: 1.Where can block be placed in cache? (block placement) 2.How can block be found in cache? …using a tag (block identification) 3.Which block should be replaced on a miss? (block replacement) 4.What happens on a write? (write strategy)


Download ppt "Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania 18042 ECE 313 - Computer Organization Lecture 20 - Memory."

Similar presentations


Ads by Google