Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 1 ECE 406 – Design of Complex Digital Systems Lecture 19: Cache Operation & Design Spring 2009.

Similar presentations


Presentation on theme: "Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 1 ECE 406 – Design of Complex Digital Systems Lecture 19: Cache Operation & Design Spring 2009."— Presentation transcript:

1 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 1 ECE 406 – Design of Complex Digital Systems Lecture 19: Cache Operation & Design Spring 2009 W. Rhett Davis NC State University with significant material from Paul Franzon, Bill Allen, & Xun Liu

2 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 2 Announcements l HW#8 Due Thursday l Proj#2 Due in 16 days (Start early!)

3 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 3 Summary of Last Lecture l How can you tell if an interface has flow- control? l What can you do to reduce the complexity of the state transition diagram for an interface with flow control?

4 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 4 Today’s Lecture l Cache Introduction l Cache Examples l Project #2 Introduction

5 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 5 Cache Memory Fundamentals l A cache memory is another memory block in the system, which works closely with the "main" memory block, to improve the performance of memory accesses: l Cache memory is: » faster than main memory » usually physically closer to the decode and execution units » smaller in capacity than the main memory » holds the frequently accessed data and/or instructions

6 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 6 Cache Memory Fundamentals l Programmers want large amounts of fast memory » for function and performance l Large main memories are usually slow l Programs do not access all code/data uniformly, rather smaller amounts of the total data and code (instructions) are accessed more frequently than the rest l Programs exhibit: » "Spatial Locality" - high probability that an instruction --- which is physically close in memory to the one just accessed --- will be accessed » "Temporal Locality" - high probability that a recently accessed instruction will be accessed again

7 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 7 Multi-Level Cache Hierarchy Main Memory Level 2 Cache Level 1 Cache Frequently access block of memory (data and/or instructions) Cache Memory Fundamentals

8 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 8 Elements of a Cache l Size » in relation to the main memory l Mapping » direct, set associative, fully associative, etc. l Replacement algorithm (for a cache "miss") » LRU, FIFO, LFU, Random, etc. l Write policy » write back, through, once, allocate, etc. l Line size (block size) l Cache Levels » number of caches, "memory hierarchy" l Cache Coherency » across multiple processors with caches l Type of Accesses » Unified (both instruction & data), Split (separate instr. & data caches)

9 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 9 Basic Cache Operation Cache Controller receives address of data or instruction to be accessed, from CPU Is the data / instruction in the cache ? Forward the data / instruction to the CPU Done yes = "cache hit" Cache Controller accesses main memory to get the requested data / instruction Allocate / replace the lines in the cache for the requested data / instruction REPLACEMENT POLICY? Load data / instruction and its associated block in cache READ / WRITE POLICY? Forward the data / instruction to the CPU no = "cache miss" order and sequence depends on the replacement and read/write policies

10 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 10 Cache Policies l Replacement Policy » Miss: –decide which location(s) and what contents in cache to replace with the requested data and its associated "block" » Policy Options: –LRU, FIFO, LFU, Random » We will use a direct-mapped cache, which means that only one cache location is mapped to each main-memory location. A Miss will always require replacement l Read Policy Options » Hit: –(1) Forward requested data to the CPU » Miss: –(1) "Load Through" - forward to CPU as the cache is filled from main memory –(2) Fill cache first from main memory, then forward to the CPU –We will use option (2)

11 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 11 Cache Policies l Write Policy Options » Hit: –(1) "Write Through" - write to both cache and main memory –(2) "Write Back" - write to cache, update main memory upon a cache "flush“ –We will use option (1) » Miss: –1) "Write Allocate" - write to main memory and then fill cache –2) "Write No-Allocate" - write to main memory only –We will use option (1)

12 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 12 Today’s Lecture l Cache Introduction l Cache Examples l Project #2 Introduction

13 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 13 Direct Mapped Caches l Each Main-Memory Address is divided into three fields: l Example 1: » 32 main-memory locations (5 address bits) » 16 bits per word » 0 offset bits (1 word per block) » 3 index bits (8 blocks in cache) » 2 tag bits » Cache RAM will be 8 words x 18 bits Tag IndexOffset

14 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 14 Basic Cache Architecture 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101. 11100 11101 11110 11111 Main Memory 32 Locations 000 001 010 011 100 101 110 111 Cache 8 "Blocks" Cache Index 10101

15 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 15 Basic Cache Architecture 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101. 11100 11101 11110 11111 Main Memory 32 Locations 000 001 010 011 100 101 110 111 Cache 8 "Blocks" Cache Index 10101 memory locations mapped to cache locations - the cache holds copies of what is in main memory

16 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 16 Basic Cache Architecture 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101. 11100 11101 11110 11111 Main Memory 32 Locations 000 001 010 011 100 101 110 111 Cache 8 "Blocks" Cache Index 10101 lower order bits used as the cache "index"

17 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 17 Basic Cache Architecture 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101. 11100 11101 11110 11111 Main Memory 32 Locations 000 001 010 011 100 101 110 111 Cache 8 "Blocks" Cache Index 10101 conflicted mappings

18 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 18 Basic Cache Architecture 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101. 11100 11101 11110 11111 Main Memory 32 Locations 000 001 010 011 100 101 110 111 Cache 8 "Blocks" Cache Index 10101 higher order bits used as the cache "tag"

19 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 19 Basic Cache Architecture 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101. 11100 11101 11110 11111 Main Memory 32 Locations 000 001 010 011 100 101 110 111 Cache 8 "Blocks" Cache Index 10101 higher order bits used as the cache "tag" - to determine which particular memory line is in cache - that is, which "index" is in cache Cache Tag

20 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 20 Basic Cache Architecture 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101. 11100 11101 11110 11111 Main Memory 32 Locations 000 001 010 011 100 101 110 111 Cache 8 "Blocks" Cache Index 10101 - which particular memory line is in cache ? - that is, which "index" is in cache Cache Tag ?

21 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 21 Basic Cache Architecture 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101. 11100 11101 11110 11111 Main Memory 32 Locations 000 001 010 011 100 101 110 111 Cache 8 "Blocks" Cache Index 10101 compare the "tag" - to determine which particular memory line is in cache - that is, which "index" is in cache Cache Tag 1 0 10101010

22 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 22 Basic Cache Architecture 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101. 11100 11101 11110 11111 Main Memory 32 Locations 000 001 010 011 100 101 110 111 Cache 8 "Blocks" Cache Index 10101 Cache Tag 1 0 10101010 "compare" "decode" - therefore, a comparator and a decoder is needed cache controller

23 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 23 Basic Cache Architecture 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101. 11100 11101 11110 11111 Main Memory 32 Locations 000 001 010 011 100 101 110 111 Cache 8 "Blocks" Cache Index 10101 Cache Tag 1 0 10101010 - Valid Array indicates if a Cache Block and Tag have been loaded. An invalid entry should always result in a “miss” Valid Array 0000010000000100

24 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 24 Another Direct-Mapped Example l Example 2 (Used on HW#8 & Proj#2): » 2 16 main-memory locations (16 address bits) » 16 bits per word » 2 offset bits –How many words per block? » 4 index bits –How many blocks in cache? » How many tag bits? » How big will the Cache RAM be?

25 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 25 Example Program l AddressDataAssembly Language l 30005020 // AND R0, R0, #0 l 30011027 // ADD R0, R0, #7 l 30025260 // AND R1, R1, #0 l 30031265 // loop1ADD R1, R1, #5 l 3004103F // ADD R0, R0, #-1 l 300503FD // BRP loop1 l 30063202 // ST R1, var1 l 3007EC04 // LEA R6, dest l 3008C180 // JMP R6 l 30090000 // var1 NOP l 300A0000 // var2 NOP l 300B0000 // var3 NOP l 300C25FC // dest LD R2, var1 l 300D14A1 // ADD R2, R2, #1 l 300E75BE // STR R2, R6, #-2 l 300F7DBF // STR R6, R6, #-1 l 3010A7FA // LDI R3, var3 l 3011B5F9 // STI R2, var3 l 30120FFF // last BRNZP last

26 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 26 Exercise l For the first 7 instructions, find the following: » tag, index, and offset for each memory access » Type of Cache Operation (e.g. read hit, read miss, write hit, or write miss) » Show the contents of the cache RAM

27 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 27 Exercise l 1 st instruction » Fetch from location 3000: –offset: –index: –tag: » Operation: » Cache RAM Contents: IndexValidTagData 0123 0 1 2 … F

28 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 28 Exercise l 2 nd instruction » Fetch from location 3001: –offset: –index: –tag: » Operation: » Cache RAM Contents: IndexValidTagData 0123 0 1 2 … F

29 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 29 Exercise l 5 th instruction » Fetch from location 3004: –offset: –index: –tag: » Operation: » Cache RAM Contents: IndexValidTagData 0123 0 1 2 … F

30 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 30 Exercise l 7 th instruction » Fetch from location 3006: » Write 0023 to location 3009: –offset: –index: –tag: » After writing to main memory, the block is loaded » Cache RAM Contents: IndexValidTagData 0123 0 1 2 … F

31 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 31 Exercise l What if the next instruction were a read from location 1105? –offset: –index: –tag: » l What if the next instruction were a write to location 300A? –offset: –index: –tag: »

32 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 32 Today’s Lecture l Cache Introduction l Cache Examples l Project #2 Introduction

33 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 33 Project #1 System l Synchronous Memory with Separate din/dout/address lines

34 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 34 Project #2 Changes l Asynchronous Off-Chip Memory with Shared din/dout/address lines l Cache sits between processor and memory l LC3 Unchanged except for “macc” signal » High when state is Fetch, Read Memory, Write Memory, or Read Indirect Address l SimpleLC3 and Memory blocks will be provided

35 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 35 Data Transfer Interface Cache Off-chip Memory Read request (rrqst) Data/Address(data) Read ready(rrdy) Read data ready(rdrdy) Read data accept(rdacpt) Write request(wrqst) Write accept(wacpt) addr din rd dout complete clock reset Memory Access (macc)

36 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 36 Protocol for Read Miss addressdata0data1data2data3 Read request Read ready Read data ready Read data accept

37 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 37 Protocol for Write Hit addressdata Write request Write accept

38 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 38 Protocol for Write Miss Read request addrdata data0data1data2data3 Read data ready Read data accept Write reqst Write acpt data

39 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 39 Cache System Block-Diagram

40 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 40 UnifiedCache Schematic

41 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 41 CacheController Block l Takes the handshaking signals from the LC-3 CPU and Off-Chip Memory as inputs l Takes miss indicator from CacheData as input l Maintains the state of the Cache and Interfaces l Maintains a 2-bit counter that specifies the word offset to be loaded into the cache

42 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 42 Controller State Machine 0 3 Read-hit 2 4 1 Read-miss rrdy=1 rdrdy=1 Read-complete 5 Write wacpt=1 wacpt=0, hit 67 wacpt=1 wacpt=0 rdrdy=1 reset 8 wacpt=0, miss always macc =0 || Read-incomplete

43 Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 43 Use Counter to Read Four Words 0 3 Read-hit 2 4 1 Read-miss rrdy=1 rdrdy=1 Read-complete 5 Other Write wacpt=1 wacpt=0, hit 67 wacpt=1 wacpt=0 rdrdy=1 reset 8 wacpt=0, miss always macc =0 || Read-incomplete 32 rdrdy=1 32 32 32 rdrdy=0


Download ppt "Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 1 ECE 406 – Design of Complex Digital Systems Lecture 19: Cache Operation & Design Spring 2009."

Similar presentations


Ads by Google