Presentation is loading. Please wait.

Presentation is loading. Please wait.

In1210/01-PDS 1 TU-Delft The Memory System. in1210/01-PDS 2 TU-Delft Organization 0123 4567 89...... Word Address Byte Address 0 1 2 3.

Similar presentations


Presentation on theme: "In1210/01-PDS 1 TU-Delft The Memory System. in1210/01-PDS 2 TU-Delft Organization 0123 4567 89...... Word Address Byte Address 0 1 2 3."— Presentation transcript:

1 in1210/01-PDS 1 TU-Delft The Memory System

2 in1210/01-PDS 2 TU-Delft Organization 0123 4567 89...... Word Address Byte Address 0 1 2 3

3 in1210/01-PDS 3 TU-Delft Connection Memory-CPU Memory CPU Read/Write MFC Address Data MAR MDR

4 in1210/01-PDS 4 TU-Delft Memory l Addressable number of bits l Different orderings l Speed up techniques -Cache memories -Memory interleaving l Enlargement -Virtual memory

5 in1210/01-PDS 5 TU-Delft Organisation(1) sense/wr W0 W1 W15 FF Address decoder input/output lines b7b1b0 R/W CS A0 A1 A2 A3 b1

6 in1210/01-PDS 6 TU-Delft Pinning Total pins required for 16x8 memory: 16 l 4 address lines l 8 data lines l 2 control lines l 2 power lines

7 in1210/01-PDS 7 TU-Delft 32 by 32 memory array W0 W31...... 1K by 1 memory 5-bit deco- der 10-bit address lines two 32-to-1 multiplexors inout

8 in1210/01-PDS 8 TU-Delft Pinning Total number of pins required: 16 l 10 address lines l 2 data lines (in/out) l 2 control lines l 2 power lines For 128 by 8 memory: 19 pins (7+8+2+2)

9 in1210/01-PDS 9 TU-Delft Multiple Modules(1) Address in Module m bits CS address Module n-1 CS address Module i CS address Module 0 Module k bits MM address Block-wise organization

10 in1210/01-PDS 10 TU-Delft Multiple Modules(2) CS address Module 2**k-1 CS address Module i CS address Module 0 Module k bits Address in Module m bits MM address Interleaving organization

11 in1210/01-PDS 11 TU-Delft Question? l What is the advantage of the interleaved organization? l What the disadvantage?

12 in1210/01-PDS 12 TU-Delft Memory Hierarchy increasing size increasing speed increasing cost Disks Main Memory Secondary cache Primary cache CPU

13 in1210/01-PDS 13 TU-Delft Caches(1) l Problem: Main Memory is slower than CPU registers (factor of 5-10) l Solution: Fast and small memory between CPU and Main Memory l Contains: recent references memory locations CPU Cache Main Memory

14 in1210/01-PDS 14 TU-Delft Caches(2) l Works because of locality principle l Profit: -Cache hit ratio: h -Access time cache: c -Cache miss ratio: 1-h -Access time main memory: m -Mean access time: h.c + (1-h).m l Cache is transparent to programmer

15 in1210/01-PDS 15 TU-Delft Caches(3) l At READ operation -If not in cache, get block in cache and read out cache (possibly read-through) -If in cache, read out cache l At WRITE operation -If not in cache, write in main memory -If in cache, write in cache, and: »write in main memory (store through) »set modified (dirty) bit

16 in1210/01-PDS 16 TU-Delft Caches(3a) Borrow books from library, store according to first letter of first author name in 26 locations l Direct mapped: separate location for a single book for each letter l Associative: any book can go to any of the 26 locations l Set-associative: 2 locations for letters A-B, C-D, E-F, etc

17 in1210/01-PDS 17 TU-Delft Caches(4) l Suppose -Main Memory is N = 2 n bytes -Divided in blocks of b = 2 k bytes -Cache: 128 blocks -e.g. n=16, k=4, b=16 l Every block in cache has valid bit (is reset when memory is modified) l At context switch: invalidate cache

18 in1210/01-PDS 18 TU-Delft Direct Mapped Cache(1) l A block in memory (j) can only be at one place in cache (j mod #cache blocks) l Place determined by block number l Memory address: 574 tagblockword main memory address

19 in1210/01-PDS 19 TU-Delft Direct Mapped Cache(1) BLOCK 0................. BLOCK 127 BLOCK 128 BLOCK 129.................. BLOCK 255 BLOCK 256 tag 5 bits tag BLOCK 0 BLOCK 1 BLOCK 2 CACHE

20 in1210/01-PDS 20 TU-Delft Direct Mapped Cache(1) BLOCK 0 BLOCK 1................. BLOCK 127 BLOCK 128 BLOCK 129.................. BLOCK 255 BLOCK 256 tag 5 bits tag BLOCK 0 BLOCK 1 BLOCK 2 CACHE

21 in1210/01-PDS 21 TU-Delft Associative(1) l Each block can be at any place in cache l At cache entry: parallel (associative) match of tag in address with tags in all cache entries l Associative: slower, more expensive, higher hit ratio 124 tagword main memory address

22 in1210/01-PDS 22 TU-Delft Associative(2) BLOCK 0 BLOCK 1................. BLOCK 127 BLOCK 128 BLOCK 129.................. BLOCK 255 BLOCK 256 tag 12- bits BLOCK 0 128 blocks tag BLOCK 1 tag BLOCK 2 tag BLOCK 3 tag BLOCK 4

23 in1210/01-PDS 23 TU-Delft Set-Associative(1) l Combination of direct mapped and associative l Cache consists of sets l Each set is associative l One block can only be placed in one set; determined by set number 664 tagsetword main memory address

24 in1210/01-PDS 24 TU-Delft Set-Associative(2) BLOCK 0 BLOCK 1................. BLOCK 127 BLOCK 128 BLOCK 129.................. BLOCK 255 BLOCK 256 tag 6- bits BLOCK 0 128 blocks tag BLOCK 1 tag BLOCK 2 tag BLOCK 3 tag BLOCK 4 set 0 set 1

25 in1210/01-PDS 25 TU-Delft Set-Associative(2) BLOCK 0 BLOCK 1................. BLOCK 127 BLOCK 128 BLOCK 129.................. BLOCK 255 BLOCK 256 tag 6- bits BLOCK 0 128 blocks tag BLOCK 1 tag BLOCK 2 tag BLOCK 3 tag BLOCK 4 set 0 set 1

26 in1210/01-PDS 26 TU-Delft Question? l Main memory: 4 GByte l Cache: 512 blocks of 64 byte l Cache: 8-way set-associative l How many bits is the: -byte address within a block -set number -tag

27 in1210/01-PDS 27 TU-Delft Answer! l Main memory: 4 GByte, so 32-bits address l Blocks of 64 byte, so 6-bits byte address l 8-way set-associative cache with 512 blocks, so 512/8=64 sets, so 6-bits set number l So, 32-6-6=20-bits tag

28 in1210/01-PDS 28 TU-Delft Replacement(1) (Set) associative replacement algorithms: l Least Recently Used (LRU) -At 2 k blocks per set, implement with k-bit counters per block -Hit: increase lower counters than referenced with 1, set counter at 0 -Miss and set not full: replace, set counter new block 0, increase rest -Miss and set full: replace counter with value 2 k -1, set counter new block at 0, increase rest

29 in1210/01-PDS 29 TU-Delft Example 01 00 10 11 10 01 00 11 k=2 HIT

30 in1210/01-PDS 30 TU-Delft Example 11 00 10 01 00 01 11 10 k=2 EMPTY MISS AND SET NOT FULL

31 in1210/01-PDS 31 TU-Delft Example 01 00 10 11 10 01 11 00 k=2 MISS AND SET FULL

32 in1210/01-PDS 32 TU-Delft Replacement(2) l Replace oldest block l Random replacement

33 in1210/01-PDS 33 TU-Delft Program example int SUM = 0; for(j=0, j<10, j++) { SUM =SUM + A[0,j]; { AVE = SUM/10; for(i=9, i>-1, i--){ A[0,i] = A[0,i]/AVE } Normalize elements of first row of A

34 in1210/01-PDS 34 TU-Delft Example cache BLOCK 0 tag BLOCK 1 tag BLOCK 2 tag BLOCK 3 tag BLOCK 4 tag BLOCK 5 tag BLOCK 6 tag BLOCK 7 tag CACHE with 8 blocks, each block 1 word, LRU replacement Set 0 Set 1 133 tagblock direct 16 tag associative 151 tagset associative

35 in1210/01-PDS 35 TU-Delft Examples(2) 0111101000000 0 0 0 0111101000000 0 0 1 0111101000000 0 1 0 0111101000000 0 1 1.......................... 0111101000100 1 0 0 0111101000100 1 0 1 0111101000100 1 1 0 0111101000100 1 1 1 Tag direct Tag set-associative Tag associative a(0,0) a(1,0) a(2,0) a(3,0).... a(0,9) a(1,9) a(2,9) a(3,9) Memory address 4x10 array column order 7A00

36 in1210/01-PDS 36 TU-Delft Direct mapped a[0,0]a[0,2]a[0,4]a[0,6]a[0,8]a[0,6]a[0,4]a[0,2]a[0,0] j=1j=3j=5j=7j=9i=6i=4i=2i=0 0 1 2 3 4 5 6 7 block pos. Contents of cache after pass: a[0,1]a[0,3]a[0,5]a[0,7]a[0,9]a[0,7]a[0,5]a[0,3]a[0,1] = miss = hit

37 in1210/01-PDS 37 TU-Delft Associative a[0,0]a[0,8] a[0,0] j=7j=8j=9i=1i=0 a[0,1] a[0,9]a[0,1] a[0,2] a[0,3] a[0,4] a[0,5] a[0,6] a[0,7] 0 1 2 3 4 5 6 7 block pos.

38 in1210/01-PDS 38 TU-Delft Set-associative a[0,0]a[0,4]a[0,8]a[0,4] j=3j=7j=9i=4i=2 a[0,1]a[0,5]a[0,9]a[0,5] a[0,2]a[0,6] a[0,2] a[0,3]a[0,7] a[0,3] 0 1 2 3 4 5 6 7 block pos. a[0,0] i=0 a[0,1] a[0,2] a[0,3] set 0

39 in1210/01-PDS 39 TU-Delft PowerPC l PowerPC 604 l Data and Instruction cache l Caches are 16 K bytes l Four-way set associative l 128 sets, each with 4 blocks, each block 8 words of 32 bits

40 in1210/01-PDS 40 TU-Delft Example Block 0 00BA2st Block 1 Block 2 Block 3 003F4st address 0000 0000 0011 1111 0100 0000 000 0 1000 003F4008 Set 0 =? no yes

41 in1210/01-PDS 41 TU-Delft Virtual Memory(1) l Problem: if compiled program does not fit into memory l Solution: Virtual memory, where the logical address space is larger than the physical address space l Logical address space: Addresses referable by instructions l Physical address space: Addresses referable in real machine

42 in1210/01-PDS 42 TU-Delft Virtual Memory(2) l For realizing virtual memory, we need an address conversion: a m = f(a v ) l a m is physical address (machine address) l a v is virtual address l This is generally done by hardware

43 in1210/01-PDS 43 TU-Delft Organization Processor MMU Cache Main Memory Disk Storage amam amam avav data DMA transfer

44 in1210/01-PDS 44 TU-Delft Address translation l Basic approach is to partition both physical address space and virtual address space in equally sized blocks called pages l A virtual address is composed of a page number and a word within a page, called off-set

45 in1210/01-PDS 45 TU-Delft Page tables virtual page numberoffset page frameoffset page table address + virtual address from processorpage table base register physical address from processor control bits page

46 in1210/01-PDS 46 TU-Delft Associative TBL virtual page numberoffset virtual address from processor page frameoffset physical address from processor virtual pagereal page = ? Hit Miss control bits TLB

47 in1210/01-PDS 47 TU-Delft Policies l Number of pages in main memory: resident set l Mechanism works because of principle of locality l Acceleration: recent address translations in separate cache

48 in1210/01-PDS 48 TU-Delft Replacement l Page replacement algorithms l Protection possible through page table register l Sharing possible through page table l Hardware support: Memory Management Unit (MMU)

49 in1210/01-PDS 49 TU-Delft Question? l Main memory: 256 MByte l Maximal virtual-address space: 4 GByte l Page size: 4 KByte l How many bits is the -offset within a page -virtual page frame number -(physical) page frame number

50 in1210/01-PDS 50 TU-Delft Answer! l Physical address: 8+20=28 bits l Virtual address: 32 bits l Offset in a page: 12 bits l Virtual page frame number: 32-12=20 bits l Physical page frame number: 28-12=16 bits


Download ppt "In1210/01-PDS 1 TU-Delft The Memory System. in1210/01-PDS 2 TU-Delft Organization 0123 4567 89...... Word Address Byte Address 0 1 2 3."

Similar presentations


Ads by Google