Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 704 Advanced Computer Architecture

Similar presentations


Presentation on theme: "CS 704 Advanced Computer Architecture"— Presentation transcript:

1 CS 704 Advanced Computer Architecture
Lecture 32 Memory Hierarchy Design (Main and Virtual Memories) Prof. Dr. M. Ashraf Chughtai Welcome to the 29th Lecture for the series of lectures on Advanced Computer Architecture

2 Lec. 32 Memory Hierarchy Design (8)
Today’s Topics Recap: Memory Hierarchy and Cache performance Main Memory Performance Virtual Memory Performance Summary MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

3 Recap: Memory Hierarchy
design goal of memory system Low cost as of cheapest memory fast speed as of fastest memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

4 Recap: Memory Hierarchy
The fastest, smallest and most costly memories The slowest, biggest and cheapest memories MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

5 Recap: Memory Hierarchy
Average access speed Cost Cheapest technology Semiconductor memories Static and Dynamic RAMs Upper levels in the memory hierarchy MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

6 Lec. 32 Memory Hierarchy Design (8)
Recap: Caches Design The Caches use Static Random Access Memory Main Memory is Dynamic Random Access Memory (DRAM) (~8 ms, <5% time) The magnetic, optical or other medias virtual memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

7 Lec. 32 Memory Hierarchy Design (8)
Recap: Cache Design Cache and main memory are organized in equal sized blocks Word transfer Bock transfer The CPU requests contents of main memory Word transfer is fast MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

8 Recap: Cache Performance
If misses Miss penalty Cache design and the performance Techniques Miss rate Hit time MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

9 Main Memory Organization
Organizations of main memory Source for Caches Destination virtual memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

10 DRAM logical organization (4 M Bit)
Column Decoder Sense Amps & I/O Memory Array (2,048 x 2,048) A0…A10 11 D Q Word Line Storage Cell Bit Line Data Out Data In Address Buffer Row Decoder MAC/VU-Advanced Computer Architecture Lec. 32– Memory Hierarchy Design (8) 10

11 Main Memory Performance
Performance of DRAM 1: Fast page mode DRAM 2: Synchronous DRAM 3:Double Data Rate DRAM MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

12 Main Memory Performance
Fast page mode Optimizes sequential access Synchronous DRAM (SDRAM) Avoid handshaking Double Data Rate (DDR) DRAM Transmit data MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

13 Main Memory Performance
latency Average memory access time Bandwidth Number of bytes read/write per unit time Access Time Cycle Time MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

14 Main Memory Performance
Inputs/outputs and multiprocessors Low-latency memory Multiprocessor demand higher bandwidth 2nd level caches with larger block size MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

15 Improving Main Memory Performance
The most commonly used techniques are Wider Main Memory Simple Interleaved Memory Independent Memory Banks MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

16 Lec. 32 Memory Hierarchy Design (8)
1: Wider Main Memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

17 Lec. 32 Memory Hierarchy Design (8)
1: Wider Main Memory Main Memory L1 cache Wider L2 Cache MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

18 1: Wider Main Memory: Example
4 words (i.e. 32 byte) block Time to send address = 4 clock cycles Time to send the data word = 4 clock cycles Access time per word = 56 clock cycles Miss Penalty = No. of words x [time to: send address + send data word + access word] MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

19 Lec. 32 Memory Hierarchy Design (8)
1: Wider Main Memory 1: For 1 word organization Miss Penalty = 4 x ( ) = 4 x (64) = 256 Clock Cycles; The memory bandwidth = bytes/clock cycle = 32/256 = 1/8 byte /cycle 2: For 4-word organization Miss Penalty = 1 x ( ) = 64 Clock Cycles; and Memory bandwidth = 32/64 = 1/2 bytes/cycle; MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

20 1: Wider Main Memory: Demerits
L1 cache Wider L2 Cache MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

21 Lec. 32 Memory Hierarchy Design (8)
2: Interleaved Memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

22 Lec. 32 Memory Hierarchy Design (8)
2: Interleaved Memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

23 Lec. 32 Memory Hierarchy Design (8)
2: Interleaved Memory bank 0 has all word whose: Address MOD 4 = 0 bank 1 has all word whose: Address MOD 4 = 1 bank 2 has all word whose: Address MOD 4 = 2 bank 3 has all word whose: Address MOD 4 = 3 Word address Bank 0 Bank 1 Bank 2 Bank 3 4 8 12 1 5 9 13 2 6 10 14 3 7 11 152 MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

24 2: Interleaved Memory: Example
Bandwidth Calculation: bandwidth of 4 words interleaved memory using the time model as used in case of wider memory The miss penalty for 4-word interleave memory is: = time to send address + time to access + number of banks x time to send data = x 4 =76 clock cycles Bandwidth = 32/76 = 0.4 byte per clock Bandwidth = 32/256= 1/8 = byte per clock MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

25 3: Independent Memory Banks
Memory banks offer independent accesses Multiprocessors I/O CPU with Hit under n Misses Non-blocking Caches MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

26 3: Independent Memory Banks
Superbank Bank ……… MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

27 3: Independent Memory Banks
An input device may use one controller and one bank The cache read may use another and The cache write still another MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

28 Summary: Main Memory Bandwidth
Using memory banks Making memory and its bus wider Doing both How many the banks should be there? MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

29 Summary: Main Memory Bandwidth Enhancement
This decision is essential to ensure that if memory is being accessed sequentially (e.g. when processing an array) then by the time you try to read a second word from a bank, the first access has finished Otherwise it will return to original bank before it has the next word ready MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

30 Summary: Main Memory Bandwidth Enhancement
8 banks, each of 64-bit Access time of 10 clock cycle Clock cycle 1 Bank 0 after 10 clock cycles After 10 clock cycles, The bank 0 would fetch the next desired word 7 banks sequentially till the 18th clock cycle MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

31 Summary: Main Memory Bandwidth Enhancement
18th clock Bank 0 CPU cannot start fetching Clock cycle 20 10 clock cycles again Number of bank ≥ Number of clock cycles to access word in bank MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

32 Lec. 32 Memory Hierarchy Design (8)
Virtual Memory Multiple processes Single process Exceed physical memory available MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

33 Lec. 32 Memory Hierarchy Design (8)
Virtual Memory System Increasing gap High cost of main memory Physical DRAM as a cache for the disk Single level store MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

34 Virtual Memory system … Cont’d
Single level storage Virtual Memory System Manages two levels of memory hierarchy Main memory and secondary storage Segments, named as a page MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

35 Virtual Memory system … Cont’d
Page Block Contiguous pages MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

36 Virtual Memory System … Cont’d
CPU D A B Physical Main Memory C Virtual Memory Address space Disk Virtual Addresses 4k 8k 12k 16k 20k : 24k 28k 32k 36k 40k MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

37 Virtual Memory: Attributes
Protection Relocation MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

38 Virtual Memory: Attributes … Cont’d
Protection Operate in different address space Different permissions Cannot access privileged information MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

39 Lec. 32 Memory Hierarchy Design (8)
Protection VP: Valid Page Page Tables Memory Physical Addr Read? Write? PP 9 Yes No PP 4 XXXXXXX VP 0: VP 1: VP 2: 0: 1: N-1: Process i: Physical Addr Read? Write? PP 6 Yes PP 9 No XXXXXXX VP 0: VP 1: VP 2: Process j: MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

40 Virtual Memory: Attributes … Cont’d
Relocation Simplifies loading of program Allows to place a program anywhere Hardware Software MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

41 Cache verses Virtual memory
Page or segment is used for block Page fault or address fault is used for miss CPU produces virtual address The virtual addresses are translated to the main memory or physical addresses MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

42 Cache verses Virtual memory
Address translation Mapping of virtual address to the physical address Page table Physical address of the segment or the page MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

43 Cache verses Virtual memory
Virtual Page No. Page offset Virtual Address Page Table Main Memory Physical Address MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

44 Cache verses Virtual memory
Replacement on cache miss Page fault The size of processor address Cache size is independent of the processor address MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

45 Cache verses Virtual memory
Secondary storage Lower-level backing store for main memory File system occupies the space on secondary storage MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

46 Issues of Virtual Memory Design
Line size Large, since disk better at transferring large blocks Associativity High, (fully associative) to minimize miss rate Write Strategy Write through or write back MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

47 Issues of Virtual Memory Design
miss rate: Extremely low. << 1% hit time: Must match cache/maim memory performance miss latency: Very high. ~20ms tag storage overhead: Low, relative to block size MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

48 Typical System with Virtual Memory
CPU 0: 1: N-1: Memory P-1: Page Table Disk Virtual Addresses Physical MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

49 Typical System with Virtual Memory
The CPU generates the Virtual Address Operating system manages a lookup table Location of the page or segment Virtual addresses to physical addresses MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

50 Page Faults (like “Cache Misses”)
Indicates virtual address not in memory OS exception handler invoked Current process suspends OS has full control over placement MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

51 Page Faults (like “Cache Misses”)
CPU Memory Page Table Disk Virtual Addresses Physical Before fault B A CPU Memory Page Table Disk Virtual Addresses Physical After fault B A MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

52 Servicing a Page Fault: 3 steps
disk Disk Memory-I/O bus Processor Cache Memory I/O controller Reg (1) Initiate Block Read MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

53 Servicing a Page Fault: 3 steps
disk Disk Memory-I/O bus Processor Cache Memory I/O controller Reg (2) DMA Transfer (1) Initiate Block Read MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

54 Servicing a Page Fault: 3 Steps
disk Disk Memory-I/O bus Processor Cache Memory I/O controller Reg (2) DMA Transfer (1) Initiate Block Read (3) Read Done MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

55 Lec. 32 Memory Hierarchy Design (8)
Summary Main memory design Methods to improve the bandwidth of main memory Concept of Virtual Memory Servicing the page fault in Virtual Memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

56 Lec. 32 Memory Hierarchy Design (8)
Allah Hafiz MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)


Download ppt "CS 704 Advanced Computer Architecture"

Similar presentations


Ads by Google