CS 704 Advanced Computer Architecture

Slides:



Advertisements
Similar presentations
Main MemoryCS510 Computer ArchitecturesLecture Lecture 15 Main Memory.
Advertisements

Cosc 3P92 Week 9 Lecture slides
1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
Virtual Memory Hardware Support
CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.
The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.
Recap. The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of the.
331 Lec20.1Fall :332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
EECC550 - Shaaban #1 Lec # 10 Summer Main Memory Main memory generally utilizes Dynamic RAM (DRAM), which use a single transistor to store.
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
1 Lecture 14: Virtual Memory Today: DRAM and Virtual memory basics (Sections )
Memory: Virtual MemoryCSCE430/830 Memory Hierarchy: Virtual Memory CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Lecture 19: Virtual Memory
EEE-445 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)
Lecture 9: Memory Hierarchy Virtual Memory Kai Bu
Main Memory CS448.
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Virtual Memory.  Next in memory hierarchy  Motivations:  to remove programming burdens of a small, limited amount of main memory  to allow efficient.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.
CS.305 Computer Architecture Memory: Virtual Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
1 Adapted from UC Berkeley CS252 S01 Lecture 18: Reducing Cache Hit Time and Main Memory Design Virtucal Cache, pipelined cache, cache summary, main memory.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.
CMSC 611: Advanced Computer Architecture Memory & Virtual Memory Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material.
Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 26 Memory Hierarchy Design (Concept of Caching and Principle of Locality)
Virtual Memory Chapter 8.
Administration Midterm on Thursday Oct 28. Covers material through 10/21. Histogram of grades for HW#1 posted on newsgroup. Sample problem set (and solutions)
CS161 – Design and Architecture of Computer
CMSC 611: Advanced Computer Architecture
Memory Hierarchy Ideal memory is fast, large, and inexpensive
Computer Organization
Chapter 2 Memory and process management
ECE232: Hardware Organization and Design
Memory COMPUTER ARCHITECTURE
Yu-Lun Kuo Computer Sciences and Information Engineering
CS161 – Design and Architecture of Computer
CS352H: Computer Systems Architecture
CS703 - Advanced Operating Systems
CS 704 Advanced Computer Architecture
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Cache Memory Presentation I
Morgan Kaufmann Publishers Memory & Cache
CS 704 Advanced Computer Architecture
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
CMSC 611: Advanced Computer Architecture
CS 105 “Tour of the Black Holes of Computing!”
Lecture 23: Cache, Memory, Virtual Memory
Lecture 08: Memory Hierarchy Cache Performance
Chap. 12 Memory Organization
CMSC 611: Advanced Computer Architecture
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Memory Organization.
Virtual Memory Nov 27, 2007 Slide Source:
CS 704 Advanced Computer Architecture
Lecture 15: Memory Design
CSC3050 – Computer Architecture
CS 105 “Tour of the Black Holes of Computing!”
CS 105 “Tour of the Black Holes of Computing!”
Main Memory Background
Virtual Memory Lecture notes from MKP and S. Yalamanchili.
Presentation transcript:

CS 704 Advanced Computer Architecture Lecture 32 Memory Hierarchy Design (Main and Virtual Memories) Prof. Dr. M. Ashraf Chughtai Welcome to the 29th Lecture for the series of lectures on Advanced Computer Architecture

Lec. 32 Memory Hierarchy Design (8) Today’s Topics Recap: Memory Hierarchy and Cache performance Main Memory Performance Virtual Memory Performance Summary MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Recap: Memory Hierarchy design goal of memory system Low cost as of cheapest memory fast speed as of fastest memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Recap: Memory Hierarchy The fastest, smallest and most costly memories The slowest, biggest and cheapest memories MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Recap: Memory Hierarchy Average access speed Cost Cheapest technology Semiconductor memories Static and Dynamic RAMs Upper levels in the memory hierarchy MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) Recap: Caches Design The Caches use Static Random Access Memory Main Memory is Dynamic Random Access Memory (DRAM) (~8 ms, <5% time) The magnetic, optical or other medias virtual memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) Recap: Cache Design Cache and main memory are organized in equal sized blocks Word transfer Bock transfer The CPU requests contents of main memory Word transfer is fast MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Recap: Cache Performance If misses Miss penalty Cache design and the performance Techniques Miss rate Hit time MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Main Memory Organization Organizations of main memory Source for Caches Destination virtual memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

DRAM logical organization (4 M Bit) Column Decoder Sense Amps & I/O Memory Array (2,048 x 2,048) A0…A10 11 D Q Word Line Storage Cell Bit Line Data Out Data In Address Buffer Row Decoder MAC/VU-Advanced Computer Architecture Lec. 32– Memory Hierarchy Design (8) 10

Main Memory Performance Performance of DRAM 1: Fast page mode DRAM 2: Synchronous DRAM 3:Double Data Rate DRAM MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Main Memory Performance Fast page mode Optimizes sequential access Synchronous DRAM (SDRAM) Avoid handshaking Double Data Rate (DDR) DRAM Transmit data MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Main Memory Performance latency Average memory access time Bandwidth Number of bytes read/write per unit time Access Time Cycle Time MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Main Memory Performance Inputs/outputs and multiprocessors Low-latency memory Multiprocessor demand higher bandwidth 2nd level caches with larger block size MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Improving Main Memory Performance The most commonly used techniques are Wider Main Memory Simple Interleaved Memory Independent Memory Banks MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) 1: Wider Main Memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) 1: Wider Main Memory Main Memory L1 cache Wider L2 Cache MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

1: Wider Main Memory: Example 4 words (i.e. 32 byte) block Time to send address = 4 clock cycles Time to send the data word = 4 clock cycles Access time per word = 56 clock cycles Miss Penalty = No. of words x [time to: send address + send data word + access word] MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) 1: Wider Main Memory 1: For 1 word organization Miss Penalty = 4 x (4 +4+56) = 4 x (64) = 256 Clock Cycles; The memory bandwidth = bytes/clock cycle = 32/256 = 1/8 byte /cycle 2: For 4-word organization Miss Penalty = 1 x (4 +4+56) = 64 Clock Cycles; and Memory bandwidth = 32/64 = 1/2 bytes/cycle; MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

1: Wider Main Memory: Demerits L1 cache Wider L2 Cache MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) 2: Interleaved Memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) 2: Interleaved Memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) 2: Interleaved Memory bank 0 has all word whose: Address MOD 4 = 0 bank 1 has all word whose: Address MOD 4 = 1 bank 2 has all word whose: Address MOD 4 = 2 bank 3 has all word whose: Address MOD 4 = 3 Word address Bank 0 Bank 1 Bank 2 Bank 3 4 8 12 1 5 9 13 2 6 10 14 3 7 11 152 MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

2: Interleaved Memory: Example Bandwidth Calculation: bandwidth of 4 words interleaved memory using the time model as used in case of wider memory The miss penalty for 4-word interleave memory is: = time to send address + time to access + number of banks x time to send data = 4 + 56 + 4 x 4 =76 clock cycles Bandwidth = 32/76 = 0.4 byte per clock Bandwidth = 32/256= 1/8 = 0.125 byte per clock MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

3: Independent Memory Banks Memory banks offer independent accesses Multiprocessors I/O CPU with Hit under n Misses Non-blocking Caches MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

3: Independent Memory Banks Superbank Bank ……… MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

3: Independent Memory Banks An input device may use one controller and one bank The cache read may use another and The cache write still another MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Summary: Main Memory Bandwidth Using memory banks Making memory and its bus wider Doing both How many the banks should be there? MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Summary: Main Memory Bandwidth Enhancement This decision is essential to ensure that if memory is being accessed sequentially (e.g. when processing an array) then by the time you try to read a second word from a bank, the first access has finished Otherwise it will return to original bank before it has the next word ready MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Summary: Main Memory Bandwidth Enhancement 8 banks, each of 64-bit Access time of 10 clock cycle Clock cycle 1 Bank 0 after 10 clock cycles After 10 clock cycles, The bank 0 would fetch the next desired word 7 banks sequentially till the 18th clock cycle MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Summary: Main Memory Bandwidth Enhancement 18th clock Bank 0 CPU cannot start fetching Clock cycle 20 10 clock cycles again Number of bank ≥ Number of clock cycles to access word in bank MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) Virtual Memory Multiple processes Single process Exceed physical memory available MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) Virtual Memory System Increasing gap High cost of main memory Physical DRAM as a cache for the disk Single level store MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Virtual Memory system … Cont’d Single level storage Virtual Memory System Manages two levels of memory hierarchy Main memory and secondary storage Segments, named as a page MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Virtual Memory system … Cont’d Page Block Contiguous pages MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Virtual Memory System … Cont’d CPU D A B Physical Main Memory C Virtual Memory Address space Disk Virtual Addresses 4k 8k 12k 16k 20k : 24k 28k 32k 36k 40k MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Virtual Memory: Attributes Protection Relocation MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Virtual Memory: Attributes … Cont’d Protection Operate in different address space Different permissions Cannot access privileged information MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) Protection VP: Valid Page Page Tables Memory Physical Addr Read? Write? PP 9 Yes No PP 4 XXXXXXX VP 0: VP 1: VP 2: • 0: 1: N-1: Process i: Physical Addr Read? Write? PP 6 Yes PP 9 No XXXXXXX • VP 0: VP 1: VP 2: Process j: MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Virtual Memory: Attributes … Cont’d Relocation Simplifies loading of program Allows to place a program anywhere Hardware Software MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Cache verses Virtual memory Page or segment is used for block Page fault or address fault is used for miss CPU produces virtual address The virtual addresses are translated to the main memory or physical addresses MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Cache verses Virtual memory Address translation Mapping of virtual address to the physical address Page table Physical address of the segment or the page MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Cache verses Virtual memory Virtual Page No. Page offset Virtual Address Page Table Main Memory Physical Address MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Cache verses Virtual memory Replacement on cache miss Page fault The size of processor address Cache size is independent of the processor address MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Cache verses Virtual memory Secondary storage Lower-level backing store for main memory File system occupies the space on secondary storage MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Issues of Virtual Memory Design Line size Large, since disk better at transferring large blocks Associativity High, (fully associative) to minimize miss rate Write Strategy Write through or write back MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Issues of Virtual Memory Design miss rate: Extremely low. << 1% hit time: Must match cache/maim memory performance miss latency: Very high. ~20ms tag storage overhead: Low, relative to block size MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Typical System with Virtual Memory CPU 0: 1: N-1: Memory P-1: Page Table Disk Virtual Addresses Physical MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Typical System with Virtual Memory The CPU generates the Virtual Address Operating system manages a lookup table Location of the page or segment Virtual addresses to physical addresses MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Page Faults (like “Cache Misses”) Indicates virtual address not in memory OS exception handler invoked Current process suspends OS has full control over placement MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Page Faults (like “Cache Misses”) CPU Memory Page Table Disk Virtual Addresses Physical Before fault B A CPU Memory Page Table Disk Virtual Addresses Physical After fault B A MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Servicing a Page Fault: 3 steps disk Disk Memory-I/O bus Processor Cache Memory I/O controller Reg (1) Initiate Block Read MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Servicing a Page Fault: 3 steps disk Disk Memory-I/O bus Processor Cache Memory I/O controller Reg (2) DMA Transfer (1) Initiate Block Read MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Servicing a Page Fault: 3 Steps disk Disk Memory-I/O bus Processor Cache Memory I/O controller Reg (2) DMA Transfer (1) Initiate Block Read (3) Read Done MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) Summary Main memory design Methods to improve the bandwidth of main memory Concept of Virtual Memory Servicing the page fault in Virtual Memory MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)

Lec. 32 Memory Hierarchy Design (8) Allah Hafiz MAC/VU-Advanced Computer Architecture Lec. 32 Memory Hierarchy Design (8)