Virtual Memory Adapted from lecture notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley and Rabi Mahapatra & Hank Walker.

Slides:



Advertisements
Similar presentations
Virtual Memory In this lecture, slides from lecture 16 from the course Computer Architecture ECE 201 by Professor Mike Schulte are used with permission.
Advertisements

Virtual Memory. The Limits of Physical Addressing CPU Memory A0-A31 D0-D31 “Physical addresses” of memory locations Data All programs share one address.
16.317: Microprocessor System Design I
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
Virtual Memory Hardware Support
Cs 325 virtualmemory.1 Accessing Caches in Virtual Memory Environment.
COMP 3221: Microprocessors and Embedded Systems Lectures 27: Virtual Memory - III Lecturer: Hui Wu Session 2, 2005 Modified.
Virtual Memory Adapted from lecture notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley.
CS61C L35 VM I (1) Garcia © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
CS61C L34 Virtual Memory I (1) Garcia, Spring 2007 © UCB 3D chips from IBM  IBM claims to have found a way to build 3D chips by stacking them together.
Chap. 7.4: Virtual Memory. CS61C L35 VM I (2) Garcia © UCB Review: Caches Cache design choices: size of cache: speed v. capacity direct-mapped v. associative.
CS 61C L24 VM II (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 34 – Virtual Memory II Jim Cicconi of AT&T says that the surge of video.
Virtual Memory.
S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.
COMP3221 lec36-vm-I.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 13: Virtual Memory - I
Recap. The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of the.
Memory Management and Paging CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
CS 61C L24 VM II (1) A Carle, Summer 2005 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #24: VM II Andy Carle.
CS61C L25 Virtual Memory I (1) Beamer, Summer 2007 © UCB Scott Beamer, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #25.
COMP 3221: Microprocessors and Embedded Systems Lectures 27: Virtual Memory - II Lecturer: Hui Wu Session 2, 2005 Modified.
CS61C L19 VM © UC Regents 1 CS61C - Machine Structures Lecture 19 - Virtual Memory November 3, 2000 David Patterson
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 35 – Virtual Memory III Researchers at three locations have recently added.
ECE 232 L27.Virtual.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 27 Virtual.
CS61C L34 Virtual Memory II (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c UC Berkeley.
CS61C L36 VM II (1) Garcia, Fall 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
CS61C L33 Virtual Memory I (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c UC Berkeley.
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
CS 430 – Computer Architecture 1 CS 430 – Computer Architecture Virtual Memory William J. Taffe using slides of David Patterson.
Cs 61C L18 Cache2.1 Patterson Spring 99 ©UCB CS61C Improving Cache Memory and Virtual Memory Introduction Lecture 18 April 2, 1999 Dave Patterson (http.cs.berkeley.edu/~patterson)
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 33 – Virtual Memory I OCZ has released a 1 TB solid state drive (the biggest.
Vm Computer Architecture Lecture 16: Virtual Memory.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 34 – Virtual Memory II Researchers at Stanford have developed “nanoscale.
©UCB CS 161 Ch 7: Memory Hierarchy LECTURE 24 Instructor: L.N. Bhuyan
COMP3221 lec37-vm-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 13: Virtual Memory - II
CS61C L37 VM III (1)Garcia, Fall 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures.
Lecture 21 Last lecture Today’s lecture Cache Memory Virtual memory
CS 61C L7.1.2 VM II (1) K. Meinz, Summer 2004 © UCB CS61C : Machine Structures Lecture VM II Kurt Meinz inst.eecs.berkeley.edu/~cs61c.
CSE431 L22 TLBs.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 22. Virtual Memory Hardware Support Mary Jane Irwin (
Lecture 19: Virtual Memory
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 35 – Virtual Memory III ”The severity of the decline in the market is further evidence.
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
Review (1/2) °Caches are NOT mandatory: Processor performs arithmetic Memory stores data Caches simply make data transfers go faster °Each level of memory.
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
Review °Apply Principle of Locality Recursively °Manage memory to disk? Treat as cache Included protection as bonus, now critical Use Page Table of mappings.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Virtual Memory.  Next in memory hierarchy  Motivations:  to remove programming burdens of a small, limited amount of main memory  to allow efficient.
CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
CS.305 Computer Architecture Memory: Virtual Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
Summary of caches: The Principle of Locality: –Program likely to access a relatively small portion of the address space at any instant of time. Temporal.
CS203 – Advanced Computer Architecture Virtual Memory.
CS161 – Design and Architecture of Computer
Lecturer PSOE Dan Garcia
ECE232: Hardware Organization and Design
CS161 – Design and Architecture of Computer
Lecture 12 Virtual Memory.
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Review (1/2) Caches are NOT mandatory:
Csci 211 Computer System Architecture – Review on Virtual Memory
Some of the slides are adopted from David Patterson (UCB)
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Albert Chae, Instructor
COMP3221: Microprocessors and Embedded Systems
Presentation transcript:

Virtual Memory Adapted from lecture notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley and Rabi Mahapatra & Hank Walker

View of Memory Hierarchies Regs L2 Cache Memory Disk Instr. Operands Blocks Pages Upper Level Lower Level Faster Larger Cache Blocks Thus far { { Next: Virtual Memory

Memory Hierarchy: Some Facts CPU Registers 100s Bytes <10s ns Cache K Bytes ns $ /bit Main Memory M Bytes 100ns-1us $ Disk G Bytes ms cents Capacity Access Time Cost Registers Cache Memory Disk Instr. Operands Blocks Pages Staging Xfer Unit prog./compiler 1-8 bytes cache cntl bytes OS 512-4K bytes Upper Level Lower Level faster Larger

Virtual Memory: Motivation If Principle of Locality allows caches to offer (usually) speed of cache memory with size of DRAM memory, then recursively why not use at next level to give speed of DRAM memory, size of Disk memory? Treat Memory as “cache” for Disk !!!

Advantages of Virtual Memory Translation: –Program can be given consistent view of memory, even though physical memory is scrambled –Makes multithreading reasonable (now used a lot!) –Only the most important part of program (“Working Set”) must be in physical memory. –Contiguous structures (like stacks) use only as much physical memory as necessary yet still grow later. Protection: –Different threads (or processes) protected from each other. –Different pages can be given special behavior (Read Only, Invisible to user programs, etc). –Kernel data protected from User programs –Very important for protection from malicious programs => Far more “viruses” under Microsoft Windows Sharing: –Can map same physical page to multiple users (“Shared memory”)

Virtual Memory Mapping Virtual Address: page no.offset Page Table Base Reg Page Table located in physical memory (actually, concatenation) index into page table + Physical Memory Address Page Table Val -id Access Rights Physical Page Address. V A.R. P. P. A....

Issues in VM Design What is the size of information blocks that are transferred from secondary to main storage (M)?  page size (Contrast with physical block size on disk, I.e. sector size) Which region of M is to hold the new block  placement policy How do we find a page when we look for it?  block identification Block of information brought into M, and M is full, then some region of M must be released to make room for the new block  replacement policy What do we do on a write?  write policy Missing item fetched from secondary memory only on the occurrence of a fault  demand load policy pages reg cache mem disk frame

Virtual to Physical Address Translation Each program operates in its own virtual address space; ~only program running Each is protected from the other OS can decide where each goes in memory Hardware (HW) provides virtual -> physical mapping virtual address (inst. fetch load, store) Program operates in its virtual address space HW mapping physical address (inst. fetch load, store) Physical memory (incl. caches)

Mapping Virtual Memory to Physical Memory 0 Physical Memory  CodeStatic Heap Stack 64 MB Divide into equal sized chunks (about 4KB) 0 Any chunk of Virtual Memory assigned to any chuck of Physical Memory (“page”)

Paging Organization (eg: 1KB Page) Addr Trans MAP Page is unit of mapping Page also unit of transfer from disk to physical memory page 0 1K Virtual Memory Virtual Address page 1 page 31 1K 2048 page 2... page Physical Address Physical Memory 1K page 1 page 7...

Virtual Memory Problem # 1 Map every address  1 extra memory access for every memory access Observation: since locality in pages of data, must be locality in virtual addresses of those pages Why not use a cache of virtual to physical address translations to make translation fast? (small is fast) For historical reasons, cache is called a Translation Lookaside Buffer, or TLB

Virtual Memory Problem # 2 Processor Trans- lation Cache Main Memory VA PA miss hit data Cache typically operates on physical addresses Page Table access is another memory access for each program memory access! Need to fix this!

Virtual Memory Problem # 2 Map every address  1 extra cache (TLB) access for every memory access Why not just have cache use virtual addresses? Problem: Different processes use same virtual addresses Could invalidate cache on context switch, but results in lots of compulsory misses Prefix addresses with process ID Other issues with virtual caches…

Virtual Memory Problem #2 Note: Only high order bits of address are translated, low order bits are not translated Feed low order index bits to cache while also doing translation Get translated tag and match with cache TLB is usually smaller and faster, so has translation ready before tag is ready Why it is called a translation LOOKASIDE buffer

Lookaside Works if page bits >= index+offset size Example: –4 KB page => 12 bit page index –256 lines of 16-byte blocks => 8 index, 4 offset bits –But only 256*16=4 KB cache size! Solution: Use larger set size –16-way cache would be 64 KB Solution: Use larger page size –But causes internal fragmentation – unused part of page Virtual Page # VA Page Index TLB Physical Page #Page Index PA TagIndexOffset TLB Tag = Hit VPNPI Data To CPU

Typical TLB Format VirtualPhysicalDirtyRef Valid Access Address AddressRights TLB just a cache on the page table mappings TLB access time comparable to cache (much less than main memory access time) Ref: Used to help calculate LRU on replacement Dirty: since use write back, need to know whether or not to write page to disk when replaced

What if not in TLB Option 1: Hardware checks page table and loads new Page Table Entry into TLB Option 2: Hardware traps to OS, up to OS to decide what to do MIPS follows Option 2: Hardware knows nothing about page table format

TLB Miss If the address is not in the TLB, MIPS traps to the operating system The operating system knows which program caused the TLB fault, page fault, and knows what the virtual address desired was requested 291 valid virtual physical

TLB Miss: If data is in Memory We simply add the entry to the TLB, evicting an old entry from the TLB valid virtual physical

What if data is on disk ? We load the page off the disk into a free block of memory, using a DMA transfer –Meantime we switch to some other process waiting to be run When the DMA is complete, we get an interrupt and update the process's page table –So when we switch back to the task, the desired data will be in memory

What if the memory is full ? We load the page off the disk into a free block of memory, using a DMA transfer –Meantime we switch to some other process waiting to be run When the DMA is complete, we get an interrupt and update the process's page table –So when we switch back to the task, the desired data will be in memory

Memory Organization with TLB TLBs usually small, typically entries Like any other cache, the TLB can be fully associative, set associative, or direct mapped Processor TLB Lookup Cache Main Memory VA PA miss hit data Trans- lation hit miss

Virtual Memory Problem # 3 Page Table too big! –4GB Virtual Memory ÷ 4 KB page  ~ 1 million Page Table Entries  4 MB just for Page Table for 1 process, 25 processes  100 MB for Page Tables! Variety of solutions to tradeoff memory size of mapping function for slower when miss TLB –Make TLB large enough, highly associative so rarely miss on address translation

Two Level Page Tables 0 Physical Memory 64 MB Virtual Memory  Code Static Heap Stack nd Level Page Tables Super Page Table

Summary Apply Principle of Locality Recursively Reduce Miss Penalty? add a (L2) cache Manage memory to disk? Treat as cache –Included protection as bonus, now critical –Use Page Table of mappings vs. tag/data in cache Virtual memory to Physical Memory Translation too slow? –Add a cache of Virtual to Physical Address Translations, called a TLB

Summary Virtual Memory allows protected sharing of memory between processes with less swapping to disk, less fragmentation than always swap or base/bound Spatial Locality means Working Set of Pages is all that must be in memory for process to run fairly well TLB to reduce performance cost of VM Need more compact representation to reduce memory size cost of simple 1-level page table (especially 32-  64-bit address)