CPEG3231 Virtual Memory. CPEG3232 Review: The memory hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory.

Slides:



Advertisements
Similar presentations
Virtual Memory In this lecture, slides from lecture 16 from the course Computer Architecture ECE 201 by Professor Mike Schulte are used with permission.
Advertisements

1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
CSE431 Chapter 5B.1Irwin, PSU, 2008 CSE 431 Computer Architecture Fall 2008 Chapter 5B: Exploiting the Memory Hierarchy, Part 2 Mary Jane Irwin (
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
Virtual Memory Hardware Support
Cs 325 virtualmemory.1 Accessing Caches in Virtual Memory Environment.
The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.
Virtual Memory Adapted from lecture notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley.
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
Chap. 7.4: Virtual Memory. CS61C L35 VM I (2) Garcia © UCB Review: Caches Cache design choices: size of cache: speed v. capacity direct-mapped v. associative.
Virtual Memory Adapted from lecture notes of Dr. Patterson and Dr. Kubiatowicz of UC Berkeley and Rabi Mahapatra & Hank Walker.
S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Recap. The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of the.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
ECE 232 L27.Virtual.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 27 Virtual.
Chapter 3.2 : Virtual Memory
Translation Buffers (TLB’s)
1  2004 Morgan Kaufmann Publishers Chapter Seven.
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
1 CSE SUNY New Paltz Chapter Seven Exploiting Memory Hierarchy.
Lecture 33: Chapter 5 Today’s topic –Cache Replacement Algorithms –Multi-level Caches –Virtual Memories 1.
Computer Architecture Lecture 28 Fasih ur Rehman.
Caching & Virtual Memory Systems Chapter 7  Caching l To address bottleneck between CPU and Memory l Direct l Associative l Set Associate  Virtual Memory.
CSE431 L22 TLBs.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 22. Virtual Memory Hardware Support Mary Jane Irwin (
Lecture 19: Virtual Memory
CS 224 Spring 2011 Chapter 5B Computer Organization CS224 Chapter 5B: Exploiting the Memory Hierarchy, Part 2 Spring 2011 With thanks to M.J. Irwin, D.
1 Chapter 3.2 : Virtual Memory What is virtual memory? What is virtual memory? Virtual memory management schemes Virtual memory management schemes Paging.
CPE432 Chapter 5A.1Dr. W. Abu-Sufah, UJ Chapter 5B:Virtual Memory Adapted from Slides by Prof. Mary Jane Irwin, Penn State University Read Section 5.4,
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
2015/11/26\course\cpeg323-08F\Topic7e1 Virtual Memory.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Virtual Memory.  Next in memory hierarchy  Motivations:  to remove programming burdens of a small, limited amount of main memory  to allow efficient.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.
CS.305 Computer Architecture Memory: Virtual Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
CS203 – Advanced Computer Architecture Virtual Memory.
Memory Management memory hierarchy programs exhibit locality of reference - non-uniform reference patterns temporal locality - a program that references.
Virtual Memory Chapter 8.
CS161 – Design and Architecture of Computer
ECE232: Hardware Organization and Design
Memory COMPUTER ARCHITECTURE
CS161 – Design and Architecture of Computer
Lecture 12 Virtual Memory.
Virtual Memory.
CS 704 Advanced Computer Architecture
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
ECE/CS 552: Virtual Memory
Chapter 4 Large and Fast: Exploiting Memory Hierarchy Part 2 Virtual Memory 박능수.
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
CMSC 611: Advanced Computer Architecture
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Translation Buffers (TLB’s)
Translation Buffers (TLB’s)
CSC3050 – Computer Architecture
Translation Buffers (TLBs)
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Virtual Memory.
Review What are the advantages/disadvantages of pages versus segments?
Presentation transcript:

CPEG3231 Virtual Memory

CPEG3232 Review: The memory hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of the memory at each level Inclusive– what is in L1$ is a subset of what is in L2$ is a subset of what is in MM that is a subset of is in SM 4-8 bytes (word) 1 to 4 blocks 1,024+ bytes (disk sector = page) 8-32 bytes (block)  Take advantage of the principle of locality to present the user with as much memory as is available in the cheapest technology at the speed offered by the fastest technology

CPEG3233 Virtual memory  Use main memory as a “cache” for secondary memory l Allows efficient and safe sharing of memory among multiple programs l Provides the ability to easily run programs larger than the size of physical memory l Automatically manages the memory hierarchy (as “one-level”)  What makes it work? – again the Principle of Locality l A program is likely to access a relatively small portion of its address space during any period of time  Each program is compiled into its own address space – a “virtual” address space l During run-time each virtual address must be translated to a physical address (an address in main memory)

CPEG3234 IBM System/350 Model 67

CPEG3235 VM simplifies loading and sharing  Simplifies loading a program for execution by avoiding code relocation  Address mapping allows programs to be load in any location in physical memory  Simplifies shared libraries, since all sharing programs can use the same virtual addresses  Relocation does not need special OS + hardware support as in the past

CPEG3236 “ Historically, there were two major motivations for virtual memory: to allow efficient and safe sharing of memory among multiple programs, and to remove the programming burden of a small, limited amount of main memory.” [Patt&Henn] “…a system has been devised to make the core drum combination appear to programmer as a single level store, the requisite transfers taking place automatically” Kilbum et al. Virtual memory motivation

CPEG3237 Terminology  Page: fixed sized block of memory bytes  Segment: contiguous block of segments  Page fault: a page is referenced, but not in memory  Virtual address: address seen by the program  Physical address: address seen by the cache or memory  Memory mapping or address translation: next slide

CPEG3238 Memory management unit Virtual AddressMem Management UnitPhysical Address from Processor to Memory Page fault Using elaborate Software page fault Handling algorithm

CPEG3239 Address translation Virtual Address (VA) Page offsetVirtual page number Page offsetPhysical page number Physical Address (PA) Translation  So each memory request first requires an address translation from the virtual space to the physical space l A virtual memory miss (i.e., when the page is not in physical memory) is called a page fault  A virtual address is translated to a physical address by a combination of hardware and software

CPEG32310 } 4K Virtual address Main memory address (a) (b) Mapping virtual to physical space 64K virtual address space 32K main memory

CPEG32311 Virtual page number Page table Physical memory Disk storage The page table maps each page in virtual memory to either a page in physical memory or a page stored on disk, which is the next level in the hierarchy. A paging system

CPEG32312 Virtual page number Page table Disk storage Physical memory TLB The TLB acts as a cache on the page table for the entries that map to physical pages only A virtual address cache (TLB)

CPEG32313 Two Programs Sharing Physical Memory Program 1 virtual address space main memory  A program’s address space is divided into pages (all one fixed size) or segments (variable sizes) l The starting location of each page (either in main memory or in secondary memory) is contained in the program’s page table Program 2 virtual address space

CPEG32314 These figures, contrasted with the values for caches, represent increases of 10 to 100,000 times. Typical ranges of VM parameters

CPEG32315 Some virtual memory design parameters Paged VMTLBs Total size16,000 to 250,000 words 16 to 512 entries Total size (KB)250,000 to 1,000,000, to 16 Block size (B)4000 to 64,0004 to 32 Miss penalty (clocks)10,000,000 to 100,000, to 1000 Miss rates % to % 0.01% to 2%

CPEG32316 Technology TechnologyAccess Time$ per GB in 2004 SRAM0.5 – 5ns$4,000 – 10,000 DRAM ns$ Magnetic disk5 -20 x 10 6 ns$

CPEG32317 Address Translation Consideration  Direct mapping using register sets  Indirect mapping using tables  Associative mapping of frequently used pages

CPEG32318 The Page Table (PT) must have one entry for each page in virtual memory! How many Pages? How large is PT? Fundamental considerations

CPEG32319  Pages should be large enough to amortize the high access time. From 4 KB to 16 KB are typical, and some designers are considering size as large as 64 KB.  Organizations that reduce the page fault rate are attractive. The primary technique used here is to allow flexible placement of pages. (e.g. fully associative) 4 key design issues

CPEG32320  Page fault (misses) in a virtual memory system can be handled in software, because the overhead will be small compared to the access time to disk. Furthermore, the software can afford to used clever algorithms for choosing how to place pages, because even small reductions in the miss rate will pay for the cost of such algorithms.  Using write-through to manage writes in virtual memory will not work since writes take too long. Instead, we need a scheme that reduce the number of disk writes. 4 key design issues (cont.)

CPEG32321 Page Size Selection Constraints  Efficiency of secondary memory device (slotted disk/drum)  Page table size  Page fragmentation: last part of last page  Program logic structure: logic block size: < 1K ~ 4K  Table fragmentation: full PT can occupy large, sparse space  Uneven locality: text, globals, stack  Miss ratio

CPEG32322 An Example Case 1 VM page size512 VM address space 64K Total virtual page = = 128 pages 64K 512

CPEG32323 Case 2 VM page size512 = 2 9 VM address space 4G = 2 32 Total virtual page = = 8M pages Each PTE has 32 bits: so total PT size 8M x 4 = 32M bytes Note : assuming main memory has working set 4M byte or = = = 2 13 = 8192 pages 4G 512 ~~~~ 4M An Example (cont.)

CPEG32324 How about VM address space =2 52 (R-6000) (4 Petabytes) page size 4K bytes so total number of virtual pages: = 2 40 = ! An Example (cont.)

CPEG32325 Techniques for Reducing PT Size  Set a lower limit, and permit dynamic growth  Permit growth from both directions (text, stack)  Inverted page table (a hash table)  Multi-level page table (segments and pages)  PT itself can be paged: ie., put PT itself in virtual address space (Note: some small portion of pages should be in main memory and never paged out)

CPEG32326 LSI-11/73 Segment Registers

CPEG32327 VM implementation issues  Page fault handling: hardware, software or both  Efficient input/output: slotted drum/disk  Queue management. Process can be linked on l CPU ready queue: waiting for the CPU l Page in queue: waiting for page transfer from disk l Page out queue: waiting for page transfer to disk  Protection issues: read/write/execute  Management bits: dirty, reference, valid.  Multiple program issues: context switch, timeslice end

CPEG32328  Placement: OS designers always pick lower miss rates vs. simpler placement algorithm  So, “fully associativity - VM pages can go anywhere in the main M (compare with sector cache)  Question: why not use associative hardware? (# of PT entries too big!) Where to place pages

CPEG32329 pidi p i w Virtual address TLB Page map RWX pid M C P Page frame address in memory (PFA) PFA in S.M. i w Physical address Operation validation RWX Requested access type S/U Access fault Page fault PME (x) Replacement policy If s/u = 1 - supervisor mode PME(x) * C = 1-page PFA modified PME(x) * P = 1-page is private to process PME(x) * pid is process identification number PME(x) * PFA is page frame address Virtual to read address translation using page map How to handle protection and multiple users

CPEG32330 Page fault handling  When a virtual page number is not in TLB, then PT in M is accessed (through PTBR) to find the PTE  Hopefully, the PTE is in the data cache  If PTE indicates that the page is missing a page fault occurs  If so, put the disk sector number and page number on the page-in queue and continue with the next process  If all page frames in main memory are occupied, find a suitable one and put it on the page-out queue

CPEG32331 Fast address translation PT must involve at least two accesses of memory for each memory fetch or store Improvement: l Store PT in fast registers: example: Xerox: 256 regs l Implement VM address cache (TLB) l Make maximal use of instruction/data cache

CPEG32332 Some typical values for a TLB might be: Miss penaly some time may be as high as upto 100 cycles. TLB size can be as long as 16 entries. Miss penaly some time may be as high as upto 100 cycles. TLB size can be as long as 16 entries.

CPEG32333 TLB design issues  Placement policy: l Small TLBs: full-associative can be used l large TLBs: full-associative may be too slow  Replacement policy: random policy is used for speed/simplicity  TLB miss rate is low (Clark-Emer data [85] 3~4 times smaller then usual cache miss rate  TLB miss penalty is relatively low; it usually results in a cache fetch

CPEG32334  TLB-miss implies higher miss rate for the main cache  TLB translation is process-dependent l strategies for context switching 1. tagging by context 2. flushing cont’d complete purge by context (shared) No absolute answer TLB design issues (cont.)

CPEG32335 A Case Study: DECStation 3100 Virtual page number Page offset …………… ………..… Virtual address Valid Dirty Tag Physical page number Physical address Valid Tag Data = = Data Cache hit Cache TLB Byte offset Index Tag TLB hit

CPEG32336 TLB access TLB hit? Virtual address Write? Try to read data from cache Check protection Write data into cache, update the dirty bit, and Put the data and the address into the write buffer Cache hit? Cache miss stall Yes No TLB miss exception No DECStation 3100 TLB and cache

CPEG32337 IBM System/ memory management unit CPU cycle time 200 ns Mem cycle time 750 ns

CPEG32338 Segment (12)Offset (12) Virtual Address (32) Page (8) Offset (12)Page (12) Bus-out Address (from CPU) Offset (12)Page (12) Bus-in Address (to memory) Dynamic Address Translation (DAT) IBM System/ address translation

CPEG32339 Offset (12)VM Page (12) Bus-out Address (from CPU) Offset (12)PH Page (12) Bus-in Address (to memory) IBM System/ associative registers

CPEG32340 (4)Offset (12) … 4095 Virtual Address (24) … VRW255 Page (8) Phys Page (24 bit addr) 1 0 1,048,575 … Virtual Page (32 bit addr) VRW0 1 … Segment Table Reg (32) + Segment Table 2 3 Page Table 2 VRW255 VRW0 1 … Page Table 4 4 V Valid bit R Reference Bit W Write (dirty) Bit IBM System/ segment/page mapping

CPEG32341 Virtual addressing with a cache  Thus it takes an extra memory access to translate a VA to a PA CPU Trans- lation Cache Main Memory VAPA miss hit data  This makes memory (cache) accesses very expensive (if every access was really two accesses)  The hardware fix is to use a Translation Lookaside Buffer (TLB) – a small cache that keeps track of recently used address mappings to avoid having to do a page table lookup

CPEG32342 Making address translation fast Physical page base addr Main memory Disk storage Virtual page # V Tag Physical page base addr V TLB Page Table (in physical memory)

CPEG32343 Translation lookaside buffers (TLBs)  Just like any other cache, the TLB can be organized as fully associative, set associative, or direct mapped V Virtual Page # Physical Page # Dirty Ref Access  TLB access time is typically smaller than cache access time (because TLBs are much smaller than caches) l TLBs are typically not more than 128 to 256 entries even on high end machines

CPEG32344 A TLB in the memory hierarchy  A TLB miss – is it a page fault or merely a TLB miss? l If the page is loaded into main memory, then the TLB miss can be handled (in hardware or software) by loading the translation information from the page table into the TLB -Takes 10’s of cycles to find and load the translation info into the TLB l If the page is not in main memory, then it’s a true page fault -Takes 1,000,000’s of cycles to service a page fault  TLB misses are much more frequent than true page faults CPU TLB Lookup Cache Main Memory VAPA miss hit data Trans- lation hit miss ¾ t ¼ t

CPEG32345 Two Machines’ Cache Parameters Intel P4AMD Opteron TLB organization1 TLB for instructions and 1TLB for data Both 4-way set associative Both use ~LRU replacement Both have 128 entries TLB misses handled in hardware 2 TLBs for instructions and 2 TLBs for data Both L1 TLBs fully associative with ~LRU replacement Both L2 TLBs are 4-way set associative with round-robin LRU Both L1 TLBs have 40 entries Both L2 TLBs have 512 entries TBL misses handled in hardware

CPEG32346 TLB Event Combinations TLBPage Table CachePossible? Under what circumstances? Hit Miss Hit MissHitMiss HitMissMiss/ Hit Miss Hit

CPEG32347 TLB Event Combinations TLBPage Table CachePossible? Under what circumstances? Hit Miss Hit MissHitMiss HitMissMiss/ Hit Miss Hit Yes – what we want! Yes – although the page table is not checked if the TLB hits Yes – TLB miss, PA in page table Yes – TLB miss, PA in page table, but data not in cache Yes – page fault Impossible – TLB translation not possible if page is not present in memory Impossible – data not allowed in cache if page is not in memory

CPEG32348 Reducing Translation Time  Can overlap the cache access with the TLB access l Works when the high order bits of the VA are used to access the TLB while the low order bits are used as index into cache TagData = TagData = Cache HitDesired word VA Tag PA Tag TLB Hit 2-way Associative Cache Index PA Tag Block offset

CPEG32349 Why Not a Virtually Addressed Cache?  A virtually addressed cache would only require address translation on cache misses data CPU Trans- lation Cache Main Memory VA hit PA but l Two different virtual addresses can map to the same physical address (when processes are sharing data), i.e., two different cache entries hold data for the same physical address – synonyms -Must update all cache entries with the same physical address or the memory becomes inconsistent

CPEG32350 The Hardware/Software Boundary  What parts of the virtual to physical address translation is done by or assisted by the hardware? l Translation Lookaside Buffer (TLB) that caches the recent translations -TLB access time is part of the cache hit time -May allot an extra stage in the pipeline for TLB access l Page table storage, fault detection and updating -Page faults result in interrupts (precise) that are then handled by the OS -Hardware must support (i.e., update appropriately) Dirty and Reference bits (e.g., ~LRU) in the Page Tables l Disk placement -Bootstrap (e.g., out of disk sector 0) so the system can service a limited number of page faults before the OS is even loaded

CPEG32351 Virtual page number Page table Disk storage Physical memory TLB The TLB acts as a cache on the page table for the entries that map to physical pages only Very little hardware with software assisst Software

CPEG32352 Summary  The Principle of Locality: l Program likely to access a relatively small portion of the address space at any instant of time. -Temporal Locality: Locality in Time -Spatial Locality: Locality in Space  Caches, TLBs, Virtual Memory all understood by examining how they deal with the four questions 1. Where can block be placed? 2. How is block found? 3. What block is replaced on miss? 4. How are writes handled?  Page tables map virtual address to physical address l TLBs are important for fast translation