Virtual Memory: Part Deux

Slides:



Advertisements
Similar presentations
Virtual Memory Basics.
Advertisements

Krste Asanovic Electrical Engineering and Computer Sciences
Operating Systems Lecture 10 Issues in Paging and Virtual Memory Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing.
EECS 470 Virtual Memory Lecture 15. Why Use Virtual Memory? Decouples size of physical memory from programmer visible virtual memory Provides a convenient.
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
1 A Real Problem  What if you wanted to run a program that needs more memory than you have?
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Virtual Memory I Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 490/590 Computer Architecture Virtual Memory II
February 25, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 11 - Virtual Memory and Caches Krste Asanovic Electrical Engineering.
The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.
CS 152 Computer Architecture and Engineering Lecture 11 - Virtual Memory and Caches Krste Asanovic Electrical Engineering and Computer Sciences University.
Memory Management (II)
CS 152 Computer Architecture and Engineering Lecture 10 - Virtual Memory Krste Asanovic Electrical Engineering and Computer Sciences University of California.
S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.
February 23, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 9 - Virtual Memory Krste Asanovic Electrical Engineering and Computer.
February 23, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 10 - Virtual Memory Krste Asanovic Electrical Engineering and.
Translation Buffers (TLB’s)
Computer Organization and Architecture
What are Exception and Interrupts? MIPS terminology Exception: any unexpected change in the internal control flow – Invoking an operating system service.
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 15 CS252 Graduate Computer Architecture Spring 2014 Lecture 15: Virtual Memory and Caches Krste Asanovic.
Interrupts / Exceptions / Faults Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology April 30, 2012L21-1
Lecture 19: Virtual Memory
Operating Systems ECE344 Ding Yuan Paging Lecture 8: Paging.
Virtual Memory Expanding Memory Multiple Concurrent Processes.
Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
Virtual Memory Muhamed Mudawar CS 282 – KAUST – Spring 2010 Computer Architecture & Organization.
Virtual Memory Part 1 Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology May 2, 2012L22-1
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
1 Some Real Problem  What if a program needs more memory than the machine has? —even if individual programs fit in memory, how can we run multiple programs?
February 21, 2012CS152, Spring 2012 CS 152 Computer Architecture and Engineering Lecture 9 - Virtual Memory Krste Asanovic Electrical Engineering and Computer.
Constructive Computer Architecture Interrupts/Exceptions/Faults Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Constructive Computer Architecture Virtual Memory and Interrupts Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
Virtual Memory CENG331 - Computer Organization Instructor: Murat Manguoglu(Section 1) Adapted from:
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
Interrupts and Exception Handling. Execution We are quite aware of the Fetch, Execute process of the control unit of the CPU –Fetch and instruction as.
Memory Management memory hierarchy programs exhibit locality of reference - non-uniform reference patterns temporal locality - a program that references.
CMSC 611: Advanced Computer Architecture
Interrupts and exceptions
Memory COMPUTER ARCHITECTURE
Lecture 12 Virtual Memory.
A Real Problem What if you wanted to run a program that needs more memory than you have? September 11, 2018.
From Address Translation to Demand Paging
From Address Translation to Demand Paging
Page Table Implementation
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Some Real Problem What if a program needs more memory than the machine has? even if individual programs fit in memory, how can we run multiple programs?
Morgan Kaufmann Publishers
Dr. George Michelogiannakis EECS, University of California at Berkeley
Virtual Memory and Interrupts
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
From Address Translation to Demand Paging
Architectural Support for OS
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Translation Buffers (TLB’s)
CS 152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Lecture 9 – Virtual Memory Krste Asanovic Electrical Engineering and.
Translation Buffers (TLB’s)
CSC3050 – Computer Architecture
Control Hazards Branches (conditional, unconditional, call-return)
Architectural Support for OS
Translation Buffers (TLBs)
Interrupts and exceptions
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Virtual Memory.
Sarah Diesburg Operating Systems CS 3430
Review What are the advantages/disadvantages of pages versus segments?
Sarah Diesburg Operating Systems COP 4610
Presentation transcript:

Virtual Memory: Part Deux

Physical Address (to cache) Address Translation: putting it all together Virtual Address hardware hardware or software software Restart instruction TLB Lookup miss hit Page Table Walk Protection Check the page is denied permitted memory memory Page Fault (OS loads page) Update TLB Protection Fault Physical Address (to cache) SEGFAULT Need to restart instruction.

Address Translation in CPU Pipeline Inst. TLB Inst. Cache Data TLB Data Cache PC Decode TLB miss? Page Fault? Protection violation? TLB miss? Page fault? Protection violation? Need restartable exception on page fault/protection violation for software handler Either hardware or software TLB refill on TLB miss Coping with the additional latency of a TLB: slow down the clock pipeline the TLB and cache access virtual address caches parallel TLB/cache access

Virtual Address Caches PA Instead of Physical Cache Primary Memory CPU TLB place the cache before the TLB Primary Memory Virtual Cache (StrongARM) CPU TLB + one-step process in case of a hit - cache needs to be flushed on a context switch unless address space identifiers (ASIDs) included in tags - aliasing problems due to the sharing of pages

Two virtual pages share one physical page Aliasing in Virtual-Address Caches Page Table Tag Data 1st Copy of Data at PA Data Pages 2nd Copy of Data at PA Virtual cache can have two copies of same physical data. Writes to one copy not visible to reads of other! Two virtual pages share one physical page General Solution: Disallow aliases to coexist in cache Software (i.e., OS) solution for direct-mapped cache: VAs of shared pages must agree in cache index bits; this ensures all VAs accessing same PA will conflict in direct-mapped cache (early SPARCs) Two processes sharing the same file, Map the same memory segment to different Parts of their address space.

Concurrent Access to TLB & Cache Virtual Index VPN Direct-map Cache 2L blocks 2b-byte block PPN Page Offset Tag Physical Tag Data hit? Index L is available without consulting the TLB ⇒ cache and TLB accesses can begin simultaneously Tag comparison is made after both accesses are completed Cases: L + b = k, L + b < k, L + b > k

Consider 4-Kbyte pages and caches with 32-byte blocks Virtual-Index Physical-Tag Caches: Associative Organization Virtual Index VPN Direct-map 2L blocks Direct-map 2b block TLB Phy. Tag PPN Page Offset Tag hit? Data After the PPN is known, 2a physical tags are compared Is this scheme realistic? Consider 4-Kbyte pages and caches with 32-byte blocks 32-Kbyte cache ⇒ 2a = 8 4-Mbyte cache ⇒ 2a =1024 No !

Concurrent Access to TLB & Large L1 Virtual Index L1 PA cache Direct-map VPN Page Offset Data TLB Data PPN Page Offset hit? Tag Can VA1 and VA2 both map to PA ? If they differ in the lower ‘a’ bits alone, and share a physical page.

Second Level Cache L1 Instruction Cache Memory Memory CPU Unified L2 Cache Memory L1 Data Cache RF Memory Usually a common L2 cache backs up both Instruction and Data L1 caches L2 is “inclusive” of both Instruction and Data caches

Anti-Aliasing Using L2: (MIPS R10000) Virtual Index L1 PA cache Direct-map VPN Page Offset into L2 tag Data TLB Data PPN Page Offset hit? Tag Suppose VA1 and VA2 both map to PA and VA1 is already in L1, L2 (VA1 ≠ VA2) After VA2 is resolved to PA, a collision will be detected in L2. VA1 will be purged from L1 and L2, and VA2 will be loaded ⇒ no aliasing ! Data Direct-Mapped L2

Virtually-Addressed L1: Anti-Aliasing using L2 VPN Page Offset Data TLB Data PPN Page Offset L1 VA Cache “Virtual Tag” Tag Physical Index & Tag Data L2 PA Cache L2 “contains” L1 Physically-addressed L2 can also be used to avoid aliases in virtually-addressed L1

Page Fault Handler When the referenced page is not in DRAM: The missing page is located (or created) It is brought in from disk, and page table is updated (A different job may start running on the CPU while the first job waits for the requested page to be read from disk) If no free pages are left, a page is swapped out Pseudo-LRU replacement policy Since it takes a long time to transfer a page (μsecs), page faults are handled completely in software by the operating system Untranslated addressing mode is essential to allow kernel to access page tables Deleted pseudo-LRU policy here. Pseudo-LRU means You can’t afford to do the real thing. Essentially has 1 bit to indicate if a page has been recently used Or not. A round-robin policy is used to find the first page with a value of 0.

Swapping a Page of a Page Table A PTE in primary memory contains primary or secondary memory addresses A PTE in secondary memory contains only secondary memory addresses ⇒ a page of a PT can be swapped out only if none its PTE’s point to pages in the primary memory Why?__________________________________ What is the worst thing you can do with respect to storing page tables? Storing page table on disk for whose entries point to phys. Mem.

Atlas Revisited One PAR for each physical page PAR’s contain the VPN’s of the pages resident in primary memory Advantage: The size is proportional to the size of the primary memory What is the disadvantage ? Good old Atlas which had all the ideas (Was an improvement on most of its successors.)

Hashed Page Table: Approximating Associative Addressing Virtual Address VPN Page Table Offset PA of PTE hash PID Base of Table Hashed Page Table is typically 2 to 3 times larger than the number of PPN’s to reduce collision probability It can also contain DPN’s for some non-resident pages (not common) If a translation cannot be resolved in this table then the software consults a data structure that has an entry for every existing page Primary Memory

Global System Address Space User map Global System Address Space Physical Memory Level A map Level B User map Level A maps users’ address spaces into the global space providing privacy, protection, sharing etc. Level B provides demand-paging for the large global system address space Level A and Level B translations may be kept in separate TLB’s

Hashed Page Table Walk: PowerPC Two-level, Segmented Addressing 64-bit user VA Seg ID Page Offset hashS Hashed Segment Table PA of Seg Table 80-bit System VA Global Seg ID Page Offset hashp Hashed Page Table PA of Page Table 40-bit PA PPN Offset

Power PC: Hashed Page Table 80-bit VA Page Table Offset PA of Slot hash Base of Table Each hash table slot has 8 PTE's <VPN,PPN> that are searched sequentially If the first hash slot fails, an alternate hash function is used to look in another slot All these steps are done in hardware! Hashed Table is typically 2 to 3 times larger than the number of physical pages The full backup Page Table is a software data structure Primary Memory

Interrupts: interrupt handler program altering the normal flow of control interrupt handler program An external or internal event that needs to be processed by another (system) program. The event is usually unexpected or rare from program’s point of view.

Causes of Interrupts An interrupt is an event that requests the attention of the processor Asynchronous: an external event input/output device service-request timer expiration power disruptions, hardware failure Synchronous: an internal event (a.k.a exceptions) undefined opcode, privileged instruction arithmetic overflow, FPU exception misaligned memory access virtual memory exceptions: page faults, TLB misses, protection violations traps: system calls, e.g., jumps into kernel

Asynchronous Interrupts: invoking the interrupt handler An I/O device requests attention by asserting one of the prioritized interrupt request lines When the processor decides to process the interrupt it stops the current program at instruction Ii , completing all the instructions up to Ii-1 (precise interrupt) it saves the PC of instruction Ii in a special register (EPC) disables interrupts and transfers control to a designated interrupt handler running in kernel mode

Interrupt Handler To allow nested interrupts, EPC is saved before enabling interrupts ⇒ need an instruction to move EPC into GPRs need a way to mask further interrupts at least until EPC can be saved A status register indicates the cause of the interrupt – it must be visible to an interrupt handler The return from an interrupt handler is a simple indirect jump but usually involves enabling interrupts restoring the processor to the user mode restoring hardware status and control state ⇒ a special return-from-exception instruction (RFE)

Synchronous Interrupts A synchronous interrupt (exception) is caused by a particular instruction In general, the instruction cannot be completed and needs to be restarted after the exception has been handled requires undoing the effect of one or more partially executed instructions In case of a trap (system call), the instruction is considered to have been completed a special jump instruction involving a change to privileged kernel mode

Asynchronous Interrupts Exception Handling (5-Stage Pipeline) Inst. Mem Data Mem Decode Illegal Opcode Data Address Exceptions Overflow PC Address Exceptions Asynchronous Interrupts How to handle multiple simultaneous exceptions in different pipeline stages? How and where to handle external asynchronous interrupts?

Asynchronous Interrupts Exception Handling (5-Stage Pipeline) Commit Point Inst. Mem Data Mem Decode Illegal Opcode Data Address Exceptions Kill Writeback Select Handler PC Overflow PC Address Exceptions Kill F Stage Kill D Stage Kill E Stage Asynchronous Interrupts

Exception Handling (5-Stage Pipeline) Hold exception flags in pipeline until commit point (M stage) Exceptions in earlier pipe stages override later exceptions for a given instruction Inject external interrupts at commit point (override others) If exception at commit: update Cause and EPC registers, kill all stages, inject handler PC into fetch stage

Virtual Memory Use Today Desktops/servers have full demand-paged virtual memory Portability between machines with different memory sizes Protection between multiple users or multiple tasks Share small physical memory among active tasks Simplifies implementation of some OS features Vector supercomputers have translation and protection but not demand-paging (Crays: base&bound, Japanese: pages) Don’t waste expensive CPU time thrashing to disk (make jobs fit in memory) Mostly run in batch mode (run set of jobs that fits in memory) More difficult to implement restartable vector instructions Most embedded processors and DSPs provide physical addressing only Can’t afford area/speed/power budget for virtual memory support Often there is no secondary storage to swap to! Programs custom written for particular memory configuration in product Difficult to implement restartable instructions for exposed architectures