Operating Systems & Memory Systems: Address Translation Computer Science 220 ECE 252 Professor Alvin R. Lebeck Fall 2008.

Slides:



Advertisements
Similar presentations
Virtual Memory In this lecture, slides from lecture 16 from the course Computer Architecture ECE 201 by Professor Mike Schulte are used with permission.
Advertisements

Virtual Storage SystemCS510 Computer ArchitecturesLecture Lecture 14 Virtual Storage System.
Lecture 12 Reduce Miss Penalty and Hit Time
EECS 470 Virtual Memory Lecture 15. Why Use Virtual Memory? Decouples size of physical memory from programmer visible virtual memory Provides a convenient.
OS Fall’02 Virtual Memory Operating Systems Fall 2002.
4/14/2017 Discussed Earlier segmentation - the process address space is divided into logical pieces called segments. The following are the example of types.
CMSC 611: Advanced Computer Architecture Memory & Virtual Memory Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
Virtual Memory Hardware Support
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Virtual Memory I Steve Ko Computer Sciences and Engineering University at Buffalo.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Virtual Memory 2 P & H Chapter
CS 153 Design of Operating Systems Spring 2015
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
Virtual Memory & Address Translation Vivek Pai Princeton University.
Memory Management (II)
S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.
Memory Management. 2 How to create a process? On Unix systems, executable read by loader Compiler: generates one object file per source file Linker: combines.
ECE 232 L27.Virtual.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 27 Virtual.
Chapter 3.2 : Virtual Memory
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Nov 9, 2005 Topic: Caches (contd.)
Memory: Virtual MemoryCSCE430/830 Memory Hierarchy: Virtual Memory CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu.
ENGS 116 Lecture 131 Caches and Virtual Memory Vincent H. Berk October 31 st, 2008 Reading for Today: Sections C.1 – C.3 (Jouppi article) Reading for Monday:
Mem. Hier. CSE 471 Aut 011 Evolution in Memory Management Techniques In early days, single program run on the whole machine –Used all the memory available.
CSE431 L22 TLBs.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 22. Virtual Memory Hardware Support Mary Jane Irwin (
Lecture 19: Virtual Memory
1 Chapter 3.2 : Virtual Memory What is virtual memory? What is virtual memory? Virtual memory management schemes Virtual memory management schemes Paging.
Operating Systems ECE344 Ding Yuan Paging Lecture 8: Paging.
IT253: Computer Organization
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming  To allocate scarce memory resources.
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
Lecture 9: Memory Hierarchy Virtual Memory Kai Bu
Operating Systems & Memory Systems: Address Translation Computer Science 220 ECE 252 Professor Alvin R. Lebeck Fall 2006.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 4, 2002 Topic: 1. Caches (contd.); 2. Virtual Memory.
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Operating Systems & Memory Systems: Address Translation CPS 220 Professor Alvin R. Lebeck Fall 2001.
Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
1 Adapted from UC Berkeley CS252 S01 Lecture 17: Reducing Cache Miss Penalty and Reducing Cache Hit Time Hardware prefetching and stream buffer, software.
Memory Management. 2 How to create a process? On Unix systems, executable read by loader Compiler: generates one object file per source file Linker: combines.
CS203 – Advanced Computer Architecture Virtual Memory.
CS161 – Design and Architecture of Computer
Translation Lookaside Buffer
CMSC 611: Advanced Computer Architecture
ECE232: Hardware Organization and Design
Memory COMPUTER ARCHITECTURE
CS161 – Design and Architecture of Computer
From Address Translation to Demand Paging
From Address Translation to Demand Paging
Chapter 8: Main Memory Source & Copyright: Operating System Concepts, Silberschatz, Galvin and Gagne.
Memory Hierarchy Virtual Memory, Address Translation
CSE 153 Design of Operating Systems Winter 2018
Chapter 8: Main Memory.
Virtual Memory 3 Hakim Weatherspoon CS 3410, Spring 2011
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Evolution in Memory Management Techniques
CMSC 611: Advanced Computer Architecture
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Virtual Memory Overcoming main memory size limitation
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE451 Virtual Memory Paging Autumn 2002
CSC3050 – Computer Architecture
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 471 Autumn 1998 Virtual memory
Paging and Segmentation
CS703 - Advanced Operating Systems
Main Memory Background
CSE 153 Design of Operating Systems Winter 2019
Virtual Memory Lecture notes from MKP and S. Yalamanchili.
Presentation transcript:

Operating Systems & Memory Systems: Address Translation Computer Science 220 ECE 252 Professor Alvin R. Lebeck Fall 2008

2 © Alvin R. Lebeck 2008 Outline Address Translation –basics –64-bit Address Space Managing memory OS Performance Throughout Review Computer Architecture Interaction with Architectural Decisions CPS 220

3 © Alvin R. Lebeck 2008 CPS 220 I/O Bus Core Chip Set Processor Cache Main Memory Disk Controller Disk Graphics Controller Network Interface Graphics Network interrupts System Organization

4 © Alvin R. Lebeck 2008 CPS 220 Computer Architecture Interface Between Hardware and Software Hardware Software Operating System Compiler Applications CPUMemoryI/O MultiprocessorNetworks This is IT

5 © Alvin R. Lebeck 2008 CPS 220 Memory Hierarchy 101 P $ Memory Very fast <1ns clock Multiple Instructions per cycle SRAM, Fast, Small Expensive DRAM, Slow, Big,Cheap (called physical or main) => Cost Effective Memory System (Price/Performance) Magnetic, Really Slow, Really Big, Really Cheap

6 © Alvin R. Lebeck 2008 CPS 220 Virtual Memory: Motivation Process = Address Space + thread(s) of control Address space = PA –programmer controls movement from disk –protection? –relocation? Linear Address space –larger than physical address space »32, 64 bits v.s. 28-bit physical (256MB) Automatic management Virtual Physical

7 © Alvin R. Lebeck 2008 CPS 220 Virtual Memory Process = virtual address space + thread(s) of control Translation –VA -> PA –What physical address does virtual address A map to –Is VA in physical memory? Protection (access control) –Do you have permission to access it?

8 © Alvin R. Lebeck 2008 CPS 220 Virtual Memory: Questions How is data found if it is in physical memory? Where can data be placed in physical memory? Fully Associative, Set Associative, Direct Mapped What data should be replaced on a miss? (Take Compsci 110 or 210 …)

9 © Alvin R. Lebeck 2008 CPS 220 Segmented Virtual Memory Virtual address (2 32, 2 64 ) to Physical Address mapping (2 30 ) Variable size, base + offset, contiguous in both VA and PA Virtual Physical 0x1000 0x6000 0x9000 0x0000 0x1000 0x2000 0x11000

10 © Alvin R. Lebeck 2008 CPS 220 Intel Pentium Segmentation Seg Selector Offset Logical Address Segment Descriptor Global Descriptor Table (GDT) Segment Base Address Physical Address Space

11 © Alvin R. Lebeck 2008 CPS 220 Pentium Segmention (Continued) Segment Descriptors –Local and Global –base, limit, access rights –Can define many Segment Registers –contain segment descriptors (faster than load from mem) –Only 6 Must load segment register with a valid entry before segment can be accessed – generally managed by compiler, linker, not programmer

12 © Alvin R. Lebeck 2008 CPS 220 Paged Virtual Memory Virtual address (2 32, 2 64 ) to Physical Address mapping (2 28 ) –virtual page to physical page frame Fixed Size units for access control & translation Virtual Physical 0x1000 0x6000 0x9000 0x0000 0x1000 0x2000 0x11000 Virtual page number Offset

13 © Alvin R. Lebeck 2008 CPS 220 Page Table Kernel data structure (per process) Page Table Entry (PTE) –VA -> PA translations (if none page fault) –access rights (Read, Write, Execute, User/Kernel, cached/uncached) –reference, dirty bits Many designs –Linear, Forward mapped, Inverted, Hashed, Clustered Design Issues –support for aliasing (multiple VA to single PA) –large virtual address space –time to obtain translation

14 © Alvin R. Lebeck 2008 CPS 220 Alpha VM Mapping (Forward Mapped) “64-bit” address divided into 3 segments –seg0 (bit 63=0) user code/heap –seg1 (bit 63 = 1, 62 = 1) user stack –kseg (bit 63 = 1, 62 = 0) kernel segment for OS Three level page table, each one page –Alpha only 43 unique bits of VA –(future min page size up to 64KB => 55 bits of VA) PTE bits; valid, kernel & user read & write enable (No reference, use, or dirty bit) –What do you do for replacement? POL3L2L1 base phys page frame number seg 0/1

15 © Alvin R. Lebeck 2008 CPS 220 Inverted Page Table (HP, IBM) One PTE per page frame –only one VA per physical frame Must search for virtual address More difficult to support aliasing Force all sharing to use the same VA Virtual page numberOffset VA PA,ST Hash Anchor Table (HAT) Inverted Page Table (IPT) Hash

16 © Alvin R. Lebeck 2008 CPS 220 Intel Pentium Segmentation + Paging Seg Selector Offset Logical Address Segment Descriptor Global Descriptor Table (GDT) Segment Base Address Linear Address Space Page Dir Physical Address Space DirOffsetTable Page Table

17 © Alvin R. Lebeck 2008 CPS 220 The Memory Management Unit (MMU) Input –virtual address Output –physical address –access violation (exception, interrupts the processor) Access Violations –not present –user v.s. kernel –write –read –execute

18 © Alvin R. Lebeck 2008 CPS 220 Translation Lookaside Buffers (TLB) Need to perform address translation on every memory reference –30% of instructions are memory references –4-way superscalar processor –at least one memory reference per cycle Make Common Case Fast, others correct Throw HW at the problem Cache PTEs

19 © Alvin R. Lebeck 2008 CPS 220 Fast Translation: Translation Buffer Cache of translated addresses Page Number Page offset... vrwtag phys frame... 48:1 mux

20 © Alvin R. Lebeck 2008 CPS 220 TLB Design Must be fast, not increase critical path Must achieve high hit ratio Generally small highly associative Mapping change –page removed from physical memory –processor must invalidate the TLB entry PTE is per process entity –Multiple processes with same virtual addresses –Context Switches? Flush TLB Add ASID (PID) –part of processor state, must be set on context switch

21 © Alvin R. Lebeck 2008 CPS 220 Hardware Managed TLBs Hardware Handles TLB miss Dictates page table organization Compilicated state machine to “walk page table” –Multiple levels for forward mapped –Linked list for inverted Exception only if access violation Control Memory TLB CPU

22 © Alvin R. Lebeck 2008 CPS 220 Software Managed TLBs Software Handles TLB miss Flexible page table organization Simple Hardware to detect Hit or Miss Exception if TLB miss or access violation Should you check for access violation on TLB miss? Control Memory TLB CPU

23 © Alvin R. Lebeck 2008 CPS 220 Kernel Mapping the Kernel Digital Unix Kseg –kseg (bit 63 = 1, 62 = 0) Kernel has direct access to physical memory One VA->PA mapping for entire Kernel Lock (pin) TLB entry –or special HW detection User Stack Kernel User Code/ Data Physical Memory

24 © Alvin R. Lebeck 2008 CPS 220 Considerations for Address Translation Large virtual address space Can map more things –files –frame buffers –network interfaces –memory from another workstation Sparse use of address space Page Table Design –space –less locality => TLB misses OS structure microkernel => more TLB misses

25 © Alvin R. Lebeck 2008 CPS 220 Address Translation for Large Address Spaces Forward Mapped Page Table –grows with virtual address space »worst case 100% overhead not likely –TLB miss time: memory reference for each level Inverted Page Table –grows with physical address space »independent of virtual address space usage –TLB miss time: memory reference to HAT, IPT, list search

26 © Alvin R. Lebeck 2008 CPS 220 Hashed Page Table (HP) Combine Hash Table and IPT [Huck96] –can have more entries than physical page frames Must search for virtual address Easier to support aliasing than IPT Space –grows with physical space TLB miss –one less memory ref than IPT Virtual page numberOffset VA PA,ST Hashed Page Table (HPT) Hash

27 © Alvin R. Lebeck 2008 CPS 220 Clustered Page Table (SUN) Combine benefits of HPT and Linear [Talluri95] Store one base VPN (TAG) and several PPN values –virtual page block number (VPBN) –block offset VPBNOffset VPBN next PA0 attrib Hash Boff VPBN next PA0 attrib... PA1 attrib PA2 attrib PA3 attrib VPBN next PA0 attrib VPBN next PA0 attrib

28 © Alvin R. Lebeck 2008 CPS 220 Reducing TLB Miss Handling Time Problem –must walk Page Table on TLB miss –usually incur cache misses Solution –build a small second-level cache in SW –on TLB miss, first check SW cache »use simple shift and mask index to hash table

29 © Alvin R. Lebeck 2008 CPS 220 Cache Indexing Tag on each block –No need to check index or block offset Increasing associativity shrinks index, expands tag Fully Associative: No index Direct-Mapped: Large index Block offset Block Address TAGIndex

30 © Alvin R. Lebeck 2008 CPS 220 Address Translation and Caches Where is the TLB wrt the cache? What are the consequences? Most of today’s systems have more than 1 cache –Digital has 3 levels –2 levels on chip (8KB-data,8KB-inst,96KB-unified) –one level off chip (2-4MB) Does the OS need to worry about this? Definition: page coloring = careful selection of va->pa mapping

31 © Alvin R. Lebeck 2008 CPS 220 TLBs and Caches CPU TLB $ MEM VA PA Conventional Organization CPU $ TLB MEM VA PA Virtually Addressed Cache Translate only on miss Alias (Synonym) Problem CPU $TLB MEM VA PA Tags PA Overlap $ access with VA translation: requires $ index to remain invariant across translation VA Tags L2 $

32 © Alvin R. Lebeck 2008 CPS 220 Virtual Caches Send virtual address to cache. Called Virtually Addressed Cache or just Virtual Cache vs. Physical Cache or Real Cache Avoid address translation before accessing cache –faster hit time to cache Context Switches? –Just like the TLB (flush or pid) –Cost is time to flush + “compulsory” misses from empty cache –Add process identifier tag that identifies process as well as address within process: can’t get a hit if wrong process I/O must interact with cache

33 © Alvin R. Lebeck 2008 CPS 220 I/O Bus Memory Bus Processor Cache Main Memory Disk Controller Disk Graphics Controller Network Interface Graphics Network interrupts I/O and Virtual Caches I/O Bridge Virtual Cache Physical Addresses I/O is accomplished with physical addresses DMA flush pages from cache need pa->va reverse translation coherent DMA

34 © Alvin R. Lebeck 2008 CPS 220 Aliases and Virtual Caches aliases (sometimes called synonyms); Two different virtual addresses map to same physical address But, but... the virtual address is used to index the cache Could have data in two different locations in the cache Kernel User Stack Kernel User Code/ Data Physical Memory

35 © Alvin R. Lebeck 2008 CPS 220 If index is physical part of address, can start tag access in parallel with translation so that can compare to physical tag Limits cache to page size: what if want bigger caches and use same trick? –Higher associativity –Page coloring Index with Physical Portion of Address Page Address Page Offset Address Tag Index Block Offset

36 © Alvin R. Lebeck 2008 CPS 220 Page Coloring for Aliases HW that guarantees that every cache frame holds unique physical address OS guarantee: lower n bits of virtual & physical page numbers must have same value; if direct-mapped, then aliases map to same cache frame –one form of page coloring Page Address Page Offset Address Tag Index Block Offset

37 © Alvin R. Lebeck 2008 CPS 220 Page Coloring to reduce misses Notion of bin –region of cache that may contain cache blocks from a page Random vs careful mapping Selection of physical page frame dictates cache index Overall goal is to minimize cache misses CachePage frames

38 © Alvin R. Lebeck 2008 CPS 220 A Case for Large Pages Page table size is inversely proportional to the page size –memory saved Fast cache hit time easy when cache <= page size (VA caches); –bigger page makes it feasible as cache size grows Transferring larger pages to or from secondary storage, possibly over a network, is more efficient Number of TLB entries are restricted by clock cycle time, –larger page size maps more memory –reduces TLB misses

39 © Alvin R. Lebeck 2008 CPS 220 A Case for Small Pages Fragmentation –large pages can waste storage –data must be contiguous within page Quicker process start for small processes(??)

40 © Alvin R. Lebeck 2008 CPS 220 Superpages Hybrid solution: multiple page sizes –8KB, 16KB, 32KB, 64KB pages –4KB, 64KB, 256KB, 1MB, 4MB, 16MB pages Need to identify candidate superpages –Kernel –Frame buffers –Database buffer pools Application/compiler hints Detecting superpages –static, at page fault time –dynamically create superpages Page Table & TLB modifications