Presentation is loading. Please wait.

Presentation is loading. Please wait.

Operating Systems & Memory Systems: Address Translation Computer Science 220 ECE 252 Professor Alvin R. Lebeck Fall 2008.

Similar presentations


Presentation on theme: "Operating Systems & Memory Systems: Address Translation Computer Science 220 ECE 252 Professor Alvin R. Lebeck Fall 2008."— Presentation transcript:

1 Operating Systems & Memory Systems: Address Translation Computer Science 220 ECE 252 Professor Alvin R. Lebeck Fall 2008

2 2 © Alvin R. Lebeck 2008 Outline Address Translation –basics –64-bit Address Space Managing memory OS Performance Throughout Review Computer Architecture Interaction with Architectural Decisions CPS 220

3 3 © Alvin R. Lebeck 2008 CPS 220 I/O Bus Core Chip Set Processor Cache Main Memory Disk Controller Disk Graphics Controller Network Interface Graphics Network interrupts System Organization

4 4 © Alvin R. Lebeck 2008 CPS 220 Computer Architecture Interface Between Hardware and Software Hardware Software Operating System Compiler Applications CPUMemoryI/O MultiprocessorNetworks This is IT

5 5 © Alvin R. Lebeck 2008 CPS 220 Memory Hierarchy 101 P $ Memory Very fast <1ns clock Multiple Instructions per cycle SRAM, Fast, Small Expensive DRAM, Slow, Big,Cheap (called physical or main) => Cost Effective Memory System (Price/Performance) Magnetic, Really Slow, Really Big, Really Cheap

6 6 © Alvin R. Lebeck 2008 CPS 220 Virtual Memory: Motivation Process = Address Space + thread(s) of control Address space = PA –programmer controls movement from disk –protection? –relocation? Linear Address space –larger than physical address space »32, 64 bits v.s. 28-bit physical (256MB) Automatic management Virtual Physical

7 7 © Alvin R. Lebeck 2008 CPS 220 Virtual Memory Process = virtual address space + thread(s) of control Translation –VA -> PA –What physical address does virtual address A map to –Is VA in physical memory? Protection (access control) –Do you have permission to access it?

8 8 © Alvin R. Lebeck 2008 CPS 220 Virtual Memory: Questions How is data found if it is in physical memory? Where can data be placed in physical memory? Fully Associative, Set Associative, Direct Mapped What data should be replaced on a miss? (Take Compsci 110 or 210 …)

9 9 © Alvin R. Lebeck 2008 CPS 220 Segmented Virtual Memory Virtual address (2 32, 2 64 ) to Physical Address mapping (2 30 ) Variable size, base + offset, contiguous in both VA and PA Virtual Physical 0x1000 0x6000 0x9000 0x0000 0x1000 0x2000 0x11000

10 10 © Alvin R. Lebeck 2008 CPS 220 Intel Pentium Segmentation Seg Selector Offset Logical Address Segment Descriptor Global Descriptor Table (GDT) Segment Base Address Physical Address Space

11 11 © Alvin R. Lebeck 2008 CPS 220 Pentium Segmention (Continued) Segment Descriptors –Local and Global –base, limit, access rights –Can define many Segment Registers –contain segment descriptors (faster than load from mem) –Only 6 Must load segment register with a valid entry before segment can be accessed – generally managed by compiler, linker, not programmer

12 12 © Alvin R. Lebeck 2008 CPS 220 Paged Virtual Memory Virtual address (2 32, 2 64 ) to Physical Address mapping (2 28 ) –virtual page to physical page frame Fixed Size units for access control & translation Virtual Physical 0x1000 0x6000 0x9000 0x0000 0x1000 0x2000 0x11000 Virtual page number Offset

13 13 © Alvin R. Lebeck 2008 CPS 220 Page Table Kernel data structure (per process) Page Table Entry (PTE) –VA -> PA translations (if none page fault) –access rights (Read, Write, Execute, User/Kernel, cached/uncached) –reference, dirty bits Many designs –Linear, Forward mapped, Inverted, Hashed, Clustered Design Issues –support for aliasing (multiple VA to single PA) –large virtual address space –time to obtain translation

14 14 © Alvin R. Lebeck 2008 CPS 220 Alpha VM Mapping (Forward Mapped) “64-bit” address divided into 3 segments –seg0 (bit 63=0) user code/heap –seg1 (bit 63 = 1, 62 = 1) user stack –kseg (bit 63 = 1, 62 = 0) kernel segment for OS Three level page table, each one page –Alpha 21064 only 43 unique bits of VA –(future min page size up to 64KB => 55 bits of VA) PTE bits; valid, kernel & user read & write enable (No reference, use, or dirty bit) –What do you do for replacement? 21 10 POL3L2L1 base + 10 13 + + phys page frame number seg 0/1

15 15 © Alvin R. Lebeck 2008 CPS 220 Inverted Page Table (HP, IBM) One PTE per page frame –only one VA per physical frame Must search for virtual address More difficult to support aliasing Force all sharing to use the same VA Virtual page numberOffset VA PA,ST Hash Anchor Table (HAT) Inverted Page Table (IPT) Hash

16 16 © Alvin R. Lebeck 2008 CPS 220 Intel Pentium Segmentation + Paging Seg Selector Offset Logical Address Segment Descriptor Global Descriptor Table (GDT) Segment Base Address Linear Address Space Page Dir Physical Address Space DirOffsetTable Page Table

17 17 © Alvin R. Lebeck 2008 CPS 220 The Memory Management Unit (MMU) Input –virtual address Output –physical address –access violation (exception, interrupts the processor) Access Violations –not present –user v.s. kernel –write –read –execute

18 18 © Alvin R. Lebeck 2008 CPS 220 Translation Lookaside Buffers (TLB) Need to perform address translation on every memory reference –30% of instructions are memory references –4-way superscalar processor –at least one memory reference per cycle Make Common Case Fast, others correct Throw HW at the problem Cache PTEs

19 19 © Alvin R. Lebeck 2008 CPS 220 Fast Translation: Translation Buffer Cache of translated addresses Page Number Page offset... vrwtag phys frame... 48:1 mux 12... 48 3 4

20 20 © Alvin R. Lebeck 2008 CPS 220 TLB Design Must be fast, not increase critical path Must achieve high hit ratio Generally small highly associative Mapping change –page removed from physical memory –processor must invalidate the TLB entry PTE is per process entity –Multiple processes with same virtual addresses –Context Switches? Flush TLB Add ASID (PID) –part of processor state, must be set on context switch

21 21 © Alvin R. Lebeck 2008 CPS 220 Hardware Managed TLBs Hardware Handles TLB miss Dictates page table organization Compilicated state machine to “walk page table” –Multiple levels for forward mapped –Linked list for inverted Exception only if access violation Control Memory TLB CPU

22 22 © Alvin R. Lebeck 2008 CPS 220 Software Managed TLBs Software Handles TLB miss Flexible page table organization Simple Hardware to detect Hit or Miss Exception if TLB miss or access violation Should you check for access violation on TLB miss? Control Memory TLB CPU

23 23 © Alvin R. Lebeck 2008 CPS 220 Kernel Mapping the Kernel Digital Unix Kseg –kseg (bit 63 = 1, 62 = 0) Kernel has direct access to physical memory One VA->PA mapping for entire Kernel Lock (pin) TLB entry –or special HW detection User Stack Kernel User Code/ Data Physical Memory 0 2 64 -1

24 24 © Alvin R. Lebeck 2008 CPS 220 Considerations for Address Translation Large virtual address space Can map more things –files –frame buffers –network interfaces –memory from another workstation Sparse use of address space Page Table Design –space –less locality => TLB misses OS structure microkernel => more TLB misses

25 25 © Alvin R. Lebeck 2008 CPS 220 Address Translation for Large Address Spaces Forward Mapped Page Table –grows with virtual address space »worst case 100% overhead not likely –TLB miss time: memory reference for each level Inverted Page Table –grows with physical address space »independent of virtual address space usage –TLB miss time: memory reference to HAT, IPT, list search

26 26 © Alvin R. Lebeck 2008 CPS 220 Hashed Page Table (HP) Combine Hash Table and IPT [Huck96] –can have more entries than physical page frames Must search for virtual address Easier to support aliasing than IPT Space –grows with physical space TLB miss –one less memory ref than IPT Virtual page numberOffset VA PA,ST Hashed Page Table (HPT) Hash

27 27 © Alvin R. Lebeck 2008 CPS 220 Clustered Page Table (SUN) Combine benefits of HPT and Linear [Talluri95] Store one base VPN (TAG) and several PPN values –virtual page block number (VPBN) –block offset VPBNOffset VPBN next PA0 attrib Hash Boff VPBN next PA0 attrib... PA1 attrib PA2 attrib PA3 attrib VPBN next PA0 attrib VPBN next PA0 attrib

28 28 © Alvin R. Lebeck 2008 CPS 220 Reducing TLB Miss Handling Time Problem –must walk Page Table on TLB miss –usually incur cache misses Solution –build a small second-level cache in SW –on TLB miss, first check SW cache »use simple shift and mask index to hash table

29 29 © Alvin R. Lebeck 2008 CPS 220 Cache Indexing Tag on each block –No need to check index or block offset Increasing associativity shrinks index, expands tag Fully Associative: No index Direct-Mapped: Large index Block offset Block Address TAGIndex

30 30 © Alvin R. Lebeck 2008 CPS 220 Address Translation and Caches Where is the TLB wrt the cache? What are the consequences? Most of today’s systems have more than 1 cache –Digital 21164 has 3 levels –2 levels on chip (8KB-data,8KB-inst,96KB-unified) –one level off chip (2-4MB) Does the OS need to worry about this? Definition: page coloring = careful selection of va->pa mapping

31 31 © Alvin R. Lebeck 2008 CPS 220 TLBs and Caches CPU TLB $ MEM VA PA Conventional Organization CPU $ TLB MEM VA PA Virtually Addressed Cache Translate only on miss Alias (Synonym) Problem CPU $TLB MEM VA PA Tags PA Overlap $ access with VA translation: requires $ index to remain invariant across translation VA Tags L2 $

32 32 © Alvin R. Lebeck 2008 CPS 220 Virtual Caches Send virtual address to cache. Called Virtually Addressed Cache or just Virtual Cache vs. Physical Cache or Real Cache Avoid address translation before accessing cache –faster hit time to cache Context Switches? –Just like the TLB (flush or pid) –Cost is time to flush + “compulsory” misses from empty cache –Add process identifier tag that identifies process as well as address within process: can’t get a hit if wrong process I/O must interact with cache

33 33 © Alvin R. Lebeck 2008 CPS 220 I/O Bus Memory Bus Processor Cache Main Memory Disk Controller Disk Graphics Controller Network Interface Graphics Network interrupts I/O and Virtual Caches I/O Bridge Virtual Cache Physical Addresses I/O is accomplished with physical addresses DMA flush pages from cache need pa->va reverse translation coherent DMA

34 34 © Alvin R. Lebeck 2008 CPS 220 Aliases and Virtual Caches aliases (sometimes called synonyms); Two different virtual addresses map to same physical address But, but... the virtual address is used to index the cache Could have data in two different locations in the cache Kernel User Stack Kernel User Code/ Data Physical Memory 0 2 64 -1

35 35 © Alvin R. Lebeck 2008 CPS 220 If index is physical part of address, can start tag access in parallel with translation so that can compare to physical tag Limits cache to page size: what if want bigger caches and use same trick? –Higher associativity –Page coloring Index with Physical Portion of Address Page Address Page Offset Address Tag Index Block Offset

36 36 © Alvin R. Lebeck 2008 CPS 220 Page Coloring for Aliases HW that guarantees that every cache frame holds unique physical address OS guarantee: lower n bits of virtual & physical page numbers must have same value; if direct-mapped, then aliases map to same cache frame –one form of page coloring Page Address Page Offset Address Tag Index Block Offset

37 37 © Alvin R. Lebeck 2008 CPS 220 Page Coloring to reduce misses Notion of bin –region of cache that may contain cache blocks from a page Random vs careful mapping Selection of physical page frame dictates cache index Overall goal is to minimize cache misses CachePage frames

38 38 © Alvin R. Lebeck 2008 CPS 220 A Case for Large Pages Page table size is inversely proportional to the page size –memory saved Fast cache hit time easy when cache <= page size (VA caches); –bigger page makes it feasible as cache size grows Transferring larger pages to or from secondary storage, possibly over a network, is more efficient Number of TLB entries are restricted by clock cycle time, –larger page size maps more memory –reduces TLB misses

39 39 © Alvin R. Lebeck 2008 CPS 220 A Case for Small Pages Fragmentation –large pages can waste storage –data must be contiguous within page Quicker process start for small processes(??)

40 40 © Alvin R. Lebeck 2008 CPS 220 Superpages Hybrid solution: multiple page sizes –8KB, 16KB, 32KB, 64KB pages –4KB, 64KB, 256KB, 1MB, 4MB, 16MB pages Need to identify candidate superpages –Kernel –Frame buffers –Database buffer pools Application/compiler hints Detecting superpages –static, at page fault time –dynamically create superpages Page Table & TLB modifications


Download ppt "Operating Systems & Memory Systems: Address Translation Computer Science 220 ECE 252 Professor Alvin R. Lebeck Fall 2008."

Similar presentations


Ads by Google