Presentation on theme: "Tore Larsen Material developed by: Kai Li, Princeton University"— Presentation transcript:
1 Tore Larsen Material developed by: Kai Li, Princeton University Address TranslationTore LarsenMaterial developed by: Kai Li, Princeton University
2 Topics Virtual memory Address translation Virtualization Protection Base and boundSegmentationPagingTranslation look-ahead buffer (TLB)
3 Issues Many processes running concurrently Location transparency Address space may exceed memory sizeMany small processes whose total size may exceed memoryEven one large may exceed physical memory sizeAddress space may be sparsely usedProtectionOS protected from user processesUser processes protected from each other
4 The Big Picture Memory is fast Disks are cheap Goals but expensive but slowGoalsRun programs as efficiently as possibleMake system as safe as possible
5 StrategiesSize: Can we use slow disks to “extend” the size of available memory?Disk accesses must be rare in comparison to memory accesses so that each disk access is amortized over many memory accessesLocation: Can we device a mechanism that delays the bindings of program address to memory location? Transparency and flexibility.Sparsity: Can we avoid reserving memory for non-used regions of address space?Process protection: Must check access rights for every memory access
6 Protection IssueErrors in one process should not affect other processesFor each process, need to enforce that every load or store are to “legal” regions of memory
7 Expansion - Location Transparency Issue Each process should be able to run regardless of location in memoryRegardless of memory size?Dynamically relocateable?Memory fragmentationExternal fragmentation – Among processesInternal fragmentation – Within processesApproachGive each process large “fake” address spaceRelocate each memory access to actual memory addres
8 Why Virtual Memory? Use secondary storage Provide Protection Extend expensive DRAM with reasonable performanceProvide ProtectionPrograms do not step over each other, communicate with each other require explicit IPC operationsConvenienceFlat address space and programs have the same view of the worldFlexibilityProcesses may be located anywhere in memory, may be moved while executing, may reside partially in memory and partially on disk
9 Design Issues How is memory partitioned? How are processes (re)located?How is protection enforced?
10 Address Mapping Granularity Mapping mechanismVirtual addresses are mapped to DRAM addresses or onto diskMapping granularity?Increased granularityIncreases flexibilityDecreases internal fragmentationRequires more mapping information & HandlingExtremesAny byte to any byte: Huge map sizeWhole segments: Large segments cause problems
11 Locality of Reference Behaviors exhibited by most programs Locality in timeWhen an item is addressed, it is likely to be addressed again shortlyLocality in spaceWhen an item is addressed, its neighboring items are likely to be addressed shortlyBasis of cachingArgues that recently accessed items should be cached together with an encompassing region; A block (or line)20/80 rule: 20 % of memory gets 80 % of referencesKeep the 20 % in memory
12 Translation Overview Actual translation is in hardware (MMU) CPUActual translation is in hardware (MMU)Controlled in privileged softwareCPU viewwhat program sees, virtual memoryMemory & I/O viewphysical memoryvirtual addressTranslation(MMU)physical addressPhysicalmemoryI/Odevice
13 Goals of Translation Implicit translation for each memory reference A hit should be very fastTrigger an exception on a missProtected from user’s faultsRegistersCache(s)2-20xDRAMxpagingDisk20M-30Mx
14 Base and Bound > error + bound virtual address base Built in Cray-1ProtectionA program can only access physical memory in [base, base+bound]On a context switch:Save/restore base, bound registersProsSimpleFlatCons:FragmentationDifficult to shareDifficult to use disksboundvirtual address>error+basephysical address
15 Segmentation error > . + Virtual address segment offset seg size Provides separate virtual address spaces (segments)Each process has a table of (seg, size)ProtectionEach entry has (nil,read,write)On a context switchSave/restore the table or a pointer to the table in kernel memoryProsEfficientEasy to shareCons:Complex managementFragmentation within a segmentsegsize.+physical address
16 Paging error > ... ... . ... Physical address Virtual address page table sizeUse a fixed size unit called pagePages not visible from programUse a page table to translateVarious bits in each entryContext switchSimilar to the segmentation schemeWhat should be the page size?ProsSimple allocationEasy to shareConsBig page tablesHow to deal with holes?VPage #offseterror>Page tablePPage#.......PPage#...PPage #offsetPhysical address
17 How Many PTEs Do We Need? Assume 4KB page size 12 bit (low order) displacement within page20 bit (high order) page#Worst case for 32-bit address machine# of processes 220220 PTEs per page table (~4MBytes). 10K processes?What about 64-bit address machine?# of processes 252Page table won’t fit on disk (252 PTEs = 16PBytes)
18 Segmentation with Paging Virtual addressVseg #VPage #offsetPage tablesegsizePPage#........PPage#...Multics was the first system to combine segmentation and paging.>PPage #offseterrorPhysical address
20 Inverted Page Tables Physical Virtual address address Main ideaOne PTE for each physical page frameHash (Vpage, pid) to Ppage#ProsSmall page table for large address spaceConsLookup is difficultOverhead of managing hash chains, etcVirtualaddresspidvpageoffsetkoffsetpidvpagekn-1Inverted page table
21 Virtual-To-Physical Lookup Program only knows virtual addressesEach process goes from 0 to highest addressEach memory access must be translatedInvolves walk-through of (hierarchical) page tablesPage table is in memoryAn extra memory access for each memory access???SolutionCache part of page table (hierarchy) in fast associative memory – Translation-Lookahead-Buffer (TLB)Introduces TLB hits, misses etc.
23 Bits in A TLB Entry Common (necessary) bits Optional (useful) bits Virtual page number: match with the virtual addressPhysical page number: translated addressValidAccess bits: kernel and user (nil, read, write)Optional (useful) bitsProcess tagReferenceModifyCacheable
24 Hardware-Controlled TLB On a TLB missHardware loads the PTE into the TLBNeed to write back if there is no free entryGenerate a fault if the page containing the PTE is invalidVM software performs fault handlingRestart the CPUOn a TLB hit, hardware checks the valid bitIf valid, pointer to page frame in memoryIf invalid, the hardware generates a page faultPerform page fault handlingRestart the faulting instruction
25 Software-Controlled TLB On a miss in TLBWrite back if there is no free entryCheck if the page containing the PTE is in memoryIf not, perform page fault handlingLoad the PTE into the TLBRestart the faulting instructionOn a hit in TLB, the hardware checks valid bitIf valid, pointer to page frame in memoryIf invalid, the hardware generates a page faultPerform page fault handling
26 Hardware vs. Software Controlled Hardware approachEfficientInflexibleNeed more space for page tableSoftware approachFlexibleSoftware can do mappings by hashingPP# (Pid, VP#)(Pid, VP#) PP#Can deal with large virtual address space
27 Cache vs. TLB Similarity Both are fast and expensive with respect to capasityBoth cache a portion of memoryBoth write back on a missDifferencesTLB is usually fully set-associativeCache can be direct-mappedTLB does not deal with consistency with memoryTLB can be controlled by softwareLogically TLB lookup appears ahead of cache lookup, careful design allows overlapped lookupCombine L1 cache with TLBVirtually addressed cacheWhy wouldn’t everyone use virtually addressed cache?
28 TLB Related Issues What TLB entry to be replaced? RandomPseudo LRUWhat happens on a context switch?Process tag: change TLB registers and process registerNo process tag: Invalidate the entire TLB contentsWhat happens when changing a page table entry?Change the entry in memoryInvalidate the TLB entry
29 Consistency Issue Snoopy cache protocols Maintain cache consistency with DRAM, even when DMA happensConsistency between DRAM and TLBs:You need to flush (SW) related TLBs whenever changing a page table entry in memoryMultiprocessors need TLB “shootdown”When you modify a page table entry, you need to do to flush (“shootdown”) all related TLB entries on every processor
30 Summary Virtual memory Address translation Easier SW developmentBetter memory utilizationProtectionAddress translationBase & bound: Simple, but limitedSegmentation: Useful but complexPaging: Best tradeoff currentlyTLB: Fast translationVM needs to handle TLB consistency issues
Your consent to our cookies if you continue to use this website.