Presentation is loading. Please wait.

Presentation is loading. Please wait.

Paging, Page Tables, and Such Andrew Whitaker CSE451.

Similar presentations

Presentation on theme: "Paging, Page Tables, and Such Andrew Whitaker CSE451."— Presentation transcript:

1 Paging, Page Tables, and Such Andrew Whitaker CSE451

2 Todays Topics Page Replacement Strategies Making Paging Fast Reducing the Overhead of Page Tables

3 Review: Working Sets Number of page frames allocated to process Request / second of throughput working set thrashing )-: Over-allocation

4 Page Replacement What happens when we take a page fault and weve run out of memory? Goal: Keep each processs working set in memory Giving more than the working set not necessary Key issue: how do we identify working sets?

5 Beladys Algorithm Evict the page that wont be used for the longest time in the future This page is probably not in the working set If it is in the working set, were thrashing This is optimal! Minimizes the number of page faults Major problem: this requires a crystal ball There is no good way to predict future memory accesses

6 How Good are These Page Replacement Algorithms? LIFO Newest page is kicked out FIFO Oldest page is kicked out Random Random page is kicked out LRU Least recently used page is kicked out

7 Temporal Locality Assumption: recently accessed pages will be accessed again soon Use the past to predict the future LIFO is horrendous Random is also pretty bad LRU is pretty good FIFO is mediocre VAX VMS used a form of FIFO because of hardware limitations

8 Implementing LRU: Approach #1 One (bad) approach: on each memory reference: long timeStamp = System.currentTimeMillis(); sortedList.insert(pageFrameNumber,timeStamp); Problem: this is too inefficient Time stamp + data structure manipulation on each memory operation Too complex for hardware

9 Making LRU Efficient Use hardware support Reference bit is set when pages are accessed Can be cleared by the OS Trade off accuracy for speed It suffices to find a pretty old page page frame numberprotMRV 202111

10 Approach #2: LRU Approximation with Reference Bits For each page, maintain a set of reference bits Lets call it a reference byte Periodically, shift the HW reference bit into the highest-order bit of the reference byte Suppose the reference byte was 10101010 If the HW bit was set, the new reference bit become 11010101 Frame with the lowest value is the LRU page

11 Analyzing Reference Bits Pro: Does not impose overhead on every memory reference Interval rate can be configured Con: Scanning all page frames can still be inefficient e.g., 4 GB of memory, 4KB pages => 1 million page frames

12 Approach #3: LRU Clock Use only a single bit per page frame Basically, this is a degenerate form of reference bits On page eviction: Scan through the list of reference bits If the value is zero, replace this page If the value is one, set the value to zero

13 Why Clock? 0 0 10 0 1 1 0 0 0 0 1 Typically implemented with a circular queue

14 Analyzing Clock Pro: Very low overhead Only runs when a page needs evicted Takes the first page that hasnt been referenced Con: Isnt very accurate (one measly bit!) Degenerates into FIFO if all reference bits are set Pro: But, the algorithm is self-regulating If there is a lot of memory pressure, the clock runs more often (and is more up-to-date)

15 When Does LRU Do Badly? LRU performs poorly when there is little temporal locality: 123 4 5 678 Example: Many database workloads: SELECT * FROM Employees WHERE Salary < 25000

16 Todays Topics Page Replacement Strategies Making Paging Fast Reducing the Overhead of Page Tables

17 Review: Mechanics of address translation page frame 0 page frame 1 page frame 2 page frame Y … page frame 3 physical memory offset physical address page frame # page table offset virtual address virtual page # Problem: page tables live in memory

18 Making Paging Fast We must avoid a page table lookup for every memory reference This would double memory access time Solution: Translation Lookaside Buffer Fancy name for a cache TLB stores a subset of PTEs (page table translation entries) TLBs are small and fast (16-48 entries) Can be accessed for free

19 TLB Details In practice, most (> 99%) of memory translations handled by the TLB Each processor has its own TLB TLB is fully associative Any TLB slot can hold any PTE entry The full VPN is the cache key All entries are searched in parallel Who fills the TLB? Two options: Hardware (x86) walks the page table on a TLB miss Software (MIPS, Alpha) routine fills the TLB on a miss TLB itself needs a replacement policy Usually implemented in hardware (LRU)

20 What Happens on a Context Switch? Each process has its own address space So, each process has its own page table So, page-table entries are only relevant for a particular process Thus, the TLB must be flushed on a context switch This is why context switches are so expensive

21 Bens Idea We can avoid flushing the TLB if entries are associated with an address space When would this work well? When would this not work well? page frame numberprotMRV 202111 ASID 4

22 TLB Management Pain TLB is a cache of page table entries OS must ensure that page tables and TLB entries stay in sync Massive pain: TLB consistency across multiple processors Q: How do we implement LRU if reference bits are stored in the TLB? One answer: we dont Windows uses FIFO for multiprocessor machines

23 Todays Topics Page Replacement Strategies Making Paging Fast Reducing the Overhead of Page Tables

24 Page Table Overhead For large address space, page table sizes can become enormous Example: Alpha architecture 64 bit address space, 8KB pages Num PTEs = 2^64 / 2^13 = 2^51 Assuming 8 bytes per PTE: Num Bytes = 2^54 = 16 Petabytes And, this is per-process!

25 Optimizing for Sparse Address Spaces Observation: very little of the address space is in use at a given time This is why virtual memory works Basic idea: only allocate page tables where we need to And, fill in new page tables on demand virtual address space

26 Implementing Sparse Address Spaces We need a data structure to keep track of the page tables we have allocated And, this structure must be small Otherwise, weve defeated our original goal Solution: multi-level page tables Page tables of page tables Any problem in CS can be solved with a layer of indirection

27 Two level page tables page frame 0 page frame 1 page frame 2 page frame Y … page frame 3 physical memory offset physical address page frame # master page table secondary page# virtual address master page #offset secondary page table empty secondary page table page frame number Key point: not all secondary page tables must be allocated

28 Generalizing Early architectures used 1-level page tables VAX, x86 used 2-level page tables SPARC uses 3-level page tables Alpha 68030 uses 4-level page tables Key thing is that the outer level must be wired down (pinned in physical memory) in order to break the recursion

29 Cool Paging Tricks Basic Idea: exploit the layer of indirection between virtual and physical memory

30 Trick #1: Shared Memory Allow different processes to share physical memory Virt Address space 2Virt Address space 1 Physical memory

31 Trick #2: Copy-on-write Recall that fork() copies the parents address space to the client This is ineffient, especially if the child calls exec Copy-on-write allows for a fast copy by using shared pages If the child tries to write to a page, the OS intervenes and makes a copy of the target page Implementation: pages are shared as read-only OS intercepts write faults page frame number prot MRV

32 Trick #3: Memory-mapped Files Normally, files are accessed with system calls Open, read, write, close Memory mapping allows a program to access a file with load/store operations Virt Address space Foo.txt

Download ppt "Paging, Page Tables, and Such Andrew Whitaker CSE451."

Similar presentations

Ads by Google