Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS533 - Concepts of Operating Systems Virtual Memory Primitives for User Programs Presentation by David Florey.

Similar presentations


Presentation on theme: "CS533 - Concepts of Operating Systems Virtual Memory Primitives for User Programs Presentation by David Florey."— Presentation transcript:

1 CS533 - Concepts of Operating Systems Virtual Memory Primitives for User Programs Presentation by David Florey

2 CS533 - Concepts of Operating Systems Overview  This paper provides basic primitives, how there used and the implementation details on various OSs  Discuss the various primitives and how they are used (in user level algorithms)  Discuss the performance on various OSs  Discuss the ramifications of these uses (algorithms) on system design

3 CS533 - Concepts of Operating Systems The Primitives (VM Services)  TRAP o Facility allowing user level handling of page faults (protection or otherwise) o An event that is raised (in the form of a message or signal from OS)  PROT1 o Decreases accessibility of a single page o A procedure call (via messaging, trap to OS, etc)  PROTN o Decreases accessibility of n pages o A procedure call (via messaging, trap to OS, etc)  UNPROT o Increases the accessibility of a single page o A procedure call (via messaging, trap to OS, etc)  DIRTY o Returns a set of pages that have been touched since the last call to dirty o A procedure call (via messaging, trap to OS, etc)  MAP2 o Map two different virtual addresses to point to the same physical page o Each virtual address has its own protection level o This is in the same address space (not two different processes or tasks or address spaces) o A procedure call (via messaging, trap to OS, etc)

4 CS533 - Concepts of Operating Systems VM Service Usage Concurrent Garbage Collection  Stop all threads  Divide memory into from-space and to-space  Copy all objects reachable from “roots” and registers into to- space  Use PROTN to protect all pages in unscanned area  Use MAP2 to allow collector access to all pages while preventing mutators from accessing the same pages  Restart threads  As mutator threads attempt to access pages in to-space that are unscanned, TRAP event: o Stops mutator in its tracks o Calls collector, collector scans, forwards and UNPROTs page o Mutator allowed to continue  At some point this process is restarted and all objects left in from-space are considered garbage and removed

5 Concurrent Garbage Collection

6 CS533 - Concepts of Operating Systems VM Service Usage Shared Virtual Memory  Each CPU (or machine) has its own memory and memory mapping manager  Memory mapping managers keep CPU memory consistent with the “shared” memory  When a page is shared, it is marked “read-only” (PROT1)  Upon writing this page, a fault occurs in the writing thread causing TRAP event associated Mapping Manager  Mapping Manager uses trap to notify other MMs, which in turn flush their copy of the page (this mechanism may also be used to get an up-to-date copy of the page)  Page is then marked writable (UNPROT) and written  MAP2 is used to allow the trap-handler to access the protected page while the client cannot  TRAP is also used by MM to pull down a page from another CPU or disk when not available locally

7 Shared Memory

8 CS533 - Concepts of Operating Systems VM Service Usage Concurrent Checkpointing  Checkpointing is the process of state such as heap, stack, etc – which can be slow  Instead of a synchronous save, we can simply use PROTN to mark the pages that need to be saved to disk read-only  A second thread can then run concurrently with the user threads writing out pages and UNPROTing each page as its written  If a user thread hits a “read-only” page, a fault occurs TRAPping to the concurrent thread which quickly writes the page and allows the faulting thread to continue  Could also just do this with the DIRTY pages using PROT1

9 CS533 - Concepts of Operating Systems Concurrent Checkpointing

10 Concurrent Checkpointing With DIRTY

11 CS533 - Concepts of Operating Systems VM Service Usage Generational Garbage collection  Objects are kept in generations  The longer an object lives, the older its generation  Typically garbage is in younger generations, but an old object might be pointing at a young object so…  Use DIRTY checkpointing to see if pages containing old objects were changed, objects in these DIRTY pages can be scanned to see where they point  Or  PROTN all old pages and TRAP to a handler when old page is written to, save page id in a list for later scanning and UNPROT page so writer can write  Later, collector can scan the list of pages to see if any objects within the pages are pointing to younger generations  Why use a small page size here?

12

13 CS533 - Concepts of Operating Systems VM Service Usage Others…  Persistent Stores o Can use VM services to protect pages, trap on writes and persist dirty pages on commit or toss them on abort o TRAP, UNPROT and PROTN, UNPROT, MAP2  Extending addressability o After translating 64-bit  32-bit pages may need to be protected so that a TRAP handler can properly “load” the page for suitable access, then UNPROT it o TRAP, UNPROT, PROT1 or PROTN and MAP2  Data-compression Paging o Compressing n pages into a couple of pages may be faster than writing these pages to disk. The compressed pages can then be access-protected. When user then tries to access such a page, TRAP, decompress, UNPROT o Could also use PROT1 to test access frequency of page o TRAP, PROT1 or PROTN, TRAP, UNPROT  Heap overflow detection o Terminate memory allocation with a “guard” (PROT1) page o Upon access to this page call TRAP-handler which triggers collector o Alternative is conditional branch o PROT1, TRAP

14 Persistent Store Example & Data Compression Example

15 CS533 - Concepts of Operating Systems Performance in OSs  Devised Appel1 and Appel2 based on algorithms’ patterns of primitive usage  Appel1 o PROT1, TRAP, UNPROT o e.g. Shared Virtual Memory  Appel2 o PROTN, TRAP, UNPROT o e.g. Concurrent garbage collection,

16 CS533 - Concepts of Operating Systems Performance in OSs

17 CS533 - Concepts of Operating Systems Performance of Primitives  All data normalized based on speed of Add instruction on CPU  Some OSs didn’t implement Map2  Some OSs did a crummy job of implementing these primitives o mprotect does not flush the TLB correctly  OS designers seem to be relying on old notions like disk latency o Not relevant with CPU-based algorithms like these  One OS performed exceptionally well showing that these instructions don’t have to perform poorly

18 CS533 - Concepts of Operating Systems Ramifications on System Design  Fault handling must be fast because we are no longer at the mercy of the disk – we can do it all in the CPU  TLB Consistency o Making memory more accessible is good for TLB consistency One less thing you need to worry about o Making memory less accessible in the multi-processor case forces TLB “shootdown” Stop all processors and tell each to flush entry 123 in TLB Better if done in batches In fact, paging out could improve if done in batches too

19 CS533 - Concepts of Operating Systems Ramifications on System Design  Optimal Page Size o Some operations depend on the size of the page “HEY OS DESIGNERS LISTEN UP!” o Disk latency can no longer be counted on for crummy design o Computations linearly proportional to page size are now going to be noticed, so we might benefit by cutting down the page size Those algorithms that do a lot of scanning – like the Generational Garbage collector – would benefit from a smaller page size o Also be aware that shrinking page sizes will cause more page faults and more calls to the fault trap handler, so its overhead must also be very small

20 CS533 - Concepts of Operating Systems Ramifications on System Design  Access to Protected Pages o Mapping same page two different ways with two different protections in same address space is FAST Although it does add some bookkeeping overhead And cache consistency could be a problem o You could achieve the same results by copying memory around – only 65 copies and you’re there! Or pounding your head on the desk – that works too o You could also use a heavyweight process and super heavy RPC to context switch heavily, relying on the shared page between processes support in OSs Techniques employeed in LRPC and URPC can alleviate the context switch problem

21 CS533 - Concepts of Operating Systems Ramifications on System Design  What about pipelined processors? o Out-of-order execution o Dependence on sequential execution o Only a problem in the heap overflow detection case Register tweaking can be a problem All other algorithms work just like a typical page fault handler – handle fault, pull page in, make page accessible

22 CS533 - Concepts of Operating Systems Final Considerations  Making memory more accessible one page at a time, and less accessible in large batches is good for TLB consistency  The total performance effect of page size should be considered (fixed costs vs variable costs)  Locality of reference is exploited in these algorithms o Better locality improves fault handling overhead (as data is closer to CPU)  Pages should be accessible in different ways in a single address space


Download ppt "CS533 - Concepts of Operating Systems Virtual Memory Primitives for User Programs Presentation by David Florey."

Similar presentations


Ads by Google