Presentation is loading. Please wait.

Presentation is loading. Please wait.

A REAL-TIME GARBAGE COLLECTOR WITH LOW OVERHEAD AND CONSISTENT UTILIZATION David F. Bacon, Perry Cheng, and V.T. Rajan IBM T.J. Watson Research Center.

Similar presentations


Presentation on theme: "A REAL-TIME GARBAGE COLLECTOR WITH LOW OVERHEAD AND CONSISTENT UTILIZATION David F. Bacon, Perry Cheng, and V.T. Rajan IBM T.J. Watson Research Center."— Presentation transcript:

1 A REAL-TIME GARBAGE COLLECTOR WITH LOW OVERHEAD AND CONSISTENT UTILIZATION David F. Bacon, Perry Cheng, and V.T. Rajan IBM T.J. Watson Research Center Presented by Srilakshmi Swati Pendyala

2 Outline  Motivation  Introduction & Previous Works  Overview of the Proposed Garbage Collector  Example of the Collection Process  Scheduling – Time-Based Vs. Work-Based  Experimental Results  Conclusion

3 Motivation  Real-time systems growing in importance  ATMs, PDAs, Web Servers, Points of Sale etc.  Constraints for Real-Time Systems:  Hard constraints for continuous performance (Low Pause Times)  Memory Constraints (less memory in embedded systems)  Other Constraints ? Need for a real-time garbage collector with low memory usage.

4 Garbage Collection in Real-time Systems  Maximum Pause Time < Required Response  CPU Utilization sufficient to accomplish task  Measured with Minimum Mutator Utilization  Memory Requirement < Resource Limit  Important Constraint in Embedded Systems

5 Problems with Previous Works  Fragmentation  Early works (Baker’s Treadmill) handles a single object size  Not suitable modern languages  Fragmentation not a major problem for a family of C and C++ benchmarks (Johnstone’ Paper)  Not valid for long-run programs (web-servers, embedded systems etc.)  Use of single (large) block size  Increase in memory requirements  Leads to internal fragmentation

6 Problems with Previous Works  High Space Overhead  Copying algorithms to avoid fragmentation  Leads to high space overhead  Uneven Mutator Utilization  The fraction of processor devoted to mutator execution  Several copying algorithms suffer from poor/uneven mutator utilization  Long low-utilization periods render mutator unsuitable for real- time applications  Inability to handle large data structures  When collecting a subset of the heap at a time, large structures generated by adversarial mutators force unbounded work

7 Outline  Motivation  Introduction & Previous Works  Overview of the Proposed Garbage Collector  Example of the Collection Process  Scheduling – Time-Based Vs. Work-Based  Experimental Results  Conclusion

8 Components and Concepts in Proposed GC  Segregated free list allocator  Geometric size progression limits internal fragmentation  Mostly non-copying  Objects are usually not moved.  Defragmentation  Moves objects to a new page when page is fragmented due to GC  Read barrier: to-space invariant [Brooks]  New techniques with only 4% overhead  Incremental mark-sweep collector  Mark phase fixes stale pointers  Arraylets: bound fragmentation, large object ops  Time-based scheduling New Old

9 Segregated Free List Allocator  Heap divided into fixed-size pages  Each page divided into fixed-size blocks  Objects allocated in smallest block that fits 24 16 12

10 Limiting Internal Fragmentation  Choose page size P and block sizes s k such that  s k = s k-1 (1+ ρ )  How do we choose small s 0 & ρ ?  s 0 ~ minimum block size  ρ ~ sufficiently small to avoid internal fragmentation Too small a ρ leads to too many pages and hence a wastage of space, but it should be okay for long running processes Too large a ρ leads to internal fragmentation Memory for a page should be allocated only when there is at least one object in that page.

11 Defragmentation  When do we move objects?  At the end of sweep phase, when there are no sufficient free pages for the mutator to execute, that is, when there is fragmentation  Usually, program exhibits locality of size  Dead objects are re-used quickly  Defragment either when  Dead objects are not re-used for a GC cycle  Free pages fall below limit for performing a GC  In practice: we move 2-3% of data traced  Major improvement over copying collector

12 Read Barrier: To-space Invariant  Problem: Collector moves objects (defragmentation)  and mutator is finely interleaved  Solution: read barrier ensures consistency  Each object contains a forwarding pointer [Brooks]  Read barrier unconditionally forwards all pointers  Mutator never sees old versions of objects  Will the mutator utilization have any effects because of the read barrier ? From-spaceTo-space A X Y Z A X Y Z A′ BEFOREAFTER

13 Read Barrier Optimization  Previous studies: 20- 40% overhead [Zorn, Nielsen]  Several optimizations applied to the read barrier and reduced the cost over-head to <10% using Eager Read Barriers  “Eager” read barrier preferred over “Lazy” read barrier.

14 Incremental Mark-Sweep  Mark/sweep finely interleaved with mutator  Write barrier: snapshot-at-the-beginning [Yuasa]  Ensures no lost objects  Must treat objects in write buffer as roots  Read barrier ensures consistency  Marker always traces correct object  With barriers, interleaving is simple  Are the problems inherent to mark sweep, also apply here ?

15 Pointer Fix-up During Mark  When can a moved object be freed?  When there are no more pointers to it  Mark phase updates pointers  Redirects forwarded pointers as it marks them  Object moved in collection n can be freed:  At the end of mark phase of collection n+1 From-spaceTo-space A X Y Z A′

16 Arraylets  Large arrays create problems  Fragment memory space  Can not be moved in a short, bounded time  Solution: break large arrays into arraylets  Access via indirection; move one arraylet at a time A1A2A3 A

17 Outline  Motivation  Introduction & Previous Works  Overview of the Proposed Garbage Collector  Example of the Collection Process  Scheduling – Time-Based Vs. Work-Based  Experimental Results  Conclusion

18 Heap (one size only)Stack Program Start

19 HeapStack free allocated Program is allocating

20 HeapStack free unmarked GC starts

21 HeapStack free unmarked marked or allocated Program allocating and GC marking

22 HeapStack free unmarked marked or allocated Sweeping away blocks

23 HeapStack free allocated evacuated GC moving objects and installing redirection

24 HeapStack free unmarked evacuated marked or allocated 2 nd GC starts tracing and redirection fixup

25 HeapStack free allocate d 2 nd GC complete

26 Outline  Motivation  Introduction & Previous Works  Overview of the Proposed Garbage Collector  Example of the Collection Process  Scheduling – Time-Based Vs. Work-Based  Experimental Results  Conclusion

27 Scheduling the Collector  Scheduling Issues  Bad CPU utilization and space usage  Loose program and collector coupling  Time-Based  Trigger the collector to run for C T seconds whenever the mutator runs for Q T seconds  Work-Based  Trigger the collector to collect C W work whenever the mutator allocate Q W bytes

28 Scheduling  Very predictable mutator utilization  Memory allocation does not need to be monitored.  Uneven mutator utilization due to bursty allocation  Memory allocation rates need to be monitored to make sure real-time performance is obtained Time – BasedWork – Based Why is Time-based scheduling better in terms of mutator utilization ? (Analytically and experimentally shown in the paper)

29 Outline  Motivation  Introduction & Previous Works  Overview of the Proposed Garbage Collector  Example of the Collection Process  Scheduling – Time-Based Vs. Work-Based  Experimental Results  Conclusion

30 12 ms Pause Time Distribution for javac (Time- Based vs. Work-Based)

31 Utilization vs. Time for javac (Time-Based vs. Work-Based) 0.45

32 Minimum Mutator Utilization for javac (Time-Based vs. Work-Based)

33 Space Usage for javac (Time-Based vs. Work-Based)

34 Conclusions  The Metronome provides true real-time GC  First collector to do so without major sacrifice Short pauses (4 ms) High MMU during collection (50%) Low memory consumption (2x max live)  Critical features  Time-based scheduling  Hybrid, mostly non-copying approach  Integration with the compiler


Download ppt "A REAL-TIME GARBAGE COLLECTOR WITH LOW OVERHEAD AND CONSISTENT UTILIZATION David F. Bacon, Perry Cheng, and V.T. Rajan IBM T.J. Watson Research Center."

Similar presentations


Ads by Google