Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rifat Shahriyar Stephen M. Blackburn Australian National University

Similar presentations


Presentation on theme: "Rifat Shahriyar Stephen M. Blackburn Australian National University"— Presentation transcript:

1 Rifat Shahriyar Stephen M. Blackburn Australian National University
High Performance Reference Counting and Conservative Garbage Collection Rifat Shahriyar Stephen M. Blackburn Australian National University Kathryn S. McKinley Microsoft Research

2 Down for the Count? Getting Reference Counting Back in the Ring ISMM’12
What happened 53 years ago?

3 Why Reference Counting?
Advantages Immediacy Object local Basic RC is easy Disadvantages Cycles Performance

4 Can we get RC back in the ring?
Problem One of the two fundamental GC algorithms Many advantages Neglected by performance-conscious VMs So how much slower is it? Can we get RC back in the ring? 30%

5 RC vs. MS New RC ≈ MS

6 Summary Old RC New RC Performance 30% slower than MS
40% slower than production New RC Limited bit count Optimization for new objects Performance Matches MS Still 10% slower than production < 2012 2012 6

7 Taking Off the Gloves with Reference Counting Immix OOPSLA’13

8 Why So Slow? GC Total Mutator

9 Looking a Little Deeper…
L1 D Cache Misses Instructions Retired Time Using Managed Runtime Systems to Tolerate Holes in Wearable Memories

10 Looking a Little Deeper…
Free List Lets see which GC uses which allocator RC and MS – Free List SS and Immix – Bump pointer L1 D Cache Misses Instructions Retired Time Bump Pointer Using Managed Runtime Systems to Tolerate Holes in Wearable Memories

11 RC Immix Combines RC and Immix Exploit Immix’s opportunistic copy
Line/block reclamation Line live object count with object reference count Exploit Immix’s opportunistic copy Observe new objects can be copied by first GC Observe old objects can be copied by backup GC Using Managed Runtime Systems to Tolerate Holes in Wearable Memories

12 3% faster then Gen Immix, +6% worst case, -21% best case
Total time 3% faster then Gen Immix, +6% worst case, -21% best case

13 Summary RC Immix Great performance Transforms RC
-3% RC Immix Object-local collection Excellent mutator locality Copying with RC Great performance Outperforms fastest production Transforms RC

14 Fast Conservative Garbage Collection OOPSLA’14
What happened 53 years ago?

15 GC is Ubiquitous GC implementations
Exact Conservative High performance systems use exact GC Conservative GC is popular roots heap roots heap heap roots GC – needs to find all live/dead objects Start from the roots Roots - all references into the heap held by runtime including stacks, registers, statics, and JNI Conservative GC is generally used in less performant systems exact conservative

16 Root Conservative GC heap roots int
We are interested in root conservative GC Where References in the roots are not precisely known But references in the heap objects are precisely known

17 We are interested in managed languages
Why Conservative GC Advantages No cooperation from compiler and runtime Engineering accurate stack maps is challenging Enable some compiler optimizations Disadvantages Must handle ambiguous references Performance We are interested in managed languages Reference counting has some interesting advantages. Our goal is to make it faster than the production. Zoom in on the result

18 Performance of Conservative GC
BDW suffers 12% and MCC suffers 45% overhead

19 Ambiguous Reference Pointers? – retain their referents and transitively reachable objects (Excess retention) Values? – not modify them and pin the referents (Pinning) Corrupt heap? – guarantee validation before updating per-object metadata (Filtering)

20 Non-moving Boehm-Demers-Weiser (BDW) widely used Problems
free-list allocator mark-sweep trace to reclaim garbage Problems Free-list suffers bad locality than contiguous With object type precision, a overly restrictive design

21 Mostly copying aka Bartlett Style with many variants
Two twists over the classic semi-space to-space and from-space are linked lists of discontiguous pages Promotes page referenced by ambiguous root Problems Semi-space suffers from huge collection cost Space waste due to page level pinning Objects can’t span pages and allocator can’t use pinned page

22 RC Immixcons matches production Gen Immix
Total time RC Immixcons matches production Gen Immix

23 Summary Conservative GC New designs Conservative RC Immix
Dominated by BDW and MCC Significant overheads Heap org. key to performance New designs Low overhead object map Immix line based pinning Conservative RC Immix Matches fastest production

24 Conclusion


Download ppt "Rifat Shahriyar Stephen M. Blackburn Australian National University"

Similar presentations


Ads by Google