Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ulterior Reference Counting Fast GC Without The Wait

Similar presentations


Presentation on theme: "Ulterior Reference Counting Fast GC Without The Wait"— Presentation transcript:

1 Ulterior Reference Counting Fast GC Without The Wait
Steve Blackburn – Kathryn McKinley Presented by: Dimitris Prountzos Slides adapted from presentation by Steve Blackburn

2 Outline Throughput-Responsiveness problem
Reference counting & optimizations Ulterior in detail BG-RC in action Experimental evaluation Conclusion

3 Throughput/Responsiveness Trade-off
GC and mutator share CPU Throughput: net GC/mutator ratio Responsivness: length of GC pauses GC mutator CPU Utilization (time) poor responsiveness maximum pause

4 The Ulterior approach Match mechanisms to object demographics
Copying nursery (young space) Highly mutated, high mortality young objects Ignores most mutations GC time proportional to survivors, space efficient RC mature space Low mutation, low mortality old objects GC time proportional to mutations, space efficient Generalize deferred RC to heap objects Defer fields of highly mutated objects & enumerate them quickly Reference count only infrequently mutated fields

5 Pure Reference Counting
Tracks mutations: RCM(p) RCM(p) generates a decrement and an increment for the before and after values of p: RCM(p)  RC(pbefore)--, RC(pafter)++ If RC==0, Free a 1 b 1 RC space

6 Pure Reference Counting
Tracks mutations: RCM(p) RCM(p) generates a decrement and an increment for the before and after values of p: RCM(p)  RC(pbefore)--, RC(pafter)++ If RC==0, Free a 1 b c 1 RC space

7 Pure Reference Counting
Tracks mutations: RCM(p) RCM(p) generates a decrement and an increment for the before and after values of p: RCM(p)  RC(pbefore)--, RC(pafter)++ If RC==0, Free a 1 b c 1 RC space

8 Pure Reference Counting
Tracks mutations: RCM(p) RCM(p) generates a decrement and an increment for the before and after values of p: RCM(p)  RC(pbefore)--, RC(pafter)++ If RC==0, Free a 1 c 1 RC space RCM(p) for every mutation is very expensive

9 RC Optimizations Buffering: apply RC(p)--, RC(p)++ later
Coalescing: apply RCM(p) only for the initial and final values of p (coalesce intermediate values): {RCM(p), RCM(p1), ... RCM(pn)}  RC(pinitial)--, RC(pfinal)++ Deferral of RCM events

10 Deferred Reference Counting Goal: Ignore RCM(p) for stacks & registers
Deferral of p A mutation of p does not generate an RCM(p) Correctness: For all deferred p: RCR(p) at each GC Retain Event: RCR(p) po temporarily retains o regardless of RC(o) Deutsch/Bobrow use a Zero Count Table Bacon et al. use a temporary increment

11 Classic Deferral In deferral phase: Ignore RCM(p) for stacks & registers
Stacks & Regs a b 1 RC space

12 Classic Deferral Ignore RCM(p) for stacks & registers
Stacks & Regs a b c 1 RC space Breaks RC==0 Invariant

13 Classic Deferral (Bacon et al.)
Divide execution in epochs Store information in buffers Root buffer (RB): Store 1st level objects Increment buffer (IB): Store increments to 1st level objects Decrement buffer (DB): Store decrements to 1st level objects At GC time do: Look at RB and apply temporary increments to all objects there Process IB of this epoch Look at RB of previous epoch and apply decrements to all objects there Process DB of previous epoch During DB processing recycle o if RC(o)=0 Avoid race conditions by Processing IB before DB Processing DB of one epoch behind

14 Classic Deferral (Bacon et al.)
At GC time, RCR(p) for root pointers applies temporary increments. Stacks & Regs a 1 b 1 c 1 RC space a b dec buf root buf

15 Classic Deferral (Bacon et al.)
Stacks & Regs At next GC, apply decrements a 1 b 1 c 1 RC space a b dec buf root buf

16 Classic Deferral (Bacon et al.)
Key: Efficient enumeration of deferred pointers Stacks & Regs At next GC, apply decrements a 1 b 1 c 1 RC space a b dec buf root buf

17 Classic Deferral (Bacon et al.)
Better, but not good enough! Stacks & Regs a 1 b 1 c 1 RC space dec buf root buf

18 Ulterior Reference Counting
Idea: Extend deferral to select heap pointers e.g. All pointers within nursery objects Deferral is not a fixed property of p e.g. A nursery object gets promoted Integrate Event I(p) Changes p from deferred to not deferred

19 BG-RC Bounded Nursery Generational - RC
Heap organization Bounded copying nursery Ignore mutations to nursery pointer fields RC old space Object remembering, coalescing, buffering Collection Process roots Nursery phase promotes live p to old space and I(p) RC phase processes object buffer, dec buffer

20 View of heap in Ulterior RC
Stacks Regs defer remember a 1 b 1 r s defer d 1 e 1 t RC space non-RC space How can we efficiently Enumerate all deferred pointer fields ? Remember old to young pointers ?

21 Bringing it Together Deferral: Defer nursery & roots
Perform I(p) on nursery promotion Piggyback on copying nursery collection Coalescing: Remember mutated RC objects Upon first mutation, dec each referent At GC time, inc each referent Piggyback remset onto this mechanism

22 BG-RC Write Barrier // unsync check for uniqueness
1 private void writeBarrier(VM_Address srcObj, VM_Address srcSlot, VM_Address tgtObj) 4 throws VM_PragmaInline { 5 if (getLogState(srcObj) != LOGGED) writeBarrierSlow(srcObj); 7 VM_Magic.setMemoryAddress(srcSlot, tgtObj); 8 } 9 } // unsync check for uniqueness 10 private void writeBarrierSlow(VM_Address srcObj) throws VM_PragmaNoInline { if (attemptToLog(srcObj)) { modifiedBuffer.push(srcObj); enumeratePointersToDecBuffer(srcObj); // trade-off for sparsely setLogState(srcObj, LOGGED); // modified objects } 17 }

23 BG-RC Mutation Phase a b d e obj buf dec buf root buf Stacks Regs
1 a b 1 1 d e RC space non-RC space obj buf dec buf root buf

24 BG-RC Mutation Phase  a b d e b d e obj buf dec buf root buf
Stacks Regs 1 a b 1 1 d e RC space non-RC space b d e obj buf dec buf root buf

25 BG-RC Mutation Phase a b d e b d e obj buf dec buf root buf
Stacks Regs 1 a b 1 1 d e RC space non-RC space b d e obj buf dec buf root buf

26 BG-RC Mutation Phase a b r d e b d e obj buf dec buf root buf
Stacks Regs 1 a b r 1 1 d e RC space non-RC space b d e obj buf dec buf root buf

27 BG-RC Mutation Phase a b r s d e b d e obj buf dec buf root buf
Stacks Regs 1 a b r s 1 1 d e RC space non-RC space b d e obj buf dec buf root buf

28 BG-RC Mutation Phase a b r s d e t b d e obj buf dec buf root buf
Stacks Regs 1 a b r s 1 1 d e t RC space non-RC space b d e obj buf dec buf root buf

29 BG-RC Mutation Phase a b r s d e t b d e obj buf dec buf root buf
Stacks Regs 1 a b r s 1 1 d e t RC space non-RC space b d e obj buf dec buf root buf

30 BG-RC Nursery Collection: Scan Roots
Stacks Regs 1 1 a b r s 1 1 d e t RC space non-RC space b d b e obj buf dec buf root buf

31 BG-RC Nursery Collection: Scan Roots
Stacks Regs 1 1 1 a b s r s 1 1 d e t RC space non-RC space b d b e s obj buf dec buf root buf

32 BG-RC Nursery Collection: Scan Roots
Stacks Regs 1 1 1 a b s r s 1 2 1 d e t t RC space non-RC space b d b e s obj buf dec buf root buf

33 BG-RC Nursery Collection: Process Object Buffer
Stacks Regs 2 1 1 1 a b r s r s 1 3 1 d e t t RC space non-RC space b d b e s obj buf dec buf root buf

34 BG-RC Nursery Collection: Reclaim Nursery
Stacks Regs 2 1 1 1 a b r s r s Reclaim 1 3 1 d e t t RC space non-RC space d b e s obj buf dec buf root buf

35 BG-RC RC Collection: Process Decrement Buffer
Stacks Regs 2 1 1 1 a b r s 3 1 d e t RC space non-RC space d b e s obj buf dec buf root buf

36 BG-RC RC Collection: Recursive Decrement
Stacks Regs 1 1 1 1 a b r s free 3 1 d e t RC space non-RC space e b s obj buf dec buf root buf

37 BG-RC RC Collection: Process Decrement Buffer
Stacks Regs 1 1 1 1 a b r s 2 1 e t RC space non-RC space e b s obj buf dec buf root buf

38 BG-RC Collection Complete!
Stacks Regs 1 1 1 1 a b r s 2 1 e t RC space non-RC space b b s s obj buf dec buf root buf

39 Controlling Pause Times
Modest bounded nursery size Meta Data Decrement and modified object buffers Trigger a collection if too big RC time cap Limits time recursively decrementing RC obj & in cycle detection Cycles - pure RC is incomplete Use Bacon/Rajan trial deletion algorithm

40 Experimental evaluation
Jikes RVM with MMTK Compare MS, BG-MS, BG-RC, RC Examine various heap sizes Collection triggers Each 4MB of allocation for BG-RC (1 MB for RC) Time cap of 60 ms Cycle detection at 512 KB

41 Throughput/Pause time Moderate Heap Size
175 1.53 53 0.98 210 1.00 214 1.23 mean 121 1.14 43 0.96 178 185 1.05 mpeg 1.11 59 1.01 244 238 db 297 1.33 281 264 pjbb 72 0.93 68 0.88 160 .98 cmpress 130 1.75 49 1.04 180 241 1.29 mtrt 133 1.71 1.03 184 203 1.31 raytrace 1.66 44 0.94 1.52 jack 580 1.78 285 268 javac 131 2.36 0.99 181 182 1.91 jess max pause norm time RC BG-RC BG-MS MS

42 Throughput & Responsiveness

43 Conclusion Ulterior design based on careful study of object demographics and making collector aware of them Extends deferred RC to heap objects Practically shows that high throughput & low pause times are compatible


Download ppt "Ulterior Reference Counting Fast GC Without The Wait"

Similar presentations


Ads by Google