Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mark and Split Kostis Sagonas Uppsala Univ., Sweden NTUA, Greece Jesper Wilhelmsson Uppsala Univ., Sweden.

Similar presentations


Presentation on theme: "Mark and Split Kostis Sagonas Uppsala Univ., Sweden NTUA, Greece Jesper Wilhelmsson Uppsala Univ., Sweden."— Presentation transcript:

1 Mark and Split Kostis Sagonas Uppsala Univ., Sweden NTUA, Greece Jesper Wilhelmsson Uppsala Univ., Sweden

2 Mark and Split2 Copying vs. Mark-Sweep Copying Collection + GC time proportional to the size of the live data set - requires non-negligible additional space moves objects compacts the heap Mark-Sweep Collection - GC time proportional to the size of the collected heap + requires relatively little additional space non-moving collector may require compaction

3 Mark and Split3 Mark-Sweep Collection Algorithm proc mark_sweep_gc()  foreach root  rootset do mark( * root) sweep() proc mark(object)  if marked(object) = false  marked(object) := true foreach pointer in object do mark( * pointer)

4 Mark and Split4 Variants of Mark-Sweep Lazy sweeping [Hughes 1982; Boehm 2000] –Defer the sweep phase until allocation time and then perform it on a demand-driven (“pay-as-you-go”) way –Improves paging and/or cache behavior Selective sweeping [Chung, Moon, Ebcioĝlu, Sahlin] –During marking, record the addresses of all marked objects in an array (outside the heap) –Once marking is finished, sort these addresses –Perform the sweep phase selectively guided by the sorted addresses

5 Mark and Split5 Mark-Split Collection: Idea Rather than (lazily/selectively) sweeping the heap after marking to locate free areas, maintain information about them during marking. More specifically, optimistically assume that the entire heap will be free after collection and let the mark phase “repair” the free list by “rescuing” the memory of live objects.

6 Mark and Split6 Mark-Split Collection: Illustration Heap to be collected One free interval Marking splits a free interval Two free intervals Marking splits another free interval Three free intervals Marking does not always increase the number of free intervals! Three free intervals Marking can actually decrease the number of free intervals! Two free intervals

7 Mark and Split7 proc mark_sweep_gc()  foreach root  rootset do mark( * root) sweep() proc mark(object)  if marked(object) = false  marked(object) := true foreach pointer in object do mark( * pointer) proc mark_?????_gc()  foreach root  rootset do mark( * root) proc mark(object)  if marked(object) = false  marked(object) := true foreach pointer in object do mark( * pointer) proc mark_sweep_gc()  foreach root  rootset do mark( * root) sweep() proc mark(object)  if marked(object) = false  marked(object) := true foreach pointer in object do mark( * pointer) proc mark_split_gc()  insert_interval(heap_start, heap_end) foreach root  rootset do mark( * root) proc mark(object)  if marked(object) = false  marked(object) := true split(find_interval(&object), object) foreach pointer in object do mark( * pointer) Mark-Split Collection: Algorithm (1)

8 Mark and Split8 Mark-Split Collection: Algorithm (2) proc split(interval, object)  objectEnd := &object + size(object) keepLeft := keep_interval(&object – interval.start) keepRight := keep_interval(interval.end – objectEnd) if keepLeft  keepRight  insert_interval(objectEnd, interval.end) // Case 1 interval.end := &object else if keepLeft  interval.end := &object // Case 2 else if keepRight  interval.start := objectEnd// Case 3 else remove_interval(interval.end)// Case 4 funct keep_interval(size)  return size  T // T is a threshold

9 Mark and Split9 Mark-Split Collection: Data Structure For storing the free intervals we need a data structure that allows for: –Fast location of an interval (find_interval ) –Fast insertion of new intervals (insert_interval ) Data structures with these properties are: –Balanced search trees –Splay trees –Skip lists –… In our implementation we used the AA tree [Andersson 1993]

10 Mark and Split10 Mark-Split Collection: Best Cases When nothing is live When marking is consecutiveWhen live data set is a small percentage of the heap

11 Mark and Split11 Mark-Split Collection: Worst Case Note: - the number of free intervals is at most #L + 1 - this number will start decreasing once L  H/2

12 Mark and Split12 Time Complexity Copying O(L) Mark-sweep O(L) + O(H) Selective sweeping O(L) + O(L log L) + O(L) Mark-split O(L log I) where: L = size of live data set H = size of heap I = number of free intervals Note: 1.I  L  H 2.I is bounded by #L+1 if L < H/2 H/(2o) if L  H/2 where o = size of smallest object

13 Mark and Split13 Space Requirements Best Worst Copying L H Mark-sweep M M Selective sweepingM + #L M + #H Mark-split M + kM + k(H/2o) where: L = size of live data set o = size of smallest object H = size of heap k = size of interval node M = size of mark bit area

14 Mark and Split14 Mark-Split vs. Selective Sweeping Mark-coalesce (the dual of mark-split) –Maintains information about occupied intervals –Can be seen as a variant of selective sweeping that eagerly merges neighboring marked intervals –Requires an extra pass at the end of collection to construct the free intervals list Assume marking is consecutive Mark-split requires significantly less auxiliary space than selective sweeping

15 Mark and Split15 Mark-Split vs. Lazy Sweeping Lazy sweeping does not affect the complexity of collection But often improves the cache performance of applications run with GC because –It avoids (some) negative caching effects Sweep phase disturbs the cache –Compared with “plain” mark-sweep, it has positive caching effects Memory to allocate to is typically in the cache during object initialization

16 Mark and Split16 Adaptive Schemes Basic idea is simple: –Optimistically start with mark-split –If it is detected that the cost will be too high, revert to mark-sweep Criteria for switching: –Auxiliary space is exhausted –Number of tree nodes visited is too big –Keep a record of prior history (last N collections) –… Note that no single mark-split collection that reverts to mark-sweep can be faster than a mark-sweep only collection, but a sequence of adaptive collections can!

17 Mark and Split17 Implementation Done in BEA’s JRockit –Mark-sweep collector has existed for quite long –Sweeps the heap by examining whole words of the bitmap array Mark-split’s code is about 600 lines of C –The threshold T is set at 2KB (because of TLA) Benchmarking environment: –4 processor Intel Xeon 2GHz with hyper-threading –512KB of cache, 8GB of RAM running Linux –SPECjvm98 benchmarks run for 50 iterations

18 Mark and Split18 Performance Evaluation on SPECjvm98 compress

19 Mark and Split19 Performance Evaluation on SPECjvm98 jess

20 Mark and Split20 Performance Evaluation on SPECjvm98 db javac mtrt jack

21 Mark and Split21 Performance Evaluation on SPECjvm98 compress

22 Mark and Split22 Performance Evaluation on SPECjvm98 jess

23 Mark and Split23 Performance Evaluation on SPECjvm98 db javac mtrt jack

24 Mark and Split24 SPECjvm98 – GC times on a 128MB heap

25 Mark and Split25 SPECjvm98 – GC times on a 512MB heap

26 Mark and Split26 SPECjvm98 – GC times on a 2GB heap

27 Mark and Split27 Other Measurements (on SPECjvm98)

28 Mark and Split28 Performance Evaluation on SPECjbb

29 Mark and Split29 Concluding Remarks on Mark-Split New non-moving garbage collection algorithm: –Based on a simple idea: maintaining free intervals during marking, rather than sweeping the heap to find them –Makes GC cost proportional to the size of the live data set, not the size of the heap that is collected –Requires very small additional space –Exploits the fact that in most programs live data tends to form (large) neighborhoods


Download ppt "Mark and Split Kostis Sagonas Uppsala Univ., Sweden NTUA, Greece Jesper Wilhelmsson Uppsala Univ., Sweden."

Similar presentations


Ads by Google