Presentation is loading. Please wait.

Presentation is loading. Please wait.

380C Lecture 17 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Why you need to care about workloads.

Similar presentations


Presentation on theme: "380C Lecture 17 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Why you need to care about workloads."— Presentation transcript:

1 380C Lecture 17 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Why you need to care about workloads & experimental methodology –Alias analysis –Dependence analysis –Loop transformations –EDGE architectures 1

2 2 Today Garbage Collection –Why use garbage collection? –What is garbage? Reachable vs live, stack maps, etc. –Allocators and their collection mechanisms Semispace Marksweep Performance comparisons –Incremental age based collection Write barriers: Friend or foe? Generational Beltway –More performance

3 3 Basic VM Structure Executing Program Program/Bytecode Dynamic Compilation Subsystem Class Loader Verifier, etc. Heap Thread Scheduler Garbage Collector

4 4 True or False? Real programmers use languages with explicit memory management. –I can optimize my memory management much better than any garbage collector

5 5 True or False? Real programmers use languages with explicit memory management. –I can optimize my memory management much better than any garbage collector –Scope of effort?

6 6 Why Use Garbage Collection? Software engineering benefits –Less user code compared to explict memory management (MM) –Less user code to get correct –Protects against some classes of memory errors No free(), thus no premature free(), no double free(), or forgetting to free() Not perfect, memory can still leak –Programmers still need to eliminate all pointers to objects the program no longer needs Performance: space time tradeoff –Time proportional to dead objects (explicit mm, reference counting) or live objects (semispace, marksweep) –Throughput versus pause time Less frequent collection, typically reduces total time but increases space requirements and pause times –Hidden locality benefits?

7 7 What is Garbage? In theory, any object the program will never reference again –But compiler & runtime system cannot figure that out In practice, any object the program cannot reach is garbage –Approximate liveness with reachability Managed languages couple GC with “safe” pointers –Programs may not access arbitrary addresses in memory –The compiler can identify and provide to the garbage collector all the pointers, thus –“Once garbage, always garbage” –Runtime system can move objects by updating pointers –“Unsafe” languages can do non-moving GC by assuming anything that looks like a pointer is one.

8 8 Reachability stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... Compiler produces a stack-map at GC safe-points and Type Information Blocks GC safe points: new(), method entry, method exit, & back- edges (thread switch points) Stack-map: enumerate global variables, stack variables, live registers -- This code is hard to get right! Why? Type Information Blocks: identify reference fields in objects

9 9 Reachability stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... Compiler produces a stack-map at GC safe-points and Type Information Blocks Type Information Blocks: identify reference fields in objects for each type i (class) in the program, a map 302 TIB i

10 10 Reachability Tracing collector (semispace, marksweep) –Marks the objects reachable from the roots live, and then performs a transitive closure over them stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... mark

11 11 Reachability Tracing collector (semispace, marksweep) –Marks the objects reachable from the roots live, and then performs a transitive closure over them stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... mark

12 12 Reachability Tracing collector (semispace, marksweep) –Marks the objects reachable from the roots live, and then performs a transitive closure over them stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... mark

13 13 Reachability Tracing collector (semispace, marksweep) –Marks the objects reachable from the roots live, and then performs a transitive closure over them All unmarked objects are dead, and can be reclaimed stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... mark

14 14 Reachability Tracing collector (semispace, marksweep) –Marks the objects reachable from the roots live, and then performs a transitive closure over them All unmarked objects are dead, and can be reclaimed stackglobalsregisters heap A B C {.... r0 = obj PC -> p.f = obj.... sweep

15 15 Today Garbage Collection –Why use garbage collection? –What is garbage? Reachable vs live, stack maps, etc. –Allocators and their collection mechanisms Semispace Marksweep Performance comparisons –Incremental age based collection Write barriers: Friend or foe? Generational Beltway –More performance

16 16 Semispace Fast bump pointer allocation Requires copying collection Cannot incrementally reclaim memory, must free en masse Reserves 1/2 the heap to copy in to, in case all objects are live heap to spacefrom space

17 17 Semispace Fast bump pointer allocation Requires copying collection Cannot incrementally reclaim memory, must free en masse Reserves 1/2 the heap to copy in to, in case all objects are live heap to spacefrom space

18 18 Semispace Fast bump pointer allocation Requires copying collection Cannot incrementally reclaim memory, must free en masse Reserves 1/2 the heap to copy in to, in case all objects are live heap to spacefrom space

19 19 Semispace Fast bump pointer allocation Requires copying collection Cannot incrementally reclaim memory, must free en masse Reserves 1/2 the heap to copy in to, in case all objects are live heap to spacefrom space

20 20 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers heap from spaceto space

21 21 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers –performs transitive closure, updating pointers as it goes heap from spaceto space

22 22 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers –performs transitive closure, updating pointers as it goes heap from spaceto space

23 23 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers –performs transitive closure, updating pointers as it goes heap from spaceto space

24 24 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers –performs transitive closure, updating pointers as it goes –reclaims “from space” en masse heap from spaceto space

25 25 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers –performs transitive closure, updating pointers as it goes –reclaims “from space” en masse –start allocating again into “to space” heap from spaceto space

26 26 Semispace Mark phase: –copies object when collector first encounters it –installs forwarding pointers –performs transitive closure, updating pointers as it goes –reclaims “from space” en masse –start allocating again into “to space” heap from spaceto space

27 27 Semispace Notice: 4fast allocation 4locality of contemporaneously allocated objects 4locality of objects connected by pointers 8wasted space heap from spaceto space

28 28 Marksweep Free-lists organized by size –blocks of same size, or –individual objects of same size Most objects are small < 128 bytes 4 8 12 16... 128 free lists... heap

29 29 Marksweep Allocation –Grab a free object off the free list 4 8 12 16... 128 free lists... heap

30 30 Marksweep Allocation –Grab a free object off the free list 4 8 12 16... 128 free lists... heap

31 31 Marksweep Allocation –Grab a free object off the free list 4 8 12 16... 128 free lists... heap

32 32 Marksweep Allocation –Grab a free object off the free list –No more memory of the right size triggers a collection –Mark phase - find the live objects –Sweep phase - put free ones on the free list 4 8 12 16... 128 free lists... heap

33 33 Marksweep Mark phase –Transitive closure marking all the live objects Sweep phase –sweep the memory for free objects populating free list 4 8 12 16... 128 free lists... heap

34 34 Marksweep Mark phase –Transitive closure marking all the live objects Sweep phase –sweep the memory for free objects populating free list 4 8 12 16... 128 free lists... heap

35 35 Marksweep Mark phase –Transitive closure marking all the live objects Sweep phase –sweep the memory for free objects populating free list 4 8 12 16... 128 free lists... heap

36 36 Marksweep Mark phase –Transitive closure marking all the live objects Sweep phase –sweep the memory for free objects populating free list –can be made incremental by organizing the heap in blocks and sweeping one block at a time on demand 4 8 12 16... 128 free lists... heap

37 37 Marksweep 4space efficiency 4Incremental object reclamation 8relatively slower allocation time 8poor locality of contemporaneously allocated objects 4 8 12 16... 128 free lists... heap

38 38 How do these differences play out in practice? Marksweep 4space efficiency 4Incremental object reclamation 8relatively slower allocation time 8poor locality of contemporaneously allocated objects Semispace 4fast allocation 4locality of contemporaneously allocated objects 4locality of objects connected by pointers 8wasted space

39 39 Methodology [SIGMETRICS 2004] Compare Marksweep (MS) and Semispace (SS) Mutator time, GC time, total time Jikes RVM & MMTk replay compilation measure second iteration without compilation Platforms 1.6GHz G5 (PowerPC 970) 1.9GHz AMD Athlon 2600+ 2.6GHz Intel P4 Linux 2.6.0 with perfctr patch & libraries – Separate accounting of GC & Mutator counts SPECjvm98 & pseudojbb

40 40 Allocation Mechanism Bump pointer – ~70 bytes IA32 instructions, 726MB/s Free list – ~140 bytes IA32 instructions, 654MB/s Bump pointer 11% faster in tight loop – < 1% in practical setting – No significant difference (?)

41 41 Mutator Time

42 42 jess

43 43 jess

44 44 jess

45 45 jess

46 46 javac

47 47 pseudojbb

48 48 Geometric Mean Mutator Time

49 49 Garbage Collection Time

50 50 Garbage Collection Time Geometric mean pseudojbb jess javac

51 51 Total Time

52 52 Total Time pseudojbb Geometric mean jess javac

53 53 MS/SS Crossover: 1.6GHz PPC

54 54 MS/SS Crossover: 1.9GHz AMD

55 55 MS/SS Crossover: 2.6GHz P4

56 56 MS/SS Crossover: 3.2GHz P4

57 57 MS/SS Crossover 2.6GHz 1.9GHz 1.6GHz localityspace 3.2GHz

58 58 Today Garbage Collection –Why use garbage collection? –What is garbage? Reachable vs live, stack maps, etc. –Allocators and their collection mechanisms Semispace Marksweep Performance comparisons –Incremental age based collection Enabling mechanisms –write barrier & remembered sets Heap organizations –Generational –Beltway –Performance comparisons

59 59 One Big Heap? Pause times –it takes to long to trace the whole heap at once Throughput –the heap contains lots of long lived objects, why collect them over and over again? Incremental collection –divide up the heap into increments and collect one at a time. Increment 1 Increment 2 to spacefrom spaceto spacefrom space

60 60 Incremental Collection Ideally perfect pointer knowledge of live pointers between increments requires scanning whole heap, defeats the purpose to spacefrom spaceto spacefrom space Increment 1 Increment 2

61 61 Incremental Collection Ideally perfect pointer knowledge of live pointers between increments requires scanning whole heap, defeats the purpose to spacefrom spaceto spacefrom space Increment 1 Increment 2

62 62 Incremental Collection to spacefrom spaceto spacefrom space Increment 1 Increment 2 Ideally perfect pointer knowledge of live pointers between increments requires scanning whole heap, defeats the purpose

63 63 Incremental Collection Ideally perfect pointer knowledge of live pointers between increments requires scanning whole heap, defeats the purpose Mechanism: Write barrier records pointers between increments when the mutator installs them, conservative approximation of reachability to spacefrom spaceto spacefrom space Increment 1 Increment 2

64 64 Write barrier compiler inserts code that records pointers between increments when the mutator installs them // original program // compiler support for incremental collection p.f = o; if (incr(p) != incr(o) { remembered set (incr(o)) U p.f; } p.f = o; to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g} a b c d e f gt u v w x y z

65 65 Write barrier Install new pointer d -> v // original program // compiler support for incremental collection p.f = o; if (incr(p) != incr(o) { remembered set (incr(o)) U p.f; } p.f = o; to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g} a b c d e f gt u v w x y z

66 66 Write barrier Install new pointer d -> v, then update d-> y // original program // compiler support for incremental collection p.f = o; if (incr(p) != incr(o) { remembered set (incr(o)) = p.f; } p.f = o; to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g,d} a b c d e f gt u v w x y z

67 67 Write barrier Install new pointer d -> v, then update d-> y // original program // compiler support for incremental collection p.f = o; if (incr(p) != incr(o) { remembered set (incr(o)) = p.f; } p.f = o; to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g,d,d} a b c d e f gt u v w x y z

68 68 Write barrier At collection time collector re-examines all entries in the remset for the increment, treating them like roots Collect Increment 2 to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g,d,d} a b c d e f gt u v w x y z

69 69 Write barrier At collection time collector re-examines all entries in the remset for the increment, treating them like roots Collect Increment 2 to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g,d,d} a b c d e f gt u v w x y z

70 70 Summary of the costs of incremental collection write barrier to catch pointer stores crossing boundaries remsets to store crossing pointers processing remembered sets at collection time excess retention to spacefrom spaceto spacefrom space Increment 1 Increment 2 remset 1 ={w}remset 2 ={f,g,d,d} a b c d e f gt u v w x y z

71 71 Heap Organization What objects should we put where? Generational hypothesis –young objects die more quickly than older ones [Lieberman & Hewitt’83, Ungar’84] –most pointers are from younger to older objects [Appel’89, Zorn’90] ÜOrganize the heap in to young and old, collect young objects preferentially to space from space Young Old

72 72 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old

73 73 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old

74 74 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old

75 75 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old

76 76 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old

77 77 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old

78 78 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old

79 79 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old

80 80 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old

81 81 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old

82 82 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to space from space Young Old

83 83 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces –Generalizing to m generations if space n < m fills up, collect n through n-1 to spacefrom spaceto space Young Old

84 84 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces - ignore remembered sets –Generalizing to m generations if space n < m fills up, collect n through n-1 to spacefrom spaceto space Young Old

85 85 Generational Heap Organization Divide the heap in to two spaces: young and old Allocate in to the young space When the young space fills up, –collect it, copying into the old space When the old space fills up –collect both spaces - ignore remembered sets –Generalizing to m generations if space n < m fills up, collect 1 through n-1 to spacefrom spaceto space Young Old

86 86 Generational Write Barrier Unidirectional barrier 4record only older to younger pointers 8no need to record younger to older pointers, since we never collect the old space independently most pointers are from younger to older objects [Appel’89, Zorn’90] track the barrier between young objects and old spaces to space from space Young Old   address barrier

87 87 Generational Write Barrier to space from space Young Old   unidirectional boundary barrier // original program // compiler support for incremental collection p.f = o; if (p > barrier && o < barrier) { remset nursery U p.f; } p.f = o;

88 88 Generational Write Barrier Unidirectional 4record only older to younger pointers 8no need to record younger to older pointers, since we never collect the old space independently –most pointers are from younger to older objects [Appel’89, Zorn’90] –most mutations are to young objects [Stefanovic et al.’99] to space from space Young Old  

89 89 Results

90 90 Garbage Collection Time

91 91 Mutator Time

92 92 Total Time

93 McKinley, UT Recap Copying improves locality Incrementality improves responsiveness Generational hypothesis –Young objects: Most very short lived Infant mortality: ~90% die young (within 4MB of alloc) –Old objects: most very long lived (bimodal) Mature morality: ~5% die each 4MB of new allocation Help from pointer mutations –In Java, pointers go in both directions, but older to younger pointers across many objects are rare less than 1% –Most mutations among young objects 92 to 98% of pointer mutations

94 380C Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Can we get mutator locality, space efficiency, and collector efficiency all in one collector? –Read: Blackburn and McKinley, Immix: A Mark-Region Garbage Collector with Space Efficiency, Fast Collection, and Mutator Performance, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pp. 22-32, Tucson AZ, June 2008. –Why you need to care about workloads & methodology –Alias analysis –Dependence analysis –Loop transformations –EDGE architectures 94


Download ppt "380C Lecture 17 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Why you need to care about workloads."

Similar presentations


Ads by Google