Copyright (c) 2004 Borys Bradel Myths and Realities: The Performance Impact of Garbage Collection Paper: Stephen M. Blackburn, Perry Cheng, and Kathryn.

Slides:



Advertisements
Similar presentations
An Implementation of Mostly- Copying GC on Ruby VM Tomoharu Ugawa The University of Electro-Communications, Japan.
Advertisements

Steve Blackburn Department of Computer Science Australian National University Perry Cheng TJ Watson Research Center IBM Research Kathryn McKinley Department.
CMSC 330: Organization of Programming Languages Memory and Garbage Collection.
Lecture 10: Heap Management CS 540 GMU Spring 2009.
1 Error-Free Garbage Collection Traces: How to Cheat and Not Get Caught ACM SIGMETRICS, 2002.
Garbage Collection CSCI 2720 Spring Static vs. Dynamic Allocation Early versions of Fortran –All memory was static C –Mix of static and dynamic.
Beltway: Getting Around Garbage Collection Gridlock Mrinal Deo CS395T Presentation March 2, Content borrowed from Jennifer Sartor & Kathryn McKinley.
By Jacob SeligmannSteffen Grarup Presented By Leon Gendler Incremental Mature Garbage Collection Using the Train Algorithm.
Efficient Concurrent Mark-Sweep Cycle Collection Daniel Frampton, Stephen Blackburn, Luke Quinane and John Zigman (Pending submission) Presented by Jose.
MC 2 : High Performance GC for Memory-Constrained Environments N. Sachindran, E. Moss, E. Berger Ivan JibajaCS 395T *Some of the graphs are from presentation.
CPSC 388 – Compiler Design and Construction
Increasing Memory Usage in Real-Time GC Tobias Ritzau and Peter Fritzson Department of Computer and Information Science Linköpings universitet
Memory Management. History Run-time management of dynamic memory is a necessary activity for modern programming languages Lisp of the 1960’s was one of.
1 The Compressor: Concurrent, Incremental and Parallel Compaction. Haim Kermany and Erez Petrank Technion – Israel Institute of Technology.
21 September 2005Rotor Capstone Workshop Parallel, Real-Time Garbage Collection Daniel Spoonhower Guy Blelloch, Robert Harper, David Swasey Carnegie Mellon.
Generational Stack Collection And Profile driven Pretenuring Perry Cheng Robert Harper Peter Lee Presented By Moti Alperovitch
Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in OOPSLA’2003.
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
Memory Allocation and Garbage Collection. Why Dynamic Memory? We cannot know memory requirements in advance when the program is written. We cannot know.
Compilation 2007 Garbage Collection Michael I. Schwartzbach BRICS, University of Aarhus.
An Adaptive, Region-based Allocator for Java Feng Qian & Laurie Hendren 2002.
Age-Oriented Concurrent Garbage Collection Harel Paz, Erez Petrank – Technion, Israel Steve Blackburn – ANU, Australia April 05 Compiler Construction Scotland.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Garbage Collection Without Paging Matthew Hertz, Yi Feng, Emery Berger University.
1 Reducing Generational Copy Reserve Overhead with Fallback Compaction Phil McGachey and Antony L. Hosking June 2006.
Uniprocessor Garbage Collection Techniques Paul R. Wilson.
Reference Counters Associate a counter with each heap item Whenever a heap item is created, such as by a new or malloc instruction, initialize the counter.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Garbage Collection Memory Management Garbage Collection –Language requirement –VM service –Performance issue in time and space.
A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao.
SEG Advanced Software Design and Reengineering TOPIC L Garbage Collection Algorithms.
Flexible Reference-Counting-Based Hardware Acceleration for Garbage Collection José A. Joao * Onur Mutlu ‡ Yale N. Patt * * HPS Research Group University.
Taking Off The Gloves With Reference Counting Immix
380C Lecture 17 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Why you need to care about workloads.
Ulterior Reference Counting: Fast Garbage Collection without a Long Wait Author: Stephen M Blackburn Kathryn S McKinley Presenter: Jun Tao.
A Real-Time Garbage Collector Based on the Lifetimes of Objects Henry Lieberman and Carl Hewitt (CACM, June 1983) Rudy Kaplan Depena CS395T: Memory Management.
Fast Conservative Garbage Collection Rifat Shahriyar Stephen M. Blackburn Australian National University Kathryn S. M cKinley Microsoft Research.
Dynamic Object Sampling for Pretenuring Maria Jump Department of Computer Sciences The University of Texas at Austin Stephen M. Blackburn.
The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
Finding Your Cronies: Static Analysis for Dynamic Object Colocation Samuel Z. Guyer Kathryn S. McKinley T H E U N I V E R S I T Y O F T E X A S A T A U.
Computer Science Department Daniel Frampton, David F. Bacon, Perry Cheng, and David Grove Australian National University Canberra ACT, Australia
September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct
Fast Garbage Collection without a Long Wait Steve Blackburn – Kathryn McKinley Presented by: Na Meng Ulterior Reference Counting:
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Immix: A Mark-Region Garbage Collector Curtis Dunham CS 395T Presentation Feb 2, 2011 Thanks to Steve Blackburn and Jennifer Sartor for their 2008 and.
1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),
David F. Bacon Perry Cheng V.T. Rajan IBM T.J. Watson Research Center ControllingFragmentation and Space Consumption in the Metronome.
A REAL-TIME GARBAGE COLLECTOR WITH LOW OVERHEAD AND CONSISTENT UTILIZATION David F. Bacon, Perry Cheng, and V.T. Rajan IBM T.J. Watson Research Center.
Memory Management -Memory allocation -Garbage collection.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Memory Management Overview.
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
Introduction to Garbage Collection. Garbage Collection It automatically reclaims memory occupied by objects that are no longer in use It frees the programmer.
CS 241 Discussion Section (12/1/2011). Tradeoffs When do you: – Expand Increase total memory usage – Split Make smaller chunks (avoid internal fragmentation)
1 The Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT), Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss.
Introduction to Garbage Collection. GC Fundamentals Algorithmic Components AllocationReclamation 2 Identification Bump Allocation Free List ` Tracing.
Immix: A Mark-Region Garbage Collector Jennifer Sartor CS395T Presentation Mar 2, 2009 Thanks to Steve for his Immix presentation from
Dynamic Compilation Vijay Janapa Reddi
Rifat Shahriyar Stephen M. Blackburn Australian National University
Chapter 9 – Real Memory Organization and Management
Automatic Memory Management/GC
Automatic Memory Management/GC
Ulterior Reference Counting Fast GC Without The Wait
David F. Bacon, Perry Cheng, and V.T. Rajan
Memory Management and Garbage Collection Hal Perkins Autumn 2011
Memory Management Kathryn McKinley.
Memory Management Overview
Beltway: Getting Around Garbage Collection Gridlock
Chapter 12 Memory Management
José A. Joao* Onur Mutlu‡ Yale N. Patt*
Garbage Collection Advantage: Improving Program Locality
Presentation transcript:

Copyright (c) 2004 Borys Bradel Myths and Realities: The Performance Impact of Garbage Collection Paper: Stephen M. Blackburn, Perry Cheng, and Kathryn S. McKinley Presentation: Borys Bradel

2 Introduction Garbage collection makes life easier Additional level of abstraction No worries about memory management How to make it efficient? What are the tradeoffs?

3 Outline Terminology Garbage Collectors Benchmarks Results Conclusions

4 Runtime Components Mutator: the executing program includes time for object allocation and, if necessary, a write barrier Garbage collectors: allocate memory in a simple manner periodically identify unused memory and free it

5 Allocators Contiguous: append new objects Free-List: k size-segregated free-lists Example: a: allocate 8 bytes b: allocate 12 bytes c: allocate 16 bytes free b d: allocate 16 bytes

6 Contiguous a: 8 b: 12 c: 16 free b d: 16 a

7 Contiguous a: 8 b: 12 c: 16 free b d: 16 ab

8 Contiguous a: 8 b: 12 c: 16 free b d: 16 abc

9 Contiguous a: 8 b: 12 c: 16 free b d: 16 abc

10 Contiguous a: 8 b: 12 c: 16 free b d: 16 abcd Good spatial locality sequential allocation creates sequential memory location No reclamation quickly run out of space

11 Free-List a: 8 b: 12 c: 16 free b d: 16 a 8 byte 16 byte

12 Free-List a: 8 b: 12 c: 16 free b d: 16 a b 8 byte 16 byte

13 Free-List a: 8 b: 12 c: 16 free b d: 16 a b c 8 byte 16 byte

14 Free-List a: 8 b: 12 c: 16 free b d: 16 a c 8 byte 16 byte

15 Free-List a: 8 b: 12 c: 16 free b d: 16 a cd 8 byte 16 byte Bad spatial locality: different queues Reclamation Some fragmentation

16 Collectors Whole heap work on entire heap treats everything the same: slow Generational divide heap into old and new optimize for the common case: faster Tracing – compute transitive closure Reference Counting

17 Example a b c d e f g h

18 Tracing – Collection 1 a b c d e f g h Roots

19 Tracing – Collection 1 a b c d e f g h Roots Live

20 Tracing – Collection 1 a b c d e f g h Roots Live

21 Tracing - Change a b d e f g h Roots

22 Tracing – Collection 2 a b d e f g h Roots Live Dead

23 SemiSpace Contiguous allocator Divide space into two At collection time move live data over d e f g h defgh

24 d e f g h SemiSpace defgh

25 SemiSpace Waste of space Copies long lived objects many times Time proportional to survivors d f df

26 Mark and Sweep Free-list allocator Collect when heap is full Use bitmaps Copies long lived objects many times d ef g h d ef g h d f

27 Reference Counting a b c d: 2 e: 1 f: 1 g: 1 h: 2 Roots

28 Reference Counting a b d: 2 e: 0 f: 1 g: 1 h: 2 Roots

29 Reference Counting a b d: 2 f: 1 g: 0 h: 1 Roots

30 Reference Counting a b d: 2 f: 1 h: 0 Roots

31 Reference Counting a b d: 2 f: 1 Roots Free-List allocator Time proportional to dead objects Detects cycles Uses object logging at writing

32 Generational Collector “Weak generational hypothesis” the young die quickly, the old die slowly Put young objects in a nursery When nursery full, collect it and move survivors to old generation, a la SemiSpace When heap full, perform all out collection

33 Generational Example d e f g h NurseryMature

34 Generational Example d f NurseryMature i k m j l

35 Generational Example d f NurseryMature j l i m

36 Generational Example d f NurseryMature n o r p q j l i m

37 Generational Example d f NurseryMature o r q j l i m

38 Generational Collectors On writes, record pointers from mature to nursery objects Nursery is smaller than half of memory generally just contiguously allocated survivors copied over into mature space Mature space uses one of the three collectors – has their benefits/pitfalls

39 Benchmarks Low memory usage: 201, 222 Low nursery survival: 202, 228, 205, 227 High nursery survival: 213, 209, jbb focus: accesses/size Table 1 in [1]

40 Results Larger heap size decreases frequency of collection, up to a point Small difference in sizes cause different behaviours Generational are big win Less examined at each collection Fewer collections too, especially Mark and Sweep Figure 1 in [1]

41 More Results Write Barrier Up to 13.6% overhead, average of 3.2% Benefit from generations more than makes up for the overhead Free-List versus contiguous allocation Free-List 11% slower.

42 Mutator Costs Whole heap SemiSpace 7-15% over Mark and Sweep Mainly due to cache effects Generational Locality of mature objects not a factor (except when it is, like in jbb) So fewer collections for Mark and Sweep are a win Figure 2 in [1]

43 More Results For small heaps, collection overhead dominates – want generational Mark and Sweep For large heaps generational SemiSpace Reference counting too expensive

44 Infinite Heaps No garbage collection has small effect on mutator compared to when garbage collection is performed Most reference patterns exhibit temporal locality, not spatial locality When spatial locality is important, garbage collection results in better performance

45 Nursery Size Sweetspot well beyond size of L2 cache Larger size leads to fewer collections Eventually too big Figure 4 in [1]

46 Conclusions Contiguous allocation yields better locality Free-lists are more space efficient, so less collection Counters myth that collection frequency is first order effect on performance? Explicit allocation is bad because free list can’t capture locality???

47 References Figures and Tables are all from: [1] Stephen M. Blackburn, Perry Cheng, and Kathryn S. McKinley. Myths and Realities: The Performance of Garbage Collection. Sigmetrics – Performance, 2004.