Dynamic Compilation Vijay Janapa Reddi

Slides:



Advertisements
Similar presentations
Garbage collection David Walker CS 320. Where are we? Last time: A survey of common garbage collection techniques –Manual memory management –Reference.
Advertisements

Steve Blackburn Department of Computer Science Australian National University Perry Cheng TJ Watson Research Center IBM Research Kathryn McKinley Department.
Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.
CMSC 330: Organization of Programming Languages Memory and Garbage Collection.
Lecture 10: Heap Management CS 540 GMU Spring 2009.
Garbage Collection What is garbage and how can we deal with it?
CMSC 330: Organization of Programming Languages Memory and Garbage Collection.
MC 2 : High Performance GC for Memory-Constrained Environments - Narendran Sachindran, J. Eliot B. Moss, Emery D. Berger Sowmiya Chocka Narayanan.
Garbage Collection  records not reachable  reclaim to allow reuse  performed by runtime system (support programs linked with the compiled code) (support.
Efficient Concurrent Mark-Sweep Cycle Collection Daniel Frampton, Stephen Blackburn, Luke Quinane and John Zigman (Pending submission) Presented by Jose.
MC 2 : High Performance GC for Memory-Constrained Environments N. Sachindran, E. Moss, E. Berger Ivan JibajaCS 395T *Some of the graphs are from presentation.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
CPSC 388 – Compiler Design and Construction
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
1 The Compressor: Concurrent, Incremental and Parallel Compaction. Haim Kermany and Erez Petrank Technion – Israel Institute of Technology.
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
Memory Allocation and Garbage Collection. Why Dynamic Memory? We cannot know memory requirements in advance when the program is written. We cannot know.
JVM-1 Introduction to Java Virtual Machine. JVM-2 Outline Java Language, Java Virtual Machine and Java Platform Organization of Java Virtual Machine Garbage.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Garbage Collection Without Paging Matthew Hertz, Yi Feng, Emery Berger University.
Garbage collection (& Midterm Topics) David Walker COS 320.
Linked lists and memory allocation Prof. Noah Snavely CS1114
Uniprocessor Garbage Collection Techniques Paul R. Wilson.
Reference Counters Associate a counter with each heap item Whenever a heap item is created, such as by a new or malloc instruction, initialize the counter.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Garbage Collection Memory Management Garbage Collection –Language requirement –VM service –Performance issue in time and space.
1 Overview Assignment 6: hints  Living with a garbage collector Assignment 5: solution  Garbage collection.
SEG Advanced Software Design and Reengineering TOPIC L Garbage Collection Algorithms.
380C Lecture 17 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Why you need to care about workloads.
Ulterior Reference Counting: Fast Garbage Collection without a Long Wait Author: Stephen M Blackburn Kathryn S McKinley Presenter: Jun Tao.
Lecture 10 : Introduction to Java Virtual Machine
Fast Conservative Garbage Collection Rifat Shahriyar Stephen M. Blackburn Australian National University Kathryn S. M cKinley Microsoft Research.
Copyright (c) 2004 Borys Bradel Myths and Realities: The Performance Impact of Garbage Collection Paper: Stephen M. Blackburn, Perry Cheng, and Kathryn.
11/26/2015IT 3271 Memory Management (Ch 14) n Dynamic memory allocation Language systems provide an important hidden player: Runtime memory manager – Activation.
Runtime System CS 153: Compilers. Runtime System Runtime system: all the stuff that the language implicitly assumes and that is not described in the program.
Garbage Collection and Memory Management CS 480/680 – Comparative Languages.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo.
Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
Introduction to Garbage Collection. Garbage Collection It automatically reclaims memory occupied by objects that are no longer in use It frees the programmer.
CS412/413 Introduction to Compilers and Translators April 21, 1999 Lecture 30: Garbage collection.
GC Assertions: Using the Garbage Collector To Check Heap Properties Samuel Z. Guyer Tufts University Edward Aftandilian Tufts University.
Memory Management CSCI 2720 Spring What is memory management? “the prudent utilization of this scarce resource (memory), whether by conservation,
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
Garbage Collection What is garbage and how can we deal with it?
CSE 374 Programming Concepts & Tools
Inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 7 – More Memory Management Lecturer PSOE Dan Garcia
Dynamic Memory Allocation
Rifat Shahriyar Stephen M. Blackburn Australian National University
Storage Management.
CS 153: Concepts of Compiler Design November 28 Class Meeting
Concepts of programming languages
Automatic Memory Management/GC
Storage.
Simulated Pointers.
David F. Bacon, Perry Cheng, and V.T. Rajan
Smart Pointers.
Memory Management and Garbage Collection Hal Perkins Autumn 2011
Simulated Pointers.
Strategies for automatic memory management
Memory Management Kathryn McKinley.
Chapter 12 Memory Management
Created By: Asst. Prof. Ashish Shah, J.M.Patel College, Goregoan West
CS703 - Advanced Operating Systems
RUN-TIME STORAGE Chuen-Liang Chen Department of Computer Science
Inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 7 – More Memory Management Lecturer PSOE Dan Garcia
Automating Memory Management
CMPE 152: Compiler Design May 2 Class Meeting
Garbage Collection What is garbage and how can we deal with it?
Reference Counting vs. Tracing
Presentation transcript:

Dynamic Compilation Vijay Janapa Reddi The University of Texas at Austin Garbage Collection 1

Today Garbage Collection Why use garbage collection? What is garbage? Reachable vs live, stack maps, etc. Allocators and their collection mechanisms Semispace Marksweep Performance comparisons Incremental age based collection Write barriers: Friend or foe? Generational Beltway More performance

Basic VM Structure Program/Bytecode Executing Program Class Loader Verifier, etc. Heap Thread Scheduler Dynamic Compilation Subsystem Garbage Collector

True or False? Real programmers use languages with explicit memory management? I can optimize my memory management much better than any garbage collector

True or False? Real programmers use languages with explicit memory management. I can optimize my memory management much better than any garbage collector Scope of effort?

Why Use Garbage Collection? Software engineering benefits Less user code compared to explict memory management (MM) Less user code to get correct Protects against some classes of memory errors No free(), thus no premature free(), no double free(), or forgetting to free() Not perfect, memory can still “leak” Programmers still need to eliminate all pointers to objects the program no longer needs Performance: space/time tradeoff Time proportional to dead objects (explicit mm, reference counting) or live objects (semispace, marksweep) Throughput versus pause time Less frequent collection, typically reduces total time but can increase space requirements and pause times Hidden locality benefits?

GC, A tool for all occasions? When might you NOT be willing to use a garbage collector?

What is Garbage? In theory, any object the program will never reference again But compiler & runtime system cannot figure that out In practice, any object the program cannot reach is garbage Approximate liveness with reachability OK, so how do we what data is reachable? Keep track of pointers They tell you how to “reach” some other piece of data What about programming languages like C? X = *(<arbitrary address >) Everything is (potentially) reachable It’s up to the programmer… malloc() & free()

What is Garbage? Managed languages couple GC with “safe” pointers Programs may not access arbitrary addresses in memory The compiler can identify and provide to the garbage collector all the pointers, thus “Once garbage, always garbage” Runtime system can potentially relocate objects by updating pointers

Reference Counting If we know whenever we assign a pointer, we update a reference count. When it is decremented to 0, it is freed. Consider: Pop of a stack… Head 1 1 1

Reference Counting If we know whenever we assign a pointer, we update a reference count. When it is decremented to 0, it is freed. Consider: Pop of a stack… Head 2 1

Reference Counting If we know whenever we assign a pointer, we update a reference count. When it is decremented to 0, it is freed. Consider: Pop of a stack… Head 1 1

Reference Counting What if we want to delete this deque? Head = Tail = NULL; Head Tail 2 2 2

Reference Counting What if we want to delete this deque? Head = Tail = NULL; Head NULL Tail 1 2 1

Reference Counting What if we want to delete this deque? Head = Tail = NULL; Head NULL Tail 1 2 1 Cycles lead to Orphaned Garbage

Reference Counting Reference Counting is used in C++ “smart pointers” “Shared” pointers that cause reference counting “Weak” pointers won’t keep an object alive (don’t affect reference count) Programmers need to pay attention to right kind of pointer Sort-of a half-way between manual new/delete and “real” garbage collection Head Tail 1 1 2

Tracing Collectors More robust solution are “tracing” collectors Start with a “root set” of all references Objects you “know” without having to have a pointer to them (e.g. globals) See what the objects in the root set point to, follow those, etc. Need to know how to find pointers within an object Bascially, traces through to find all the reachable objects Everything else must be garbage

{ Tracing Collectors .... r0 = obj globals stack registers heap A B C Compiler produces a stack-map at GC safe-points and Type Information Blocks GC safe points: new(), method entry, method exit, & back-edges (thread switch points) Stack-map: enumerate global variables, stack variables, live registers -- This code is hard to get right! Why? Type Information Blocks: identify reference fields in objects A B C { .... r0 = obj PC -> p.f = obj globals stack registers heap

{ Tracing Collectors .... r0 = obj globals stack registers heap A B C Compiler produces a stack-map at GC safe-points and Type Information Blocks Type Information Blocks: identify reference fields in objects for each type i (class) in the program, a map TIBi 2 3 A B C { .... r0 = obj PC -> p.f = obj globals stack registers heap

{ Tracing Collectors Tracing collector (semispace, marksweep) mark Marks the objects reachable from the roots live, and then performs a transitive closure over them mark A B C { .... r0 = obj PC -> p.f = obj globals stack registers heap

{ Tracing Collectors Tracing collector (semispace, marksweep) mark Marks the objects reachable from the roots live, and then performs a transitive closure over them mark A B C { .... r0 = obj PC -> p.f = obj globals stack registers heap

{ Tracing Collectors Tracing collector (semispace, marksweep) mark Marks the objects reachable from the roots live, and then performs a transitive closure over them mark A B C { .... r0 = obj PC -> p.f = obj globals stack registers heap

{ Tracing Collectors Tracing collector (semispace, marksweep) Marks the objects reachable from the roots live, and then performs a transitive closure over them All unmarked objects are dead, and can be reclaimed mark A B C { .... r0 = obj PC -> p.f = obj globals stack registers heap

{ Tracing Collectors Tracing collector (semispace, marksweep) Marks the objects reachable from the roots live, and then performs a transitive closure over them All unmarked objects are dead, and can be reclaimed sweep A B C { .... r0 = obj PC -> p.f = obj globals stack registers heap

Conservative Collectors What if we didn’t have the type information block? That is, we can’t identify the pointers within an object

Conservative Collectors What if we didn’t have the type information block? That is, we can’t identify the pointers within an object e.g. with C or vanilla C++ pointers Answer: Do “Conservative Collection” Treat every value like it might be a pointer If it looks like it might point to a memory region in the heap, assume it is a pointer Trace the block of data that was “malloc’d” that contains that address In some architectures, pointers must be word-aligned (least significant two bits are zero) which helps filter out random integers But “unfortunate integers” can keep memory alive Also, can’t move objects since we can’t safely backpatch pointers (since, they might really be integers)

Today Garbage Collection Why use garbage collection? What is garbage? Reachable vs live, stack maps, etc. Allocators and their collection mechanisms Semispace Marksweep Performance comparisons Incremental age based collection Write barriers: Friend or foe? Generational Beltway More performance

Semispace Fast bump pointer allocation Requires copying collection Cannot incrementally reclaim memory, must free en masse Reserves 1/2 the heap to copy in to, in case all objects are live to space from space heap

Semispace Fast bump pointer allocation Requires copying collection Cannot incrementally reclaim memory, must free en masse Reserves 1/2 the heap to copy in to, in case all objects are live to space from space heap

Semispace Fast bump pointer allocation Requires copying collection Cannot incrementally reclaim memory, must free en masse Reserves 1/2 the heap to copy in to, in case all objects are live to space from space heap

Semispace Fast bump pointer allocation Requires copying collection Cannot incrementally reclaim memory, must free en masse Reserves 1/2 the heap to copy in to, in case all objects are live to space from space heap

Semispace Mark phase: copies object when collector first encounters it installs forwarding pointers from space to space heap

Semispace Mark phase: copies object when collector first encounters it installs forwarding pointers performs transitive closure, updating pointers as it goes from space to space heap

Semispace Mark phase: copies object when collector first encounters it installs forwarding pointers performs transitive closure, updating pointers as it goes from space to space heap

Semispace Mark phase: copies object when collector first encounters it installs forwarding pointers performs transitive closure, updating pointers as it goes from space to space heap

Semispace Mark phase: copies object when collector first encounters it installs forwarding pointers performs transitive closure, updating pointers as it goes reclaims “from space” en masse from space to space heap

Semispace Mark phase: from space to space heap copies object when collector first encounters it installs forwarding pointers performs transitive closure, updating pointers as it goes reclaims “from space” en masse start allocating again into “to space” from space to space heap

Semispace Mark phase: from space to space heap copies object when collector first encounters it installs forwarding pointers performs transitive closure, updating pointers as it goes reclaims “from space” en masse start allocating again into “to space” from space to space heap

Semispace Notice: fast allocation locality of contemporaneously allocated objects locality of objects connected by pointers wasted space from space to space heap

Marksweep Free-lists organized by size blocks of same size, or individual objects of same size Most objects are small < 128 bytes 4 8 12 16 ... 128 ... heap ... free lists

Marksweep Allocation heap free lists Grab a free object off the free list 4 8 12 16 ... 128 ... heap ... free lists

Marksweep Allocation heap free lists Grab a free object off the free list 4 8 12 16 ... 128 ... heap ... free lists

Marksweep Allocation heap free lists Grab a free object off the free list 4 8 12 16 ... 128 ... heap ... free lists

Marksweep heap free lists Allocation Grab a free object off the free list No more memory of the right size triggers a collection Mark phase - find the live objects Sweep phase - put free ones on the free list 4 8 12 16 ... 128 ... heap ... free lists

Marksweep heap free lists Mark phase Sweep phase Transitive closure marking all the live objects Sweep phase sweep the memory for free objects populating free list 4 8 12 16 ... 128 ... heap ... free lists

Marksweep heap free lists Mark phase Sweep phase Transitive closure marking all the live objects Sweep phase sweep the memory for free objects populating free list 4 8 12 16 ... 128 ... heap ... free lists

Marksweep heap free lists Mark phase Sweep phase Transitive closure marking all the live objects Sweep phase sweep the memory for free objects populating free list 4 8 12 16 ... 128 ... heap ... free lists

Marksweep heap free lists Mark phase Sweep phase Transitive closure marking all the live objects Sweep phase sweep the memory for free objects populating free list can be made incremental by organizing the heap in blocks and sweeping one block at a time on demand 4 8 12 16 ... 128 ... heap ... free lists

Marksweep heap free lists space efficiency Incremental object reclamation relatively slower allocation time poor locality of contemporaneously allocated objects 4 8 12 16 ... 128 ... heap ... free lists

How do these differences play out in practice? Marksweep space efficiency Incremental object reclamation relatively slower allocation time poor locality of contemporaneously allocated objects Semispace fast allocation locality of contemporaneously allocated objects locality of objects connected by pointers wasted space

Methodology [SIGMETRICS 2004] Compare Marksweep (MS) and Semispace (SS) Mutator time, GC time, total time Jikes RVM & MMTk replay compilation measure second iteration without compilation Platforms 1.6GHz G5 (PowerPC 970) 1.9GHz AMD Athlon 2600+ 2.6GHz Intel P4 Linux 2.6.0 with perfctr patch & libraries Separate accounting of GC & Mutator counts SPECjvm98 & pseudojbb

Allocation Mechanism Bump pointer Free list ~70 bytes IA32 instructions, 726MB/s Free list ~140 bytes IA32 instructions, 654MB/s Bump pointer 11% faster in tight loop < 1% in practical setting No significant difference (?)

Mutator Time

jess

jess

jess

jess

javac

pseudojbb

Geometric Mean Mutator Time

Garbage Collection Time

Garbage Collection Time javac pseudojbb jess Geometric mean

Total Time

Total Time javac pseudojbb jess Geometric mean

MS/SS Crossover: 1.6GHz PPC

MS/SS Crossover: 1.9GHz AMD

MS/SS Crossover: 2.6GHz P4

MS/SS Crossover: 3.2GHz P4

MS/SS Crossover locality space 2.6GHz 1.6GHz 3.2GHz 1.9GHz