CMSC 330: Organization of Programming Languages Memory and Garbage Collection.

Slides:



Advertisements
Similar presentations
Garbage collection David Walker CS 320. Where are we? Last time: A survey of common garbage collection techniques –Manual memory management –Reference.
Advertisements

Dynamic Memory Management
CMSC 330: Organization of Programming Languages Memory and Garbage Collection.
Lecture 10: Heap Management CS 540 GMU Spring 2009.
Garbage Collection What is garbage and how can we deal with it?
Garbage Collection Introduction and Overview Christian Schulte Excerpted from presentation by Christian Schulte Programming Systems Lab Universität des.
Prof. Necula CS 164 Lecture 141 Run-time Environments Lecture 8.
Garbage Collection  records not reachable  reclaim to allow reuse  performed by runtime system (support programs linked with the compiled code) (support.
5. Memory Management From: Chapter 5, Modern Compiler Design, by Dick Grunt et al.
Memory Management Tom Roeder CS fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)
Run-time organization  Data representation  Storage organization: –stack –heap –garbage collection Programming Languages 3 © 2012 David A Watt,
Garbage Collection CSCI 2720 Spring Static vs. Dynamic Allocation Early versions of Fortran –All memory was static C –Mix of static and dynamic.
Hastings Purify: Fast Detection of Memory Leaks and Access Errors.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
CPSC 388 – Compiler Design and Construction
CS 536 Spring Automatic Memory Management Lecture 24.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
Memory Management. History Run-time management of dynamic memory is a necessary activity for modern programming languages Lisp of the 1960’s was one of.
CS 1114: Data Structures – memory allocation Prof. Graeme Bailey (notes modified from Noah Snavely, Spring 2009)
CS 61C L07 More Memory Management (1) Garcia, Fall 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C.
Memory Allocation. Three kinds of memory Fixed memory Stack memory Heap memory.
CS 536 Spring Run-time organization Lecture 19.
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
Run time vs. Compile time
Pointers and Dynamic Variables. Objectives on completion of this topic, students should be able to: Correctly allocate data dynamically * Use the new.
Garbage collection (& Midterm Topics) David Walker COS 320.
Linked lists and memory allocation Prof. Noah Snavely CS1114
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Reference Counters Associate a counter with each heap item Whenever a heap item is created, such as by a new or malloc instruction, initialize the counter.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Garbage Collection Memory Management Garbage Collection –Language requirement –VM service –Performance issue in time and space.
SEG Advanced Software Design and Reengineering TOPIC L Garbage Collection Algorithms.
Dynamic Memory Allocation Questions answered in this lecture: When is a stack appropriate? When is a heap? What are best-fit, first-fit, worst-fit, and.
1 Names, Scopes and Bindings Aaron Bloomfield CS 415 Fall
C. Varela; Adapted w/permission from S. Haridi and P. Van Roy1 Declarative Computation Model Memory management (CTM 2.5) Carlos Varela RPI April 6, 2015.
Storage Management. The stack and the heap Dynamic storage allocation refers to allocating space for variables at run time Most modern languages support.
Runtime Environments. Support of Execution  Activation Tree  Control Stack  Scope  Binding of Names –Data object (values in storage) –Environment.
1 Lecture 22 Garbage Collection Mark and Sweep, Stop and Copy, Reference Counting Ras Bodik Shaon Barman Thibaud Hottelier Hack Your Language! CS164: Introduction.
C++ Memory Overview 4 major memory segments Key differences from Java
CS 2130 Lecture 5 Storage Classes Scope. C Programming C is not just another programming language C was designed for systems programming like writing.
1 Dynamic Memory Allocation –The need –malloc/free –Memory Leaks –Dangling Pointers and Garbage Collection Today’s Material.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 9.
11/26/2015IT 3271 Memory Management (Ch 14) n Dynamic memory allocation Language systems provide an important hidden player: Runtime memory manager – Activation.
1 Languages and Compilers (SProg og Oversættere) Heap allocation and Garbage Collection.
Pointers in C Computer Organization I 1 August 2009 © McQuain, Feng & Ribbens Memory and Addresses Memory is just a sequence of byte-sized.
Heap storage & Garbage collection Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001.
Garbage Collection and Memory Management CS 480/680 – Comparative Languages.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Runtime The optimized program is ready to run … What sorts of facilities are available at runtime.
CS412/413 Introduction to Compilers and Translators April 21, 1999 Lecture 30: Garbage collection.
Runtime Environments Chapter 7. Support of Execution  Activation Tree  Control Stack  Scope  Binding of Names –Data object (values in storage) –Environment.
Object Lifetime and Pointers
Garbage Collection What is garbage and how can we deal with it?
Inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 7 – More Memory Management Lecturer PSOE Dan Garcia
Storage Management.
CS 153: Concepts of Compiler Design November 28 Class Meeting
Concepts of programming languages
Automatic Memory Management
Storage.
Memory Management and Garbage Collection Hal Perkins Autumn 2011
Memory Allocation CS 217.
Closure Representations in Higher-Order Programming Languages
Binding Times Binding is an association between two things Examples:
Heap storage & Garbage collection
Lecturer PSOE Dan Garcia
Heap storage & Garbage collection
Inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 7 – More Memory Management Lecturer PSOE Dan Garcia
Automating Memory Management
CMPE 152: Compiler Design May 2 Class Meeting
Garbage Collection What is garbage and how can we deal with it?
Presentation transcript:

CMSC 330: Organization of Programming Languages Memory and Garbage Collection

CMSC 3302 Memory attributes Memory has several attributes: –Persistence (or lifetime) – How long the memory exists –Allocation – When the memory is available for use –Recovery – When the system recovers the memory for reuse Most programming languages are concerned with some subset of the following 4 memory classes: –Fixed (or static) memory –Automatic (or stack) memory –Allocated (or heap) memory –Persistent (or disk) memory

CMSC 3303 Memory classes Static memory – Usually a fixed address in memory –Persistence – Lifetime of execution of program –Allocation – By compiler for entire execution –Recovery – By system when program terminates Automatic memory – Usually on a stack –Persistence – Lifetime of method using that data –Allocation – When method is invoked –Recovery – When method terminates

CMSC 3304 Memory classes Allocated memory – Usually memory on a heap –Persistence – As long as memory is needed –Allocation – Explicitly by programmer –Recovery – Either by programmer or automatically (when possible and depends upon language) Persistent memory – Usually the file system –Persistence – Multiple execution of a program (e.g., files or databases) –Allocation – By program or user, often outside of program execution –Recovery – When data no longer needed –This form of memory is usually discussed in a database class (e.g., CMSC 424)

CMSC 3305 Memory Management in C Local variables live on the stack –Allocated at function invocation time –Deallocated when function returns –Storage space reused after function returns Space on the heap allocated with malloc() –Must be explicitly freed with free() –This is called explicit or manual memory management Deletions must be done by the programmer

CMSC 3306 Memory Management Mistakes May forget to free memory (memory leak) { int *x = (int *) malloc(sizeof(int)); } May retain ptr to freed memory (dangling pointer) { int *x =...malloc(); free(x); *x = 5; /* oops! */ } May try to free something twice { int *x =...malloc(); free(x); free(x); } This may corrupt the memory management data structures –The memory allocator maintains a free list of available space on the heap

CMSC 3307 Ways to Avoid Mistakes Don’t allocate memory on the heap –Often impractical –Leads to confusing code Never free memory –OS will reclaim process’s memory anyway at exit –Memory is cheap; who cares about a little leak? Use a garbage collector –E.g., conservative Boehm-Weiser collector for C The collector automatically recycles memory when it determines that it can no longer be otherwise accessed.

CMSC 3308 Memory Management in Ruby Local variables live on the stack –Storage reclaimed when method returns Objects live on the heap –Created with calls to Class.new Objects never explicitly freed –Ruby uses automatic memory management Uses a garbage collector to reclaim memory

CMSC 3309 Memory Management in OCaml Local variables live on the stack Tuples, closures, and constructed types live on the heap Garbage collection reclaims memory

CMSC Memory Management in Java Local variables live on the stack –Allocated at method invocation time –Deallocated when method returns Other data lives on the heap –Memory is allocated with new –But never explicitly deallocated Java uses automatic memory management

CMSC Another Memory Problem: Fragmentation allocate(a); allocate(x); allocate(y); free(a); allocate(z); free(y); allocate(b);  No contiguous space for b

CMSC Garbage collection goal Process to reclaim memory. (Also solve Fragmentation problem.) Algorithm: You can do garbage collection and memory compaction if you know where every pointer is in a program. If you move the allocated storage, simply change the pointer to it. This is true in some languages (LISP, OCaml, Java, Prolog), but not every language

CMSC Garbage Collection (GC) At any point during execution, can divide the objects in the heap into two classes: –Live objects will be used later –Dead objects will never be used again They are garbage Idea: Can reclaim memory from dead objects Goals: Reduce memory leaks, and make dangling pointers impossible

CMSC Many GC Techniques In most languages we can’t know for sure which objects are really live or dead –Undecidable, like solving the halting problem Thus we need to make an approximation Err on the conservative side: –OK if we decide something is live when it’s not –But we’d better not deallocate an object that will be used later on

CMSC Reference Counting (Smart Pointers) Old technique (1960) Each object has count of number of pointers to it from other objects and from the stack –When count reaches 0, object can be deallocated Counts tracked by either compiler or manually To find pointers, need to know layout of objects –In particular, need to distinguish pointers from ints Method works mostly for reclaiming memory; doesn’t handle fragmentation problem

CMSC Reference Counting Example

CMSC Reference Counting Example

CMSC Reference Counting Example

CMSC Reference Counting Example

CMSC Reference Counting Example

CMSC Reference Counting Example

CMSC Reference Counting Example

CMSC Tradeoffs Advantage: incremental technique –Generally small, constant amount of work per memory write Disadvantages: –Cascading decrements can be expensive –Also requires extra storage for reference counts –Can’t collect cycles, since counts never go to 0

CMSC Reachability An object is reachable if it can be accessed by following pointers from live data Safe policy: delete unreachable objects –An unreachable object can never be accessed again by the program The object is definitely garbage –A reachable object may be accessed in the future The object could be garbage but will be retained anyway

CMSC Roots At a given program point, we define liveness as being data reachable from the root set: –Global variables –Local variables of all live method activations (i.e., the stack) At the machine level, we also consider the register set (usually stores local or global variables) Recursively follow all pointers from the root set to find reachable data

CMSC Mark and Sweep GC Idea: Only objects reachable from stack could possibly be live –Every so often, stop the world and do GC: Mark all objects on stack as live Until no more reachable objects, –Mark object reachable from live object as live Deallocate any non-live objects This is a tracing garbage collector Does not handle fragmentation problem

CMSC Mark and Sweep Example stack

CMSC Mark and Sweep Example stack

CMSC Mark and Sweep Example stack

CMSC Mark and Sweep Example stack

CMSC Mark and Sweep Example stack

CMSC Mark and Sweep Example stack

CMSC Mark and Sweep Example stack

CMSC Tradeoffs with Mark and Sweep

CMSC Tradeoffs with Mark and Sweep Pros: –No problem with cycles –Memory writes have no cost Cons: –Fragmentation Available space broken up into many small pieces –Thus many mark-and-sweep systems may also have a compaction phase (like defragmenting your disk) –Cost proportional to heap size Sweep phase needs to traverse whole heap – it touches dead memory to put it back on to the free list –Not appropriate for real-time applications You wouldn’t like your auto’s braking system to stop working for a GC while you are trying to stop at a busy intersection

CMSC Stop and Copy GC Like mark and sweep, but only touches live objects –Divide heap into two equal parts (semispaces) –Only one semispace active at a time –At GC time, flip semispaces Trace the live data starting from the stack Copy live data into other semispace Declare everything in current semispace dead; switch to other semispace

CMSC Stop and Copy Example stack

CMSC Stop and Copy Example stack ① ①

CMSC Stop and Copy Example stack ① ① ② ②

CMSC Stop and Copy Example stack ① ① ② ② ③ ③

CMSC Stop and Copy Tradeoffs Pros: –Only touches live data –No fragmentation; automatically compacts Will probably increase locality Cons: –Requires twice the memory space –Like mark and sweep, need to “stop the world” Program must stop running to let garbage collector move around data in the heap

CMSC The Generational Principle Object lifetime increases  More objects live  “Young objects die quickly; old objects keep living”

CMSC Using The Generational Principle Some objects live longer –For instance, there are typically some objects allocated at initialization that live until the process exits. Some objects live shorter –For instance, loop variables don't last long at all. Between these two extremes are objects that live for the duration of some intermediate computation (the “lump”). Many applications have this general shape. Focus on the fact that a majority of objects "die young".

CMSC Generational Collection Long lived objects get copied over and over –Idea: Have more than one semispace, divide into generations Older generations collected less often Objects that survive many collections get pushed into older generations Need to track pointers from old to young generations to use as roots for young generation collection GC in Java 2 is based on this idea

CMSC More Issues in GC Stopping the world is a big problem –Unpredictable performance Bad for real-time systems –Need to stop all threads Without a much more sophisticated GC

CMSC More Issues in GC Building a “one-size fits all” solution is hard –Impossible to be optimal for all programs –So correctness and safety come first

CMSC What Does GC Mean to You? Ideally, nothing! –It should make your life easier –And shouldn’t affect performance too much If GC becomes a problem, hard to solve –You can often set parameters of the GC –You can modify your program

CMSC GC Parameters Can resize the generations –How much to use initially, plus max growth Change the total heap size –In terms of an absolute measure –In terms of ratio of free/allocated data For server applications, two common tweaks: –Make the total heap as big as possible –Make the young generation half the total heap

CMSC Bad Ideas (Usually) Calling System.gc() –This is probably a bad idea –You have no idea what the GC will do –And it will take a while Managing memory yourself –Object pools, free lists, object recycling –GC’s have been heavily tuned to be efficient

CMSC Dealing with GC Problems Best idea: Measure where your problems are coming from For Java VM, try running with –-verbose:gc –Prints out messages with statistics when a GC occurs [GC K->83000K(776768K), secs] [GC K->83372K(776768K), secs] [Full GC K->83769K(776768K), secs]

As a language designer... Don’t allow pointer types to mix with integer types –Makes liveness analysis much easier –Also helps with other types of analysis Less trouble with aliasing Discourage top-level allocation –Difficult or impossible to free top-level structures automatically Encourage tight scoping –Easier to make GC decisions CMSC 330