Precise Memory Leak Detection for Java Software Using Container Profiling Guoqing Xu, Atanas Rountev Program analysis and software tools group Ohio State.

Slides:



Advertisements
Similar presentations
Dynamic Memory Management
Advertisements

A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
Building a Better Backtrace: Techniques for Postmortem Program Analysis Ben Liblit & Alex Aiken.
Lecture 10: Heap Management CS 540 GMU Spring 2009.
Resurrector: A Tunable Object Lifetime Profiling Technique Guoqing Xu University of California, Irvine OOPSLA’13 Conference Talk 1.
1 Perracotta: Mining Temporal API Rules from Imperfect Traces Jinlin Yang David Evans Deepali Bhardwaj Thirumalesh Bhat Manuvir Das.
Precise Detection of Memory Leaks Jonas Maebe, Michiel Ronsse, Koen De Bosschere WODA May 2004 Dammit, Jim. I’m an Eiffel Tower, not a Star Trek.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Precise Memory Leak Detection for Java Software Using Container Profiling.
Guoquing Xu, Atanas Rountev Ohio State University Oct 9 th, 2008 Presented by Eun Jung Park.
Memory issues in production systems. Production system Restricted access Application, DB, Application server, log files Debugging, monitoring Investigation.
Hastings Purify: Fast Detection of Memory Leaks and Access Errors.
LOW-OVERHEAD MEMORY LEAK DETECTION USING ADAPTIVE STATISTICAL PROFILING WHAT’S THE PROBLEM? CONTRIBUTIONS EVALUATION WEAKNESS AND FUTURE WORKS.
CORK: DYNAMIC MEMORY LEAK DETECTION FOR GARBAGE- COLLECTED LANGUAGES A TRADEOFF BETWEEN EFFICIENCY AND ACCURATE, USEFUL RESULTS.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
CS 1114: Data Structures – memory allocation Prof. Graeme Bailey (notes modified from Noah Snavely, Spring 2009)
1 CS 177 Week 12 Recitation Slides Running Time and Performance.
Establishing Local Temporal Heap Safety Properties with Applications to Compile-Time Memory Management Ran Shaham Eran Yahav Elliot Kolodner Mooly Sagiv.
Finding Low-Utility Data Structures Guoqing Xu 1, Nick Mitchell 2, Matthew Arnold 2, Atanas Rountev 1, Edith Schonberg 2, Gary Sevitsky 2 1 Ohio State.
Lecture 36: Programming Languages & Memory Management Announcements & Review Read Ch GU1 & GU2 Cohoon & Davidson Ch 14 Reges & Stepp Lab 10 set game due.
LeakChaser: Helping Programmers Narrow Down Causes of Memory Leaks Guoqing Xu, Michael D. Bond, Feng Qin, Atanas Rountev Ohio State University.
Stacks. 2 What is a stack? A stack is a Last In, First Out (LIFO) data structure Anything added to the stack goes on the “top” of the stack Anything removed.
Detecting Inefficiently-Used Containers to Avoid Bloat Guoqing Xu and Atanas Rountev Department of Computer Science and Engineering Ohio State University.
Bell: Bit-Encoding Online Memory Leak Detection Michael D. Bond Kathryn S. McKinley University of Texas at Austin.
JAVA: An Introduction to Problem Solving & Programming, 5 th Ed. By Walter Savitch and Frank Carrano. ISBN © 2008 Pearson Education, Inc., Upper.
1 Testing Concurrent Programs Why Test?  Eliminate bugs?  Software Engineering vs Computer Science perspectives What properties are we testing for? 
Dynamic Memory Allocation Questions answered in this lecture: When is a stack appropriate? When is a heap? What are best-fit, first-fit, worst-fit, and.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Understanding Parallelism-Inhibiting Dependences in Sequential Java Programs Atanas (Nasko) Rountev Kevin Van Valkenburgh Dacong Yan P. Sadayappan Ohio.
Bug Localization with Machine Learning Techniques Wujie Zheng
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Security - Why Bother? Your projects in this class are not likely to be used for some critical infrastructure or real-world sensitive data. Why should.
Basic Semantics Associating meaning with language entities.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
CS 346 – Chapter 4 Threads –How they differ from processes –Definition, purpose Threads of the same process share: code, data, open files –Types –Support.
Chameleon Automatic Selection of Collections Ohad Shacham Martin VechevEran Yahav Tel Aviv University IBM T.J. Watson Research Center Presented by: Yingyi.
Exception Handling Unit-6. Introduction An exception is a problem that arises during the execution of a program. An exception can occur for many different.
OOPLs /FEN March 2004 Object-Oriented Languages1 Object-Oriented Languages - Design and Implementation Java: Behind the Scenes Finn E. Nordbjerg,
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
1 CS 177 Week 12 Recitation Slides Running Time and Performance.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 11 Modified by Donghui Zhang Jan 30, 2006.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Efficient Checkpointing of Java Software using Context-Sensitive Capture.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
A Introduction to Computing II Lecture 11: Hashtables Fall Session 2000.
Heap liveness and its usage in automatic memory management Ran Shaham Elliot Kolodner Mooly Sagiv ISMM’02 Unpublished TVLA.
CoCo: Sound and Adaptive Replacement of Java Collections Guoqing (Harry) Xu Department of Computer Science University of California, Irvine.
Detecting Inefficiently-Used Containers to Avoid Bloat Guoqing Xu and Atanas Rountev Department of Computer Science and Engineering Ohio State University.
CSE 374 Programming Concepts & Tools Hal Perkins Fall 2015 Lecture 10 – C: the heap and manual memory management.
Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)
JAVA: An Introduction to Problem Solving & Programming, 6 th Ed. By Walter Savitch ISBN © 2012 Pearson Education, Inc., Upper Saddle River,
CS412/413 Introduction to Compilers and Translators April 21, 1999 Lecture 30: Garbage collection.
GC Assertions: Using the Garbage Collector To Check Heap Properties Samuel Z. Guyer Tufts University Edward Aftandilian Tufts University.
Lecture 10 Page 1 CS 111 Online Memory Management CS 111 On-Line MS Program Operating Systems Peter Reiher.
Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX.
MSP’05 1 Gated Memory Control for Memory Monitoring, Leak Detection and Garbage Collection Chen Ding, Chengliang Zhang Xipeng Shen, Mitsunori Ogihara University.
Dynamic Bug Detection & Tolerance Kathryn S McKinley The University of Texas at Austin.
Static Software Metrics Tool
Topic: Java Garbage Collection
Cork: Dynamic Memory Leak Detection with Garbage Collection
John Hurley Cal State LA
Parallel Programming By J. H. Wang May 2, 2017.
CS 153: Concepts of Compiler Design November 28 Class Meeting
Optimizing Malloc and Free
Storage.
Objective of This Course
Adaptive Code Unloading for Resource-Constrained JVMs
Demand-Driven Context-Sensitive Alias Analysis for Java
Dynamic Data Structures and Generics
Dongyun Jin, Patrick Meredith, Dennis Griffith, Grigore Rosu
CS5123 Software Validation and Quality Assurance
Introduction to Data Structure
Presentation transcript:

Precise Memory Leak Detection for Java Software Using Container Profiling Guoqing Xu, Atanas Rountev Program analysis and software tools group Ohio State University Supported by NSF under CAREER grant CCF

Memory Leaks C: malloc without free C++: new without delete Java: garbage-collected language – Unreachable objects are identified and freed – How about reachable objects that are not used again? A Java memory leak can cause serious problems – Performance degradation due to GC cost – Crash with OutOfMemory exception – For long-running enterprise applications with large memory footprint: even small leaks are bad 2PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Finding the Causes of Memory Leaks (1/2) Compile-time analysis usually does not work Run-time analysis is preferable, but tricky – Millions of heap objects at any moment of time – The statement that finally exhausts the memory has nothing to do with the source of heap growth Continuous run-time heap analysis looking for suspicious behavior (symptom) – E.g., a possible symptom is the growing number of objects: “the number of java.util.HashMap$Entry objects keeps growing” Finding the leak cause, given this symptom 3PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Finding the Causes of Memory Leaks (2/2) Issue 1: what is a leak symptom? – Growing number of instances of a type LeakBot [ECOOP’05]; Cork [POPL’07] – Staleness (time since last use) of an object Sleigh [ASPLOS’06] Issue 2: what is the leak cause? – Starting with the suspicious objects, traverse backwards the run-time object graph to find the cause This is all great, but … 4PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

What is a Leak Symptom? A single factor is not enough as the leak symptom – Growing number of instances may be due to perfectly legitimate useful data – Staleness does not necessarily mean a leak E.g., a JFrame window object is never used after creation, but it is not a leak Other factors: e.g., volume of memory consumed by an object and all objects reachable from it – E.g., a big container that is not used for a while may be more important than a never-used string 5PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

What is the Leak Cause? From the suspicious objects, traverse backwards the object graph and examine the reference edges – Very large and complex object graph on the heap – The programmer is buried under a mountain of data – How to decide if a reference edge is unnecessary? – Why does this edge exist at all? Where exactly in the program code – was this reference edge created? – the edge should have been destroyed? – should the programmer look to find and fix the bug? 6PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Outline Motivation New container-based approach – Key idea – Generic leak analysis – Specific leak analysis for Java Experimental evaluation – Real-world memory leaks – Run-time overhead 7PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

A New Perspective Observation: containers are often the leak causes – Elements are not properly removed from containers – Many real-world JDK memory leak bugs are caused by misuse of containers Let’s reverse the traditional diagnosis process – Start by suspecting that all containers are leaking, and use symptoms to rule out those less likely to leak – Avoid the effort to search for a cause starting from arbitrary suspicious objects 8PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Our Proposal Container-centric – Track container operations – Assign a confidence value to each container based on its symptoms – Rank and report based this on this confidence value We only consider bugs caused by containers at the first and second levels of the tree 9PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Run-time Leak Confidence Analysis Generic “analysis template” that can be implemented in different ways – Later we show a specific implementation for Java Considers the combined effect of multiple factors – Memory taken up by an individual container – Overall memory consumption – Staleness of a container Container abstraction: ADT with three operations – ADD( , o) adds object o to container  – GET(  ) retrieves an object from container  – REMOVE( , o) removes object o from container  10PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Leaking Region A time region [  s,  e ] in which symptoms occur – Garbage collection (CG) occurs at  s, at  e, and several times in between – The live-memory consumption at these GC events (mostly) keeps increasing from one event to the next Choice of  e – Offline, post-mortem analysis: the time at which the program ends or OutOfMemory exception is thrown – Online, while the program is running: any time when a user wants to generate a report Choice of  s : examine the history of GC events before  e and the live-memory usage at them 11PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Memory Usage of a Container At some GC event in the leaking region – Find the total memory consumed by all objects reachable from the container – Relative value: divide by the total live memory at this GC event; get a number  [0, 1] Memory usage graph: – X axis: time relative to  e – Y axis: relative memory Memory contribution MC(  ): area under the curve  [0, 1] 12PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Staleness of a Container Time since last use [ASPLOS’06] A new definition in terms of  and o SC(o) = (  2 -  1 )/(  2 -  0 ) SC(  ) is the average SC(o) for objects o in  – A number  [0, 1] Large value of SC means that many elements are sitting in the container without being used 13PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Putting it All Together Leaking confidence LC = SC × MC 1-SC  [0, 1] – If SC or MC increases, LC also increases Properties – MC = 0 and SC  [0, 1]  LC = 0 – SC = 0 and MC  [0, 1]  LC = 0 – SC = 1 and MC  [0, 1]  LC = 1 – MC = 1 and SC  [0, 1]  LC = SC Analysis output: – Containers ranked by their LC value – ADD/GET call sites ranked by the average staleness of their elements 14PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Outline Motivation New container-based approach – Key idea – Generic leak analysis – Specific leak analysis for Java Experimental evaluation – Real-world memory leaks – Run-time overhead 15PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Modeling and Tracking of Containers 16PRESTO: Program Analyses and Software Tools Research Group, Ohio State University class HashMap { Object put(Object key, Object value) {…} Object get(Object key) {…} Object remove(Object key) {…} } class Java_util_HashMap { static void put_after (int csID, Map map, Object key, Object value, Object result) { if (result == null) { … Recorder.v().record(csID, map, key, …, Recorder.EFFECT_ADD); } } } Object result = m.put(a,b); Java_util_HashMap.put_after(1234, m, a, b, result);

Tracking of Memory Usage Approximation of MC – An object graph traversal thread is launched periodically to calculate the total amount of memory consumed by objects reachable from the container object – Precision and overhead tradeoff is defined by the interval between two runs of the thread – Our experience shows that once every 50 GC events is a good compromise 17PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Data Analysis Decide what should be the leaking region Compute the approximation of MC – MC =  M T i ×(T i+1 – T i ) Compute SC – Scan the Recorder data and remove data entries outside the leaking region – For each element, find its REMOVE event and its last GET event 18PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Evaluation on Real-World Memory Leaks Java AWT/Swing bugs – Sun JDK bug # – existed in Java 5, fixed in 6 – Sun JDK bug # – still open in Java 6 SPECjbb bug The generated reports are precise – Top-ranked containers are the actual causes of the bugs – Confidence values for bug-inducing containers and correctly-used containers differ significantly 19PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Sun JDK Bug # The bug manifests when switching between two Swing applications According to a developer’s report, it is very hard to track down We instrumented the entire java.awt and javax.swing packages, and the test case that triggered the bug 20PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Sun JDK Bug # PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Container: type: java.util.HashMap (LC: 0.443, SC: 0.480, MC: 0.855) ---cs: javax.swing.RepaintManager:591 Container: type: class java.util.LinkedList (LC: 0.145, SC:0.172, MC: 0.814) ---cs: java.awt.DefaultKeyboardFocusManager:738 Container: type: class javax.swing.JPanel (LC: 0.038, SC:0.044, MC: 0.860) ---cs: javax.swing.JComponent:796

Sun JDK Bug # Line 591 of javax.swing.RepaintManager – A GET operation image = (VolatileImage) volatileMap.get(config); – The container that is misused is the volatileMap – This information is sufficient for a developer to locate the bug Where is the actual bug? – VolatileImage objects are cached in the map – Upon a display mode switch, the old configuration object get invalidated and will not be used again – But the images are still maintained in the map Similar bugs exist for # and SPECJbb 22PRESTO: Program Analyses and Software Tools Research Group, Ohio State University

Overhead Compile-time analysis Dynamic overhead – Sampling rate: 1/15GC, 1/50GC – Initial heap size: default, 512M

Overhead for Different Sampling Rates Y-axis: (NewTime-OldTime)/OldTime 1/15GC: 121.2% 1/50GC: 87.5%

Overhead for Different Initial Heap Size Default heap: 177.2% 512M heap: 87.5%

Summary Proposed a container-centric approach – Tracking all modeled containers – Computing a leak confidence for each container Memory contribution and staleness contribution – Can be used for both online and offline diagnosis Memory leak detection for Java – Code transformation + run-time profiling Future work – Lower overhead (e.g., selective profiling; JVM internals) – Evaluate other confidence models – Larger experimental study 26PRESTO: Program Analyses and Software Tools Research Group, Ohio State University