Bell: Bit-Encoding Online Memory Leak Detection Michael D. Bond Kathryn S. McKinley University of Texas at Austin.

Slides:



Advertisements
Similar presentations
Department of Computer Sciences Dynamic Shape Analysis via Degree Metrics Maria Jump & Kathryn S. McKinley Department of Computer Sciences The University.
Advertisements

Garbage collection David Walker CS 320. Where are we? Last time: A survey of common garbage collection techniques –Manual memory management –Reference.
Introduction to Memory Management. 2 General Structure of Run-Time Memory.
Garbage Collection What is garbage and how can we deal with it?
Michael Bond Kathryn McKinley The University of Texas at Austin Presented by Na Meng Most of the slides are from Mike’s original talk. Many thanks go to.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2007 Exterminator: Automatically Correcting Memory Errors with High Probability Gene.
Resurrector: A Tunable Object Lifetime Profiling Technique Guoqing Xu University of California, Irvine OOPSLA’13 Conference Talk 1.
Probabilistic Calling Context Michael D. Bond Kathryn S. McKinley University of Texas at Austin.
Precise Detection of Memory Leaks Jonas Maebe, Michiel Ronsse, Koen De Bosschere WODA May 2004 Dammit, Jim. I’m an Eiffel Tower, not a Star Trek.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Precise Memory Leak Detection for Java Software Using Container Profiling.
Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley
Guoquing Xu, Atanas Rountev Ohio State University Oct 9 th, 2008 Presented by Eun Jung Park.
Hastings Purify: Fast Detection of Memory Leaks and Access Errors.
LOW-OVERHEAD MEMORY LEAK DETECTION USING ADAPTIVE STATISTICAL PROFILING WHAT’S THE PROBLEM? CONTRIBUTIONS EVALUATION WEAKNESS AND FUTURE WORKS.
CORK: DYNAMIC MEMORY LEAK DETECTION FOR GARBAGE- COLLECTED LANGUAGES A TRADEOFF BETWEEN EFFICIENCY AND ACCURATE, USEFUL RESULTS.
SOS: Saving Time in Dynamic Race Detection with Stationary Analysis Du Li, Witawas Srisa-an, Matthew B. Dwyer.
MC 2 : High Performance GC for Memory-Constrained Environments N. Sachindran, E. Moss, E. Berger Ivan JibajaCS 395T *Some of the graphs are from presentation.
380C Where are we & where we are going – Managed languages Dynamic compilation Inlining Garbage collection What else can you do when you examine the heap.
CPSC 388 – Compiler Design and Construction
Free-Me: A Static Analysis for Individual Object Reclamation Samuel Z. Guyer Tufts University Kathryn S. McKinley University of Texas at Austin Daniel.
1 The Compressor: Concurrent, Incremental and Parallel Compaction. Haim Kermany and Erez Petrank Technion – Israel Institute of Technology.
Lecture 36: Programming Languages & Memory Management Announcements & Review Read Ch GU1 & GU2 Cohoon & Davidson Ch 14 Reges & Stepp Lab 10 set game due.
LeakChaser: Helping Programmers Narrow Down Causes of Memory Leaks Guoqing Xu, Michael D. Bond, Feng Qin, Atanas Rountev Ohio State University.
Dynamic Tainting for Deployed Java Programs Du Li Advisor: Witawas Srisa-an University of Nebraska-Lincoln 1.
An Adaptive, Region-based Allocator for Java Feng Qian & Laurie Hendren 2002.
Age-Oriented Concurrent Garbage Collection Harel Paz, Erez Petrank – Technion, Israel Steve Blackburn – ANU, Australia April 05 Compiler Construction Scotland.
1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot.
The Memory Behavior of Data Structures Kartik K. Agaram, Stephen W. Keckler, Calvin Lin, Kathryn McKinley Department of Computer Sciences The University.
Quarantine: A Framework to Mitigate Memory Errors in JNI Applications Du Li , Witawas Srisa-an University of Nebraska-Lincoln.
Precise Memory Leak Detection for Java Software Using Container Profiling Guoqing Xu, Atanas Rountev Program analysis and software tools group Ohio State.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2006 Exterminator: Automatically Correcting Memory Errors Gene Novark, Emery Berger.
Tolerating Memory Leaks Michael D. Bond Kathryn S. McKinley.
Michael Bond Kathryn McKinley The University of Texas at Austin.
Exploiting Prolific Types for Memory Management and Optimizations By Yefim Shuf et al.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Computer Security and Penetration Testing
P ath & E dge P rofiling Michael Bond, UT Austin Kathryn McKinley, UT Austin Continuous Presented by: Yingyi Bu.
Fast Conservative Garbage Collection Rifat Shahriyar Stephen M. Blackburn Australian National University Kathryn S. M cKinley Microsoft Research.
1 Fast and Efficient Partial Code Reordering Xianglong Huang (UT Austin, Adverplex) Stephen M. Blackburn (Intel) David Grove (IBM) Kathryn McKinley (UT.
Dynamic Object Sampling for Pretenuring Maria Jump Department of Computer Sciences The University of Texas at Austin Stephen M. Blackburn.
Free-Me: A Static Analysis for Automatic Individual Object Reclamation Samuel Z. Guyer, Kathryn McKinley, Daniel Frampton Presented by: Dimitris Prountzos.
Chameleon Automatic Selection of Collections Ohad Shacham Martin VechevEran Yahav Tel Aviv University IBM T.J. Watson Research Center Presented by: Yingyi.
Breadcrumbs: Efficient Context Sensitivity for Dynamic Bug Detection Analyses Michael D. Bond University of Texas at Austin Graham Z. Baker Tufts / MIT.
How’s the Parallel Computing Revolution Going? 1How’s the Parallel Revolution Going?McKinley Kathryn S. McKinley The University of Texas at Austin.
Targeted Path Profiling : Lower Overhead Path Profiling for Staged Dynamic Optimization Systems Rahul Joshi, UIUC Michael Bond*, UT Austin Craig Zilles,
Highly Scalable Distributed Dataflow Analysis Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan Chelsea LeBlancTodd.
Drinking from Both Glasses: Adaptively Combining Pessimistic and Optimistic Synchronization for Efficient Parallel Runtime Support Man Cao Minjia Zhang.
380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality.
Practical Path Profiling for Dynamic Optimizers Michael Bond, UT Austin Kathryn McKinley, UT Austin.
1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),
Targeted Path Profiling : Lower Overhead Path Profiling for Staged Dynamic Optimization Systems Rahul Joshi, UIUC Michael Bond*, UT Austin Craig Zilles,
GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo.
Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)
Sampling Dynamic Dataflow Analyses Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan University of British Columbia.
Survey of Tools to Support Safe Adaptation with Validation Alain Esteva-Ramirez School of Computing and Information Sciences Florida International University.
1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng.
Tracking Bad Apples: Reporting the Origin of Null & Undefined Value Errors Michael D. Bond UT Austin Nicholas Nethercote National ICT Australia Stephen.
GC Assertions: Using the Garbage Collector To Check Heap Properties Samuel Z. Guyer Tufts University Edward Aftandilian Tufts University.
1 The Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT), Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss.
MSP’05 1 Gated Memory Control for Memory Monitoring, Leak Detection and Garbage Collection Chen Ding, Chengliang Zhang Xipeng Shen, Mitsunori Ogihara University.
Dynamic Bug Detection & Tolerance Kathryn S McKinley The University of Texas at Austin.
Garbage Collection What is garbage and how can we deal with it?
YAHMD - Yet Another Heap Memory Debugger
Debugging Memory Issues
Cork: Dynamic Memory Leak Detection with Garbage Collection
Concepts of programming languages
CACHETOR Detecting Cacheable Data to Remove Bloat
Jipeng Huang, Michael D. Bond Ohio State University
Strategies for automatic memory management
Garbage Collection What is garbage and how can we deal with it?
Presentation transcript:

Bell: Bit-Encoding Online Memory Leak Detection Michael D. Bond Kathryn S. McKinley University of Texas at Austin

Bugs in Deployed Software Humans rely on software for critical tasks Bugs are costly & risky Software more complex More bugs & harder to fix

Bugs in Deployed Software Humans rely on software for critical tasks Bugs are costly & risky Software more complex More bugs & harder to fix Bugs are a problem in deployed software In-house testing incomplete Performance is critical Focus on space overhead

Why do bug tools want so much space? Store lots of info about the program Correlate program locations (sites) & data Ex: DirectedGraph.java:309 Tag each object with one or more sites Last-use siteHeaderField 2Field 3Field 1Alloc site

Why do bug tools want so much space? Store lots of info about the program Correlate program locations (sites) & data Ex: DirectedGraph.java:309 Tag each object with one or more sites Bug detection applications AVIO tracks last-use site of each object Leak detection reports leaking objects’ sites [JRockit,.NET Memory Profiler, Purify, SWAT, Valgrind] High space overhead if many small objects Last-use siteHeaderField 2Field 3Field 1Alloc site

Why do bug tools want so much space? Store lots of info about the program Correlate program locations (sites) & data Ex: DirectedGraph.java:309 Tag each object with one or more sites Bug detection applications AVIO tracks last-use site of each object Leak detection reports leaking objects’ sites [JRockit,.NET Memory Profiler, Purify, SWAT, Valgrind] High space overhead if many small objects SWAT: 75% space overhead on twolf Last-use siteHeaderField 2Field 3Field 1Alloc site

Site How many bits do we need? HeaderField 2Field 3Field 1

Site How many bits do we need? 32 bits HeaderField 2Field 3Field 1

Site How many bits do we need? 32 bits 20 bits if # sites < 1,000, bits for common case (hot sites) HeaderField 2Field 3Field 1

How many bits do we need? 32 bits 20 bits if # sites < 1,000, bits for common case (hot sites) 1 bit? HeaderField 2Field 3Field 1 Site

How many bits do we need? HeaderField 2Field 3Field 1 Site 32 bits 20 bits if # sites < 1,000, bits for common case (hot sites) 1 bit? One bit loses info about site

1 bit? One bit loses info about site But with many objects… How many bits do we need? ?

Bell: Bit-Encoding Leak Location Stores per-object sites in single bit Reconstructs sites by looking at multiple objects’ bits site

Outline Introduction Memory leaks Bell encoding and decoding Leak detection using Bell Related work

Memory Leaks Memory bugs Memory corruption: dangling refs, buffer overflows Memory leaks Lost objects: unreachable but not freed Useless objects: reachable but not used again

Memory Leaks Memory bugs Memory corruption: dangling refs, buffer overflows Memory leaks Lost objects: unreachable but not freed Useless objects: reachable but not used again Managed Languages 80% of new software in Java or C# by 2010 [Gartner] Type safety & GC eliminate many bugs

Memory Leaks Memory bugs Memory corruption: dangling refs, buffer overflows Memory leaks Lost objects: unreachable but not freed Useless objects: reachable but not used again Managed Languages 80% of new software in Java or C# by 2010 [Gartner] Type safety & GC eliminate many bugs

Memory Leaks Memory bugs Memory corruption: dangling refs, buffer overflows Memory leaks Lost objects: unreachable but not freed Useless objects: reachable but not used again Managed Languages 80% of new software in Java or C# by 2010 [Gartner] Type safety & GC eliminate many bugs Leaks occur in practice in managed languages [Cork, JRockit, JProbe, LeakBot,.NET Memory Profiler]

Outline Introduction Memory leaks Bell encoding and decoding Leak detection using Bell Related work

Bell’s Encoding Function f (, ) = 0 or 1 site object

Bell’s Encoding Function f (, ) = 0 or 1 site object Color indicates site (ex: allocation site)

Bell’s Encoding Function site object f (, ) = 0 or 1 site object f (, ) = 0 or 1 may match

Bell’s Encoding Function may match site object f (, ) = 0 or 1 object f (, ) = 0 or 1 Probability of match is ½  unbiased function site

How do we find leaking sites? Problem: leaking objects with unknown allocation sites

How do we find leaking sites? site object f (, ) Solution: for each site, see how many objects it matches

yes How do we find leaking sites? site object f (, ) Site matches all objects it allocated

no yes no yes no How do we find leaking sites? site object f (, ) Site matches all objects it allocated Site matches ~50% objects it didn’t allocate

no yes no yes no How do we find leaking sites? site object f (, ) Site matches all objects it allocated Site matches ~50% objects it didn’t allocate yes matches ≈ allocObjs + ½ (leakingObjs - allocObjs)

no yes no yes no How do we find leaking sites? site object f (, ) yes matches ≈ allocObjs + ½ (leakingObjs - allocObjs) allocObjs ≈ 2 x matches - leakingObjs

no yes no yes no How do we find leaking sites? site object f (, ) yes matches ≈ allocObjs + ½ (leakingObjs - allocObjs) allocObjs ≈ 2 x matches - leakingObjs 6

no yes no yes no How do we find leaking sites? site object f (, ) yes matches ≈ allocObjs + ½ (leakingObjs - allocObjs) allocObjs ≈ 2 x matches - leakingObjs 69

no yes no yes no How do we find leaking sites? site object f (, ) yes matches ≈ allocObjs + ½ (leakingObjs - allocObjs) allocObjs ≈ 2 x matches - leakingObjs 69 3

Bell Decoding foreach possible matches  0 foreach potentially leaking if f (, ) = ’s site bit matches  matches + 1 allocObjs = 2 x matches – leakingObjs if allocObjs > threshold(leakingObjs) print is the site for allocObjs objects site object site

Bell Decoding foreach possible matches  0 foreach potentially leaking if f (, ) = ’s site bit matches  matches + 1 allocObjs = 2 x matches – leakingObjs if allocObjs > threshold(leakingObjs) print is the site for allocObjs objects site object site

Bell Decoding foreach possible matches  0 foreach potentially leaking if f (, ) = ’s site bit matches  matches + 1 allocObjs = 2 x matches – leakingObjs if allocObjs > threshold(leakingObjs) print is the site for allocObjs objects site object site

Bell Decoding foreach possible matches  0 foreach potentially leaking if f (, ) = ’s site bit matches  matches + 1 allocObjs = 2 x matches – leakingObjs if allocObjs > threshold(leakingObjs) print is the site for allocObjs objects site object site Threshold avoids reporting sites that allocated no objects (false positives)

Bell Decoding foreach possible matches  0 foreach potentially leaking if f (, ) = ’s site bit matches  matches + 1 allocObjs = 2 x matches – leakingObjs if allocObjs > threshold(leakingObjs) print is the site for allocObjs objects site object site Decoding misses sites that allocated few objects (false negatives) Threshold avoids reporting sites that allocated no objects (false positives)

Bell Decoding foreach possible matches  0 foreach potentially leaking where is possible if f (, ) = ’s site bit matches  matches + 1 allocObjs = 2 x matches – leakingObjs if allocObjs > threshold(leakingObjs) print is the site for allocObjs objects site object site Dynamic type check narrows possible objects

Outline Introduction Memory leaks Bell encoding and decoding Leak detection using Bell Related work

Sleigh Bell encodes allocation and last-use sites Stale objects  potential leaks [SWAT] Periodic decoding of highly stale objects Leak Detection using Bell

Sleigh Bell encodes allocation and last-use sites Stale objects  potential leaks [SWAT] Periodic decoding of highly stale objects Implementation in Jikes RVM Find leaks in Eclipse and SPEC JBB2000 Leak Detection using Bell

0101 o.allocSite o.lastUseSite o.staleness

0101 o.allocSite o.lastUseSite o.staleness Leak Detection using Bell No space overhead since four free bits in object header

0101 Maintaining Sleigh’s Bits o.allocSite o.lastUseSite o.staleness // Object allocation: s 1 : o = new MyObject();

0101 Maintaining Sleigh’s Bits o.allocSite o.lastUseSite o.staleness // Object allocation: s 1 : o = new MyObject(); // Instrumentation: o.allocSite = f(s 1, o);

0101 Maintaining Sleigh’s Bits o.allocSite o.lastUseSite o.staleness // Object allocation: s 1 : o = new MyObject(); // Instrumentation: o.allocSite = f(s 1, o); // Object use: s 2 : tmp = o.f;

0101 Maintaining Sleigh’s Bits o.allocSite o.lastUseSite o.staleness // Object allocation: s 1 : o = new MyObject(); // Instrumentation: o.allocSite = f(s 1, o); // Object use: s 2 : tmp = o.f; // Instrumentation: o.lastUseSite = f(s 2, o); o.staleness = 0;

// Object allocation: s 1 : o = new MyObject(); // Instrumentation: o.allocSite = f(s 1, o); // Object use: s 2 : tmp = o.f; // Instrumentation: o.lastUseSite = f(s 2, o); o.staleness = 0; site object The Encoding Function f (, ) := bit 31 ( х )

// Object allocation: s 1 : o = new MyObject(); // Instrumentation: o.allocSite = f(s 1, o); // Object use: s 2 : tmp = o.f; // Instrumentation: o.lastUseSite = f(s 2, o); o.staleness = 0; site object The Encoding Function f (, ) := bit 31 ( х х ) object

// Object allocation: s 1 : o = new MyObject(); // Instrumentation: o.allocSite = f(s 1, o); // Object use: s 2 : tmp = o.f; // Instrumentation: o.lastUseSite = f(s 2, o); o.staleness = 0; site object Object Movement Restrictions f (, ) := bit 31 ( х х ) object Objects may not move (Mostly) non-moving collector Mark-sweep Generational mark-sweep C and C++ do not move objects

// Object allocation: s 1 : o = new MyObject(); // Instrumentation: o.allocSite = f(s 1, o); // Object use: s 2 : tmp = o.f; // Instrumentation: o.lastUseSite = f(s 2, o); o.staleness = 0; site object Sleigh’s Time Overhead f (, ) := bit 31 ( х х ) object DaCapo [Blackburn et al. ’06] SPEC JBB2000 SPEC JVM98

// Object allocation: s 1 : o = new MyObject(); // Instrumentation: o.allocSite = f(s 1, o); // Object use: s 2 : tmp = o.f; // Instrumentation: o.lastUseSite = f(s 2, o); o.staleness = 0; site object Sleigh’s Time Overhead f (, ) := bit 31 ( х х ) object 29% time overhead (11% with adaptive profiling) DaCapo [Blackburn et al. ’06] SPEC JBB2000 SPEC JVM98

Finding and Fixing Leaks Leaks in Eclipse and SPEC JBB2000

Data structures leak Finding and Fixing Leaks

Leaks in Eclipse and SPEC JBB2000 Data structures leak Finding and Fixing Leaks

Leaks in Eclipse and SPEC JBB2000 Data structures leak Most interesting: stale roots Finding and Fixing Leaks

Leaks in Eclipse and SPEC JBB2000 Data structures leak Most interesting: stale roots many Finding and Fixing Leaks

Leaks in Eclipse and SPEC JBB2000 Data structures leak Most interesting: stale roots few many Finding and Fixing Leaks

Leaks in Eclipse and SPEC JBB2000 Data structures leak Most interesting: stale roots Need significant number of stale data structures Finding and Fixing Leaks

Leaks in Eclipse and SPEC JBB2000 Data structures leak Most interesting: stale roots Sleigh’s output directly useful for fixing leaks Finding and Fixing Leaks

Bell Decoding Again foreach possible matches  0 foreach potentially leaking where is possible and is root of stale data structure if f (, ) = ’s site bit matches  matches + 1 allocObjs = 2 x matches – leakingObjs if allocObjs > threshold(leakingObjs) print is the site for allocObjs objects site object site object site object Consider roots of stale data structures only

Leak detectors store per-object sites [JRockit,.NET Memory Profiler, Purify, SWAT, Valgrind] Sampling [Jump et al. ’04] Trades accuracy for lower overhead (like Bell) Adds some overhead; requires conditional instrumentation No encoding or decoding Communication complexity & information theory Related Work

Summary Bell encodes sites in a single bit and decodes sites using multiple objects’ bits Leak detection with low overhead site

Thank You Questions?