Tolerating Memory Leaks Michael D. Bond Kathryn S. McKinley.

Slides:



Advertisements
Similar presentations
Department of Computer Sciences Dynamic Shape Analysis via Degree Metrics Maria Jump & Kathryn S. McKinley Department of Computer Sciences The University.
Advertisements

Garbage collection David Walker CS 320. Where are we? Last time: A survey of common garbage collection techniques –Manual memory management –Reference.
Comparing and Optimising Parallel Haskell Implementations on Multicore Jost Berthold Simon Marlow Abyd Al Zain Kevin Hammond.
1 Write Barrier Elision for Concurrent Garbage Collectors Martin T. Vechev Cambridge University David F. Bacon IBM T.J.Watson Research Center.
Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.
Garbage Collection What is garbage and how can we deal with it?
Michael Bond Kathryn McKinley The University of Texas at Austin Presented by Na Meng Most of the slides are from Mike’s original talk. Many thanks go to.
Fast and Safe Performance Recovery on OS Reboot Kenichi Kourai Kyushu Institute of Technology.
MC 2 : High Performance GC for Memory-Constrained Environments - Narendran Sachindran, J. Eliot B. Moss, Emery D. Berger Sowmiya Chocka Narayanan.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2007 Exterminator: Automatically Correcting Memory Errors with High Probability Gene.
CHESS: A Systematic Testing Tool for Concurrent Software CSCI6900 George.
Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley
LOW-OVERHEAD MEMORY LEAK DETECTION USING ADAPTIVE STATISTICAL PROFILING WHAT’S THE PROBLEM? CONTRIBUTIONS EVALUATION WEAKNESS AND FUTURE WORKS.
CORK: DYNAMIC MEMORY LEAK DETECTION FOR GARBAGE- COLLECTED LANGUAGES A TRADEOFF BETWEEN EFFICIENCY AND ACCURATE, USEFUL RESULTS.
Efficient Concurrent Mark-Sweep Cycle Collection Daniel Frampton, Stephen Blackburn, Luke Quinane and John Zigman (Pending submission) Presented by Jose.
MC 2 : High Performance GC for Memory-Constrained Environments N. Sachindran, E. Moss, E. Berger Ivan JibajaCS 395T *Some of the graphs are from presentation.
CPSC 388 – Compiler Design and Construction
ParMarkSplit: A Parallel Mark- Split Garbage Collector Based on a Lock-Free Skip-List Nhan Nguyen Philippas Tsigas Håkan Sundell Distributed Computing.
Using Prefetching to Improve Reference-Counting Garbage Collectors Harel Paz IBM Haifa Research Lab Erez Petrank Microsoft Research and Technion.
OOPSLA 2003 Mostly Concurrent Garbage Collection Revisited Katherine Barabash - IBM Haifa Research Lab. Israel Yoav Ossia - IBM Haifa Research Lab. Israel.
1 The Compressor: Concurrent, Incremental and Parallel Compaction. Haim Kermany and Erez Petrank Technion – Israel Institute of Technology.
LeakChaser: Helping Programmers Narrow Down Causes of Memory Leaks Guoqing Xu, Michael D. Bond, Feng Qin, Atanas Rountev Ohio State University.
U NIVERSITY OF M ASSACHUSETTS Department of Computer Science Automatic Heap Sizing Ting Yang, Matthew Hertz Emery Berger, Eliot Moss University of Massachusetts.
Memory Allocation and Garbage Collection. Why Dynamic Memory? We cannot know memory requirements in advance when the program is written. We cannot know.
An Adaptive, Region-based Allocator for Java Feng Qian & Laurie Hendren 2002.
Age-Oriented Concurrent Garbage Collection Harel Paz, Erez Petrank – Technion, Israel Steve Blackburn – ANU, Australia April 05 Compiler Construction Scotland.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Garbage Collection Without Paging Matthew Hertz, Yi Feng, Emery Berger University.
1 Reducing Generational Copy Reserve Overhead with Fallback Compaction Phil McGachey and Antony L. Hosking June 2006.
File System Variations and Software Caching May 19, 2000 Instructor: Gary Kimura.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science PLDI 2006 DieHard: Probabilistic Memory Safety for Unsafe Programming Languages Emery.
Quarantine: A Framework to Mitigate Memory Errors in JNI Applications Du Li , Witawas Srisa-an University of Nebraska-Lincoln.
Bell: Bit-Encoding Online Memory Leak Detection Michael D. Bond Kathryn S. McKinley University of Texas at Austin.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Garbage Collection Memory Management Garbage Collection –Language requirement –VM service –Performance issue in time and space.
SEG Advanced Software Design and Reengineering TOPIC L Garbage Collection Algorithms.
Michael Bond Kathryn McKinley The University of Texas at Austin.
Taking Off The Gloves With Reference Counting Immix
Software faults & reliability Presented by: Presented by: Pooja Jain Pooja Jain.
Ulterior Reference Counting: Fast Garbage Collection without a Long Wait Author: Stephen M Blackburn Kathryn S McKinley Presenter: Jun Tao.
Fast Conservative Garbage Collection Rifat Shahriyar Stephen M. Blackburn Australian National University Kathryn S. M cKinley Microsoft Research.
Dynamic Object Sampling for Pretenuring Maria Jump Department of Computer Sciences The University of Texas at Austin Stephen M. Blackburn.
Copyright (c) 2004 Borys Bradel Myths and Realities: The Performance Impact of Garbage Collection Paper: Stephen M. Blackburn, Perry Cheng, and Kathryn.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 1 Automatic Heap Sizing: Taking Real Memory into Account Ting Yang, Emery Berger,
September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct
Investigating the Effects of Using Different Nursery Sizing Policies on Performance Tony Guan, Witty Srisa-an, and Neo Jia Department of Computer Science.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
380C lecture 19 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Opportunity to improve data locality.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),
Reference Counting. Reference Counting vs. Tracing Advantages ✔ Immediate ✔ Object-local ✔ Overhead distributed ✔ Very simple Trivial implementation for.
Tracking Bad Apples: Reporting the Origin of Null & Undefined Value Errors Michael D. Bond UT Austin Nicholas Nethercote National ICT Australia Stephen.
GC Assertions: Using the Garbage Collector To Check Heap Properties Samuel Z. Guyer Tufts University Edward Aftandilian Tufts University.
.NET Garbage Collection Performance Tips Sasha Goldshtein | SELA Group.
Free Transactions with Rio Vista Landon Cox April 15, 2016.
Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation.
Dynamic Bug Detection & Tolerance Kathryn S McKinley The University of Texas at Austin.
Garbage Collection What is garbage and how can we deal with it?
Free Transactions with Rio Vista
Presented by: Daniel Taylor
Cork: Dynamic Memory Leak Detection with Garbage Collection
Fardin Abdi, Renato Mancuso, Stanley Bak, Or Dantsker, Marco Caccamo
David F. Bacon, Perry Cheng, and V.T. Rajan
Jipeng Huang, Michael D. Bond Ohio State University
Strategies for automatic memory management
Memory Management Kathryn McKinley.
Free Transactions with Rio Vista
Continuously and Compacting By Lyndon Meadow
Reference Counting.
Garbage Collection What is garbage and how can we deal with it?
Reference Counting vs. Tracing
Presentation transcript:

Tolerating Memory Leaks Michael D. Bond Kathryn S. McKinley

Bugs in Deployed Software Deployed software fails ◦ Different environment and inputs  different behaviors Greater complexity & reliance

Bugs in Deployed Software Deployed software fails ◦ Different environment and inputs  different behaviors Greater complexity & reliance Memory leaks are a real problem Fixing leaks is hard

Memory Leaks in Deployed Systems Memory leaks are a real problem ◦ Managed languages do not eliminate them

Memory Leaks in Deployed Systems Memory leaks are a real problem ◦ Managed languages do not eliminate them Live ReachableDead

Memory Leaks in Deployed Systems Memory leaks are a real problem ◦ Managed languages do not eliminate them Live Reachable Dead

Memory Leaks in Deployed Systems Memory leaks are a real problem ◦ Managed languages do not eliminate them Live Reachable Dead

Memory Leaks in Deployed Systems Memory leaks are a real problem ◦ Managed languages do not eliminate them Live Reachable Dead

Memory Leaks in Deployed Systems Memory leaks are a real problem ◦ Managed languages do not eliminate them ◦ Slow & crash real programs Live Dead

Memory Leaks in Deployed Systems Memory leaks are a real problem ◦ Managed languages do not eliminate them ◦ Slow & crash real programs ◦ Unacceptable for some applications

Memory Leaks in Deployed Systems Memory leaks are a real problem ◦ Managed languages do not eliminate them ◦ Slow & crash real programs ◦ Unacceptable for some applications Fixing leaks is hard ◦ Leaks take time to materialize ◦ Failure far from cause

Example Driverless truck ◦ 10,000 lines of C# Leak: past obstacles remained reachable No immediate symptoms “This problem was pernicious because it only showed up after 40 minutes to an hour of driving around and collecting obstacles.” ◦ Quick “fix”: after 40 minutes, stop & reboot Environment sensitive ◦ More obstacles in deployment: failed in 28 minutes

Example Driverless truck ◦ 10,000 lines of C# Leak: past obstacles remained reachable No immediate symptoms “This problem was pernicious because it only showed up after 40 minutes to an hour of driving around and collecting obstacles.” Quick “fix”: restart after 40 minutes Environment sensitive ◦ More obstacles in deployment ◦ Failed in 28 minutes

Example Driverless truck ◦ 10,000 lines of C# Leak: past obstacles remained reachable No immediate symptoms “This problem was pernicious because it only showed up after 40 minutes to an hour of driving around and collecting obstacles.” Quick “fix”: restart after 40 minutes Environment sensitive ◦ More obstacles in deployment ◦ Failed in 28 minutes

Example Driverless truck ◦ 10,000 lines of C# Leak: past obstacles remained reachable No immediate symptoms “This problem was pernicious because it only showed up after 40 minutes to an hour of driving around and collecting obstacles.” Quick “fix”: restart after 40 minutes Environment sensitive ◦ More obstacles in deployment ◦ Unresponsive after 28 minutes

Uncertainty in Deployed Software Unknown leaks; unexpected failures Online leak diagnosis helps ◦ Too late to help failing systems

Uncertainty in Deployed Software Unknown leaks; unexpected failures Online leak diagnosis helps ◦ Too late to help failing systems Also tolerate leaks

Uncertainty in Deployed Software Unknown leaks; unexpected failures Online leak diagnosis helps ◦ Too late to help failing systems Also tolerate leaks Illusion of fix

Uncertainty in Deployed Software Unknown leaks; unexpected failures Online leak diagnosis helps ◦ Too late to help failing systems Also tolerate leaks Illusion of fix Eliminate bad effects Don’t slow Don’t crash

Uncertainty in Deployed Software Unknown leaks; unexpected failures Online leak diagnosis helps ◦ Too late to help failing systems Also tolerate leaks Illusion of fix Eliminate bad effects Preserve semantics Don’t slow Don’t crash Defer OOM errors

Predicting the Future Dead objects  not used again Highly stale objects  likely leaked Live Reachable Dead

Predicting the Future Dead objects  not used again Highly stale objects  likely leaked [Chilimbi & Hauswirth ’04] [Qin et al. ’05] [Bond & McKinley ’06] Live Reachable Dead

Tolerating Leaks with Melt Move highly stale objects to disk ◦ Much larger than memory ◦ Time & space proportional to live memory ◦ Preserve semantics Stale objects In-use objects Stale objects

Sounds like Paging! Stale objects In-use objects Stale objects

Sounds like Paging! Paging insufficient for managed languages ◦ Need object granularity ◦ GC’s working set is all reachable objects

Sounds like Paging! Paging insufficient for managed languages ◦ Need object granularity ◦ GC’s working set is all reachable objects Bookmarking collection [Hertz et al. ’05]  

Challenge #1: How does Melt identify stale objects? roots A A E E B B C C F F D D

A A E E B B C C F F D D GC: for all fields a.f a.f |= 0x1; Challenge #1: How does Melt identify stale objects?

roots GC: for all fields a.f a.f |= 0x1; A A E E B B C C F F D D Challenge #1: How does Melt identify stale objects?

roots GC: for all fields a.f a.f |= 0x1; Application: b = a.f; if (b & 0x1) { b &= ~0x1; a.f = b; [atomic] } A A E E B B C C F F D D Challenge #1: How does Melt identify stale objects?

roots A A E E B B C C F F D D GC: for all fields a.f a.f |= 0x1; Application: b = a.f; if (b & 0x1) { b &= ~0x1; a.f = b; [atomic] } Add 6% to application time Challenge #1: How does Melt identify stale objects?

roots GC: for all fields a.f a.f |= 0x1; Application: b = a.f; if (b & 0x1) { b &= ~0x1; a.f = b; [atomic] } A A E E F F D D C C B B Challenge #1: How does Melt identify stale objects?

roots GC: for all fields a.f a.f |= 0x1; Application: b = a.f; if (b & 0x1) { b &= ~0x1; a.f = b; [atomic] } A A E E F F D D B B C C Challenge #1: How does Melt identify stale objects?

Stale Space stale space roots A A E E F F D D B B C C in-use space Heap nearly full  move stale objects to disk Heap nearly full  move stale objects to disk

Stale Space roots in-use spacestale space A A E E B B F F C C D D

Challenge #2 roots in-use spacestale space A A E E B B F F C C D D How does Melt maintain pointers?

Stub-Scion Pairs roots in-use spacestale space A A E E B B F F C C D D B stub B scion scion space

Stub-Scion Pairs roots in-use spacestale space A A E E B B F F C C D D B stub B scion scion space B  B scion scion table

Stub-Scion Pairs roots in-use spacestale space A A E E B B F F C C D D B stub B scion scion space B  B scion scion table ?

Scion-Referenced Object Becomes Stale roots in-use spacestale space scion space B  B scion scion table A A E E F F C C D D B stub B scion B B

Scion-Referenced Object Becomes Stale roots in-use spacestale space scion space scion table A A E E F F C C D D B stub B B

roots in-use spacestale space scion space scion table A A E E F F C C D D B stub B B Challenge #3 What if program accesses highly stale object?

Application Accesses Stale Object roots in-use spacestale space scion space scion table A A E E F F C C D D B stub B B b = a.f; if (b & 0x1) { b &= ~0x1; if (inStaleSpace(b)) b = activate(b); a.f = b; [atomic] } b = a.f; if (b & 0x1) { b &= ~0x1; if (inStaleSpace(b)) b = activate(b); a.f = b; [atomic] }

Application Accesses Stale Object roots in-use spacestale space scion space C  C scion scion table E E F F C stub D D A A B stub B B C C C scion

Application Accesses Stale Object roots in-use spacestale space scion space C  C scion scion table F F C stub D D A A B stub B B C scion E E C C

Application Accesses Stale Object roots in-use spacestale space scion space C  C scion scion table F F C stub D D A A B stub B B C scion E E C C

Implementation Integrated into Jikes RVM ◦ Works with any tracing collector ◦ Evaluation uses generational copying collector

Implementation Integrated into Jikes RVM ◦ Works with any tracing collector ◦ Evaluation uses generational copying collector 64-bit 120 GB 32-bit 2 GB

Implementation Integrated into Jikes RVM ◦ Works with any tracing collector ◦ Evaluation uses generational copying collector 64-bit 120 GB mapping stub mapping stub 32-bit 2 GB

Performance Evaluation Methodology ◦ DaCapo, SPECjbb2000, SPECjvm98 ◦ Dual-core Pentium 4 ◦ Deterministic execution (replay) Results ◦ 6% overhead (read barriers) ◦ Stress test: still 6% overhead  Speedups in tight heaps (reduced GC workload)

Tolerating Leaks

Eclipse Diff: Reachable Memory

Eclipse Diff: Performance

Managed [LeakSurvivor, Tang et al. ’08] [Panacea, Goldstein et al. ’07, Breitgand et al. ’07] ◦ Don’t guarantee time & space proportional to live memory Native [Cyclic memory allocation, Nguyen & Rinard ’07] [Plug, Novark et al. ’08] ◦ Different challenges & opportunities ◦ Less coverage or change semantics Orthogonal persistence & distributed GC ◦ Barriers, swizzling, object faulting, stub-scion pairs Related Work

Conclusion Finding bugs before deployment is hard

Conclusion Online diagnosis helps developers Help users in meantime Tolerate leaks with Melt: illusion of fix Stale objects

Conclusion Finding bugs before deployment is hard Online diagnosis helps developers Help users in meantime Tolerate leaks with Melt: illusion of fix ◦ Time & space proportional to live memory ◦ Preserve semantics

Conclusion Finding bugs before deployment is hard Online diagnosis helps developers Help users in meantime Tolerate leaks with Melt: illusion of fix ◦ Time & space proportional to live memory ◦ Preserve semantics Buys developers time to fix leaks Thank you!

Backup

Triggering Melt INACTIVE MARK STALE MOVE & MARK STALE WAIT Heap not nearly full Heap full or nearly full Heap full or nearly full Start Expected heap fullness Heap not nearly full After marking Unexpected heap fullness Back

Conclusion Finding bugs before deployment is hard Online diagnosis helps developers To help users in meantime, tolerate bugs Tolerate leaks with Melt: illusion of fix Stale objects

Related Work: Tolerating Bugs Nondeterministic errors [Atom-Aid] [DieHard] [Grace] [Rx] ◦ Memory corruption: perturb layout ◦ Concurrency bugs: perturb scheduling General bugs ◦ Ignore failing operations [FOC] ◦ Need higher level, more proactive approaches

Melt’s GC Overhead Back