Presentation is loading. Please wait.

Presentation is loading. Please wait.

Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley

Similar presentations


Presentation on theme: "Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley"— Presentation transcript:

1 Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley {mjump,mckinley}@cs.utexas.edu

2 Department of Computer Sciences 11-Dec-2006UMCP2  Best case : increases GC workload  Worst case: systematic heap growth causes crash after days of execution A memory leak in a garbage-collected language occurs when a program inadvertently maintains references to objects that it no longer needs, preventing the collector from reclaiming space. Cork accurately pinpoints systematic heap growth completely online

3 Department of Computer Sciences 11-Dec-2006UMCP3 Cork’s Solution 1. Summarize heap growth by calculating type points-from graph  Piggybacks on full-heap object scan  Summarizes the heap by type 2. Interpret the summarization using differencing 3. Generate debugging reports  Candidate Report  Slice Report  Allocation Site Report

4 Department of Computer Sciences 11-Dec-2006UMCP4 3 Type Points-From Graph Heap 3 4 2 2 1 4 1 1 1 TPFG =instance =type =HashTable =Queue =PQueue =Company =People

5 Department of Computer Sciences 11-Dec-2006UMCP5 1 2 Differencing TPFGs 1 2 1 2 2 3 2 1 TPFG i+1 TPFG i+2 1 4 1 1 2 1 2 3 3 4 TPFG i 1 1 1 1 1 1 1 1 1

6 Department of Computer Sciences 11-Dec-2006UMCP6 1 4 1 1 4 1 1 1 4  Rank growing nodes  Rank all growing nodes  Designate node as a candidate if Finding Growth (SRT)

7 Department of Computer Sciences 11-Dec-2006UMCP7 Reported Candidates # of Candidates SRT jess SPECjbb fop

8 Department of Computer Sciences 11-Dec-2006UMCP8 1 4 1 1 4 1 1 1 4  Find nodes that are growing  Rank all growing nodes  Designate node as a candidate if Finding Growth (RRT)

9 Department of Computer Sciences 11-Dec-2006UMCP9 Reported Candidates # of Candidates SRT RRT jess SPECjbb fop

10 Department of Computer Sciences 11-Dec-2006UMCP10 Finding Data Structure  Type is not enough Growing edges identify the data structure Rank edges  Calculate a slice from each candidate Set of all paths ( n 0 … n n ) such that “Sees” beyond non-candidate nodes 1 1 1 1 1 4 1 4 1 4

11 Department of Computer Sciences 11-Dec-2006UMCP11 Implementation and Methodology  Jikes RVM with MMTk  Benchmarks: SPECjvm98, DaCapo, SPECjbb2000 Eclipse 3.1.2  Garbage collector Generational with 4MB bounded nursery  For performance, report application only Replay compilation 2 nd run methodology

12 Department of Computer Sciences 11-Dec-2006UMCP12 Efficiency and Scalability  Node/type data stored in type information block (TIB) adding 5 words 1 word for type volume and edge list pointer for each of the previous 4 collections 1 word for # of phases (p)  Edge data stored in lists Prune parts of TPFG that are non-growing

13 Department of Computer Sciences 11-Dec-2006UMCP13 Space Overhead jessEclipse Geomean # of types bm+VM174433651747 TPFG avg318667334 TPFG max319775346 # of edges TPFG avg8444090904 TPFG max86175851142 % pruned66%42%60% Increased Alloc %0.094%0.167%0.233% 19% 2.7X 0.233%

14 Department of Computer Sciences 11-Dec-2006UMCP14 Time Overhead Normalized Total Time Heap Size Relative to Minimum

15 Department of Computer Sciences 11-Dec-2006UMCP15 Benchmarks on Cork  Cork identified: Systematic heap growth Growing types Growing data structure  Analysis: fop – application design jess – memory leak SPECjbb2000 – memory leak SPECjbb fop jess

16 Department of Computer Sciences 11-Dec-2006UMCP16 SPECjbb2000 Heap Occupancy (MB) Time (MB of allocation)

17 Department of Computer Sciences 11-Dec-2006UMCP17 Slice Diagram: SPECjbb2000 Order Orderline Date NewOrder Object[] longBTreeNode longBTree longStaticBTree Types:1663 (71) Nodes:318 Edges:904 Candidate Non-candidate

18 Department of Computer Sciences 11-Dec-2006UMCP18 SPECjbb2000 Heap Occupancy (MB) Time (MB of allocation)

19 Department of Computer Sciences 11-Dec-2006UMCP19 Eclipse 3.1.2 on Cork  IDE  Big, complex, and open-source  Bug repository details known memory leaks and how to reproduce them #115789: Memory Leak Comparing 2 source trees or jar files Manually repeat while running Cork

20 Department of Computer Sciences 11-Dec-2006UMCP20 Eclipse 115789 Heap Occupancy (MB) Time (MB of allocation)

21 Department of Computer Sciences 11-Dec-2006UMCP21 Slice Diagram: Eclipse 115789 Path FolderFile ResourceCompareInput$ FilteredBufferedResourceNode ArrayList Object[] ListenerList RuleBasedCollator ResourceCompareInput$ MyDiffNode HashMap$ HashEntry HashMap$ HashEntry[] HashMap HashMap$ HashIterator ResourceCompareInput ElementTree$ ChildIDsCache ElementTree Types:3365 (1773) Nodes:667 Edges:4090 Candidate Non-candidate

22 Department of Computer Sciences 11-Dec-2006UMCP22 Eclipse 115789 Heap Occupancy (MB) Time (MB of allocation)

23 Department of Computer Sciences 11-Dec-2006UMCP23 Slice Diagram: Eclipse 115789 Path FolderFile ResourceCompareInput$ FilteredBufferedResourceNode ArrayList Object[] ListenerList RuleBasedCollator ResourceCompareInput$ MyDiffNode HashMap$ HashEntry HashMap$ HashEntry[] HashMap HashMap$ HashIterator ResourceCompareInput ElementTree$ ChildIDsCache ElementTree Types:3365 (1773) Nodes:667 Edges:4090 Candidate Non-candidate

24 Department of Computer Sciences 11-Dec-2006UMCP24 Eclipse 115789 Heap Occupancy (MB) Time (MB of allocation)

25 Department of Computer Sciences 11-Dec-2006UMCP25 Cork’s Contributions  Very low-overhead technique <0.5% space overhead ~2% time overhead  Accurately identifies Systematic heap growth Data structure containing the growth  First mechanism for detecting memory leaks in production systems

26 Department of Computer Sciences 11-Dec-2006UMCP26 Thank You! mjump@cs.utexas.edu http://www.cs.utexas.edu/~mjump

27 Department of Computer Sciences 11-Dec-2006UMCP27 Second Run Methodology  Replay compilation Profiling runs chooses hot methods Deterministically applies optimizing compiler Mixture of optimized & unoptimized code  Measure 2 nd run First run applies replay compilation Turn off compilation Flush compiler objects from heap Measure second run

28 Department of Computer Sciences 11-Dec-2006UMCP28 Gartner Report predicts that by 2010, 80% of all new software will be in Java or C# C++Java Execution efficiencyDeveloper productivity Trusts the programmerProtects the programmer Arbitrary memory access possible Memory access only through objects Can arbitrarily override typesType safety Procedural or object-orientedObject-oriented Operator overloadingMeaning of operators immutable Powerful capabilities of language Feature-rich, easy-to-use standard library Explicit memory control Automatic memory management [Wikipedia: Comparison of Java and C++, Dec 2006]

29 Department of Computer Sciences 11-Dec-2006UMCP29 Panacea for Bugs?  PMD, FindBugs, JLint, …  ESC/Java, Bandera, …  HPROF, JProbe, HAT, Leakbot, … Microsoft reports that, even in C#, 75% of development time is spent in debugging  Provide a good start  Programs still ship with memory and semantic errors

30 Department of Computer Sciences 11-Dec-2006UMCP30 My Research Focus PROBLEM: Dynamically detect statistical and anomalous per-object behavior 5in production systems Low overhead and high accuracy SOLUTION: Exploit GC and underlying runtime system Focus only on interesting objects Find ways to summarize object properties

31 Department of Computer Sciences 11-Dec-2006UMCP31 Outline  Motivation: Programs have bugs  Cork: Dynamic Memory Leak Detection for Garbage-Collected Languages Summarize using a type points-from graph Interpret the summarization Find memory leaks with Cork  How to focus only on interesting objects  Heap summarization with focus  Conclusions and future work

32 Department of Computer Sciences 11-Dec-2006UMCP32 Memory-Related Bugs with GC  Lost Pointer : lose pointer to memory before freeing  Dangling Pointer : de-referencing pointer to memory previously freed  Unnecessary Reference : keeping pointer to memory no longer needed   Objects are live, can not reclaim  Object is live  Reclaims automatically

33 Department of Computer Sciences 11-Dec-2006UMCP33 Heap Occupancy Graph Heap Occupancy (MB) Time (MB of allocation)

34 Department of Computer Sciences 11-Dec-2006UMCP34 Related Work  Offline Techniques: Static analysis [Heine et al. 03] Heap differencing [JProbe, DePauw et al. 98, 99, 00] Allocation and/or usage tracking [OptimizeIt, Rationale, Purify, HAT, HPROF, Shaham et al. 00]  Online Techniques: Leakbot (partially online) [Mitchell et al. 03] Adaptive usage tracking [Chilimbi et al. 04, Bond et al. 06] Cork accurately pinpoints systematic heap growth completely online

35 Department of Computer Sciences 11-Dec-2006UMCP35 Outline  Motivation: Programs have bugs  Cork: Dynamic Memory Leak Detection for Garbage-Collected Languages Summarize using a type points-from graph Interpret the summarization Find memory leaks with Cork  How to focus only on interesting objects  Heap summarization with focus  Conclusions and future work

36 Department of Computer Sciences 11-Dec-2006UMCP36 What do we know?  Objects have special properties Lifetime, allocation site, last-use site, calling context, thread usage, etc.  Tracking individual object properties is useful for debugging  Can use dynamic object sampling to gather fine-grained object statistics at very low overhead [Jump et al. 04]

37 Department of Computer Sciences 11-Dec-2006UMCP37 Dynamic Object Sampling  Tag objects with special properties One bit in the header indicates a tag Sample tag encodes object properties  Examples: Allocation site Last-use site Lifetime Which data structure

38 Department of Computer Sciences 11-Dec-2006UMCP38  For example, modify a bump-pointer allocator Dynamic Object Sampling Sample Tag

39 Department of Computer Sciences 11-Dec-2006UMCP39 During Garbage Collection  Gather object statistics Piggyback on object scanning SAMPLE TAG FOUND! 1. Examine tag 2. Collect statistics survivors

40 Department of Computer Sciences 11-Dec-2006UMCP40 Focus DOS Overhead  Sampling every object 12% space overhead 6-7% time overhead  What is interesting depends application Memory leak detection … candidate types Malformed data structures … nodes Dynamic pretenuring … random sampling  Focus only on 6% of objects 0.8% space overhead 2-3% time overhead 6%

41 Department of Computer Sciences 11-Dec-2006UMCP41 DOS in Cork  Encode allocation site and lifetime for candidates <1.3% space overhead, ~4% time overhead Find specific allocation sites causing growth  Future work Encode last-use site in sample tag Requires read/write barrier for candidates Will overhead still be low enough for use in production systems?

42 Department of Computer Sciences 11-Dec-2006UMCP42 Conclusions  Developed synergistic two techniques Dynamic object sampling Points-from graphs  See detailed object characteristics in high-level summarizations  Unique ways to debug software in production systems


Download ppt "Department of Computer Sciences Cork: Dynamic Memory Leak Detection with Garbage Collection Maria Jump Kathryn S. McKinley"

Similar presentations


Ads by Google