Presentation is loading. Please wait.

Presentation is loading. Please wait.

Locating Matching Method Calls by Mining Revision History Data Benjamin Livshits Stanford University and Thomas Zimmermann Saarland University.

Similar presentations


Presentation on theme: "Locating Matching Method Calls by Mining Revision History Data Benjamin Livshits Stanford University and Thomas Zimmermann Saarland University."— Presentation transcript:

1 Locating Matching Method Calls by Mining Revision History Data Benjamin Livshits Stanford University and Thomas Zimmermann Saarland University

2 The Usual Suspects Much bug-detection research in recent years Focus: generic patterns, sometimes language-specific NULL dereferences Security Buffer overruns Format string violations Memory Double-deletes Memory leaks Locking errors/threads Deadlock/races/atomicity Lets look at the space of error patterns in more detail

3 Classification of Error Patterns NULL dereferences Buffer overruns Double-deletes Locking errors/threads Generic patterns -- the usual suspects App-specific patterns particular to a system or a set of APIs Bugs in Linux code Bugs in J2EE servlets Device drivers Error Pattern Iceberg NULL dereferences Buffer overruns Double-deletes Locks/threads …

4 Classification of Error Patterns App-specific patterns particular to a system or a set of APIs Intuition: Many other application-specific patterns exist Much of application-specific stuff remains a gray area so far Goal: Lets figure out what the patterns are Generic patterns -- the usual suspects NULL dereferences Buffer overruns Double-deletes Locking errors/threads Anybody knows any good error patterns specific to WinAmp plugins? ? There are hundreds of WinAmp plugins out there

5 Focus: Matching Method Pairs Start small: Matching method pairs Only two methods A very simple state machine Calls have match perfectly Very common, our inspiration is System calls fopen/fclose lock/unlock … GUI operations addNotify/removeNotify addListener/removeListener createWidget/destroy Widget … Want to find more of the same

6 Our Insight Our problem: Want to find matching method pairs that are error patterns Our technique: Look at revision histories (think CVS) Crucial observation: Use data mining techniques to find method that are often added at the same time Things that are frequently checked in together often form a pattern

7 Our Insight (continued) Now we know the potential patterns Profile the patterns Run the application See how many times each pattern hits – number of times a pattern is followed misses – number of times a pattern is violated Based on this statistics, classify the patterns Usage patterns – almost always hold Error patterns – violated a large number of the times, but still hold most of the time Unlikely patterns – not validated enough times

8 System Architecture mine CVS histories patterns run the application post-process usage patterns error patterns unlikely patterns sort and filter revision history mining dynamic analysis report bugs report patterns reporting instrument relevant method calls

9 Mining Basics Sequence of revisions Files Foo.java, Bar.java, Baz.java, Qux.java Simplification: look at method calls only Look for interesting patterns in the way methods are called o1.addListener o1.removeListener o2.addListener o2.removeListener System.out.println o3.addListener o3.removeListener list.iterator iter.hasNext iter.next o4.addListener System.out.println o4.removeListener Foo.java 1.12 Bar.java 1.47 Baz.java 1.23 Qux.java 1.41 1.42

10 Mining Matching Method Calls Use our observation: Methods that are frequently added simultaneously often represent a use pattern For instance: … addListener(…); … removeListener(…); … o1.addListener o1.removeListener o2.addListener o2.removeListener System.out.println o3.addListener o3.removeListener list.iterator iter.hasNext iter.next o4.addListener System.out.println o4.removeListener Foo.java 1.12 Bar.java 1.47 Baz.java 1.23 Qux.java 1.41 1.42

11 Data Mining Summary We consider method calls added in each check-in We want to find patterns of method calls Too many potential pairs to consider Want to filter and rank them Use support and confidence for that Support and confidence of each pattern Standard metrics used in data mining Support reflects how many times each pair appears Confidence reflects how strongly a particular pair is correlated Refer to the paper for details

12 Improvements Over the Traditional Approach The default data mining approach doesnt really work Filters based on confidence and support Still too many potential patterns! 1.Filtering: Consider only pairs with the same initial subsequence as potential patterns 2.Ranking: Use one-line fixes to find likely error patterns

13 Matching Initial Call Sequences o1.addListener o1.removeListener o2.addListener o2.removeListener System.out.println o3.addListener o3.removeListener list.iterator iter.hasNext iter.next o4.addListener System.out.println o4.removeListener Foo.java 1.12 Bar.java 1.47 Baz.java 1.23 Qux.java 1.41 1.42 1 Pair 3 Pairs 1 Pair 10 Pairs 2 Pairs 1 Pair 0 Pairs 0 Pairs

14 Using Fixes to Rank Patterns Look for one-call additions which likely indicate fixes. Rank pairs for such methods higher. o1.addListener o1.removeListener o2.addListener o2.removeListener System.out.println o3.addListener o3.removeListener list.iterator iter.hasNext iter.next o4.addListener System.out.println o4.removeListener Foo.java 1.12 Bar.java 1.47 Baz.java 1.23 Qux.java 1.41 1.42 This is a fix! Move pairs containing removeListener up

15 Mining Setup Apply these ideas to the revision history of Eclipse Very large open-source project Many people working on it Perform filtering based on Pattern support Pattern strength Get 32 strongly correlated method pairs in Eclipse

16 Some Interesting Method Pairs (1) kEventControlActivatekEventControlDeactivate addDebugEventListenerremoveDebugEventListener beginRuleendRule suspendresume NewPtrDisposePtr addListenerremoveListener registerderegister addElementChangedListenerremoveElementChangedListener addResourceChangeListenerremoveResourceChangeListener addPropertyChangeListenerremovePropertyChangeListener createPropertyListreapPropertyList preReplaceChildpostReplaceChild addWidgetremoveWidget stopMeasuringcommitMeasurements blockSignalunblockSignal HLockHUnlock OpenEventfireOpen …

17 Some Interesting Method Pairs (2) kEventControlActivatekEventControlDeactivate addDebugEventListenerremoveDebugEventListener beginRuleendRule suspendresume NewPtrDisposePtr addListenerremoveListener registerderegister addElementChangedListenerremoveElementChangedListener addResourceChangeListenerremoveResourceChangeListener addPropertyChangeListenerremovePropertyChangeListener createPropertyListreapPropertyList preReplaceChildpostReplaceChild addWidgetremoveWidget stopMeasuringcommitMeasurements blockSignalunblockSignal HLockHUnlock OpenEventfireOpen … Begin applying a thread scheduling rule to a Java thread

18 Some Interesting Method Pairs (3) kEventControlActivatekEventControlDeactivate addDebugEventListenerremoveDebugEventListener beginRuleendRule suspendresume NewPtrDisposePtr addListenerremoveListener registerderegister addElementChangedListenerremoveElementChangedListener addResourceChangeListenerremoveResourceChangeListener addPropertyChangeListenerremovePropertyChangeListener createPropertyListreapPropertyList preReplaceChildpostReplaceChild addWidgetremoveWidget stopMeasuringcommitMeasurements blockSignalunblockSignal HLockHUnlock OpenEventfireOpen … Register/unregister the current widget with the parent display object for subsequent event forwarding

19 Some Interesting Method Pairs (4) kEventControlActivatekEventControlDeactivate addDebugEventListenerremoveDebugEventListener beginRuleendRule suspendresume NewPtrDisposePtr addListenerremoveListener registerderegister addElementChangedListenerremoveElementChangedListener addResourceChangeListenerremoveResourceChangeListener addPropertyChangeListenerremovePropertyChangeListener createPropertyListreapPropertyList preReplaceChildpostReplaceChild addWidgetremoveWidget stopMeasuringcommitMeasurements blockSignalunblockSignal HLockHUnlock OpenEventfireOpen … Add/remove listener for a particular kind of GUI events

20 Some Interesting Method Pairs kEventControlActivatekEventControlDeactivate addDebugEventListenerremoveDebugEventListener beginRuleendRule suspendresume NewPtrDisposePtr addListenerremoveListener registerderegister addElementChangedListenerremoveElementChangedListener addResourceChangeListenerremoveResourceChangeListener addPropertyChangeListenerremovePropertyChangeListener createPropertyListreapPropertyList preReplaceChildpostReplaceChild addWidgetremoveWidget stopMeasuringcommitMeasurements blockSignalunblockSignal HLockHUnlock OpenEventfireOpen … Use OS native locking mechanism for resources such as icons, etc.

21 Dynamically Check the Patterns Home-grown bytecode instrumenter Get a list of matching method pairs Instrument calls to any of the methods to dump parameters Post-processing of the output Process a stream of events Find and count matches and mismatches … o.register(d) … o.deregister(d) … o.deregister(d) matched mismatched ???

22 Experimental Setup Applied our approach to Eclipse One of the biggest Java applications 2,900,000 lines of Java code Included many Eclipse plugins consisting of lower quality code than the core Chose 32 matching method pairs Times: 5 days to fetch and process CVS histories 30 minutes to compute the patterns An hour to instrument Eclipse And we are done!

23 Experimental Summary Pattern classification: 5 are usage patterns 5 are error patterns 5 are unlikely patterns 17 were not hit at runtime Error patterns Resulted in a total of 107 dynamically confirmed bugs Results for a single run of Eclipse

24 A Preview of Coming Attractions… We have a paper in FSE 2005 Describes a tool called DynaMine Provides various extensions: More complex patterns: State machines Grammars More applications analyzed More bugs found


Download ppt "Locating Matching Method Calls by Mining Revision History Data Benjamin Livshits Stanford University and Thomas Zimmermann Saarland University."

Similar presentations


Ads by Google