Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ahmed Y. Tamrawi Electrical and Computer Engineering Department Iowa State University 2011.

Similar presentations


Presentation on theme: "Ahmed Y. Tamrawi Electrical and Computer Engineering Department Iowa State University 2011."— Presentation transcript:

1 Ahmed Y. Tamrawi Electrical and Computer Engineering Department Iowa State University 2011

2 Software Bugs 1 2345 { Introduction } Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging2 A common term used to describe a flaw, mistake, or failure in a computer system that produces an incorrect or unexpected result, or causes it to behave in unintended ways. Definition: (Software Bug) Bugs can occur in any software. Ranging from operating systems, flight auto- pilot software, to a simple arithmetic program! Software bugs are costing ~60 bln US$/Y. The term Bug (September 9, 1947)

3 More Bugs 1 2345 { Introduction } Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging3

4 Bug Repository Software users and developers report bugs, to allow software developers to fix them. Bugs are reported using bug reports which are added to an issue tracking system or bug repository. 1 2345 { Introduction } Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging4 reportedstored An interface for Bugs Repository

5 Manual bug triaging is a difficult, expensive, and lengthy process, since it needs the bug triager to manually read, analyze, and assign bug fixers for each newly reported bug. Bug Triaging 1 2345 { Introduction } Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging5 Assigning a bug to the most appropriate/capable developer who will fix it. Definition: (Bug Triaging)

6 Bug Triaging 1 2345 { Introduction } Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging6 New Bug Reports Bugs Repository Software Developers Bug Assignment Bug Triager

7 Bug Triaging Bug triager challenges: – Knowledge about the system/project; – Descriptiveness of bug report; – Rate of reporting bugs; – Many developers, different projects, and various expertise! Why not to automate the bug triaging process? – Improve software quality; – Reduce cost and time. 1 2345 { Introduction } Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging7 Eclipse – Feb 2011

8 Example Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging8 1 2345 { Motivation } Assigned to: James Moody Summary: New Repository wizard follows implementation model, not user model. Description: The new CVS Repository Connection wizard's layout is confusing. This is because it follows the implementation model of the order of elds in the full CVS location path rather than the user model... Assigned to: James Moody Summary: Opening repository resources doesn't honor type. Description: Opening repository resource always open the default text editor and doesn't honor any mapping between resource types and editors. As a result it is not possible to view the contents of an image (*.gif le) in a sensible way.... Version Control Management (VCM) Technical Aspect James Moody This aspect is concerned about various Concurrent Versions System (CVS) repository features and operations within Eclipse project.

9 Technical Aspects & Terms A software system has many technical aspects. Technical aspects are described via the technical terms extracted from software artifacts. A bug report describes issues related to technical aspects via its terms. Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging9 1 2345 { Motivation }

10 Automatic Bug Triaging Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging10 1 2345 { Motivation } Who have the most bug-fixing capability/expertise with respect to the reported technical aspect(s) in a give bug report should be the fixer(s) Key Philosophy for Automatic Bug Triaging

11 Problem Definition Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging11 1 2345 { Bugzie Model } In a software system, given a bug report B, and a set of developers D who have past fixing activity. Find the developers(s) with the most fixing expertise with respect to the reported technical aspect(s) in B. In a software system, given a bug report B, and a set of developers D who have past fixing activity. Find the developers(s) with the most fixing expertise with respect to the reported technical aspect(s) in B. Problem: (Automatic Bug Assignment) Bugs Repository New Bug Report B Software Developers

12 Bugzie Overview Bugzie considers the problem as a ranking problem. – State-of-the-art approaches view the problem as a classification problem. For a bug report, Bugzie determines a ranked list of developers most capable toward the reported issue(s). Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging12 1 2345 { Bugzie Model }

13 Bugzie Overview Bugzie utilizes the fuzzy set theory to rank the fixing expertise of developers toward the technical aspects. Bugzie models the association of a developer and technical aspects. If a developer has higher fixing association with a technical aspect, he will have higher expertise and rank for that aspect. Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging13 1 2345 { Bugzie Model }

14 Association of Fixer & Term Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging14 1 2345 { Bugzie Model } Definition: (Capable Fixer toward a Term) CtCt CtCt

15 Association of Fixer & Term The membership score of a developer d toward a term t is: D d : Bug reports d has fixed. D t : Bug reports containing t. Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging15 1 2345 { Bugzie Model } D( ) D( ) D( )

16 Association of Fixer & Bug Report Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging16 1 2345 { Bugzie Model } Bug Report (B) Bug Report (B) t1t1 t2t2 tntn CBCB CBCB

17 Association of Fixer & Bug Report Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging17 1 2345 { Bugzie Model }

18 Bugzie Model Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging18 1 2345 { Bugzie Model } 4 Bug Report (B) Bug Report (B) Pre-processing 2 t1t1 t2t2 tntn Recommendation List Recommendation 3 Bugs Repository Initial Training 1 Updating 5 Bug Report (B) Bug Report (B)

19 Bugzie Caching Fixer candidates selection ( Developers Caching ). Significant terms selection ( Terms Caching ). Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging19 1 2345 { Bugzie Model } Bugs Repository Initial Training Developers Cache F(x) Terms Cache T(k)

20 Data Collection Collected all fixed bug reports from 7 bug repositories. For each bug report, we extracted and merged the summary and description. For each system, we pre-processed these reports: stemming, stop words removal, etc. Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging20 1 2345 { Bugzie Model }

21 Locality of Fixing Activity Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging21 Timeline Bug Report 20102009200820072006 1 2345 { Bugzie Model }

22 Locality of Fixing Activity Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging22 Bug Report B Fixed by d Fixing Timeline 20102009200820072006 All Developers that have been fixing before B Developers Cache F(x) Recent x% 1 2345 { Bugzie Model } The recent fixing developers are likely to fix bug reports in the near future. Hypothesis: (Locality of Fixing Activity)

23 Locality of Fixing Activity Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging23 1 2345 { Bugzie Model } 94% - 98% 96% - 99%

24 Selection of Fixer Candidates The locality of fixing activity suggests the actual fixer for a given bug report is likely the one having recent fixing activity. For each bug report, Bugzie chooses the top x% of developers sorted by their fixing time as the fixer candidates F(x). Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging24 1 2345 { Bugzie Model } Bug Report B Fixed by d Fixing Timeline 20102009200820072006 All Developers that have been fixing before B Developers Cache F(x) Recent x%

25 Bug Report (B) Bug Report (B) Developers Caching Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging25 1 2345 { Bugzie Model } 5 Initial Training Developers Cache F(x) 1 Bug Report (B) Bug Report (B) Pre-processing 3 t1t1 t2t2 tntn Recommendation List Recommendation Updating 4 6 Bugs Repository 2

26 Selection of Descriptive Terms Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging26 1 2345 { Bugzie Model }

27 Selection of Descriptive Terms Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging27 1 2345 { Bugzie Model }

28 Terms Caching Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging28 1 2345 { Bugzie Model } Bugs Repository Initial Training Terms Cache T(k) Bug Report (B) Bug Report (B) Pre-processing t1t1 t2t2 tntn Recommendation List Recommendation Updating Bug Report (B) Bug Report (B) Updating

29 Empirical Evaluation We evaluated Bugzie on our collected datasets. Experiments: – Selection of fixer candidates; – Selection of terms; – Selection of developers and terms; – Comparison with state-of-the-art approaches. Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging29 1 2345 { Empirical Evaluation }

30 Experiment Setup Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging30 1 2345 { Empirical Evaluation } Bug Report B Creation Timeline 012345678910 Bugzie uses frame 0 for initial training 1 Using training data, Bugzie recommends a top-n developers to fix bug report B 2 Bugzie updates the training data with the tested bug report B 3 Move to next Bug Report Bug Report B Recommendation List for B Bugzie repeats steps 2 and 3 till it consumes all bug reports

31 Prediction Accuracy If the recommendation list for a bug report contains its actual fixer, we count this as a hit (i.e. a correct recommendation). For each frame under test, we calculated Prediction Accuracy (PA). Top-2 prediction accuracy is 60% If we have 100 bugs and for 60 of those bugs, we could recommend the actual fixing developer is in our Top-2 list, then Top-2 prediction accuracy is 60%. Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging31 1 2345 { Empirical Evaluation }

32 Selection of Fixer Candidates Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging32 1 2345 { Empirical Evaluation } Bug Report (B) Bug Report (B) 5 Initial Training Developers Cache F(x) 1 Bug Report (B) Bug Report (B) Pre-processing 3 t1t1 t2t2 tntn Recommendation List Recommendation Updating 4 6 Bugs Repository 2 Bug Report B Fixed by d Fixing Timeline 20102009200820072006 All Developers that have been fixing before B Developers Cache F(x) Recent x%

33 Selection of Fixer Candidates Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging33 1 2345 { Empirical Evaluation } Top-1 Prediction Accuracy Top-5 Prediction Accuracy Firefox ( ): At x = 10%, PA = 72.4% At x = 100%, PA = 70.7% Firefox ( ): At x = 10%, PA = 72.4% At x = 100%, PA = 70.7%

34 Selecting a suitable portion of recent fixers does not lessen much the accuracy, and sometimes improves it as in the cases of Firefox, Eclipse, etc. Selecting only a portion of available developers as candidates also improves time efficiency. Selection of Fixer Candidates Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging34 1 2345 { Empirical Evaluation }

35 Selection of Terms Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging35 1 2345 { Empirical Evaluation } 5 Bugs Repository Initial Training Terms Cache T(k) 1 2 Bug Report (B) Bug Report (B) Pre-processing 3 t1t1 t2t2 tntn Recommendation List Recommendation Updating 4 Bug Report (B) Bug Report (B) Updating 6

36 Selection of Terms Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging36 1 2345 { Empirical Evaluation } Top-1 Prediction Accuracy Top-5 Prediction Accuracy Peak Range Eclipse( ): At k = 16, PA = 80% At k = All Terms, PA = 72% Eclipse( ): At k = 16, PA = 80% At k = All Terms, PA = 72%

37 Selection of terms could improve much the prediction accuracy. The results suggest that one just needs a small yet significant set of terms for each developer to describe his bug-fixing expertise. Selection of Terms Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging37 1 2345 { Empirical Evaluation }

38 Selection of Developers & Terms To study the impact of both developers selection (x) and terms selection (k). Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging38 1 2345 { Empirical Evaluation } Eclipse Firefox

39 Selection of Developers & Terms Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging39 1 2345 { Empirical Evaluation } Base: Base model with all developers and all terms C.S.: Candidate Selection T.S.: Terms Selection Both: The best PA when applying both C.S. and T.S. Base: Base model with all developers and all terms C.S.: Candidate Selection T.S.: Terms Selection Both: The best PA when applying both C.S. and T.S.

40 Comparison We compared Bugzie Results with state-of- the-art approaches. Used Weka to re-implement those approaches Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging40 1 2345 { Empirical Evaluation }

41 Comparison Some of the approaches (C4.5 - Decision Trees) can not scale up well to our dataset. We prepared smaller dataset: Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging41 1 2345 { Empirical Evaluation } 3-Year Histories of the full dataset

42 Comparison Results Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging42 1 2345 { Empirical Evaluation } (d) days, (h) hours, (m) minutes, (s) seconds

43 Conclusions Bugzie achieves higher accuracy and efficiency than state-of-the-art approaches. Bugzie can accommodate the locality of fixing activity and software evolution with flexible caching of developers and terms. Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging43 1 2345 { Conclusions}

44 Thesis Contributions Bugzie, a scalable, fuzzy set and cache-based automatic bug triaging approach, which is significantly more efficient and accurate than existing state-of-the-art approaches. The finding of the locality of fixing activity. A comprehensive evaluation on the efficiency and correctness of Bugzie in comparison with state-of-the-art approaches. An observation/method to capture a small and significant set of terms describing developers bug-fixing expertise. Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging44 1 2345 { Conclusions}

45 Future Work Use different caching mechanisms for developers and terms. Explore the usage of other textual and non- textual contents of bug reports for bug triaging. Use other software artifacts to accurately measure the developers expertise. Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging45 1 2345 { Conclusions}

46 References [1] D. Cubranic and G. Murphy. Automatic bug triage using text categorization. In SEKE04, KSI Press. [2] J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? In ICSE 06, pages 361–370. ACM, 2006. [3] P. Bhattacharya and I. Neamtiu. Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging. In ICSM'10. IEEE CS, 2010. [4] D. Matter, A. Kuhn, and O. Nierstrasz. Assigning bug reports using a vocabulary-based expertise model of developers. In MSR'09, pp. 131–140. IEEE CS, 2009. Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging46 1 2345 { Conclusions}

47 Thank You! Iowa State UniversityFuzzy Set and Cache-based Approach for Bug Triaging47


Download ppt "Ahmed Y. Tamrawi Electrical and Computer Engineering Department Iowa State University 2011."

Similar presentations


Ads by Google