Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Retrieving Similar Code Fragments based on Identifier.

Slides:



Advertisements
Similar presentations
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Identifying Source.
Advertisements

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extraction of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Extracting Code.
A Tool Support to Merge Similar Methods with a Cohesion Metric COB ○ Masakazu Ioka 1, Norihiro Yoshida 2, Tomoo Masai 1,Yoshiki Higo 1, Katsuro Inoue 1.
Re-ranking Documents Segments To Improve Access To Relevant Content in Information Retrieval Gary Madden Applied Computational Linguistics Dublin City.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Prototype of.
‘C’ LANGUAGE PRESENTATION.  C language was introduced by Dennis Ritchie..  It is a programming language, which can make a interaction between user and.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Measuring Copying.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Industrial Application.
Introduction SWE 619. Why Is Building Good Software Hard? Large software systems enormously complex  Millions of “moving parts” People expect software.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University CoxR: Open Source.
Software Engineering Lab, Osaka University Code Clone Analysis and Its Application Katsuro Inoue Osaka University.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Cross-application.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Mining Coding Patterns to Detect Crosscutting Concerns.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University ICSE 2003 Java.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Similar.
2006/09/19AOAsia 21 Towards Locating a Functional Concern Based on a Program Slicing Technique Takashi Ishio 1,2, Ryusuke Niitani 2 and Katsuro Inoue 2.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University What Kinds of.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University A Criterion for.
Department of Computer Science, Graduate School of Information Science and Technology, Osaka University DCCFinder: A Very- Large Scale Code Clone Analysis.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Investigation.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University A clone detection approach for a collection of similar.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University What Do Practitioners.
An Adaptive Version-Controlled File System Makoto Matsushita, Tetsuo Yamamoto and Katsuro Inoue Osaka University, JAPAN.
Software Engineering Research Group, Graduate School of Engineering Science, Osaka University Analysis and Implementation Method of Program to Detect Inappropriate.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Code-Clone Analysis.
2002/12/11PROFES20021 On software maintenance process improvement based on code clone analysis Yoshiki Higo* , Yasushi Ueda* , Toshihiro Kamiya** , Shinji.
1 Gemini: Maintenance Support Environment Based on Code Clone Analysis *Graduate School of Engineering Science, Osaka Univ. **PRESTO, Japan Science and.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Design and Implementation.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Applying Clone.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Inoue Laboratory Eunjong Choi 1 Investigating Clone.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University How to extract.
An Effective Method to Control Interrupt Handler for Data Race Detection Makoto Higashi †, Tetsuo Yamamoto ‡, Yasuhiro Hayase †, Takashi Ishio † and Katsuro.
Engineering Essential Characteristics Security Engineering Process Overview.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University VerXCombo: An.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Development of.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 1 Towards an Assessment of the Quality of Refactoring.
Static Program Analysis of Embedded Software Ramakrishnan Venkitaraman Graduate Student, Computer Science Advisor: Dr. Gopal Gupta
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 1 Towards an Investigation of Opportunities for Refactoring.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University July 21, 2008WODA.
Copyright © 2015 NTT DATA Corporation Kazuo Kobori, NTT DATA Corporation Makoto Matsushita, Osaka University Katsuro Inoue, Osaka University SANER2015.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Code Clones.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Cage: A Keyword.
MCL - 'route map' for demo1 ‘ ROUTE MAP’ MCL- demo 1. Project definition 2. ‘Main’ program definition 3. Programme 1 definition - ‘Price check’ 4. Simulator.
Extracting a Unified Directory Tree to Compare Similar Software Products Yusuke Sakaguchi, Takashi Ishio, Tetsuya Kanda, Katsuro Inoue Department of Computer.
1 Measuring Similarity of Large Software System Based on Source Code Correspondence Tetsuo Yamamoto*, Makoto Matsushita**, Toshihiro Kamiya***, Katsuro.
Department of Computer Science, Graduate School of Information Science & Technology, Osaka University An Empirical Study of Out-dated Third-party Code.
Experience of Finding Inconsistently-Changed Bugs in Code Clones of Mobile Software Katsuro Inoue†, Yoshiki Higo†, Norihiro Yoshida†, Eunjong Choi†, Shinji.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Classification.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Extracting Sequence.
Sairajiv Burugapalli. This chapter covers three main categories of classic software vulnerability: Buffer overflows Integer vulnerabilities Format string.
What kind of and how clones are refactored? A case study of three OSS projects WRT2012 June 1, Eunjong Choi†, Norihiro Yoshida‡, Katsuro Inoue†
Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Towards a Collection of Refactoring Patterns Based.
1 Gemini: Code Clone Analysis Tool †Graduate School of Engineering Science, Osaka Univ., Japan ‡ Graduate School of Information Science and Technology,
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
TESTING BASED ON ERROR GUESSING Rasa Zavistanavičiūtė, IFME-0/2.
Estimating Code Size After a Complete Code-Clone Merge Buford Edwards III, Yuhao Wu, Makoto Matsushita, Katsuro Inoue 1 Graduate School of Information.
Yasuhiro Hayase†, Yu Kashima‡, Yuki Manabe‡, Katsuro Inoue‡
Source File Set Search for Clone-and-Own Reuse Analysis
Naoya Ujihara1, Ali Ouni2, Takashi Ishio1, Katsuro Inoue1
Mining Application-Specific Coding Patterns for Software Maintenance
Yuta Nakamura1, Eunjong Choi1, Norihiro Yoshida2,
○Yuichi Semura1, Norihiro Yoshida2, Eunjong Choi3, Katsuro Inoue1
Tatsuya Miyake Takashi Ishio Katsuro Inoue
CS179G, Project In Computer Science
Predicting Fault-Prone Modules Based on Metrics Transitions
On Refactoring Support Based on Code Clone Dependency Relation
Research Activities of Software Engineering Lab in Osaka University
Dotri Quoc†, Kazuo Kobori†, Norihiro Yoshida
Presentation transcript:

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University Retrieving Similar Code Fragments based on Identifier Similarity for Defect Detection Norihiro Yoshida Takashi Ishio Makoto Matsushita Katsuro Inoue (Osaka University)

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 1 Similar code fragment A code fragment that has similar part to it in source code  introduced in source code because of various reasons. e.g. “copy-and-paste”  makes software maintenance difficult. Similar code fragment CF 1 If CF 1 is defective… It is necessary to check a2. It is necessary to check CF 2 and CF 3 CF 2 CF 3 Source file

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 2 Similar defects in Linux for(iter=0; iter<num_regs; iter++) { prom_prom_taken[iter].start_adr = prom_reg_memlist[iter].phys_addr; prom_prom_taken[iter].num_bytes = prom_reg_memlist[iter].reg_size; prom_prom_taken[iter].theres_more = &prom_phys_total[iter+1]; // should be:&prom_prom_taken[iter+1]; } for(iter=0; iter<num_regs; iter++) { prom_prom_taken[iter].start_adr = (char *) prom_reg_memlist[iter].phys_addr; prom_prom_taken[iter].num_bytes = (unsigned long) prom_reg_memlist[iter].reg_size; prom_prom_taken[iter].theres_more = &prom_phys_total[iter+1]; // should be:&prom_prom_taken[iter+1]; }

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 3 Similar defects in Linux for(iter=0; iter<num_regs; iter++) { prom_prom_taken[iter].start_adr = prom_reg_memlist[iter].phys_addr; prom_prom_taken[iter].num_bytes = prom_reg_memlist[iter].reg_size; prom_prom_taken[iter].theres_more = &prom_phys_total[iter+1]; // should be:&prom_prom_taken[iter+1]; } for(iter=0; iter<num_regs; iter++) { prom_prom_taken[iter].start_adr = (char *) prom_reg_memlist[iter].phys_addr; prom_prom_taken[iter].num_bytes = (unsigned long) prom_reg_memlist[iter].reg_size; prom_prom_taken[iter].theres_more = &prom_phys_total[iter+1]; // should be:&prom_prom_taken[iter+1]; } Type cast operations are inserted. Clone detection tools cannot treat the code fragments as a clone pair.

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 4 An overview of proposed method Input code fragment (Query) Target source files Lexical Analysis I i [0]I i [n i ] Input identifier list Lexical Analysis I t1 [0] I t1 [n t1 ] I t2 [0]I t2 [n t2 ] I tn [0]I tn [n tn ] Target identifier lists Comparison Similar sublists I s1 [0]I s1 [n s1 ] I s2 [0]I s2 [n s2 ] Ranking I sn [0]I sn [n sn ] RankStart line #End line #Similarity 1Line s1 Line e1 Sim 1 2Line s2 Line e2 Sim 2 Similarity Ranking The method retrieves code fragments similar to an input code fragment.

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 5 Comparison Scan a target identifier list with a sliding window  We compare identifiers in the sliding window with the input identifier list. Extract a code fragment corresponding to the sliding window if the window involves one or more identifiers in the input list It[3]It[0]It[1]It[2] It[n]It[n-1] Input identifier list Ii[0]Ii[1]Ii[2] Sliding Window ( fixed length ) The direction of movement of the sliding window Target identifier list

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 6 Similarity-based ranking The extracted code fragments are sorted according to the following similarity.  S i : a set of elements in an input identifier list  S w : a set of elements in a sliding window Developers investigate the resultant similarity-based ranking.

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 7 Case Study Target open source software systems  arch/ directory in Linux Architecture-specific implementations in OS 2 incorrect pointer accesses  server/ directory in Canna 3.6 Japanese input system 19 buffer overflow errors Procedure 1. extract code fragments sharing similar defects 2. enter each code fragment into the tool implementing our method 3. inspect if the similarity ranking ranks highly code fragments involving defects

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 8 Result Linux  We used 2 code fragments as queries. Each code fragment involves an incorrect pointer access.  In both of those queries, the 2 code fragments are the top 2. Canna 3.6  We used 19 code fragments as queries. Each code fragment involves a buffer overflow error.  In all of those queries, 18 or 19 code fragments are the top 30. In our case studies, we could detect most of similar defects.

Department of Computer Science, Graduate School of Information Science & Technology, Osaka University 9 Summary & Future work We proposed a method to retrieve similar code fragments based on identifier similarity.  Sliding window comparison  Similarity-based ranking We need further case studies.  Application to similar defects in other software systems  Effects from changing “similarity” definition