Presentation is loading. Please wait.

Presentation is loading. Please wait.

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Similar.

Similar presentations


Presentation on theme: "Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Similar."— Presentation transcript:

1 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Similar Defects Using Synonymous Identifier Retrieval Norihiro Yoshida, Takeshi Hattori, Katsuro Inoue Osaka University, Japan 1

2 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 2 Similar Code fragment One of factors that make software maintenance more difficult Source file B It is necessary to determine whether or not modify them Source file A Code fragment CF Modify it Similar code fragment SF 1 Similar code fragment SF 2 It is necessary to develop automatic code retrieval tool based on code similarity

3 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Key Idea In many cases, code fragments involving similar identifier names have the similar functionalities.  e.g., type, variable, function names 3 Developers often need to inspect those code fragments simultaneously. It is necessary to develop automatic code retrieval tool based on identifier similarity

4 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 4 SC-Retriever: Code retrieval tool based on identifier similarity Retrieves code fragments that are similar to a query code fragment Based on identifier similarity  e.g., type, variable, function name Determines synonymous words in target source files Target source files Similar code fragments Query Code Fragment Retrieval Identifier extraction Synonymous identifier determination Identifier extraction

5 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Why should we determine synonymous words in source code? SC-Retriever needs to identify a set of code fragments that have similar functionalities Different developer often uses different identifier names even if they implement the same functionalities 5 It is necessary to determine synonymous words for identifying code fragments that have similar functionalities.

6 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University How to determine synonymous words? We use an automatic synonymous words determination technique in NLP.  Dagan’s method[1], which is based on co-occurrence relation and do not use thesauruses and dictionaries.  His method detects a set of synonymous words often occurs a similar set of words in statements.  e.g., “Kids play soccer”, “Children play soccer”  Note that we should set threshold for synonymous words determination. 6 Both “kids” and “children” co-occur with a set of words “play“ “soccer”.  They are synonymous. [1] I. Dagan, L. Lee, and F. C. N. Pereira. Similarity-based models of word cooccurrence probabilities. Machine Learning, 34(1-3):43–69, 1999.

7 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University How to determine synonymous words? (2/2) Why not use thesauruses in natural languages?  Source files of application software often involve domain specific words.  Basically, thesauruses involve a few domain-specific synonyms.  SC-Retriever needs to determine domain specific synonymous words in target source files. 7

8 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University How to match a query code fragment with target source files? if code fragments have the same or synonymous identifiers as the query identifiers,…. those code fragments are extracted as similar code fragments from the target source files 8 Identifiers in query code fragment node allocadd hostallocaddhost Identifiers in target source files

9 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Case Study Overview  conduct a case study with SC-Retriever and CCFinder  retrieve defective functions in 2 software systems  compare the efficiency of the retrieval Target Systems  Canna (90KLOC, 2361 functions)  client-server Japanese character input system  Ver. 3.6 involves 19 buffer overflow defects Those defects exist in 18 functions.  SPARS-J (36KLOC, 859 functions) 9

10 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Experimental Step 1. choose defective code fragments from Canna source code 2. retrieve C functions in Canna source code.  SC-Retriever  we give 3 chosen code fragments as the queries.  CCFinder  we detect code clones for those 3 code fragments. 3. calculate the precisions, recalls and F-scores with the retrieved results and the bug records  F-score is harmonic average between precision and recall. 10

11 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Resultant precisions, recalls, and F-scores Queries SCRetriever ( th= 0.1 ) SCRetriever ( th= 0.2 ) CCFinder Prec.RecallF-scorePrec.RecallF-scorePrec.RecallF-score CF A 0.500.720.590.181.000.311.000.060.11 CF B 0.190.330.250.181.000.311.000.060.11 CF C 1.000.060.110.330.060.101.000.060.11 11 We set threshold for synonymous words determination, to 0.1 and 0.2.  If th is set to high value, a lot of synonymous words are detected F-Scores of SC-Retriever are higher than those of CCFinder.  Recalls of SC-Retriever are relatively high  Precisions of CCFinder are relatively high The results of SC-Retriever depends on queries and th

12 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Future Work Further case studies on defects in other software systems. Code clone detection tool based on synonymous words determination Method to calculate code fragment ranking based on identifier similarity Other methods to determine synonymous words  LSI, dictionary, or thesauruses based method 12

13 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Word extraction and Synonymous words determination Word extraction from identifier names  Extracts identifiers from both query code fragment and target source files  Applies several normalization rules to extracted identifiers  e.g. dividing at underscore, number elimination 13 It is necessary to develop automatic code retrieval tool based on identifier similarity

14 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Software Maintenance Increasing large-scale software systems that have been maintained for a long time 14 Improving the efficiency of maintenance activities has become an important topic in software engineering

15 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Maintaining Similar Code Fragments Simultaneous Modification Merging (Refactoring) 15 New method Call statements Modify them Simultaneously Code fragment CF Modify it Similar code fragment SF 1 Similar code fragment SF 2

16 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University My Talk On Simultaneous Modification  Norihiro Yoshida, et al.: “Finding Similar Defects Using Synonymous Identifier Retrieval”, the 4th International workshop on software clones (IWSC 2010), May. 2010. (to appear) On Merging Similar Code Fragments  Norihiro Yoshida, et al.: "On Refactoring Support Based on Code Clone Dependency Relation", Proceedings of the 11th IEEE International Software Metrics Symposium (METRICS 2005), Sep. 2005. 16

17 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Similar Defects Using Synonymous Identifier Retrieval Norihiro Yoshida Osaka University, Japan 17

18 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 18 Identifying similar code fragments in source code Process to identify code fragments that are similar to a query code fragment To support simultaneous modification, it is necessary to develop automatic code retrieval tool based on code similarity Input code fragment Target source files e i [0]e i [n i ]e i [1]e t1 [0]e t1 [n t1 ]e t1 [1] e t2 [0] e t2 [n t2 ]e t2 [1] e t2 [0]e t2 [n t2 ]e t2 [1] A list of program elements matching Similar code fragments Similar elements e s1 [0]e s1 [n s1 ] e s2 [0]e s2 [n s2 ] Input a defective code fragments Inspect them lists of program elements

19 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 19 Proposed Method We propose a code retrieval method based on identifier similarity  Identifier normalization  “add_host”  “add” and “host”  “type1”  “type”  Synonymous identifier determination based on NLP  Present code fragments have the same synonymous identifiers as the query identifiers

20 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 20 node allocadd hostallocaddhost

21 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 21 node allocadd identifiers in input code fragment hostallocaddhost Identifiers in target source code

22 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 22 On Refactoring Support Based on Code Clone Dependency Relation Norihiro Yoshida 1, Yoshiki Higo 1, Toshihiro Kamiya 2, Shinji Kusumoto 1, Katsuro Inoue 1 1 Osaka University 2 National Institute of Advanced Industrial Science and Technology

23 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 23 Background(1) What is a code clone? A set of code fragments identical or similar to each other Introduced in source program by various reasons such as reusing code by `copy-and-paste’ Make software maintenance more difficult Code Clone

24 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 24 Background(1) Refactoring Refactoring[1] is a way to deal with code clone problem. Refactoring is a technique for restructuring an existing code  Alter software’s internal structure without changing its external behavior [2]  Improve the maintainability of software  Number one in the stink parade is duplicate code [1] New method Call statements [1] M. Fowler, Refactoring: improving the design of existing code, Addison Wesley, 1999. [2] http://www.refactoring.com

25 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Background(2) Difficulty of Refactoring It is difficult to identity refactoring opportunities in large scale source code.  Where are code fragments that should be merged into one method?  How should they be merged into one method?  Extract Method or Pull Up Method Refactroing? 25 New method Call statements Extract Method Refactoring Pull Up Method Refactoring

26 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Related Works on Clone Refactoring Aries: A Tool for Supporting Clone Refactoring [1]  Calculate the metrics that represent the difficulty to apply Extract Method Refactoring into each clone set  Suggest destination class for Pull Up Method Refactoring Code Clone Categorization [2]  Categorize code clones based on types of difference between them 26 [1] Y. Higo, et al., Refactoring Support Based on Code Clone Analysis, PROFES 2004. [2] M. Balazinska, et al., Advanced Clone-analysis to Support Object-oriented System Refactoring, WCRE 2000.

27 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 27 Motivation There are dependency relations between methods belonging to the different code clone. Method a1Method a2 Code clone A Method b1 Method c1 Method b2 Method c2 Code clone B Code clone C Method a Merging ( Refactoring ) Method b Merging ( Refactoring ) Method c Merging ( Refactoring )

28 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 28 Motivation There are dependency relations between methods belonging to the different code clone.  Merging all of the code clones at once is more effective Method a1Method a2 Method b1 Method c1 Method b2 Method c2 Method a Method b Method c Merging

29 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 29 Research Overview Define a set of code clones having dependency relations as a chained clone Suggest applicable refactoring pattern for each chained clone based on chained clone categorization Chained Clone Method a1Method a2 Method b1 Method c1 Method b2 Method c2

30 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 30 Definition of chained clone(1) Chained Method  A set of methods that hold dependency relations Chained Method Graph  A node represents a method  An edge represents a dependency relation  Three types of labels for the dependency relation  “Call” : Calling methods  “A i ” : Sharing variable i in terms of assignment  “R j ” : Sharing variable j in terms of reference A Chained Method A Chained Method Graph Rx Ax Call

31 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 31 Definition of chained clone(2) Chained Clone  For 2 given chained methods CM1 and CM2, we transform them into chained method graphs G1 and G2.  For G1 and G2, if the following three conditions are satisfied, we call the pair of CM1 and CM2 as a chained clone. 1.G1 and G2 are isomorphic. 2.Each pair of the corresponding nodes between G1 and G2, holds a clone relation. 3.In G1 and G2, labels of the corresponding edge are identical. Chained Clone Set  An equivalence class of chained clones CM1 CM2 G1 G2 A pair of nodes filled with colored same color is a code clone Call RxAx Call AxRx

32 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 32 Applicable Refactorings for Chained Clones The following refactoring[1] can be applied to merge chained clones.  Pull Up Method Refactoring  Extract Method Refactoring  Extract Super Class Refactoring According to the characteristics of a chained clone, we provide a different appropriate refactoring for it. [1] M. Fowler: Refactoring: Improving the Design of Existing Code, Addison-Wesley, 1999.

33 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 33 Class A Method a11 Method a22Method a21 Method a12 Chained Clone Before Refactoring Class A Method a1 Method a2 After Refactoring Typical Chained Clones Case 1 : Extract Method Refactoring All the methods in a chained clone that are contained in a single class. All methods can be merged into two new methods in the class A. (“Extract Method” Refactoring)

34 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 34 Class A Method a1 Method b2 Method a2 Class B Method b1 Super Class Chained Clone Class A Method 1 Method 2 Class B Super Class Before RefactoringAfter Refactoring Typical Chained Clones Case 2 : Pull Up Method Refactoring All methods in a chained clone belong to classes that have common parent classes. All methods of each chained method are in the same class respectively. All methods of each code clone can be merged into a new method in the parent class. (“Pull Up Method” Refactoring)

35 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 35 Typical Chained Clones Case 3 : Extract SuperClass Refactoring Class A Method a1 Method b2Method a2 Class B Method b1 All methods of each code clone can be merged into a new method in the new superclass. ( ”Extract SuperClass” Refactoring ) Chained Clone Class A Method 1 Method 2 Class B New SuperClass Before RefactoringAfter Refactoring Some methods in a chained clone belong to classes that have no common parent class. All method of each chained method are in the same class respectively.

36 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 36 It is difficult to apply refactoring to all methods at one time. ( The “Pull Up Method” refactoring can be applied to each Code Clone.) Typical Chained Clones Case 4 ( difficult to apply refactoring ) Chained methods exist in different classes. Chained Clone Class A Method a Class B Method b Class C Method dMethod c Class D Class S1 Class S2

37 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 37 Categorization of Chained Clones We propose a method to classify chained clones by using two metrics. Two method groups for classifying chained clones  G1 The group of methods having clone relations  G2 The group of methods having dependency relations These metrics evaluate the relationship of distance and position in the class hierarchy among methods belonging to these two groups.  R1 All methods belong to classes that exist in the same class.  R2 All methods belong to classes that have common parent classes.  R3 Some methods belongs to classes that have no common parent class.

38 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 38 The metric DCH(S) (the Dispersion in the Class Hierarchy) Class A Method a Class B Method b Class S1 DCH(S) = 1 1 If there are classes that have no common parent class, the value of its DCH is undefined. DCH(S) = 2 Class C Method c Class D Method d Class S2 Class E Method e Class S3 2 DCH(S) : represents the dispersion in the class hierarchy among methods

39 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 39 Metrics to classify chained clone sets (1) DCHS : Evaluates the dispersion of the methods belonging to G1 (The group of methods having clone relations) in the class hierarchy DCHD : Evaluates the dispersion of the methods belonging to G2 (The group of methods having dependency relations ) in the class hierarchy Method a1Method a2 Method b1 Method c1 Method b2 Method c2 1.Calculate a set of DCH(S) metric from methods in each of chained method. 2.Select the maximum value among them as a DCHD. 1.Calculate a set of DCH(S) metric from methods having each of clone relations. 2.Select the maximum value among them as a DCHS.

40 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 40 Metrics to classify chained clone sets (2) Using the two metrics, we classify the chained clones into 9 categories. DCHD DCHS 0greater than 0 cannot be defined 0Category 11Category 12Category 13 grater than 0Category 21Category 22Category 23 cannot be defined Category 31Category 32Category 33

41 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 41 Metrics to classify chained clone sets (2) Using the two metrics, we classify the chained clones into 9 categories. DCHD DCHS 0grater than 0 cannot be defined 0Category 11Category 12Category 13 grater than 0Category 21Category 22Category 23 cannot be defined Category 31Category 32Category 33 Extract Method Refactoring All the methods in a chained clone that are contained in a single class.

42 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 42 Metrics to classify chained clone sets (2) Using the two metrics, we classify the chained clones into 9 categories. DCHD DCHS 0grater than 0 cannot be defined 0Category 11Category 12Category 13 grater than 0Category 21Category 22Category 23 cannot be defined Category 31Category 32Category 33 Pull Up Method Refactoring All methods in a chained clone belong to classes that have common parent classes. All methods of each chained method are in the same class respectively.

43 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 43 Metrics to classify chained clone sets (2) Using the two metrics, we classify the chained clones into 9 categories. DCHD DCHS 0grater than 0 cannot be defined 0Category 11Category 12Category 13 grater than 0Category 21Category 22Category 23 cannot be defined Category 31Category 32Category 33 Extract SuperClass Refactoring Some methods in a chained clone belong to classes that have no common parent class. All method of each chained method are in the same class respectively.

44 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 44 Metrics to classify chained clone sets (2) Using the two metrics, we classify the chained clones into 9 categories. DCHD DCHS 0grater than 0 cannot be defined 0Category 11Category 12Category 13 grater than 0Category 21Category 22Category 23 cannot be defined Category 31Category 32Category 33 Difficult to apply refactoring to all methods at one time Chained methods exist in different classes.

45 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 45 Evaluation Overview Objective  How many chained clone sets exist in actual Java programs?  Is it possible to classify chained clone sets by using the proposed metrics and to apply suggested refactorings to them? Target software  Open source software  ANTLR 2.7.4 (47,000 LOC, 285 Classes ) Compiler-Compiler ( Java, C++, C# )  JBoss 3.2.6 (640,000 LOC, 3364 Classes ) J2EE Application Server  Commercial software  X ( 70,000 LOC, 309 Classes )  Y ( 81,000 LOC, 290 Classes ) We used CCFinder to detect code clones[1]. [1] T. Kamiya, et. al., CCFinder: A multi-linguistic token-based code clone detection system for large scale source code, IEEE TSE, vol.28, no.7, pp.654-670, Jul. 2002.

46 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 46 Evaluation Detected chained clone sets ANTLR 2.7.4 Category# of chained clone sets # of methods maxmin 11344 216406 31144 Other0 Total10 In category 21, the maximum number of methods is very large.  Similar functionalities for Java, C# and C++ Category# of chained clone sets # of methods maxmin 11213 210 317264 Other0 Total9 Software X The number of chained clone sets in category 31 is large.  Two packages have similar utility classes.

47 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 47 CSharp CharFormatter escapeString escapeChar Java CharFormatter escapeString escapeChar call CSharp CharFormatter Java CharFormatter GeneralCharFormatter escapeString escapeChar call Before Refactoring After Refactoring Extract Super Class Evaluation Refactoring for Category 31 ( ANTLR ) We applied suggested refactorings to chained clone sets in ANTLR.

48 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 48 Conclusion We focus on refactoring for chained clones that consist of sets of the methods with dependency relations  Define chained clone  Two metrics to classify chained clones according to their applicable refactorings  Case studies to show the usefulness of the proposed metrics Future Works  Provide information about the internal structure of chained clones  Apply our proposed method to some other Java programs

49 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 49 Evaluation Detected chained clone sets (Open source software) ANTLR 2.7.4 Category# of chained clone sets # of methods maxmin 11344 216406 31144 Other0 Total10 JBoss 3.2.6 Category# of chained clone sets # of methods maxmin 1116134 211784 3113294 Other4446 Total50 In category 21, the max of the number of methods in very large  Similar functionalities for each language ( Java, C#, C++) The number of chained clone sets in category 31 is large  JBoss contains several products. As a result, it has code clones among them

50 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 50 Evaluation Detected chained clone sets (Commercial software) X Category# of chained clone sets # of methods maxmin 11213 210 317264 Other0 Total9 Y Category# of chained clone sets # of methods maxmin 110 219144 310 Other0 Total9 The number of chained clone sets in category 31 is large  Two packages have similar utility classes In only category 21, chained clone sets were detected  X Software has code clones among several classes which inherit the same component class

51 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 51 A1 A2 B1 C1 Chain ed Clone Set A3 B2 C2 A2 B1 C1 A1 B1 C1 Chain ed Method1 Chain e d Method2 A3 B2 C2 Chain e d Method3

52 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 52 CCFinder CCFinder directly compares source code on token unit, and detects code clones  Normalization of name space  Replacement of names defined by user  Removal of table initialization  Consideration of module delimiter CCFinder can analyze the system of millions line scale in practical use time

53 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 53 Source files Lexical analysis Transformation Token sequence Match detection Transformed token sequence Clones on transformed sequence Formatting Clone pairs 1. static void foo() throws RESyntaxException { 2. String a[] = new String [] { "123,400", "abc", "orange 100" }; 3. org.apache.regexp.RE pat = new org.apache.regexp.RE("[0-9,]+"); 4. int sum = 0; 5. for (int i = 0; i < a.length; ++i) 6. if (pat.match(a[i])) 7. sum += Sample.parseNumber(pat.getParen(0)); 8. System.out.println("sum = " + sum); 9. } 10. static void goo(String [] a) throws RESyntaxException { 11. RE exp = new RE("[0-9,]+"); 12. int sum = 0; 13. for (int i = 0; i < a.length; ++i) 14. if (exp.match(a[i])) 15. sum += parseNumber(exp.getParen(0)); 16. System.out.println("sum = " + sum); 17. } Lexical analysis Transformation Token sequence Match detection Transformed token sequence Clones on transformed sequence Formatting Lexical analysis Transformation Token sequence Match detection Transformed token sequence Clones on transformed sequence Formatting 1. static void foo() throws RESyntaxException { 2. String a[] = new String [] { "123,400", "abc", "orange 100" }; 3. org.apache.regexp.RE pat = new org.apache.regexp.RE("[0-9,]+"); 4. int sum = 0; 5. for (int i = 0; i < a.length; ++i) 6. if (pat.match(a[i])) 7. sum += Sample.parseNumber(pat.getParen(0)); 8. System.out.println("sum = " + sum); 9. } 10. static void goo(String [] a) throws RESyntaxException { 11. RE exp = new RE("[0-9,]+"); 12. int sum = 0; 13. for (int i = 0; i < a.length; ++i) 14. if (exp.match(a[i])) 15. sum += parseNumber(exp.getParen(0)); 16. System.out.println("sum = " + sum); 17. } Lexical analysis Transformation Token sequence Match detection Transformed token sequence Clones on transformed sequence Formatting CCFinder (Clone Detection Process)


Download ppt "Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Finding Similar."

Similar presentations


Ads by Google