Presentation is loading. Please wait.

Presentation is loading. Please wait.

Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Measuring Copying.

Similar presentations


Presentation on theme: "Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Measuring Copying."— Presentation transcript:

1 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Measuring Copying of Java Archives Tetsuya Kanda 1, Daniel M. German 2,1, Takashi Ishio 1, Katsuro Inoue 1 1 Osaka University, Japan 2 University of Victoria, Canada

2 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Reusing a library Reuse existing libraries by copying them into the software development project Black-box reuse 2 Copy THE USEFUL LIBRARY

3 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Library in Java JAR files (Java archive file) are built on the ZIP file format A Jar file can contain another jar file inside. 3 THE USEFUL LIBRARY jar files in the library

4 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of jar files Since a Jar file can contain another jar file inside, they can be duplicated Jar files in another jar file might cause further duplication 4 Copy

5 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Question How many jar files in a large software repository contain jar files inside? Are there any duplication of jar files inside? 5 jar files in the library

6 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Definition: Top-level jar file A jar file found in the repository –A component ready to be reused 6 Top-level jar

7 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Definition: Inner jar file A jar file that is included in another jar file 7 A.jar (Top-level jar) Inner jar files of A.jar

8 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University The experiment Objective: –Find how many top-level jar files contain duplicate inner jar files inside Target: –Maven Central repository Default repository of Apache Maven Contains many popular libraries and projects. 8

9 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Counting inner jar files 599,498 top-level jar files in Maven Central repository (without duplications) 4,747 contains jar files inside 9 # inner jar files Max282 Average13.1 Median2 Min1 (in 1,833 of top-level jar files)

10 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Reused jar files 118,361 different inner jar files are contained in other jar files 89,054 of them are found as top-level jar files in Maven Central repository –There is a possibility of causing further duplication in software projects. 10

11 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of inner jar files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4,74710539430469 11 The same version The different versions Having the same file name and the same file hash of the contents Having the same file name with the exception of version names llibA-1.0.jar hash:3bf7 llibA-1.0.jar hash:3bf7 llibB-1.0.jarllibB-1.2.jar

12 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of inner jar files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4,74710539430469 12 Contain the same version of the same library Ver.1

13 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of inner jar files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4,74710539430469 13 Contain the different versions of the same library Ver.1 Ver.2

14 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of inner jar files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4,74710539430469 14 Contain both the same version and the different versions of the same library Ver.1 Ver.2 Ver.1

15 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Concluding remarks About 5,000 jar files in the Maven Central repository contain other jar files About 470 of them contains duplicate libraries Most of inner jar files are also found as Maven components –There are still possibility of further duplications. 15

16 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Future works Find duplications of jar files and class files in distributed software applications –eclipse, JBoss, … Analyze the behavior of the software which contains duplicated libraries –Understanding the impact of duplication 16

17 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University HIDDEN 17

18 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 18

19 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Inner Jar Files 19 Maven2 599,498 top-level jar files (without duplications) 4747 contains jar file inside Max: 282 inner jar files Median:2 Min:1 (in 1833 of top-level jar files)

20 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of Inner Jar Files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4,74710539430469 #projects88639491473 20 The same version The different versions Having the same file name and the same file hash of the contents Having the same file name with the exception of version names llibA-1.0.jar hash:3bf7 llibA-1.0.jar hash:3bf7 llibB-1.0.jarllibB-1.2.jar

21 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of Inner Jar Files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4,74710539430469 #projects88639491473 21 Contain the same version of the same library Ver.1

22 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of Inner Jar Files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4,74710539430469 #projects88639491473 22 Contain the different versions of the same library Ver.1 Ver.2

23 Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Duplication of Inner Jar Files Top-level Contains Inner jar Having Duplication Total SameDifferentBoth #files4,74710539430469 #projects88639491473 23 Contain both the same version and the different versions of the same library Ver.1 Ver.2 Ver.1


Download ppt "Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Measuring Copying."

Similar presentations


Ads by Google