Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Large Software Compilations over Time Another Perspective on Software Evolution: Gregorio Robles, Jesus M. Gonzalez-Barahona, Martin Michlmayr,

Similar presentations


Presentation on theme: "Mining Large Software Compilations over Time Another Perspective on Software Evolution: Gregorio Robles, Jesus M. Gonzalez-Barahona, Martin Michlmayr,"— Presentation transcript:

1 Mining Large Software Compilations over Time Another Perspective on Software Evolution: Gregorio Robles, Jesus M. Gonzalez-Barahona, Martin Michlmayr, Juan Jose Amor Presented by Brian Chan Cisc 864 12 th October 2007

2 Overview Background Information Motivations for Paper Problems Addressed Solutions and Data Analysis Conclusions Thoughts about the paper Questions/Comments

3 Background Libre Software (open source software) Compilations of software by vendors: Group different software sources together as a product. Must be: Easy to install, configure administer

4 Background Information Example of Libre Sofware: Debian – Distribution of the Linux Kernel Versions 2.0 2.1, 2.2, 3.0 and 3.1 Lots of volunteers - all information mail etc becomes available.

5 Motivation for Paper The evolution of products created from software compilation is new Companies have trouble categorizing all the programs built by different vendors. This is different compared to normal software evolution: Integration Vs Development Maintenance means additions of new software not removal of faults or addition of new functionality

6 Problems addressed Dealing with adding and removal of packages in the Debian release and libre Software “by the large” Address versioning in packages Paper is indicative of Libre Software in general because of its size.

7 Solutions/Data Gathered Information of the product (Sources.tar.gz) contains: Name, version, list of binary packages built from it, name and email address of maintainer. Experiment focuses on source lines of code (SLOC) using SLOCCount

8 Solutions/Data Gathered 1.SLOCCount transforms data into relational and XML data formats for viewing purposes.

9 Solutions/Data Gathered As MSLOC (Million Lines of Code) Number of Packages Every two years x2 growth Faster in earlier years

10 Solutions/Data Gathered Rule 1: Large packages grow in time Rule 2: Many small packages introduced Result: Mean size of packages is the same

11 Solutions/Data Gathered Common Package: Same files but updated in later versions Common Versions: Same files with no change

12 Solutions/Data Gathered 25% of packages have been completely removed 15% of packages have been unchanged Number of packages with versions in common increases

13 Solutions/Data Gathered C dominates (between 85%-55%) in all versions

14 Solutions/Data Gathered 300% increase in lines of C code But overall direction is heading to Python Perl Reasoning: Many more shell scripts for installation purposes

15 Conclusions Evidence shows: Versions that stay double in size (in terms of packages or lines of code) every two years. Mean size of packages is the same Not indicative of package behavior! Because more files with more lines but many small packages as well

16 Conclusions One developer can only handle N amount of files but software is getting larger => more developers C is becoming less important even though it is still leading in terms of percentage of lines

17 Conclusions More research needed if link between skills, # of developers, complexity and activities performed found. Debian provides good example for understanding compilation evolution.

18 Thoughts about the paper Strong points: Data provided shows interesting progression of versioning for this product; another face to software evolution Good use of linux product that has mainstream versioning for example: Ubuntu may have been too new Good explanation of reasons for trend: i.e. same mean, more shell code.

19 Thoughts about the paper Points that need improvement Borrow terms like maintenance from usual definition: versioning probably would have sufficed. Does not really explain the significance of common packages, files between versions, just lists them. Bold claim to say Debian is indicative of software compilation evolution as a whole: Other releases may show alternate patterns=> show background research on that.

20 Questions/Comments


Download ppt "Mining Large Software Compilations over Time Another Perspective on Software Evolution: Gregorio Robles, Jesus M. Gonzalez-Barahona, Martin Michlmayr,"

Similar presentations


Ads by Google