Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evolution in Open Source Software (OSS) SEVO seminar at Simula, 16 March 2006 Software Engineering (SU) group Reidar Conradi, Andreas Røsdal, Jingyue Li.

Similar presentations


Presentation on theme: "Evolution in Open Source Software (OSS) SEVO seminar at Simula, 16 March 2006 Software Engineering (SU) group Reidar Conradi, Andreas Røsdal, Jingyue Li."— Presentation transcript:

1 Evolution in Open Source Software (OSS) SEVO seminar at Simula, 16 March 2006 Software Engineering (SU) group Reidar Conradi, Andreas Røsdal, Jingyue Li Reidar Conradi, 30.jan.06

2 Motivation: Open Source Software is fast becoming the major way of making software 38% of European IT companies used OSS in 2003, 56% in 2005 (Evans Data Corp.). Need to understand how OSS is developed and evolved, e.g. are revised processes needed? Partly used “as-is”, partly made in cooperative projects.

3 Context and intention of our two OSS studies Study 1: Survey of OSS- and COTS- based (Commercial-Off-The-Shelf) development in Norway, Germany and Italy – 145 projects in companies. Study development and risk management processes. Study 2: Data mining of evolution in two OSS projects – Mozilla and Portage. Analyze change logs, source etc.

4 Main findings from Study1/Survey Source available? – 100% in OSS projects, 30% in COTS. Source being read: 68% in OSS, 77% in COTS w/ source. Source being modified: 36% in OSS, 15% in COTS with source + glueware/addware.

5 Empirical Study2 of Software Evolution in OSS Projects Goal of study: identify factors which can explain and possibly predict software evolution. How: by performing an empirical study on two open source projects and analyzing the results. Focus: observe changes in software architecture, quality and change-rates. Motivation: Reduce software maintenance costs by identifying evolution-prone software. Slide 516 March 2006 - SEVO seminar on Software Evolution

6 Research questions RQ1: How much does the architectural properties of the software change over time? RQ2: Are modules with high complexity more evolution prone than modules with low complexity? RQ3: Are modules with high coupling more evolution prone than modules with low coupling? RQ4: Does the amount of software development decrease over time? RQ5: What is the relationship between the defect density and the architectural changes of a system? Slide 616 March 2006 - SEVO seminar on Software Evolution

7 Overview of Methods Used Analyzing changes over time: measuring software metrics for all releases of the software. Data mining defect reports from the defect-tracking system. Collecting change-rates from change logs and source code. Applying software metric tools to measure evolution: C and C++ Code Counter and Pythonmetric. Slide 716 March 2006 - SEVO seminar on Software Evolution

8 Data Sources: Mozilla (large) and Portage (small) Mozilla: Open source web browser developed in C/C++. Access to source code of 86 releases from 1999 to 2005. The latest version consists of 1.4 million lines of code. Access to change logs from CVS and reported defects from Bugzilla. Portage: Open source system utility for Gentoo Linux developed in Python. Access to source code of 4 mayor releases including 278 minor releases from 2003 to 2005. The latest version consists of 13 thousand lines of code. Access to change logs from CVS and reported defects from Bugzilla. Slide 816 March 2006 - SEVO seminar on Software Evolution

9 Software Metrics (1) The following metrics were measured for each release: Lines of Code: simple metric for the size and complexity of source code. McCabe's Cyclomatic Complexity: metric for the number of independent paths through a program, and is a measure for program complexity. Henry-Kafura / Shepperd: metric for the information flow to and from a module (FAN- IN, FAN-OUT), and is a measure for structural complexity. Slide 916 March 2006 - SEVO seminar on Software Evolution

10 Software Metrics (2) Module Coupling: a measure for how many relations a module has to other modules. It is a way to measure semantic coherence, or how the responsibilities of a module are related. Defect density: the number of known defects divided by the size of the software. Number of changes (line-based) to a module. Slide 1016 March 2006 - SEVO seminar on Software Evolution

11 Results RQ1: Architectural changes over time? McCabe's cyclomatic complexity measured in Mozilla and Portage increases over time. The measurements in Portage shows a trend to follow a linear increase in cyclomatic complexity, and Mozilla shows a trend to follow a logarithmic increase in cyclomatic complexity. This result is in accordance with Lehman’s 2nd law of software evolution, which states ”As an E-type system is evolved its complexity increases unless work is done to maintain or reduce it.” Slide 1116 March 2006 - SEVO seminar on Software Evolution

12 Results RQ1: Architectural changes over time? Measurements of Information Flow for three of the largest modules in Mozilla over a period of 5 years: all with increasing measures This metric will be compared to defect-density in RQ5. Slide 1216 March 2006 - SEVO seminar on Software Evolution

13 Results RQ2: Evolution-proneness of complex modules Question: Does module complexity have an impact on evolution proneness? Approach: A sample of 30 modules with high cyclomatic complexity and a sample of 30 modules with low cyclomatic complexity was taken. Evolution- proneness was measured as the number of changes to these modules, and was collected from the change log. Result: A t-test performed on the sample support the hypothesis that modules with high cyclomatic complexity have a higher number of changes than modules with low cyclomatic complexity. Slide 1316 March 2006 - SEVO seminar on Software Evolution

14 Results RQ3: Evolution-proneness of modules with high coupling Question: Does module coupling have an impact on evolution proneness? Approach: A sample of 30 modules with high coupling and a sample of 30 modules with low coupling was taken, and the number of changes to these modules was measured. Result: A t-test performed on the sample support the hypothesis that modules with high coupling have a higher number of changes than modules with low coupling. Slide 1416 March 2006 - SEVO seminar on Software Evolution

15 Results RQ4: Decreasing software change over time? Observation for both OSS systems: the cumulative number of changes increases linearly over time for both data sources. This means the amount of software development, measured by the number of changed lines, did change at the same rate over time. Also: LOC > 15X code changes!! This result verifies Lehman’s 4th law of software evolution, in that “the average activity rate in an E-type system tends to remain constant over system lifetime or segments of that lifetime”. Slide 1516 March 2006 - SEVO seminar on Software Evolution

16 Results RQ5: Defect density versus architectural changes A study by Allen Nikora and John Munson indicates that measures of an evolving system's structure are strongly related to its number of faults. To answer RQ5, defect density and Information flow was measured in three of the largest modules in Mozilla and compared in the plots below. No clear pattern! Slide 1616 March 2006 - SEVO seminar on Software Evolution

17 Results RQ5: Defect density versus architectural changes The defect density of both OSS systems was collected by data mining the shared Bugzilla defect-tracking system. Both systems show an increasing defect-density over time, least for the largest system. Slide 1716 March 2006 - SEVO seminar on Software Evolution

18 Conclusions Little difference in OSS/COTS read/writes, so similar in practice. RQ1: Architectural change: more in large systems. RQ2: High complexity is correlated with relatively more changes. RQ3: High coupling is correlated with relatively more changes. RQ4: Linear change rates over time. RQ5: Defect density and architectural changes: unrelated. Rather use change rates in RQ2 and RQ3? Does RQ2/RQ3 indicate that modules should be split/merged? Problem with “over-counting” line moves in change logs? Are Mozilla and Portage representative OSS systems – too volatile? Much meat for future studies!


Download ppt "Evolution in Open Source Software (OSS) SEVO seminar at Simula, 16 March 2006 Software Engineering (SU) group Reidar Conradi, Andreas Røsdal, Jingyue Li."

Similar presentations


Ads by Google