Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presenter: Randy Hunt Presenter: Vitaliy Krestnikov

Similar presentations


Presentation on theme: "Presenter: Randy Hunt Presenter: Vitaliy Krestnikov"— Presentation transcript:

1 Presentation on the article: Identifying effective software metrics using genetic algorithms
Presenter: Randy Hunt Presenter: Vitaliy Krestnikov Date: April 27, 2009 course: comp 589 9/23/2018

2 Introduction Team Leaders commonly use software metrics as a measure of the overall quality of the design and the eventual implementation of systems. The ability to predict the quality of a software object from a set of software metrics is in essence a problem of classification. 9/23/2018

3 Classification Take a set of objects with known features (software metrics) Combine them with group labels (quality rankings) And you get a classifier that can predict the quality of new objects using only the computed metrics 9/23/2018

4 Software Metrics Software metrics are used to quantitatively map a set of numerical values, such as the number of lines of code in a file or the number of methods in a class, to a subjective measure of quality, in terms of the apparent complexity, maintainability and usability. Not all metrics provide the same classification power though, but different combination can yield results that certain people are looking for. 9/23/2018

5 Proposition This article proposes using a genetic algorithm feature selection procedure to indicate the optimal metrics used in the classification process. To test this proposal, software produced by Evident was used. 9/23/2018

6 Software Metrics All 338 software objects in EvIdent were subjectively labeled by an experienced software architect in terms of maintainability. Ranked each Java class as low, medium-low, medium or high. High represents easy to modify. Low represents difficult to modify. 9/23/2018

7 Software Metrics There were 16 different software metrics used.
LOC, SLC, CLC, WLC, RCC, RCS, SMC, MET, ANL, CAN, AE, ALC, ASC, ASL, ACC. AEC 9/23/2018

8 The Genetic Algorithm 9/23/2018

9 Genetic Algorithm step 1: initialize population
Population of Genes: Each chromosome is a software metric Chromo-somes: gene #1 gene #2 gene #3 gene #4 gene #5 T Y P L O C S W R SMC ME ANL CAN AIE 1 9/23/2018

10 2. Begin the algorithm* for creating offspring for generation N, starting with generation 1
* The algorithm is shown on the following slides 9/23/2018

11 3. Calculate fitness by LDA* 4. Select pair based on fitness
* LDA is explained later in this presentation Chromo-somes: gene #1 gene #2 gene #3 gene #4 gene #5 T Y P L O C S W R SMC ME ANL CAN AIE 1 Fitness (LDA %) 44 63 37 67 50 9/23/2018

12 5. Produce child gene by swapping bits starting from the randomly-picked crossover point
Chromo-somes: gene #2 gene #4 crossover T Y P L O C S W R SMC ME ANL CAN AIE 1 * 9/23/2018

13 6. mutate each child bit where a random probability number exceeds the control parameter
* Control parameter should be small (e.g. 10%) Chromo-somes: Crossover mutated T Y P L O C S W R SMC ME ANL CAN AIE 1 9/23/2018

14 7. Insert child into population; replacing the least fit gene
Chromo-somes: gene #1 gene #2 child gene #4 gene #5 T Y P L O C S W R SMC ME ANL CAN AIE 1 Fitness (LDA %) 44 63 N/A 67 50 9/23/2018

15 8. Return to step 3 and repeat this process until one generation has reproduced.
* There is a control parameter, the number of elite genes (those which survive to the next generation) which determines when one generation is complete. 9/23/2018

16 9. Return to step 2 and repeat for the next generation, until N generations have reproduced.
* There is a control parameter, the number of generations, which determines when this loop terminates. 9/23/2018

17 Control parameters for the GA
Number of genes in the population Number of generations Percent of elite genes (those that survive to the next generation) * The probability of mutations * In the previous example, we have a very small population and only one reproduction was demonstrated. There are many reproductions per generation. 9/23/2018

18 Control parameters for the GA
Number of genes in the population Number of generations Percent of elite genes (those that survive to the next generation) * The probability of mutations * In the previous example, we have a very small population and only one reproduction was demonstrated. There are many reproductions per generation. 9/23/2018

19 Linear Discriminate Analysis (LDA)
9/23/2018

20 Computing “Fitness” using LDA
Java object: Zoo Quality ranking: low SW metrics: TYP: 1, LOC:539, SLC: 401, CLC: 138, Etc. Java object: Bar Quality ranking: low SW metrics: TYP: 1, LOC:539, SLC: 401, CLC: 138, Etc. Objective Function using LDA Fitness Value Java object: Foo Quality ranking: high SW metrics: TYP: 1, LOC:539, SLC: 401, CLC: 138, Etc. * This shows computing fitness for only one gene (set of SW metrics) 9/23/2018

21 SW metric (“known feature”):
Group: High max Group: Low Group: Medium SW metric (“known feature”): LOC Group: medium-low SW Metric (“known feature”):TYP * In reality, we can have up to 16 dimensions (only 2 shown here) 9/23/2018

22 Aspects of LDA function logic
For a point on the previous graph, the LDA algorithm will allocate it to the group based on: the greatest probability distribution The prior probability (for the last SW object processed, presumably) is also a factor 9/23/2018

23 Results 9/23/2018

24 Top 5 These are the 6 metrics that were common to the top 5 genes. SLC
WLC RCC AE ASL ACC 9/23/2018

25 Conclusion The GA metrics appear to indicate that code that is easy to read along with comments help developers understand the purpose of the code. 9/23/2018


Download ppt "Presenter: Randy Hunt Presenter: Vitaliy Krestnikov"

Similar presentations


Ads by Google