Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.ischool.drexel.edu INFO 636 Software Engineering Process I Prof. Glenn Booker Weeks 4-5 – Estimating Software Size 1INFO636 Weeks 4-5.

Similar presentations


Presentation on theme: "Www.ischool.drexel.edu INFO 636 Software Engineering Process I Prof. Glenn Booker Weeks 4-5 – Estimating Software Size 1INFO636 Weeks 4-5."— Presentation transcript:

1 www.ischool.drexel.edu INFO 636 Software Engineering Process I Prof. Glenn Booker Weeks 4-5 – Estimating Software Size 1INFO636 Weeks 4-5

2 www.ischool.drexel.edu INFO636 Weeks 4-52 Why Plan? As emphasized earlier, we need a good estimate of the amount of work to be performed, in order to predict effort and time accurately (per Boehm) Estimation is one of the most challenging aspects of managing software development, hence our substantial focus on it here

3 www.ischool.drexel.edu INFO636 Weeks 4-53 Estimation Example Other fields have well established formulas for estimating work –Construction knows the cost per square foot of various types of construction –More complex projects look at the linear amount of walls, and the areas of various parts (walls, ceilings, etc.) to develop good estimates

4 www.ischool.drexel.edu INFO636 Weeks 4-54 Size Estimation Process The framework, or process, for planning a project was covered last lecture –Define system requirements –Product conceptual design –Estimate product size –Estimate resources and schedule –Develop the product –Refine basis for later estimates

5 www.ischool.drexel.edu INFO636 Weeks 4-55 Estimation Tools Most software estimation tools have been calibrated to use software size as an input, and produce effort and schedule as outputs –COCOMO, SLIM, PriceS, and McConnell’s tables in Rapid Development –Often start at fairly large project sizes, e.g. 10,000 LOC and up

6 www.ischool.drexel.edu INFO636 Weeks 4-56 Estimation Tools We need a basis for estimation which works for an individual (programmer) Most organizations use either no estimation methods, or use terribly unreliable ones –100% error is far too common

7 www.ischool.drexel.edu INFO636 Weeks 4-57 Desired Estimation Goals Criteria for a good estimation method include: –Use structured and trainable methods –Should apply to both development and maintenance –Should be able to handle all aspects of development, not just code

8 www.ischool.drexel.edu INFO636 Weeks 4-58 Desired Estimation Goals –It should be suitable for statistical analysis –It should be adaptable to future types of work –It should be possible to judge the accuracy of your work (and hence refine the model) We’ll briefly cover four estimation methods, then explain the proxy-based PROBE approach

9 www.ischool.drexel.edu INFO636 Weeks 4-59 Estimation Methods Wideband-Delphi Method Fuzzy Logic Method Standard Component Method Function Point Method Proxy-based Estimating

10 www.ischool.drexel.edu INFO636 Weeks 4-510 Wideband-Delphi Method This method was developed by Rand Corporation It uses several people to estimate the same task, then applies a Delphi method to get a consensus estimate The process is: –Discuss the problem

11 www.ischool.drexel.edu INFO636 Weeks 4-511 Wideband-Delphi Method –Get anonymous estimates, and hand them to a moderator –Find the median estimate, and show everyone the set of estimates –Discuss the results, to uncover different views of the project scope Repeat the process until estimates converge to within a predefined range

12 www.ischool.drexel.edu INFO636 Weeks 4-512 Fuzzy Logic Method This approach uses historic data to arrive at some meaningful estimates based on qualitative descriptions –Size categories such as Very Small, Small, Medium, Large, and Very Large How data are divided into these categories depends on the type of data

13 www.ischool.drexel.edu INFO636 Weeks 4-513 Fuzzy Logic Method Data with a small range (say, a factor of five from very small to very large) can use a linear divisions Data with a large range can use a base 10 logarithmic division (as shown in the text)

14 www.ischool.drexel.edu INFO636 Weeks 4-514 Fuzzy Logic Method Linear division breaks up sizes into evenly divided pieces Here’s an example for the N track –If your work to read the text involves chapters from 23 to 75 pages long (I made those numbers up), then the range of sizes is 75-23=52 pages –Divide that range into five pieces by dividing by four 52/4 = 13

15 www.ischool.drexel.edu INFO636 Weeks 4-515 Fuzzy Logic Method –The midpoints of each size are just the lowest size, then add the 13 four times Very Small midpoint = 23 pages Small midpoint = 23+13=36 pages Medium midpoint = 23+13*2=49 pages Large midpoint = 23 +13*3=62 pages Very Large midpoint = 23 +13*4=75 pages (which equals the largest chapter size)

16 www.ischool.drexel.edu INFO636 Weeks 4-516 Fuzzy Logic Method –Use half of 13, or 6.5, to find the ranges for each size Very Small range is up to 23+6.5=29.5 pages Small range is 29.5 to 36+6.5=42.5 pages Medium range is 42.5 to 49+6.5=55.5 pages Large range is 55.5 to 62+6.5=68.5 pages Very Large range is 68.5 pages and up –Notice each category’s range is also 13 pages, since we have linear divisions

17 www.ischool.drexel.edu INFO636 Weeks 4-517 Fuzzy Logic Method The logarithmic version is messier, since we have to –Convert the sizes to their base-10 logarithms –Follow the linear approach using the logarithms –Take everything to the power of 10 to convert it back to the original units

18 www.ischool.drexel.edu INFO636 Weeks 4-518 Fuzzy Logic Method –The example in the book has LOC ranging from 173 to 10,341 LOC The log 10 of 173 is 2.238 The log 10 of 10,341 is 4.014 –The difference is 4.014 – 2.238 = 1.776 –Divide the difference by four to get the interval 1.776/4=0.444 –Mimic slide 15 to find the midpoints

19 www.ischool.drexel.edu INFO636 Weeks 4-519 Fuzzy Logic Method –The midpoints of each size are just the lowest size, then add the 0.444 four times Very Small midpoint = 2.238 Small midpoint = 2.238 + 0.444 = 2.682 Medium midpoint = 2.238 + 0.444*2 = 3.126 Large midpoint = 2.238 + 0.444*3 = 3.570 Very Large midpoint = 2.238 + 0.444*4 = 4.014 (which equals the largest code size) –Mimic slide 16 to find the ranges of each size category

20 www.ischool.drexel.edu INFO636 Weeks 4-520 Fuzzy Logic Method –Use half of 0.444, or 0.222, to find the ranges for the first size (then just keep adding 0.444 to each range boundary) Very Small range is up to 2.238+0.222=2.460 Small range is 2.460 to 2.460+0.444=2.904 Medium range is 2.904 to 2.904+0.444=3.348 Large range is 3.348 to 3.348+0.444=3.792 Very Large range is 3.792 and up

21 www.ischool.drexel.edu INFO636 Weeks 4-521 Fuzzy Logic Method –Now take 10 to the power of the logarithms to find the actual LOC Very Small range is up to 10^2.460=288 LOC Small range is 288 to 10^2.904=802 LOC Medium range is 802 to 10^3.348=2228 LOC Large range is 2228 to 10^3.792=6194 LOC Very Large range is 6194 LOC and up –This is the basis for the poorly labeled table at the bottom of page 104 in the text

22 www.ischool.drexel.edu INFO636 Weeks 4-522 Fuzzy Logic Method An aside…Tables 5.2 in the text divide each of the five basic categories (Very Small, etc.) into five more “subranges” –This follows the same approach, just adding more detail to each category –It’s unlikely you’ll have enough data to worry about subranges

23 www.ischool.drexel.edu INFO636 Weeks 4-523 Standard Component Method The Standard Component Method, by Putnam, assumes you have a substantial database from which to make your estimates –Make a realistic estimate of how many screens you think will be in your system –Estimate the lowest and highest possible numbers of screens you could imagine will be in your system

24 www.ischool.drexel.edu INFO636 Weeks 4-524 Standard Component Method –For actual estimation, use n = (lowest number + highest number + 4*realistic number)/6 –The idea is to try to account for possible error in your estimate Repeat this process for each type of component in your system

25 www.ischool.drexel.edu INFO636 Weeks 4-525 Function Point Method The function point approach uses “function points” as a proxy for the complexity of the system, independent of the programming language used

26 www.ischool.drexel.edu INFO636 Weeks 4-526 Function Point Method –Each input or output function, interface, file, and inquiry is judged on a fixed complexity scale of small to large (not shown in the Humphrey text), and assigned some number of function points –The total number of function points is adjusted for 14 “influence” factors, such as the developers’ expertise, business environment, etc.

27 www.ischool.drexel.edu INFO636 Weeks 4-527 Function Point Method While a great language-independent method for judging the complexity of a program, it isn’t as reliable for estimating development effort –See IFPUG for more detailsIFPUG

28 www.ischool.drexel.edu INFO636 Weeks 4-528 Proxy-based Estimating We are trying to predict the final size of a software product Measuring or estimating that directly is tricky at best, so we use proxies to help get there –A proxy is an intermediate concept or substitute for what we really want to predict

29 www.ischool.drexel.edu INFO636 Weeks 4-529 Proxy-based Estimating The overall process is like this –We want to take the conceptual design, and break it into parts which correspond to the proxies available –Estimate each part of the system, based on the proxies –Add them up to get the overall product size

30 www.ischool.drexel.edu INFO636 Weeks 4-530 Choosing a Proxy The proxy size should correspond to the development effort size Proxy content should be countable and easy to visualize Proxy must be customizable The proxy should be sensitive to the same factors which affect development

31 www.ischool.drexel.edu INFO636 Weeks 4-531 Possible Proxies In a manner similar to function points, any characteristic of the system could be proxies –Input screens, output reports, data files –Objects or classes The fuzzy logic and function point concepts are essentially blended to produce the PROBE approach

32 www.ischool.drexel.edu INFO636 Weeks 4-532 PROBE Method PROxy-Based Estimation (PROBE) uses objects as proxies –See also Appendix C, Tables C36 and C40 First choose appropriate proxy categories (e.g. Table 5.7, p. 117) –For code, calculation, data, I/O, control, print, etc. might be suitable proxies –Reading, discussion, homework,… (N track)

33 www.ischool.drexel.edu INFO636 Weeks 4-533 PROBE Method Choose reasonable size options for the proxies –For class, you might only have enough data for three sizes instead of five Analyze your historic data to determine approximate sizes (LOC) for each proxy –For N track, the amount of effort needed

34 www.ischool.drexel.edu INFO636 Weeks 4-534 PROBE Method Now start using your method for a given assignment –Develop a conceptual design for the solution –Use your proxies to estimate the amount of code or effort needed to develop them –The example on page 120 is the first use of form C39 (p. 683)

35 www.ischool.drexel.edu INFO636 Weeks 4-535 A Course Note P track students will use the estimating pretty much as written in the text –Our forms are slightly different N track students will develop their own proxies to correspond to their weekly activities, and create a custom form N39 to follow a similar process

36 www.ischool.drexel.edu INFO636 Weeks 4-536 PROBE Method The BASE PROGRAM section of C39 is a summary of the expected changes to the preexisting code –Base Size (B) is the amount of code already present –LOC Deleted (D) is how much existing code you plan to remove –LOC Modified (M) is how much existing code you expect to change

37 www.ischool.drexel.edu INFO636 Weeks 4-537 PROBE Method The PROJECTED LOC section contains: –Base Additions (BA) are planned additions to existing code (new lines within existing modules) –New Objects (NO) are new modules or classes which will need to be implemented Your proxy structure is used to describe the Type, Methods, and Relative Size of the changes to BA and NO

38 www.ischool.drexel.edu INFO636 Weeks 4-538 PROBE Method The REUSED OBJECTS (R) section of C39 is used to describe –Code you’ll reuse from another preexisting source –Code you’ll create during this assignment which will be reusable These tend to be rare during the course

39 www.ischool.drexel.edu INFO636 Weeks 4-539 PROBE Method Now comes the number crunching part –The Projected LOC (P) is the total amount of new development for this assignment; P = BA + NO –The terms  0 (hereafter beta0) and  1 (beta1) are linear regression parameters from your work history –By now you have a history of planned LOC or effort, and actual

40 www.ischool.drexel.edu INFO636 Weeks 4-540 PROBE Method What are beta0 and beta1? The classic equation for a line is y = mx + b –‘m’ is the slope, which corresponds to beta1 –‘b’ is the y-intercept, which is beta0 Here the ‘x’ axis is the planned LOC or effort, and the ‘y’ axis has actual values

41 www.ischool.drexel.edu INFO636 Weeks 4-541 PROBE Method } Planned LOC (X) Actual LOC (Y) x x x x x x Data points from weekly assignments Linear regression 1 Beta1 (slope) Beta0 (y-intercept)

42 www.ischool.drexel.edu INFO636 Weeks 4-542 PROBE Method See “regression” handout for an example of calculating beta0 and beta1 –Note that  x i 2 means  (x i 2 ) not [  (x i )] 2 When you use this, make sure the formulas are correct –‘n’ changes each week as new data is created

43 www.ischool.drexel.edu INFO636 Weeks 4-543 PROBE Method Incidentally, if your estimates are always perfect, you’d have beta1 = 1, and beta0 = 0 (why?) Once you have beta0 and beta1, find: –New and Changed LOC (N) = beta0 + beta1*(P + M) –It’s critical to note that later calculations for prediction interval use ‘N’, not ‘P’

44 www.ischool.drexel.edu INFO636 Weeks 4-544 PROBE Method The expected size of the application after this project is –Total LOC (T) = N + B - D – M + R The Total New Reused is the sum of code flagged (with a *) in the New Objects section which are being reused –Don’t need to use this very often

45 www.ischool.drexel.edu INFO636 Weeks 4-545 PROBE Method Then we get to the Range calculation We have a refined estimate of the size of the system, but want to establish a prediction interval in which the real outcome is likely to fall –See the PSP_Calculation_Example.xls spreadsheet

46 www.ischool.drexel.edu INFO636 Weeks 4-546 PROBE Method To find the Range, we start with a parameter from the ‘t’ distribution Called ‘t(  /2, n-2)’ where –  /2 is the width of the prediction interval – generally 70% or 90% –‘n-2’ is the number of degrees of freedom; again, ‘n’ is the number of data pairs –In Excel, use TINV(1 -  /2, n - 2)

47 www.ischool.drexel.edu INFO636 Weeks 4-547 PROBE Method Next we need the standard deviation,  –That’s why column G adds up (Yi - b0 + b1*Xi) 2 –  = sqrt[  (Yi - b0 + b1 Xi) 2 / (n-2)] Now there’s a new term, xk (x k ) –x k = P + M –This is the same term used in the N formula – the projected and modified LOC

48 www.ischool.drexel.edu INFO636 Weeks 4-548 PROBE Method Now use this to plug into formula 5.3 on page 124 –I’m not going to copy it here –Notice in the spreadsheet the column H calculation of (Xi - Xavg)**2 which is also used to find the Range

49 www.ischool.drexel.edu INFO636 Weeks 4-549 PROBE Method Finally, find the Upper and Lower Prediction Intervals (UPI and LPI) –UPI = N + Range –LPI = N – Range The Prediction Interval Percent is either 70% or 90%, the value used to find ‘t’

50 www.ischool.drexel.edu INFO636 Weeks 4-550 PROBE Method If Range is comparable to N in magnitude –Choose a Prediction Interval Percent of 70% to keep Range smaller, and/or –Look for data fliers which can have a strong influence on sigma (  ) E.g. data points with relatively large value of (Yi - b0 + b1*Xi) 2

51 www.ischool.drexel.edu INFO636 Weeks 4-551 Object Size Ranges The fuzzy logic method (starting on slide 12) summarizes the two most likely approaches for defining size ranges based on your historic data –A Linear approach, generally best if the range of the data is well under a factor of 10 –A logarithmic approach for wider range data

52 www.ischool.drexel.edu INFO636 Weeks 4-552 Object Size Ranges If your work is following a true normal distribution, then your objects should have –6.68% each in Very Small and Very Large categories –24.17% each in Small and Large categories –38.30% in the Medium category It’s good to see if this holds

53 www.ischool.drexel.edu INFO636 Weeks 4-553 Object Size Ranges If your object size distribution is really skewed, you could –Reconsider the size categories –Look for better proxies –See if your design approach is leaning toward very large or very small objects, or very inconsistent object sizes

54 www.ischool.drexel.edu INFO636 Weeks 4-554 N Track Notes You’ll use most of the preceding discussion –You’ll have different proxies instead of the {Base Program, Projected LOC, and Reused Objects} –You’ll have some equivalent of ‘P’ and ‘N’, and still find beta0, beta1, and Range Your P and N will measure time instead of LOC –You’ll still find prediction intervals UPI, LPI

55 www.ischool.drexel.edu INFO636 Weeks 4-555 Improving Estimation We tend to try to estimate many small things for a large task –The estimation errors tend to cancel each other somewhat The PSP allows you to know what your estimation errors have been, and hence improve later estimates –Though that’s hard to see during the term

56 www.ischool.drexel.edu INFO636 Weeks 4-556 Improving Estimation As you follow this consistently, your values for beta0 and beta1 will tend to stabilize –Then you don’t have to keep recalculating them! –If you get really weird beta0 and beta1, or have no history yet, look at other options for refining your estimate, on page 679 (Table C35)

57 www.ischool.drexel.edu INFO636 Weeks 4-557 Improving Estimation On large projects, look for a consistent, and fairly low, level of abstraction –The conceptual design might need to be refined to provide enough detail for a good estimate –If a single object performs the work of many kinds of proxies, then it probably needs to be broken down

58 www.ischool.drexel.edu INFO636 Weeks 4-558 Improving Estimation Estimating products which have no precedent is really tough –Make sure the level of uncertainty is clear to your customer Avoid overcompensating for your own history of errors –Make small changes in your approach and try them for a while


Download ppt "Www.ischool.drexel.edu INFO 636 Software Engineering Process I Prof. Glenn Booker Weeks 4-5 – Estimating Software Size 1INFO636 Weeks 4-5."

Similar presentations


Ads by Google