If you fix everything you lose fixes for everything else Tim Menzies (WVU) Jairus Hihn (JPL) Oussama Elrawas (WVU) Dan Baker (WVU) Karen Lum (JPL) International.

If you fix everything you lose fixes for everything else Tim Menzies (WVU) Jairus Hihn (JPL) Oussama Elrawas (WVU) Dan Baker (WVU) Karen Lum (JPL) International Workshop on Living with Uncertainty, IEEE ASE 2007, Atlanta, Georgia, Nov 5, 2007 This work was conducted at West Virginia University and the Jet Propulsion Laboratory under grants with NASA's Software Assurance Research Program. Reference herein to any specific commercial product, process, or service by trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government. tim@menzies.us oelrawas@mix.wvu.edu

2 What does this mean? Q: for what models does (a few peeks) = (many hard stares)? A supposedly np-hard task abduction over first- order theories nogood/2

3 A: models with “collars” Grow – Monte Carlo a model Picking input settings at random – For each run Score each output Add score to each input settings Harvest – Rule generation experiments, favoring settings with better scores If “collars”, then – … small rules … – … learned quickly … – … will suffice “Collar” variables set the other variables – Narrows Amarel in the 60s – Minimal environments DeKleer ’85 – Master variables Crawford & Baker ‘94 – Feature subset selection Kohavi & John ‘97 – Back doors Williams et al ‘03 – Etc Implications for uncertainty? “Collar” variables set the other variables – Narrows Amarel in the 60s – Minimal environments DeKleer ’85 – Master variables Crawford & Baker ‘94 – Feature subset selection Kohavi & John ‘97 – Back doors Williams et al ‘03 – Etc Implications for uncertainty? Feather & Menzies RE’02

4 STAR: collars + simulated annealing on Boehm’s USC’s software process models USC software process models for effort, defects, threats – y[i] = impact[i] * project[i] + b[i] for i  {1,2,3,…} –  ≤ project[i] ≤  : uncertainty in project description –  ≤ impact[i] ≤  : uncertainty in model calibration Random solution – pick project[i] and impact[i] from any .. , ..  – ..  set via domain knowledge; e.g. process maturity in 3 to 5 – range of ..  known from history; Score solution by effort (Ef), defects (De) and Threat (Th) For example uncontrollable controllable

5 Two studies y[i] = impact[i] * project[i] + b[i] Certain methods – Using much historical data – Learn the magnitude of the impact[i] relationship – With fixed impact[I] Monte Carlo at andom across the project[i] settings E.g. – Regression-based tools that learn impact[I] from historical records – 93 records of JPL systems – SCAT: JPL’s current methods – 2CEE: WVU’s improvement over SCAT (currently under test) Methods with more uncertainty – Using no historical data – Monte Carlo at random across the project[i] settings and impact[i] settings E.g. – STAR – Monte Carlo a model – Score each output – Sort settings by their “C”, “C”= cumulative score – Rule generation experiments, favoring settings with better “C”. Methods with more uncertainty – Using no historical data – Monte Carlo at random across the project[i] settings and impact[i] settings E.g. – STAR – Monte Carlo a model – Score each output – Sort settings by their “C”, “C”= cumulative score – Rule generation experiments, favoring settings with better “C”. Tame uncontrollables via historical records one two

6 for setting  S x { value[setting] += E } Sort all settings by their value – Ignore uncontrollables impact[I] – Assume the top (1 ≤ i ≤ max) project[I] settings – Randomly select the rest “Policy point” : – smallest I with lowest E Median = 50% percentile – Spread = (75-50)% percentile Bad Good 22 good ideas 38 not-so- good ideas Inside STAR 1. sampling - simulated annealing 2. summarizing - post-processor

7 SCAT vs 2CEE vs STAR project[i]

8 SCAT vs 2CEE vs STAR project[i] Control impact[I] via historical data

9 SCAT vs 2CEE vs STAR project[i] Stagger around superset of possible impact[I] Control impact[I] via historical data

10 Median: 50% point Spread : (75 - 50)% Median: 50% point Spread : (75 - 50)% SCAT vs 2CEE vs STAR project[i] Stagger around superset of possible impact[I] Control impact[I] via historical data

11 Median: 50% point Spread : (75 - 50)% Median: 50% point Spread : (75 - 50)% STAR/2cee= 50/ 800= 6% STAR/scat= 50/1300= 4% STAR/2cee= 50/ 800= 6% STAR/scat= 50/1300= 4% SCAT vs 2CEE vs STAR project[i] Stagger around superset of possible impact[I] Control impact[I] via historical data

12 STAR/2cee= 400/1600= 25% STAR/scat= 400/1900= 21% STAR/2cee= 400/1600= 25% STAR/scat= 400/1900= 21% Median: 50% point Spread : (75 - 50)% Median: 50% point Spread : (75 - 50)% STAR/2cee= 50/ 800= 6% STAR/scat= 50/1300= 4% STAR/2cee= 50/ 800= 6% STAR/scat= 50/1300= 4% STAR/2cee= 30/620= 5% STAR/scat= 30/730= 4% STAR/2cee= 30/620= 5% STAR/scat= 30/730= 4% SCAT vs 2CEE vs STAR project[i] STAR/2cee= 180/ 400= 45% STAR/scat= 180/1900= 60% STAR/2cee= 180/ 400= 45% STAR/scat= 180/1900= 60% Stagger around superset of possible impact[I] Control impact[I] via historical data

13 STAR/2cee= 400/1600= 25% STAR/scat= 400/1900= 21% STAR/2cee= 400/1600= 25% STAR/scat= 400/1900= 21% Median: 50% point Spread : (75 - 50)% Median: 50% point Spread : (75 - 50)% STAR/2cee= 50/ 800= 6% STAR/scat= 50/1300= 4% STAR/2cee= 50/ 800= 6% STAR/scat= 50/1300= 4% STAR/2cee= 30/620= 5% STAR/scat= 30/730= 4% STAR/2cee= 30/620= 5% STAR/scat= 30/730= 4% SCAT vs 2CEE vs STAR project[i] STAR/2cee= 180/ 400= 45% STAR/scat= 180/1900= 60% STAR/2cee= 180/ 400= 45% STAR/scat= 180/1900= 60% Stagger around superset of possible impact[I] Control impact[I] via historical data

14 STAR/2cee= 400/1600= 25% STAR/scat= 400/1900= 21% STAR/2cee= 400/1600= 25% STAR/scat= 400/1900= 21% Median: 50% point Spread : (75 - 50)% Median: 50% point Spread : (75 - 50)% STAR/2cee= 50/ 800= 6% STAR/scat= 50/1300= 4% STAR/2cee= 50/ 800= 6% STAR/scat= 50/1300= 4% STAR/2cee= 30/620= 5% STAR/scat= 30/730= 4% STAR/2cee= 30/620= 5% STAR/scat= 30/730= 4% SCAT vs 2CEE vs STAR project[i] STAR/2cee= 180/ 400= 45% STAR/scat= 180/1900= 60% STAR/2cee= 180/ 400= 45% STAR/scat= 180/1900= 60% Stagger around superset of possible impact[I] Control impact[I] via historical data Ignoring historical data is useful (!!!?)

15 STAR/2cee= 400/1600= 25% STAR/scat= 400/1900= 21% STAR/2cee= 400/1600= 25% STAR/scat= 400/1900= 21% Median: 50% point Spread : (75 - 50)% Median: 50% point Spread : (75 - 50)% STAR/2cee= 50/ 800= 6% STAR/scat= 50/1300= 4% STAR/2cee= 50/ 800= 6% STAR/scat= 50/1300= 4% STAR/2cee= 30/620= 5% STAR/scat= 30/730= 4% STAR/2cee= 30/620= 5% STAR/scat= 30/730= 4% SCAT vs 2CEE vs STAR project[i] STAR/2cee= 180/ 400= 45% STAR/scat= 180/1900= 60% STAR/2cee= 180/ 400= 45% STAR/scat= 180/1900= 60% Stagger around superset of possible impact[I] Control impact[I] via historical data Ignoring historical data is useful (!!!?)

16 STAR/2cee= 400/1600= 25% STAR/scat= 400/1900= 21% STAR/2cee= 400/1600= 25% STAR/scat= 400/1900= 21% Median: 50% point Spread : (75 - 50)% Median: 50% point Spread : (75 - 50)% STAR/2cee= 50/ 800= 6% STAR/scat= 50/1300= 4% STAR/2cee= 50/ 800= 6% STAR/scat= 50/1300= 4% STAR/2cee= 30/620= 5% STAR/scat= 30/730= 4% STAR/2cee= 30/620= 5% STAR/scat= 30/730= 4% SCAT vs 2CEE vs STAR project[i] STAR/2cee= 180/ 400= 45% STAR/scat= 180/1900= 60% STAR/2cee= 180/ 400= 45% STAR/scat= 180/1900= 60% Stagger around superset of possible impact[I] Control impact[I] via historical data If you fix everything, you lose fixes for everything else Ignoring historical data is useful (!!!?)

Luke, trust the force, I mean, collars IEEE Computer, Jan 2007 “The strangest thing about software”

Extra Material

19 Related work Feather, DDP, treatment learning – Optimization of requirement models XEROC PARC, 1980s, qualitative representations (QR) – not overly-specific, – Quickly collected in a new domain. – Used for model diagnosis and repair – Can found creative solutions in larger space of possible qualitative behaviors, than in the tighter space of precise quantitative behaviors Abduction : – World W = minimal set of assumptions (w.r.t. size) such that T  A => G Not(T U A => error) – Framework for validation, diagnosis, planning, monitoring, explanation, tutoring, test case generation, prediction,… – Theoretically slow (NP-hard) but this should be practical: Abduction + stochastic sampling Find collars Learn constraints on collars Abduction : – World W = minimal set of assumptions (w.r.t. size) such that T  A => G Not(T U A => error) – Framework for validation, diagnosis, planning, monitoring, explanation, tutoring, test case generation, prediction,… – Theoretically slow (NP-hard) but this should be practical: Abduction + stochastic sampling Find collars Learn constraints on collars

20 Possible optimizations (not used here) STAR, an example of a general process: – Stochastic sampling – Sort settings by “value” – Rule generation experiments favoring highly “value”-ed settings See also, elite sampling in the cross-entropy method If SA convergence too slow – Try moving back select into the SA; – Constrain solution mutation to prefer highly “value”-ed settings BORE (best or rest) – n runs – Best= top 10% scores – Rest = remaining 90% – {a,b} = frequency of discretized range in {best, rest – Sort settings by -1 * (a/n) 2 / (a/n + b/n) Other valuable tricks: – Incremental discretization: Gama&Pinto’s PID + Fayyad&Irani – Limited discrepancy search: Harvey&Ginsberg – Treatment learning: Menzies&Yu BORE (best or rest) – n runs – Best= top 10% scores – Rest = remaining 90% – {a,b} = frequency of discretized range in {best, rest – Sort settings by -1 * (a/n) 2 / (a/n + b/n) Other valuable tricks: – Incremental discretization: Gama&Pinto’s PID + Fayyad&Irani – Limited discrepancy search: Harvey&Ginsberg – Treatment learning: Menzies&Yu Ask me why, off-line

“Uncertainty helps planning” (questions? comments?)

22 At the “policy point”, STAR’s random solutions are surprisingly accurate LC : learn impact[i] via regression (JPL data) STAR: no tuning, randomly pick impact[i] Diff = ∑ mre(lc)/ ∑ mre(star) Mre = abs(predicted - actual) /actual { “ ” “  ”} same at {95, 99}% confidence (MWU) Why so little Diff (median= 75%)? – Most influential inputs tightly constrained diff same diff same diff same ∑ mre(lc) / ∑ mre(star)strategictactical ground 66% 63% all 91% 75% OSP2 99% 125%  OSP 112%  111%  flight 101%  121% 

23 (Model uncertainty = collars) << inputs In many models, a few “collar” variables set the other variables – Narrows (Amarel in the 60s) – Minimal environments (DeKleer ’85) – Master variables (Crawford & Baker ‘94) – Feature subset selection (Kohavi & John ‘97) – Back doors (Williams et al ‘03) – See “The Strangest Thing About Software (IEEE Computer, Jan’07)” Collars appear in all execution traces (by definition) – You don’t have to find the collars, they’ll find you So, to handle uncertainty – Write a simulator – Stagger over uncertainties – From stagger, find collars – Constrain collars This talk: a very simple example of this process

24 Comparisons Standard software process modeling – Models written more than run (PROSIM community) Limited sensitivity analysis Limited trade space – Or, expensive, error-prone, incomplete data collection programs Point solutions Here: – No data collection – Found stable conclusions within a space of possibilities – Search : very simple – Solution, not brittle With trade-off space 22 good ideas, sorted

25 Summary Living with uncertainty – Sometimes, simpler than you may think – more useful than you might think Simple: – Here, the smallest change to simulating annealing Useful: – Sometimes uncertainty can teach you more than certainty – If you fix everything, you lose fixes to everything else Collars control certainty – Uncertainty plus constrained collars  more certainty – Also, can drive model to better performance An example you can explain to any business user Bad Good 22 good ideas, sorted An example you can explain to any business user

If you fix everything you lose fixes for everything else Tim Menzies (WVU) Jairus Hihn (JPL) Oussama Elrawas (WVU) Dan Baker (WVU) Karen Lum (JPL) International.

Similar presentations

Presentation on theme: "If you fix everything you lose fixes for everything else Tim Menzies (WVU) Jairus Hihn (JPL) Oussama Elrawas (WVU) Dan Baker (WVU) Karen Lum (JPL) International."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

If you fix everything you lose fixes for everything else Tim Menzies (WVU) Jairus Hihn (JPL) Oussama Elrawas (WVU) Dan Baker (WVU) Karen Lum (JPL) International.

Similar presentations

Presentation on theme: "If you fix everything you lose fixes for everything else Tim Menzies (WVU) Jairus Hihn (JPL) Oussama Elrawas (WVU) Dan Baker (WVU) Karen Lum (JPL) International."— Presentation transcript:

Similar presentations

About project

Feedback