Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intraprocedural Dataflow Analysis for Software Product Lines

Similar presentations


Presentation on theme: "Intraprocedural Dataflow Analysis for Software Product Lines"— Presentation transcript:

1 Intraprocedural Dataflow Analysis for Software Product Lines
Claus Brabrand IT University of Copenhagen Universidade Federal de Pernambuco [ ] Márcio Ribeiro Universidade Federal de Alagoas Universidade Federal de Pernambuco [ ] Paulo Borba Universidade Federal de Pernambuco [ ] Társis Tolêdo Universidade Federal de Pernambuco [ ]

2 Abstract Software product lines (SPLs) are commonly developed using annotative approaches such as conditional compilation that come with an inherent risk of constructing erroneous products. For this reason, it is essential to be able to analyze SPLs. However, as dataflow analysis techniques are not able to deal with SPLs, developers must generate and analyze all valid methods individually, which is expensive for non-trivial SPLs. In this paper, we demonstrate how to take any standard intraprocedural dataflow analysis and automatically turn it into a feature-sensitive dataflow analysis in three different ways. All are capable of analyzing all valid methods of an SPL without having to generate all of them explicitly. We have implemented all analyses as extensions of SOOT’s intraprocedural dataflow analysis framework and experimentally evaluated their performance and memory characteristics on four qualitatively different SPLs. The results indicate that the feature-sensitive analyses are on average 5.6 times faster than the brute force approach on our SPLs, and that they have different time and space tradeoffs.

3 < Outline > Introduction Software Product Lines (recap)
Dataflow Analysis (recap) Dataflow Analyses for Software Product Lines: feature in-sensitive (A1) vs feature sensitive (A2, A3, A4) Results: A1 vs A2 vs A3 vs A4 (in theory and practice) Related Work Conclusion

4 Introduction Traditional Software Development: Product Line:
One program = One product Product Line: A ”family” of products (of N ”similar” products): 1x CAR = 1x CELL PHONE = 1x APPLICATION = CARS CELL PHONES APPLICATIONS customize SPL: (Family of Programs)

5 Software Product Line SPL: Feature Model: Family of Programs: 2F
(e.g.: ψFM ≡ VIDEO  COLOR) Ø Family of Programs: customize COLOR VIDEO COLORVIDEO { Color } 2F { Video } Features: F = { COLOR, VIDEO } { Color, Video } VALID Configurations: Ø, {Color}, {Video}, {Color,Video} 2F

6 Software Product Line SPL: Conditional compilation: Family of s:
Program COLOR VIDEO #ifdef (  ) ... #endif Alternatively, via Aspects (as in AOSD) COLORVIDEO VIDEO *** uninitialized variable! in configurations: {Ø, {COLOR}} Logo logo; ... use(logo); #ifdef (VIDEO) logo = new Logo(); #endif Example: Similarly for; e.g.: ■ null-pointers ■ unused variables ■ undefined variables

7 Feature-sensitive data-flow analysis !
Analysis of SPLs The Compilation Process: ...and for Software Product Lines: compile run result ANALYZE! ERROR! customize compile run compile run compile run result result result 2F ANALYZE! ERROR! ANALYZE! ERROR! ANALYZE! ERROR! Feature-sensitive data-flow analysis !

8 < Outline > Introduction Software Product Lines (recap)
Dataflow Analysis (recap) Dataflow Analyses for Software Product Lines: feature in-sensitive (A1) vs feature sensitive (A2, A3, A4) Results: A1 vs A2 vs A3 vs A4 (in theory and practice) Related Work Conclusion

9 Dataflow Analysis L Dataflow Analysis: 1) Control-flow graph
2) Lattice (finite height) 3) Transfer functions (monotone) Example: "sign-of-x analysis" L

10 < Outline > Introduction Software Product Lines (recap)
Dataflow Analysis (recap) Dataflow Analyses for Software Product Lines: feature in-sensitive (A1) vs feature sensitive (A2, A3, A4) Results: A1 vs A2 vs A3 vs A4 (in theory and practice) Related Work Conclusion

11 A1 (brute force) A1 (feature in-sensitive): N = 2F compilations! L _ _
void m() { int x=0; ifdef(A) x++; ifdef(B) x--; } A1 (feature in-sensitive): N = 2F compilations! ψFM = A∨B c = {A}: c = {B}: c = {A,B}: _ | _ | _ | int x = 0; int x = 0; int x = 0; x++; x++; x++; + x--; x--; x--; + - 0/+

12 A2 (consecutive) A2 (feature sensitive!): L _ _ _ ✓ ✓ ✓ ✓ ✗ ✓ + + ✗ ✓
void m() { int x=0; ifdef(A) x++; ifdef(B) x--; } A2 (feature sensitive!): ψFM = A∨B c = {A}: c = {B}: c = {A,B}: _ | _ | _ | int x = 0; c |- [[true]] int x = 0; c |- [[true]] int x = 0; c |- [[true]] [[true]] [[true]] [[true]] x++; c |- [[A]] x++; c |- [[A]] x++; c |- [[A]] [[A]] [[A]] [[A]] + + x--; c |- [[B]] x--; c |- [[B]] x--; c |- [[B]] [[B]] [[B]] [[B]] + - 0/+

13 A3 (simultaneous) A3 (feature sensitive!): L _ _ _ + + + - 0/+
void m() { int x=0; ifdef(A) x++; ifdef(B) x--; } A3 (feature sensitive!): ψFM = A∨B ∀c ∈ {{A},{B},{A,B}}: _ | _ | _ | ({A} = , {B} = , {A,B} = ) ∀c |- [[true]] int x = 0; [[true]] ({A} = , {B} = , {A,B} = ) ∀c |- [[A]] x++; [[A]] ({A} = , {B} = , {A,B} = ) + + ∀c |- [[B]] x--; [[B]] ({A} = , {B} = , {A,B} = ) + - 0/+

14 i.e., invalid given wrt. the feature model, ψ !
A4 (shared) L void m() { int x=0; ifdef(A) x++; ifdef(B) x--; } A4 (feature sensitive!): ψFM = A∨B ψFM = A∨B: _ | ( [[ψ]] = ) int x = 0; [[true]] ( [[ψ]] = ) …using BDD representation! (compact+efficient) x++; (A∨B)∧¬A∧¬B ≡ false [[A]] i.e., invalid given wrt. the feature model, ψ ! + ( [[ψ∧¬A]] = , [[ψ∧A]] = ) x--; [[B]] + - 0/+ ( [[ψ∧¬A∧¬B]] = , [[ψ∧A∧¬B]] = , [[ψ∧¬A∧B]] = , [[ψ∧A∧B]] = )

15 < Outline > Introduction Software Product Lines (recap)
Dataflow Analysis (recap) Dataflow Analyses for Software Product Lines: feature in-sensitive (A1) vs feature sensitive (A2, A3, A4) Results: A1 vs A2 vs A3 vs A4 (in theory and practice) Related Work Conclusion

16 Evaluation Four (qualitatively different) SPL benchmarks:
Implementation: A1, A2, A3, A4 in SOOT + CIDE Evaluation: total time, analysis time, memory usage ALL results averages over 10 runs with minimum and maximum data-point eliminated.

17 Feature sensitive (avg. gain factor): (Reaching Definitions)
Results (total time) In theory: In practice: 2F 2F 2F Feature sensitive (avg. gain factor): A2 (3x), A3 (4x), A4 (5x) (Reaching Definitions) 1x 1x 1x 2x A1 computed as average of [c = Ø] and [c = 2^F_local] (due to parse errors in some configs) A2: (17%+33%+105%+47%)/4 = 51 % A2: (1/17%+1/33%+1/105%+1/47%)/4 = 2.9x (3x) A3: (12%+21%+105%+39%)/4 = 44 % A3: (1/12%+1/21%+1/105%+1/39%)/4 = 4.2x (4x) A4: (07%+30%+104%+54%)/4 = 49 % A4: (1/07%+1/30%+1/104%+1/54%)/4 = 5.1x (5x) BEST gain factor, averaged over benchmarks: 5.6x ! 2x 3x 2½x 3x 6x 5x 8x 14x

18 Results (analysis time)
In theory: In practice: A2 2F A3 vs On average (A2 vs A3): TIME(A4) : Depends on degree of sharing in SPL ! A3 (1.5x) faster (Reaching Definitions) (caching!) ( ) / 4 = 70% (1/ / /1 + 1/.72) / 4 = 1.5x N fix-point calculations where each step costs 1 1 fix-point calculation where each step costs N

19 Results (memory usage)
In theory: In practice: A2 A3 vs 2F SPACE(A4) : Depends on degree of sharing in SPL ! Average (Reaching Definitions) 6.3 : 1 AVERAGE (A3 vs A4) = 6.3:1

20 < Outline > Introduction Software Product Lines (recap)
Dataflow Analysis (recap) Dataflow Analyses for Software Product Lines: feature in-sensitive (A1) vs feature sensitive (A2, A3, A4) Results: A1 vs A2 vs A3 vs A4 (in theory and practice) Related Work Conclusion

21 Related Work (DFA) Path-sensitive DFA: Predicated DFA:
Idea of “conditionally executed statements” Compute different analysis info along different paths (~ A2, A3, A4) to improve precision or to optimize “hot paths” Predicated DFA: Guard lattice values by propositional logic predicates (~ A4), yielding “optimistic dataflow values” that are kept distinct during analysis (~ A3 and A4) “Constant Propagation with Conditional Branches” ( Wegman and Zadeck ) TOPLAS 1991 “Predicated Array Data-Flow Analysis for Run-time Parallelization” ( Moon, Hall, and Murphy ) ICS 1998 Our work: Automatically lift any DFA to SPLs (with ψFM) ⇒ feature-sensitive analysis for analyzing entire program family

22 Related Work (Lifting for SPLs)
Model Checking: Type Checking: Parsing: Testing: Model checks all SPLs at the same time (3.5x faster) than one by one! (similar goal, diff techniques) Model Checking Lots of Systems: Efficient Verification of Temporal Properties in Software Product Lines” ( Classen, Heymans, Schobbens, Legay, and Raskin ) ICSE 2010 Type checking ↔ DFA (similar goals, diff techniques) Our: auto lift any DFA (uninit vars, null pointers, ...) “Type-Checking Software Product Lines - A Formal Approach” ( Kastner and Apel ) ASE 2008 “Type Safety for Feature-Oriented Product Lines” ( Apel, Kastner, Grösslinger, and Lengauer ) ASE 2010 (similar techniques, diff goal): Split and merging parsing (~A4) and also uses instrumentation “Variability-Aware Parsing in the Presence of Lexical Macros & C.C.” ( Kastner, Giarrusso, Rendel, Erdweg, Ostermann, and Berger ) OOPSLA 2011 Select relevant feature combinations for a given test case Uses (hardwired) DFA (w/o FM) to compute reachability “Reducing Combinatorics in Testing Product Lines” ( Hwan, Kim, Batory, and Khurshid ) AOSD 2011

23 Related Work (emerging interfaces)
Compute E.I. to flag dependencies and how edit in one place affect feature(s) elsewhere “Emergent Feature Modularization” ( Ribeiro, Pacheco, Teixeira, and Borba ) Onward! 2010 “EMERGO: A Tool for Improving Maintainability of Preprocessor-Based PLs” ( Ribeiro, Tolêdo, Winther, Brabrand, and Borba ) AOSD Tool Demo 2012 “EMERGO: A Tool for Improving Maintainability of Preprocessor-Based Product Lines” Thursday at 14:00 and Friday at 16:00 AOSD 2012 TOOL DEMO

24 < Outline > Introduction Software Product Lines (recap)
Dataflow Analysis (recap) Dataflow Analyses for Software Product Lines: feature in-sensitive (A1) vs feature sensitive (A2, A3, A4) Results: A1 vs A2 vs A3 vs A4 (in theory and practice) Related Work Conclusion

25 Conclusion(s) It is possible to analyze SPLs using DFAs
We can automatically "lift" any dataflow analysis and make it feature sensitive: A2) Consecutive A3) Simultaneous A4) Shared Simultaneous A2,A3,A4 much faster (3x,4x,5x) than naive A1 A3 is (1.5x) faster than A2 (caching!) A4 saves lots of memory vs A3 (sharing!) 6.3 : 1

26 < Obrigado* > *) Thanks

27 BONUS SLIDES

28 INTER-procedural data-flow analysis In progress...! Future Work
Explore how all this scales to…: In particular: …relative speed of A1 vs A2 vs A3 vs A4 ? …which analyses are feasible vs in-feasible ? INTER-procedural data-flow analysis In progress...!

29 Specification: A1, A2, A3, A4 A1 A2 A3 A4

30 Results (analysis time)
?! In theory: In practice: Nx1 ≠ 1xN (caching!) A2 2F A3 2F vs On average (A2 vs A3): TIME(A4) : Depends on degree of sharing in SPL ! A3 (1.5x) faster (Reaching Definitions) ( ) / 4 = 70% (1/ / /1 + 1/.72) / 4 = 1.5x N fix-point calculations where each step costs 1 1 fix-point calculation where each step costs N

31 A2 vs A3 (caching) Cache misses in A2 vs A3: Normal cache:
As expected, A2 incurs more cache misses (⇒ slower!) Full/no cache*: As hypothesized, this indeed affects A2 more than A3 i.e., A3 has better cache properties than A2 A2 A3 vs *) we flush the L2 cache, by traversing an 8MB “bogus array” to invalidate cache!

32 Analyzing a Program 1) Program 2) Build CFG 3) Make Equations
4) Solve equations: fixed-point computation (iteration) 5) SOLUTION (least fixed point):

33 IFDEF normalization Refactor "undisciplined" (lexical) ifdefs into "disciplined" (syntactic) ifdefs: Normalize "ifdef"s (by transformation):

34 Feature Model (Example)
Note: | [[FM]] | = 3 < 32 = |2F | Feature Model: Feature set: Formula: Set of configurations: F = {Car, Engine, 1.0, 1.4, Air} [[ ]] = FM  Car  Engine  (1.01.4)  Air1.4 { {Car, Engine, 1.0}, {Car, Engine, 1.4}, {Car, Engine, 1.4, Air} }

35 Example Bug from Lampiro
Lampiro SPL (IM client for XMPP protocol): *** uninitialized variable "logo" (if feature "GLIDER" is defined) Similar problems with: undeclared variables, unused variables, null pointers, ...

36 BDD (Binary Decision Diagram)
Compact and efficient representation for boolean functions (aka., set of set of names) FAST: negation, conjunction, disjunction, equality ! =  F(A,B,C) = A(BC) A B C BDD A C minimized BDD B

37 Formula ~ Set of Configurations
Definitions (given F, set of feature names): f  F feature name c  2F configuration (set of feature names) c  F X  set of config's (set of set of feature names) X  2F Example ifdefs: F [[ BA ]] = { {A}, {B}, {A,B} } F = {A,B} [[ A(BC) ]] = { {A,B}, {A,C}, {A,B,C} } F = {A,B,C}

38 Emerging Interfaces

39 Emerging Interfaces CBSoft 2011: *** Best Tool Award ***
"A Tool for Improving Maintainability of Preprocessor-based Product Lines" ( Márcio Ribeiro, Társis Tolêdo, Paulo Borba, Claus Brabrand )

40 Errors Logo logo; use(logo); #ifdef (VIDEO) logo = new Logo(); #endif *** uninitialized variable! in configurations: {Ø, {COLOR}} Logo logo; logo.use(); #ifdef (VIDEO) logo = new Logo(); #endif *** null-pointer exception! in configurations: {Ø, {COLOR}} Logo logo; ... *** unused variable! in configurations: {Ø, {COLOR}} #ifdef (VIDEO) logo = new Logo(); #endif


Download ppt "Intraprocedural Dataflow Analysis for Software Product Lines"

Similar presentations


Ads by Google