CBSE'051 Component-Level Dataflow Analysis Atanas (Nasko) Rountev Ohio State University
CBSE'052 Outline Interprocedural dataflow analysis Whole-program analysis: limitations Problem: Problem: making dataflow analysis usable and useful for component-based software Technical challenges Ongoing and future work
CBSE'053 Uses of Dataflow Analysis Software understanding tools e.g. dependence analysis for program slicing, change impact analysis, refactoring, etc. Software testing e.g. dataflow-based testing; testing of object interactions in OO software Software checking e.g. object protocols: open(read|write)*close Performance optimizations in compilers
CBSE'054 Model for Whole-Program Analysis code for C 1 code for C 2 … code for C n C 1 + C 2 + … + C n constitute a complete program Implicit assumption: it is possible and desirable to analyze the source code of the entire program as a single unit dataflow solution for C 1 + C 2 + … + C n WholeProgramDataflowAnalysis
CBSE'055 Limitations of Whole-Program Analysis What if some of the components are only available in binary form? What if we are building a library? What if we are using large libraries that need to be re-analyzed from scratch? e.g. the standard Java libraries contain a few thousand classes What if one part of program changes? may have to re-analyze the entire program
CBSE'056 Outline Interprocedural dataflow analysis Whole-program analysis: limitations Problem: Problem: making dataflow analysis usable and useful for component-based software Dozens of existing analyses could potentially become useful for component-based software In tools for software understanding, testing, checking, and optimization Technical challenges
CBSE'057 A Simple Case: Main + Lib code for Lib Goal: the solution for Main should be as good as the solution that would have been computed by a whole-program analysis (no loss of precision) ComponentLevelDataflowAnalysis summary for Lib code for Main dataflow solution for Main ComponentLevelDataflowAnalysis summary for Lib dataflow solution for Lib
CBSE'058 Component Model and Summary Info Component = set of related procedures or classes Component interactions: synchronous calls, shared variables Challenge: more sophisticated component models Summary information is computed based only on the source code of Lib Challenge: use info from component specifications
CBSE'059 Summary Functions Main Lib Main calls procedure Q path p 1 : dataflow function f 1 path p 2 : dataflow function f 2 Summary function for Q: f Q = f 1 f 2 computed by the analysis of Lib Q
CBSE'0510 Open Questions Challenge: compact representation of dataflow functions and their transitive composition and meet Existing work solves this problem for some analysis categories; need generalizations Challenge: callbacks e.g., function pointers in C e.g., virtual dispatch in C++ and Java Fundamental problem, not addressed adequately by existing work
CBSE'0511 Callbacks Main Lib Main calls procedure Q; during the call, Lib calls R The function for p 2 cannot be computed until Main is analyzed Q R subpaths Solution: summary functions for subpaths, computed during the analysis of Lib; Later, compose them with the functions from Main
CBSE'0512 Ongoing Work Goal 1 Goal 1 (achieved): theoretical model for computing and using summary functions in the presence of callbacks Goal 2 Goal 2 (ongoing): instantiate the model to common categories of analyses dependence analysis, pointer analysis, etc. Goal 3 Goal 3: experimental evaluation e.g. how large are the summaries? Eclipse plug-in for call graph construction: needs summary info for all Java 1.4 libraries
CBSE'0513 Future Work Beyond the traditional restrictions Use not only code, but also component specifications: e.g., “sharpen” the summary functions based on preconditions Higher-level of abstraction for component interfaces and interactions Right now: low-level mechanisms such as procedure calls and shared variables Extensive experimental evaluation on real-world software systems