Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Pivot: Static Analysis of C++ Applications Bjarne Stroustrup Texas A&M University

Similar presentations


Presentation on theme: "The Pivot: Static Analysis of C++ Applications Bjarne Stroustrup Texas A&M University"— Presentation transcript:

1 The Pivot: Static Analysis of C++ Applications Bjarne Stroustrup Texas A&M University http://www.research.att.com/~bs

2 2 Overview Static analysis of C++ –What would be useful –Why it is hard –C++0x The Pivot –Context –Aims –Organization –Basic representations High-level program representation for HPC –Concept-based checking and transformation

3 3 What would be useful? Direct representation of high-level ideas in code –E.g. no sideeffects, idempotent operation, always gives the same answer for the same element, no security violation, no memory leak, no race condition, no deadlock, being sorted, being band-diagonal, parallel application … Use of such direct representation –For providing guarantees –For information –For optimization –For program transformation

4 4 It’s hard C++ is –Large –Extremely flexible and general –Quite irregular –Has it’s type-unsafe C subset High-level ideas tend to be represented as templated classes and functions –Generic programming, Template meta-programming, generative programming –We have little experience with tools representing and manipulating templates –Such templates tend to be provided as part of domain specific libraries

5 5 Bell Labs proverbs Library design is language design Language design is library design But the devil is in the details

6 6 C++0x 1998: ISO C++ standard 2009 (estimated): ISO C++ standard –Better libraries and better support for library building Hash maps, regular expressions, file system, … Threads and memory model Concepts –A type system for types, integers, and operations Auto, template aliases, general, initializer lists, …

7 7 Concept: trivial example // Caveat: likely C++0x template where Assignable Iter find(Iter first, Iter last, Val v); template where Assignable Iter find(Iter first, Iter last, Val v); vector v = { 2, 3, 5, 8, 13, 21, 34 }; auto p1 = find(v.begin(), v.end(), 42); auto p2 = find(v,42.3); auto p3 = find(7,42);// error: 7 is not a Container

8 8 Concepts Can express many high-level abstractions –A type system for sets of types, integers, and operations We have experimental implementations of concepts A concept is a handle to which we can attach –some “standard semantics” within the language –essentially arbitrary semantics outside the language using tools Until we get concepts, we can “fake them” with static analysts and transformation tools

9 9 Context for the Pivot Semantically Enhanced Library (Language) –Enhanced notation through libraries –Restrict semantics through tools And take advantage of that semantics C++ Domain Specific Library Semantic Restriction s

10 10 Context for the Pivot Provide the advantages of specialized languages –Without introducing new “special purpose” languages –Without supporting special-purpose language tool chains –Avoiding the 99.?% language death rate Provide general support for the SELL idea –Not just a specialized tool per application/library –The Pivot fits here C++ Domain Specific Library Semantic Restriction s

11 11 Example SELL: Safe C++ Add –Range-checked std::vector iterators –Resource handles –Any (if needed) (a typesafe union type) Subtract –Arrays –Pointers –New/delete –Unions –Excessively complex/obscure code Uses of undefined construct not caught by compilers (e.g. a[++i] = i) Transforms –Pointers into iterators and resource handles (if porting) –New/delete into resource handle uses

12 12 Example SELL: STAPL Wait for Lawrence’s talk

13 13 Aims To allow fully general analysis of C++ source code –“What a human can do” –Foci Templates (e.g. specialization) C++0x features (e.g. concepts, generalized initializers) Distributed programming Embedded systems –Limitation: we work after macro expansion To allow transformation of C++ code –i.e. production of new code from old source Non-aim: handling other languages –e.g. Fortran, Java –but C and C++ dialects are relatively easy

14 14 Related work Lots –20+ tools for analyzing C++ But –Most are specialized E.g. alias analysis, flow analysis, numeric optimizations –Most are attached to a single compiler/parser –None handles all of C++ E.g. C + classes, C++ but not standard libraries –(that requires full handling of templates) Hardly two tools handle the same subset None handles the key C++0x features (e.g. concepts) –Some are proprietary –No serious interoperability

15 The Pivot

16 16 The Pivot Compiler IPR XPR Tool 2 Tool 1 C++ source Object code C++ source IDL Tool 4 “information” Tool 3 Specialized representation (e.g. flow graph) Compiler

17 17 Why? The Original Project Communication with remote mobile device –Calling interface CORBA, DCOM, Java RMI, …, homebrew interface –Transport TCP/IP, XML, …, homebrew protocol Big, Ugly, Slow, Proprietary, … –Why can’t I just write ISO Standard C++?

18 18 The original Project Distributed programs in ISO C++ “as similar as possible to non-distributed programming, but no more similar” // use local object: X x; // remote at “my host” A a; std::string s("abc"); // … x.f(a, s); // a function call // use remote object : proxy x; x.connect("my_host"); A a; std::string s("abc"); // … x.f(a, s); // a message send

19 19 IPR high-level principles Complete: Direct representation of C++ –Built-in types, classes, templates, expressions, statements, translation units … –Can represent erroneous and incomplete C++ programs Regular –The structure contains all of C++ but doesn’t mimic irregularities Programming effort proportional to complexity of task –IPR is not just a data structure Extensible –Node types –Information associated with a node –Operations No integration with compilers

20 20 IPR design choices Type safe IPR (not its users) handles memory management Minimal (run-time and space) –Minimal number of nodes (unification) –Minimal number of checked indirections (usually, virtual function calls) Expression-based regular superset of C++ –E.g. statements, declarations are expressions too –C++0x features (most important: concepts – types have types) Interfaces: –Purely functional, abstract classes, for most users No mutation operation on abstract classes Users don't get pointers directly –Mutating (operates on concrete classes) Users get to use pointers for in-place transformation –Traversals (and queries) Several, most not in “the Pivot core”

21 21 IPR is minimal Necessary for dealing with real-world code –Multi-million line programs are not uncommon Given the constraint of completeness –C++ is complex especially when we use the advanced template features essential for high-performance work Unified representation –E.g., there is only one int and only one 1 –Type comparison becomes pointer comparison Indirections are minimized –An indirection (only) when there is a choice of different types of information

22 22 Original idea (XTI) Too large, too slow

23 23 Current hierarchy (IPR) Compact minimal call overhead

24 IPR – Example 1 void foo(float b = 2.4)

25 IPR – Example 2

26 26 XPR (eXternal Program Representation) Can be thought of as a specialized portable object database –Easy/fast to parse –Easy/fast to write Compact –About as compact as C++ source code Robust –Read/write without using a symbol table LR(1), strictly prefix declaration syntax Human readable Human writeable Can represent almost all of C++ directly –No preprocessor directives –No multiple declarators in a declaration –No, >>, or << in template arguments, except in parentheses

27 27 XPR i : int// int i; C : class {// class C { m : const int// const int m; mm : *const int// const int* mm; f : (:int,:*char) double// double f(int,char*); f : (z:complex) C//C f(complex z); }// }; vector : class {// template class vector { p : *T//T* p; sz : int//int sz; }// };

28 Extremely simple SELL example template void f(const T& v) { double d = v[2]; // OK double* d = &v[2]; // not OK };

29 29 Current and future work Complete infrastructure –Complete EDG and GCC interfaces –Represent headers (modularity) directly –Complete type representation in XPR Initial applications –Style analysis including type safety and security – Analysis and transformation of STAPL programs Build alliances

30 References [GJS+06] Gregor, Douglas; Järvi, Jaako; Siek, Jeremy; Lumsdaine, Andrew; Dos Reis, Gabriel; Stroustrup, Bjarne: Concepts: Linguistic Support for Generic Programming in C++. to appear OOPSLA'06. [DRS05]Stroustrup, Bjarne; Dos Reis, Gabriel: A concept design. C++ Committee, paper N1782. April 2005. [SDR05]Stroustrup, Bjarne; Dos Reis, Gabriel: Supporting SELL for High Performance Computing. LCPC '05. [Str05] Stroustrup, Bjarne: A rational for semantically enhanced libraries. LCSD '05.


Download ppt "The Pivot: Static Analysis of C++ Applications Bjarne Stroustrup Texas A&M University"

Similar presentations


Ads by Google