Download presentation
Presentation is loading. Please wait.
Published byJessica Ryan Modified over 9 years ago
1
The Pivot: Static Analysis of C++ Applications Bjarne Stroustrup Texas A&M University http://www.research.att.com/~bs
2
2 Overview Static analysis of C++ –What would be useful –Why it is hard –C++0x The Pivot –Context –Aims –Organization –Basic representations High-level program representation for HPC –Concept-based checking and transformation
3
3 What would be useful? Direct representation of high-level ideas in code –E.g. no sideeffects, idempotent operation, always gives the same answer for the same element, no security violation, no memory leak, no race condition, no deadlock, being sorted, being band-diagonal, parallel application … Use of such direct representation –For providing guarantees –For information –For optimization –For program transformation
4
4 It’s hard C++ is –Large –Extremely flexible and general –Quite irregular –Has it’s type-unsafe C subset High-level ideas tend to be represented as templated classes and functions –Generic programming, Template meta-programming, generative programming –We have little experience with tools representing and manipulating templates –Such templates tend to be provided as part of domain specific libraries
5
5 Bell Labs proverbs Library design is language design Language design is library design But the devil is in the details
6
6 C++0x 1998: ISO C++ standard 2009 (estimated): ISO C++ standard –Better libraries and better support for library building Hash maps, regular expressions, file system, … Threads and memory model Concepts –A type system for types, integers, and operations Auto, template aliases, general, initializer lists, …
7
7 Concept: trivial example // Caveat: likely C++0x template where Assignable Iter find(Iter first, Iter last, Val v); template where Assignable Iter find(Iter first, Iter last, Val v); vector v = { 2, 3, 5, 8, 13, 21, 34 }; auto p1 = find(v.begin(), v.end(), 42); auto p2 = find(v,42.3); auto p3 = find(7,42);// error: 7 is not a Container
8
8 Concepts Can express many high-level abstractions –A type system for sets of types, integers, and operations We have experimental implementations of concepts A concept is a handle to which we can attach –some “standard semantics” within the language –essentially arbitrary semantics outside the language using tools Until we get concepts, we can “fake them” with static analysts and transformation tools
9
9 Context for the Pivot Semantically Enhanced Library (Language) –Enhanced notation through libraries –Restrict semantics through tools And take advantage of that semantics C++ Domain Specific Library Semantic Restriction s
10
10 Context for the Pivot Provide the advantages of specialized languages –Without introducing new “special purpose” languages –Without supporting special-purpose language tool chains –Avoiding the 99.?% language death rate Provide general support for the SELL idea –Not just a specialized tool per application/library –The Pivot fits here C++ Domain Specific Library Semantic Restriction s
11
11 Example SELL: Safe C++ Add –Range-checked std::vector iterators –Resource handles –Any (if needed) (a typesafe union type) Subtract –Arrays –Pointers –New/delete –Unions –Excessively complex/obscure code Uses of undefined construct not caught by compilers (e.g. a[++i] = i) Transforms –Pointers into iterators and resource handles (if porting) –New/delete into resource handle uses
12
12 Example SELL: STAPL Wait for Lawrence’s talk
13
13 Aims To allow fully general analysis of C++ source code –“What a human can do” –Foci Templates (e.g. specialization) C++0x features (e.g. concepts, generalized initializers) Distributed programming Embedded systems –Limitation: we work after macro expansion To allow transformation of C++ code –i.e. production of new code from old source Non-aim: handling other languages –e.g. Fortran, Java –but C and C++ dialects are relatively easy
14
14 Related work Lots –20+ tools for analyzing C++ But –Most are specialized E.g. alias analysis, flow analysis, numeric optimizations –Most are attached to a single compiler/parser –None handles all of C++ E.g. C + classes, C++ but not standard libraries –(that requires full handling of templates) Hardly two tools handle the same subset None handles the key C++0x features (e.g. concepts) –Some are proprietary –No serious interoperability
15
The Pivot
16
16 The Pivot Compiler IPR XPR Tool 2 Tool 1 C++ source Object code C++ source IDL Tool 4 “information” Tool 3 Specialized representation (e.g. flow graph) Compiler
17
17 Why? The Original Project Communication with remote mobile device –Calling interface CORBA, DCOM, Java RMI, …, homebrew interface –Transport TCP/IP, XML, …, homebrew protocol Big, Ugly, Slow, Proprietary, … –Why can’t I just write ISO Standard C++?
18
18 The original Project Distributed programs in ISO C++ “as similar as possible to non-distributed programming, but no more similar” // use local object: X x; // remote at “my host” A a; std::string s("abc"); // … x.f(a, s); // a function call // use remote object : proxy x; x.connect("my_host"); A a; std::string s("abc"); // … x.f(a, s); // a message send
19
19 IPR high-level principles Complete: Direct representation of C++ –Built-in types, classes, templates, expressions, statements, translation units … –Can represent erroneous and incomplete C++ programs Regular –The structure contains all of C++ but doesn’t mimic irregularities Programming effort proportional to complexity of task –IPR is not just a data structure Extensible –Node types –Information associated with a node –Operations No integration with compilers
20
20 IPR design choices Type safe IPR (not its users) handles memory management Minimal (run-time and space) –Minimal number of nodes (unification) –Minimal number of checked indirections (usually, virtual function calls) Expression-based regular superset of C++ –E.g. statements, declarations are expressions too –C++0x features (most important: concepts – types have types) Interfaces: –Purely functional, abstract classes, for most users No mutation operation on abstract classes Users don't get pointers directly –Mutating (operates on concrete classes) Users get to use pointers for in-place transformation –Traversals (and queries) Several, most not in “the Pivot core”
21
21 IPR is minimal Necessary for dealing with real-world code –Multi-million line programs are not uncommon Given the constraint of completeness –C++ is complex especially when we use the advanced template features essential for high-performance work Unified representation –E.g., there is only one int and only one 1 –Type comparison becomes pointer comparison Indirections are minimized –An indirection (only) when there is a choice of different types of information
22
22 Original idea (XTI) Too large, too slow
23
23 Current hierarchy (IPR) Compact minimal call overhead
24
IPR – Example 1 void foo(float b = 2.4)
25
IPR – Example 2
26
26 XPR (eXternal Program Representation) Can be thought of as a specialized portable object database –Easy/fast to parse –Easy/fast to write Compact –About as compact as C++ source code Robust –Read/write without using a symbol table LR(1), strictly prefix declaration syntax Human readable Human writeable Can represent almost all of C++ directly –No preprocessor directives –No multiple declarators in a declaration –No, >>, or << in template arguments, except in parentheses
27
27 XPR i : int// int i; C : class {// class C { m : const int// const int m; mm : *const int// const int* mm; f : (:int,:*char) double// double f(int,char*); f : (z:complex) C//C f(complex z); }// }; vector : class {// template class vector { p : *T//T* p; sz : int//int sz; }// };
28
Extremely simple SELL example template void f(const T& v) { double d = v[2]; // OK double* d = &v[2]; // not OK };
29
29 Current and future work Complete infrastructure –Complete EDG and GCC interfaces –Represent headers (modularity) directly –Complete type representation in XPR Initial applications –Style analysis including type safety and security – Analysis and transformation of STAPL programs Build alliances
30
References [GJS+06] Gregor, Douglas; Järvi, Jaako; Siek, Jeremy; Lumsdaine, Andrew; Dos Reis, Gabriel; Stroustrup, Bjarne: Concepts: Linguistic Support for Generic Programming in C++. to appear OOPSLA'06. [DRS05]Stroustrup, Bjarne; Dos Reis, Gabriel: A concept design. C++ Committee, paper N1782. April 2005. [SDR05]Stroustrup, Bjarne; Dos Reis, Gabriel: Supporting SELL for High Performance Computing. LCPC '05. [Str05] Stroustrup, Bjarne: A rational for semantically enhanced libraries. LCSD '05.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.