The Pivot: Static Analysis of C++ Applications Bjarne Stroustrup Texas A&M University

Slides:



Advertisements
Similar presentations
Chapter 8 Technicalities: Functions, etc. Bjarne Stroustrup
Advertisements

Chapter 18 Vectors and Arrays
Optional Static Typing Guido van Rossum (with Paul Prescod, Greg Stein, and the types-SIG)
Connecting to Databases. relational databases tables and relations accessed using SQL database -specific functionality –transaction processing commit.
Chapter 18 Vectors and Arrays John Keyser’s Modification of Slides by Bjarne Stroustrup
The C ++ Language BY Shery khan. The C++ Language Bjarne Stroupstrup, the language’s creator C++ was designed to provide Simula’s facilities for program.
C++ Programming Languages
Programming Languages Marjan Sirjani 2 2. Language Design Issues Design to Run efficiently : early languages Easy to write correctly : new languages.
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
1/18 CS 693/793 Lecture 09 Special Topics in Domain Specific Languages CS 693/793-1C Spring 2004 Mo, We, Fr 10:10 – 11:00 CH 430.
1 Semantic Processing. 2 Contents Introduction Introduction A Simple Compiler A Simple Compiler Scanning – Theory and Practice Scanning – Theory and Practice.
Reasons to study concepts of PL
Scripting Languages For Virtual Worlds. Outline Necessary Features Classes, Prototypes, and Mixins Static vs. Dynamic Typing Concurrency Versioning Distribution.
Context-sensitive Analysis, II Ad-hoc syntax-directed translation, Symbol Tables, andTypes.
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
The Structure of the GNAT Compiler. A target-independent Ada95 front-end for GCC Ada components C components SyntaxSemExpandgigiGCC AST Annotated AST.
Communication in Distributed Systems –Part 2
1 Dan Quinlan, Markus Schordan, Qing Yi Center for Applied Scientific Computing Lawrence Livermore National Laboratory Semantic-Driven Parallelization.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Generative Programming. Generic vs Generative Generic Programming focuses on representing families of domain concepts Generic Programming focuses on representing.
C++ fundamentals.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Design Synopsys System Verilog API Donations to Accellera João Geada.
Peter Juszczyk CS 492/493 - ISGS. // Is this C# or Java? class TestApp { static void Main() { int counter = 0; counter++; } } The answer is C# - In C#
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
Programming Language Concepts
Language Evaluation Criteria
C++ Programming. Table of Contents History What is C++? Development of C++ Standardized C++ What are the features of C++? What is Object Orientation?
A Simplified Approach to Web Service Development Peter Kelly Paul Coddington Andrew Wendelborn.
(1.1) COEN 171 Programming Languages Winter 2000 Ron Danielson.
1 Concepts: Linguistic Support for Generic Programming in C++ Douglas Gregor Jeremy Siek Gabriel Dos Reis Jaakko Järvi Bjarne Stroustrup Andrew Lumsdaine.
C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica
Types for Programs and Proofs Lecture 1. What are types? int, float, char, …, arrays types of procedures, functions, references, records, objects,...
CSCA48 Course Summary.
Computer Science 101 Introduction to Programming.
Polymorphism, Inheritance Pt. 1 COMP 401, Fall 2014 Lecture 7 9/9/2014.
CS 390- Unix Programming Environment CS 390 Unix Programming Environment Topics to be covered: Distributed Computing Fundamentals.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
Programming Language C++ Xulong Peng CSC415 Programming Languages.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages.
CSE 425: Object-Oriented Programming I Object-Oriented Programming A design method as well as a programming paradigm –For example, CRC cards, noun-verb.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, Chapter 11: Compiler II: Code Generation slide 1www.idc.ac.il/tecs.
Copyright © 2007 Addison-Wesley. All rights reserved.1-1 Reasons for Studying Concepts of Programming Languages Increased ability to express ideas Improved.
Netprog: Java Intro1 Crash Course in Java. Netprog: Java Intro2 Why Java? Network Programming in Java is very different than in C/C++ –much more language.
A Little Language for Surveys: Constructing an Internal DSL in Ruby H. Conrad Cunningham Computer and Information Science University of Mississippi.
C++ Panel Discussion Summary Jim Kowalkowski. Participants Amber Boehnlein Jim Kowalkowski Leo Michelotti Marc Paterno Liz Sexton-Kennedy Bjarne Stroustrup.
Generative Programming. Automated Assembly Lines.
Programming Language Support for Generic Libraries Jeremy Siek and Walid Taha Abstract The generic programming methodology is revolutionizing the way we.
410/510 1 of 18 Week 5 – Lecture 1 Semantic Analysis Compiler Construction.
C++ History C++ was designed at AT&T Bell Labs by Bjarne Stroustrup in the early 80's Based on the ‘C’ programming language C++ language standardised in.
Copyright © Curt Hill Generic Classes Template Classes or Container Classes.
Chapter 2: A Brief History Object- Oriented Programming Presentation slides for Object-Oriented Programming by Yahya Garout KFUPM Information & Computer.
COP4020 Programming Languages Names, Scopes, and Bindings Prof. Xin Yuan.
C++ Programming Basic Learning Prepared By The Smartpath Information systems
Chapter 3 - Language Design Principles
Fall 2015CISC/CMPE320 - Prof. McLeod1 CISC/CMPE320 Lecture Videos will no longer be posted. Assignment 3 is due Sunday, the 8 th, 7pm. Today: –System Design,
How to execute Program structure Variables name, keywords, binding, scope, lifetime Data types – type system – primitives, strings, arrays, hashes – pointers/references.
STL CSSE 250 Susan Reeder. What is the STL? Standard Template Library Standard C++ Library is an extensible framework which contains components for Language.
©SoftMoore ConsultingSlide 1 Structure of Compilers.
1 of 24 Concepts: Linguistic Support for Generic Programming in C++ Douglas GregorJaakko JärviJeremy Siek Inidiana UniversityTexas A&M UniversityRice University.
ECE 750 Topic 8 Meta-programming languages, systems, and applications Automatic Program Specialization for J ava – U. P. Schultz, J. L. Lawall, C. Consel.
FASTFAST All rights reserved © MEP Make programming fun again.
Chapter 1 Introduction.
The C++ Standards Committee: Progress & Plans
Chapter 1 Introduction.
课程名 编译原理 Compiling Techniques
1.1 Reasons to study concepts of PLs
C++ History C++ was designed at AT&T Bell Labs by Bjarne Stroustrup in the early 80's Based on the ‘C’ programming language C++ language standardised in.
Presentation transcript:

The Pivot: Static Analysis of C++ Applications Bjarne Stroustrup Texas A&M University

2 Overview Static analysis of C++ –What would be useful –Why it is hard –C++0x The Pivot –Context –Aims –Organization –Basic representations High-level program representation for HPC –Concept-based checking and transformation

3 What would be useful? Direct representation of high-level ideas in code –E.g. no sideeffects, idempotent operation, always gives the same answer for the same element, no security violation, no memory leak, no race condition, no deadlock, being sorted, being band-diagonal, parallel application … Use of such direct representation –For providing guarantees –For information –For optimization –For program transformation

4 It’s hard C++ is –Large –Extremely flexible and general –Quite irregular –Has it’s type-unsafe C subset High-level ideas tend to be represented as templated classes and functions –Generic programming, Template meta-programming, generative programming –We have little experience with tools representing and manipulating templates –Such templates tend to be provided as part of domain specific libraries

5 Bell Labs proverbs Library design is language design Language design is library design But the devil is in the details

6 C++0x 1998: ISO C++ standard 2009 (estimated): ISO C++ standard –Better libraries and better support for library building Hash maps, regular expressions, file system, … Threads and memory model Concepts –A type system for types, integers, and operations Auto, template aliases, general, initializer lists, …

7 Concept: trivial example // Caveat: likely C++0x template where Assignable Iter find(Iter first, Iter last, Val v); template where Assignable Iter find(Iter first, Iter last, Val v); vector v = { 2, 3, 5, 8, 13, 21, 34 }; auto p1 = find(v.begin(), v.end(), 42); auto p2 = find(v,42.3); auto p3 = find(7,42);// error: 7 is not a Container

8 Concepts Can express many high-level abstractions –A type system for sets of types, integers, and operations We have experimental implementations of concepts A concept is a handle to which we can attach –some “standard semantics” within the language –essentially arbitrary semantics outside the language using tools Until we get concepts, we can “fake them” with static analysts and transformation tools

9 Context for the Pivot Semantically Enhanced Library (Language) –Enhanced notation through libraries –Restrict semantics through tools And take advantage of that semantics C++ Domain Specific Library Semantic Restriction s

10 Context for the Pivot Provide the advantages of specialized languages –Without introducing new “special purpose” languages –Without supporting special-purpose language tool chains –Avoiding the 99.?% language death rate Provide general support for the SELL idea –Not just a specialized tool per application/library –The Pivot fits here C++ Domain Specific Library Semantic Restriction s

11 Example SELL: Safe C++ Add –Range-checked std::vector iterators –Resource handles –Any (if needed) (a typesafe union type) Subtract –Arrays –Pointers –New/delete –Unions –Excessively complex/obscure code Uses of undefined construct not caught by compilers (e.g. a[++i] = i) Transforms –Pointers into iterators and resource handles (if porting) –New/delete into resource handle uses

12 Example SELL: STAPL Wait for Lawrence’s talk

13 Aims To allow fully general analysis of C++ source code –“What a human can do” –Foci Templates (e.g. specialization) C++0x features (e.g. concepts, generalized initializers) Distributed programming Embedded systems –Limitation: we work after macro expansion To allow transformation of C++ code –i.e. production of new code from old source Non-aim: handling other languages –e.g. Fortran, Java –but C and C++ dialects are relatively easy

14 Related work Lots –20+ tools for analyzing C++ But –Most are specialized E.g. alias analysis, flow analysis, numeric optimizations –Most are attached to a single compiler/parser –None handles all of C++ E.g. C + classes, C++ but not standard libraries –(that requires full handling of templates) Hardly two tools handle the same subset None handles the key C++0x features (e.g. concepts) –Some are proprietary –No serious interoperability

The Pivot

16 The Pivot Compiler IPR XPR Tool 2 Tool 1 C++ source Object code C++ source IDL Tool 4 “information” Tool 3 Specialized representation (e.g. flow graph) Compiler

17 Why? The Original Project Communication with remote mobile device –Calling interface CORBA, DCOM, Java RMI, …, homebrew interface –Transport TCP/IP, XML, …, homebrew protocol Big, Ugly, Slow, Proprietary, … –Why can’t I just write ISO Standard C++?

18 The original Project Distributed programs in ISO C++ “as similar as possible to non-distributed programming, but no more similar” // use local object: X x; // remote at “my host” A a; std::string s("abc"); // … x.f(a, s); // a function call // use remote object : proxy x; x.connect("my_host"); A a; std::string s("abc"); // … x.f(a, s); // a message send

19 IPR high-level principles Complete: Direct representation of C++ –Built-in types, classes, templates, expressions, statements, translation units … –Can represent erroneous and incomplete C++ programs Regular –The structure contains all of C++ but doesn’t mimic irregularities Programming effort proportional to complexity of task –IPR is not just a data structure Extensible –Node types –Information associated with a node –Operations No integration with compilers

20 IPR design choices Type safe IPR (not its users) handles memory management Minimal (run-time and space) –Minimal number of nodes (unification) –Minimal number of checked indirections (usually, virtual function calls) Expression-based regular superset of C++ –E.g. statements, declarations are expressions too –C++0x features (most important: concepts – types have types) Interfaces: –Purely functional, abstract classes, for most users No mutation operation on abstract classes Users don't get pointers directly –Mutating (operates on concrete classes) Users get to use pointers for in-place transformation –Traversals (and queries) Several, most not in “the Pivot core”

21 IPR is minimal Necessary for dealing with real-world code –Multi-million line programs are not uncommon Given the constraint of completeness –C++ is complex especially when we use the advanced template features essential for high-performance work Unified representation –E.g., there is only one int and only one 1 –Type comparison becomes pointer comparison Indirections are minimized –An indirection (only) when there is a choice of different types of information

22 Original idea (XTI) Too large, too slow

23 Current hierarchy (IPR) Compact minimal call overhead

IPR – Example 1 void foo(float b = 2.4)

IPR – Example 2

26 XPR (eXternal Program Representation) Can be thought of as a specialized portable object database –Easy/fast to parse –Easy/fast to write Compact –About as compact as C++ source code Robust –Read/write without using a symbol table LR(1), strictly prefix declaration syntax Human readable Human writeable Can represent almost all of C++ directly –No preprocessor directives –No multiple declarators in a declaration –No, >>, or << in template arguments, except in parentheses

27 XPR i : int// int i; C : class {// class C { m : const int// const int m; mm : *const int// const int* mm; f : (:int,:*char) double// double f(int,char*); f : (z:complex) C//C f(complex z); }// }; vector : class {// template class vector { p : *T//T* p; sz : int//int sz; }// };

Extremely simple SELL example template void f(const T& v) { double d = v[2]; // OK double* d = &v[2]; // not OK };

29 Current and future work Complete infrastructure –Complete EDG and GCC interfaces –Represent headers (modularity) directly –Complete type representation in XPR Initial applications –Style analysis including type safety and security – Analysis and transformation of STAPL programs Build alliances

References [GJS+06] Gregor, Douglas; Järvi, Jaako; Siek, Jeremy; Lumsdaine, Andrew; Dos Reis, Gabriel; Stroustrup, Bjarne: Concepts: Linguistic Support for Generic Programming in C++. to appear OOPSLA'06. [DRS05]Stroustrup, Bjarne; Dos Reis, Gabriel: A concept design. C++ Committee, paper N1782. April [SDR05]Stroustrup, Bjarne; Dos Reis, Gabriel: Supporting SELL for High Performance Computing. LCPC '05. [Str05] Stroustrup, Bjarne: A rational for semantically enhanced libraries. LCSD '05.