Finding Optimal Program Abstractions Mayur Naik Georgia Tech Xin Zhang (Georgia Tech) Hongseok Yang (Oxford) Percy Liang (Stanford) Mooly Sagiv (Tel-Aviv.

Slides:



Advertisements
Similar presentations
Peter Grimm, DO Seattle Prostate Institute Latest update 4/23/09.
Advertisements

NANC Report Numbering Oversight Working Group (NOWG)
Welcome to PMP certification Introduction for Managers
Vocational School in Bersenbrück
Números.
JavaScript: Functions
Wi-Fi Technology By: Mohamed Hassan Ali May, 2005.
1 SnT – Interdisciplinary Centre for Security, Reliability and Trust 2 Bell Laboratories, Alcatel-Lucent Identifying abnormal patterns in cellular communication.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Feeding the Content Monster: How to Repurpose Your Association's Web Content for Use in Social Media Leslie OFlahavan, Association Media.
Dataflow Analysis for Datarace-Free Programs (ESOP 11) Arnab De Joint work with Deepak DSouza and Rupesh Nasre Indian Institute of Science, Bangalore.
Global Value Numbering using Random Interpretation Sumit Gulwani George C. Necula CS Department University of California, Berkeley.
PDAs Accept Context-Free Languages
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala
Program Analysis using Random Interpretation Sumit Gulwani UC-Berkeley March 2005.
EuroCondens SGB E.
Mechanizing Program Analysis With Chord Mayur Naik Intel Labs Berkeley.
By John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullman
Introduction to Turing Machines
Fundamental Digital Electronics Fundamental Digital Electronics.
V. Sola & al1. CMS Pixel Upgrade Meeting
Broadcasting in Conflict Aware Multi-Channel Networks WALCOM 2013 – Feb 14, 2013 Shahin Kamali 1 Joint work with Francisco Claude 1, Reza Dorrigiv 2, Alejandro.
The 5S numbers game..
Buy Side & Sell Side Alexander Motola, CFA Alexander Motola,
突破信息检索壁垒 -SciFinder Scholar 介绍
Internet Basics and Information Literacy
R O O T S Field-Sensitive Points-to-Analysis Eda GÜNGÖR
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Input/Output Systems and Peripheral Devices (03-3)
The basics for simulations
Managing Your Files.
Alaska Region Proving Ground Activities, Demonstration and Evaluation or Where We Are Eric Stevens UAF / GINA 6/18/20131.
QUALITATIVE ANALYSIS PROS & CONS by: Kevin Zarem, MAI
The Pecan Market How long will prices stay this high?? Brody Blain Vice – President.
Briana B. Morrison Adapted from William Collins
Figure 3–1 Standard logic symbols for the inverter (ANSI/IEEE Std
Roof Replacements and Associated Tuckpointing at SWRP & LASMA Contract # /12/20131.
Public Goods and Common Resources Economics for Business II Day 10 Spring 2013.
1 Prediction of electrical energy by photovoltaic devices in urban situations By. R.C. Ott July 2011.
Diffraction at the LHC Results from TOTEM, CMS and ATLAS Máté Csanád, Eötvös University Zimányi School December 5, 2013.
Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.
Progressive Aerobic Cardiovascular Endurance Run
Februari Organisation 22. Februari KI´s education and research DANDERYD HOSPITAL 235 FTE students Research SEK 33 million 26 FTE employees.
Small Wind Turbines Development & Building Approvals Draft Regulations for Industrial, Commercial and Special Purpose Districts May 2013 Onward/ By 2020.
1 Termination and shape-shifting heaps Byron Cook Microsoft Research, Cambridge Joint work with Josh Berdine, Dino Distefano, and.
Opportunities and Challenges of Textual Big Data for the Humanities Dr. Adam Wyner, Department of Computing Prof. Barbara Fennell, Department of Linguistics.
Before Between After.
Access to medicines: can differential pricing be an answer? Jo DE COCK, CEO NIHDI European Parliament
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
© Dr Kelvyn Youngman, March 2013 I am I said A local/local cloud by Neil Diamond in 3 minutes and 32 seconds.
CS151 Complexity Theory Lecture 4 April 11, 2013.
Excel Tutorial 3 Calculating Data with Formulas and Functions
Static Equilibrium; Elasticity and Fracture
Resistência dos Materiais, 5ª ed.
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
The CMS Particle Flow algorithm in CMS
KLARA chemical inventory 2013 Ulrika Olsson. How do I find KLARA – alternative 1  – internwebben – Choose system/tool – Chemical index Klara.
Tutorial 11: Connecting to External Data
Microsoft Office 2013 ®® Access Tutorial 2 Building a Database and Defining Table Relationships.
Tutorial 12 Collaborating on a Shared Workbook
Issues Related to Parameter Estimation in Model Accuracy Assessment DDDAS: June 6-7, Tom Henderson & Narong Boonsirisumpun ICCS 2013 Barcelona,
Shape Analysis by Graph Decomposition R. Manevich M. Sagiv Tel Aviv University G. Ramalingam MSR India J. Berdine B. Cook MSR Cambridge.
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
The Quest for Minimal Program Abstractions Mayur Naik Georgia Tech Ravi Mangal and Xin Zhang (Georgia Tech), Percy Liang (Stanford), Mooly Sagiv (Tel-Aviv.
Finding Optimum Abstractions in Parametric Dataflow Analysis Xin Zhang Georgia Tech Mayur Naik Georgia Tech Hongseok Yang University of Oxford.
A User-Guided Approach to Program Analysis Ravi Mangal, Xin Zhang, Mayur Naik Georgia Tech Aditya Nori Microsoft Research.
Presentation transcript:

Finding Optimal Program Abstractions Mayur Naik Georgia Tech Xin Zhang (Georgia Tech) Hongseok Yang (Oxford) Percy Liang (Stanford) Mooly Sagiv (Tel-Aviv U) Joint work with:

Static Analysis: 70’s to 90’s April client-oblivious “Because clients have different precision and scalability needs, future work should identify the client they are addressing …” M. Hind, Pointer Analysis: Haven’t We Solved This Problem Yet?, 2001 abstraction a program p query q 1 query q 2 p ² q1?p ² q1? p ² q2?p ² q2? Dagstuhl

p ² q1?p ² q1? p ² q2?p ² q2? Static Analysis: 00’s to Present April client-driven – demand-driven points-to analysis Heintze & Tardieu ’01, Guyer & Lin ’03, Sridharan & Bodik ’06, … – CEGAR model checkers: SLAM, BLAST, … abstraction a program p query q 1 query q 2 Dagstuhl

Static Analysis: 00’s to Present April abstraction a 2 abstraction a 1 q1q1 p q2q2 p ² q 1 ? p ² q 2 ? client-driven – demand-driven points-to analysis Heintze & Tardieu ’01, Guyer & Lin ’03, Sridharan & Bodik ’06, … – CEGAR model checkers: SLAM, BLAST, … Dagstuhl

Our Static Analysis Setting April client-driven + parametric – new search algorithms: testing, machine learning, … – new analysis questions: optimality, impossibility, … abstraction a 2 abstraction a 1 q1q1 p q2q2 p ² q 1 ? p ² q 2 ? Dagstuhl

Example 1: Predicate Abstraction (CEGAR) April abstraction a 2 abstraction a 1 q1q1 p q2q2 Predicates to use in predicate abstraction p ² q 1 ? p ² q 2 ? Dagstuhl

Example 2: Shape Analysis (TVLA) April Predicates to use as abstraction predicates abstraction a 2 abstraction a 1 q1q1 p q2q2 p ² q 1 ? p ² q 2 ? Dagstuhl

Example 3: Cloning-based Pointer Analysis April abstraction a 2 abstraction a 1 q1q1 p q2q2 K value to use for each call and each allocation site p ² q 1 ? p ² q 2 ? Dagstuhl

Problem Statement An efficient algorithm with: INPUTS: – program p and query q – abstractions A = { a 1, …, a n } – boolean function S(p, q, a) OUTPUT: – a 2 A: S(p, q, a) = true – Proof: a 2 A: S(p, q, a) = true 8 a’ 2 A: (a’ · a Æ S(p, q, a’) = true) ) a’ = a April q p S p ` q p 0 q a Dagstuhl Optimal Abstraction AND

An efficient algorithm with: INPUTS: – program p and query q – abstractions A = { a 1, …, a n } – boolean function S(p, q, a) OUTPUT: – a 2 A: S(p, q, a) = true – Proof: a 2 A: S(p, q, a) = true 8 a’ 2 A: (a’ · a Æ S(p, q, a’) = true) ) a’ = a Problem Statement April : S(p, q, a) S(p, q, a) 1111 finest 0100 optimal 0000 coarsest AND Dagstuhl Optimal Abstraction

Orderings on A Efficiency Partial Ordering – a 1 · cost a 2, sum of a 1 ’s bits · sum of a 2 ’s bits – S(p, q, a 1 ) runs faster than S(p, q, a 2 ) Precision Partial Ordering – a 1 · prec a 2, a 1 is pointwise · a 2 – S(p, q, a 1 ) = true ) S(p, q, a 2 ) = true April Dagstuhl

Why Optimality? Empirical lower bounds for static analysis Efficient to compute Better for user consumption – analysis imprecision facts – assumptions about missing program parts Better for machine learning April Dagstuhl

Why is this Hard in Practice? |A| exponential in size of p, or even infinite S(p, q, a) = false for most p, q, a Different a is optimal for different p, q April Dagstuhl

Talk Outline Abstraction Coarsening [POPL’11] Abstractions from Tests [POPL’12] Abstraction Refinement [PLDI’13] April Dagstuhl

Talk Outline Abstraction Coarsening [POPL’11] Abstractions from Tests [POPL’12] Abstraction Refinement [PLDI’13] April Dagstuhl

Abstraction Coarsening [POPL’11] For given p, q: start with finest a, incrementally replace 1’s with 0’s Two algorithms: – deterministic vs. randomized In practice, use combination of the algorithms April : S(p, q, a) S(p, q, a) 1111 finest 0100 optimal 0000 coarsest Dagstuhl

Randomized Coarsening Algorithm April a à (1, …, 1) Loop: Remove each component from a with probability (1 - ® ) Run S(p, q, a) If : S(p, q, a) then add components back Else remove components permanently Dagstuhl

Performance of Randomized Coarsening Let: n = total # components s = # components in largest optimal abstraction If set probability ® = e (-1/s) then outputs optimal abstraction in O(s log n) expected time Significance: s is small, only log dependence on total # components April Dagstuhl

Application: Pointer Analysis Abstractions Client: static datarace detector [PLDI’06] – Pointer analysis using k-CFA with heap cloning – Uses call graph, may-alias, thread-escape, and may-happen-in-parallel analyses April # components (x 1000) # unproven queries (dataraces) (x 1000) alloc sites call sites 0-CFA1-CFAdiff1-obj2-objdiff hedc weblech lusearch Dagstuhl

Experimental Results: All Queries April K-CFA# components (x 1000) BasicRefine (x 1000) ActiveCoarsen hedc (83%)90 (1.0%) weblech (85%)157 (1.0%) lusearch (88%)250 (1.5%) K-obj# components (x 1000) BasicRefine (x 1000) ActiveCoarsen hedc (57%)37 (2.3%) weblech (68%)48 (1.9%) lusearch (73%)56 (1.9%) Dagstuhl

Empirical Results: Per Query April Dagstuhl

Empirical Results: Per Query, contd. April Dagstuhl

Talk Outline Abstraction Coarsening [POPL’11] Abstractions from Tests [POPL’12] Abstraction Refinement [PLDI’13] April Dagstuhl

Talk Outline Abstraction Coarsening [POPL’11] Abstractions from Tests [POPL’12] Abstraction Refinement [PLDI’13] April Dagstuhl

Abstractions From Tests [POPL’12] April p, q dynamic analysis p ² q?p ² q? and optimal! static analysis Dagstuhl

Combining Dynamic and Static Analysis Previous work: – Counterexamples: query is false on some input suffices if most queries are expected to be false – Likely invariants: a query true on some inputs is likely true on all inputs [Ernst 2001] Our approach: – Proofs: a query true on some inputs is likely true on all inputs and for likely the same reason! April Dagstuhl

Example: Thread-Escape Analysis April L L L L h1 h2 h3 h4 local(pc, w)? // u, v, w are local variables // g is a global variable // start() spawns new thread for (i = 0; i < N; i++) { u = new h1; v = new h2; g = new h3; v.f = g; w = new h4; u.f2 = w; pc: w.id = i; u.start(); } Dagstuhl

Example: Thread-Escape Analysis // u, v, w are local variables // g is a global variable // start() spawns new thread for (i = 0; i < N; i++) { u = new h1; v = new h2; g = new h3; v.f = g; w = new h4; u.f2 = w; pc: w.id = i; u.start(); } April L L E L h1 h2 h3 h4 but not optimal local(pc, w)? Dagstuhl

Example: Thread-Escape Analysis April L E E L h1 h2 h3 h4 and optimal! local(pc, w)? // u, v, w are local variables // g is a global variable // start() spawns new thread for (i = 0; i < N; i++) { u = new h1; v = new h2; g = new h3; v.f = g; w = new h4; u.f2 = w; pc: w.id = i; u.start(); } Dagstuhl

Benchmarks April classesbytecodes (x 1000) alloc. sites (x 1000) apptotalapptotal hedc weblech lusearch sunflow1641, avrora1,1591, hsqldb Dagstuhl

Precision: Thread-Escape Analysis April Dagstuhl

Running Time (seconds) CDFs 32April 2013Dagstuhl

Running Time (seconds) CDFs 33April 2013Dagstuhl

Talk Outline Abstraction Coarsening [POPL’11] Abstractions from Tests [POPL’12] Abstraction Refinement [PLDI’13] April Dagstuhl

Talk Outline Abstraction Coarsening [POPL’11] Abstractions from Tests [POPL’12] Abstraction Refinement [PLDI’13] April Dagstuhl

`21.548` Example: Type-State Analysis x = new File; y = x; if (*) z = x; x.open(); y.close(); if (*) check1(x, closed); else check2(x, opened); 36April 2013Dagstuhl QueryAbstraction check1Any >= { x, y } check2None `21.548` QueryAbstraction check1{ } check2

Example: Type-State Analysis 37April 2013Dagstuhl x = new File; y = x; if (*) z = x; x.open(); y.close(); if (*) check1(x, closed); else check2(x, opened); QueryAbstraction check1Any >= { x, y } check2None QueryAbstraction check1{ } check2 { x } `21.548` { x, y }

Example: Type-State Analysis 38April 2013Dagstuhl x = new File; y = x; if (*) z = x; x.open(); y.close(); if (*) check1(x, closed); else check2(x, opened); QueryAbstraction check1Any >= { x, y } check2None QueryAbstraction check1{ } check2{ } `21.548` { x }{ x, y } { x }

Precision: Thread-Escape Analysis April Dagstuhl

Comparison with Abstractions from Tests April 2013Dagstuhl40

Number of Iterations April Dagstuhl proven queriesimpossible queries minmaxavgminmaxavg hsqldb antlr avrora lusearch

Running Time April Dagstuhl proven queriesimpossible queries minmaxavgminmaxavg hsqldb20s25m94s4s50m55s antlr18s77m98s6s21m64s avrora16s28m67s5s3h41s lusearch14s13m112s6s45m131s

Size of Optimal Abstraction April Dagstuhl

Size of Optimal Abstraction April Dagstuhl

Key Takeaways New questions: optimality, impossibility, … New applications: lower bounds, lib assumptions, … New techniques: search algorithms, abstractions, … New tools: meta-analysis, parallelism, … pag.gatech.edu/prism April Dagstuhl