Survey of program slicing techniques

Slides:



Advertisements
Similar presentations
DATAFLOW TESTING DONE BY A.PRIYA, 08CSEE17, II- M.s.c [C.S].
Advertisements

Overview Structural Testing Introduction – General Concepts
A Survey of Program Slicing Techniques A Survey of Program Slicing Techniques Sections 3.1,3.6 Swathy Shankar
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Data Flow Coverage. Reading assignment L. A. Clarke, A. Podgurski, D. J. Richardson and Steven J. Zeil, "A Formal Evaluation of Data Flow Path Selection.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Architecture-dependent optimizations Functional units, delay slots and dependency analysis.
SSA.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
1.6 Behavioral Equivalence. 2 Two very important concepts in the study and analysis of programs –Equivalence between programs –Congruence between statements.
 Program Slicing Long Li. Program Slicing ? It is an important way to help developers and maintainers to understand and analyze the structure.
Program Slicing; Andreas Linder eXtreme Programming lab course 2004.
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
Program Slicing. 2 CS510 S o f t w a r e E n g i n e e r i n g Outline What is slicing? Why use slicing? Static slicing of programs Dynamic Program Slicing.
Program Slicing Mark Weiser and Precise Dynamic Slicing Algorithms Xiangyu Zhang, Rajiv Gupta & Youtao Zhang Presented by Harini Ramaprasad.
1 Program Slicing Purvi Patel. 2 Contents Introduction What is program slicing? Principle of dependences Variants of program slicing Slicing classifications.
CS590F Software Reliability What is a slice? S: …. = f (v)  Slice of v at S is the set of statements involved in computing v’s value at S. [Mark Weiser,
Introduction to Program Slicing Presenter: M. Amin Alipour Software Design Laboratory
Interprocedural Slicing using Dependence Graphs Susan Horwitz, Thomas Reps, and David Binkley University of Wisconsin-Madison.
The Application of Graph Criteria: Source Code  It is usually defined with the control flow graph (CFG)  Node coverage is used to execute every statement.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
Foundations of Data-Flow Analysis. Basic Questions Under what circumstances is the iterative algorithm used in the data-flow analysis correct? How precise.
Program Representations Xiangyu Zhang. CS590F Software Reliability Why Program Representations  Initial representations Source code (across languages).
Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.
Program Slicing for Refactoring Advanced SW Tools Seminar Jan 2005Yossi Peery.
Program Representations Xiangyu Zhang. CS590Z Software Defect Analysis Program Representations  Static program representations Abstract syntax tree;
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 5 Data Flow Testing
Software Testing and QA Theory and Practice (Chapter 4: Control Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Impact Analysis of Database Schema Changes Andy Maule, Wolfgang Emmerich and David S. Rosenblum London Software Systems Dept. of Computer Science, University.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.
Data Flow Testing Data flow testing(DFT) is NOT directly related to the design diagrams of data-flow-diagrams(DFD). It is a form of structural testing.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University STATIC ANALYSES FOR JAVA IN THE PRESENCE OF DISTRIBUTED COMPONENTS AND.
Presented By Dr. Shazzad Hosain Asst. Prof., EECS, NSU
Software (Program) Analysis. Automated Static Analysis Static analyzers are software tools for source text processing They parse the program text and.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.
Foundations of Software Testing Chapter 5: Test Selection, Minimization, and Prioritization for Regression Testing Last update: September 3, 2007 These.
Bug Localization with Machine Learning Techniques Wujie Zheng
 Program Slicing : Analysis technique of extracting parts of a given program, relevant to the aspects being analyzed (the slicing criterion). E.g. slicing.
1 Program Slicing Amir Saeidi PhD Student UTRECHT UNIVERSITY.
Chapter 11: Dynamic Analysis Omar Meqdadi SE 3860 Lecture 11 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
1 The System Dependence Graph and its use in Program Slicing.
PRESTO: Program Analyses and Software Tools Research Group, Ohio State University Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to.
Graph Coverage for Design Elements 1.  Use of data abstraction and object oriented software has increased importance on modularity and reuse.  Therefore.
Program Slicing Techniques CSE 6329 Spring 2013 Parikksit Bhisay
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
1 Software Testing & Quality Assurance Lecture 13 Created by: Paulo Alencar Modified by: Frank Xu.
Control Flow Graphs : The if Statement 1 if (x < y) { y = 0; x = x + 1; } else { x = y; } x >= yx < y x = y y = 0 x = x + 1 if (x < y) { y = 0;
1 Test Coverage Coverage can be based on: –source code –object code –model –control flow graph –(extended) finite state machines –data flow graph –requirements.
Chapter 2 : Graph Coverage (part 2)
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Control Flow Testing Handouts
Static Slicing Static slice is the set of statements that COULD influence the value of a variable for ANY input. Construct static dependence graph Control.
Handouts Software Testing and Quality Assurance Theory and Practice Chapter 4 Control Flow Testing
Software Testing and Maintenance 1
Outline of the Chapter Basic Idea Outline of Control Flow Testing
SwE 455 Program Slicing.
White-Box Testing Techniques II
G. Ramalingam Microsoft Research, India & K. V. Raghavan
A Survey of Program Slicing Techniques: Section 4
Program Slicing Baishakhi Ray University of Virginia
Graph Coverage for Source Code
Control Flow Analysis (Chapter 7)
White-Box Testing Techniques II
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Presentation transcript:

Survey of program slicing techniques Presenter’s Name: Keyur Malaviya

Purpose of this paper It’s a survey that presents an overview of program slicing Various general approaches used to compute slices Specific techniques used to address procedures, unstructured control flow, composite data types and pointers, and concurrency. Static and dynamic slicing methods for each of these features Comparison and classification in terms of their accuracy and efficiency

Topics Covered Definitions Static slicing vs Dynamic slicing Basic slicing algorithm for single procedure and multiprocedure Weiser Algorithm Hausler Bergeretti and Carr´e Horwitz, Reps, and Binkley Algo Applications

Definitions (Basics) Slicing? Slicing Criteria? Static and Dynamic slicing? Program slicing? Program dependence graph (PDG) or Control flow graph (CFG) or System dependency grapy (SDG) (1) read(n); (2) i := 1; (3) sum := 0; (4) product := 1; (5) while i <= n do begin (6) sum := sum + i; (7) product := product * i; (8) i := i + 1 end; (9) write(sum); (10) write(product)

Definitions (CFG \ PDG) PDG: Directed graph; Vertices = statements and control predicates Edges = data and control dependences CFG

Definitions Program slice: consists of the parts of a program that affect the values computed at some point of interest. Slicing criterion: is this point of interest specified by a pair (program point, set of variables) Original concept by Weiser: Its a mental abstractions that people make when they are debugging a program Static slicing: Computed without making assumptions regarding a program’s input Dynamic slicing: Relies on some specific test case

Definitions (criteria and slicing ) Slice of this program w.r.t criterion (10, product) (1) read(n); (2) i := 1; (3) sum := 0; (4) product := 1; (5) while i <= n do begin (6) sum := sum + i; (7) product := product * i; (8) i := i + 1 end; (9) write(sum); (10) write(product) (1) read(n); (2) i := 1; (3) sum := 0; (4) product := 1; (5) while i <= n do begin (6) sum := sum + i; (7) product := product * i; (8) i := i + 1 end; (9) write(sum); (10) write(product) (1) read(n); (2) i := 1; (3) (4) product := 1; (5) while i <= n do begin (6) (7) product := product * i; (8) i := i + 1 end; (9) (10) write(product) Single-procedure programs (PDG); Shading in the PDG shown before  vertices in the slice w.r.t. write(product)

Static slicing vs Dynamic slicing Dynamic Slicing: First introduced by Korel and Laski Non-interactive variation of Balzer’s flowback analysis Only the dependences that occur in a specific execution of the program are taken into account Dynamic slicing criterion is a triple (input, occurrence of a statement, variable) – it specifies the input, and distinguishes between different occurrences of a statement in the execution history Dynamic slicing assumes fixed input for a program Static slicing does not make assumptions regarding the input. Flowback analysis: Interactively traverse a graph (data and control dependences between statements in the program); For e.g.: S(V) depends on T(V), S and T are statements; T  S is in CFG, then trace back from vertex for S to vertex for T

Static slicing vs Dynamic slicing criterion SS: (8, x) and DS: (n=2, 81, x) Example program: Static slice w.r.t. criterion (8, x) Dynamic slice w.r.t. criterion (n=2, 81, x) read(n); i := 1; while (i <= n) do begin if (i mod 2 = 0) then x := 17 else x := 18; i := i + 1 end; write(x) 1 2 3 4 5 6 7 8 read(n); i := 1; while (i <= n) do begin if (i mod 2 = 0) then x := 17 else x := 18; i := i + 1 end; write(x) read(n); i := 1; while (i <= n) do begin if (i mod 2 = 0) then x := 17 else ; i := i + 1 end; write(x)

Slicing Algorithm Approaches Achieved through one of three algorithmic approaches: 1) data-flow equations 2) system dependency graph 3) parallel algorithm All based on control and data dependencies and defined in terms of a graph representation of a program (as seen before)

Approaches: Weiser’s approach: compute slices from consecutive sets of transitively relevant statements ( data flow and control flow dependences ) Ottenstein approach: in terms of a reachability problem in a PDG. Slicing criterion  A vertex in the PDG; A Slice corresponds to all PDG vertices from which the vertex under consideration can be reached Other approaches: Based on modified and extended versions of PDGs Statements and control predicates are gathered by way of a backward traversal of the program’s control flow graph (CFG) or PDG, starting at the slicing criterion

Weiser Algorithm (single procedure) Two levels of iteration: 1. Transitive data dependences in the presence of loops in the program 2. Control dependences, initiating the inclusion of control predicates for which each, step 1 is repeated to include the statements it is dependent upon Determine directly relevant variables and then indirectly relevant variables; From these compute the sets of relevant statements

Parameters and equations Defined and Referenced Variables DEF(i) and REF(i) Say at node ‘i’ consider a statement a = b + c Then DEF(i) = {a} and REF(i) = {b, c} Directly Relevant Variable : set of directly relevant variables, where slice criterion = (V, n) Set DRV (i)  Set DRV (all nodes j) that have a direct edge to i,

Parameters and equations Directly Relevant Statements : set of all nodes i that define a variable v that is relevant at the successor node of I Indirectly Relevant Variables referenced variables in control predicate are indirectly relevant when at least one of the statements in its body is relevant, denoted: b is known as a range of influence INFL (b),

Example program

Applying the Weiser algo Slicing criterion (10, product) & our example program NODE DEF 1 {n} 2 {i} 3 {sum} 4 {product} 5 6 7 8 9 10 REF {i, n} {sum, i} {product, i} {i} {sum} {product} INFL {6, 7, 8} R0 {product}

Applying the Weiser algo Slicing criterion (10, product) & our example program NODE DEF 1 {n} 2 {i} 3 {sum} 4 {product} 5 6 7 8 9 10 REF {i, n} {sum, i} {product, i} {i} {sum} {product} R0 {product} {product} {product}

Applying the Weiser algo Slicing criterion (10, product) & our example program NODE DEF 1 {n} 2 {i} 3 {sum} 4 {product} 5 6 7 8 9 10 REF {i, n} {sum, i} {product, i} {i} {sum} {product} R0 {product, i} {product} {product} {product}

Applying the Weiser algo Slicing criterion (5, {i, n}) & repeat the same procedure Slicing criterion (10, product) & our example program NODE DEF 1 {n} 2 {i} 3 {sum} 4 {product} 5 6 7 8 9 10 REF {i, n} {sum, i} {product, i} {i} {sum} {product} R0 {i} {product, i} {product} {n} {i, n} {i, n} {product, i, n} {product, i, n} {product, i, n} {product, i, n} {product} {product}

Applying the Weiser algo Slicing criterion (10, product) & our example program NODE DEF 1 {n} 2 {i} 3 {sum} 4 {product} 5 6 7 8 9 10 REF {i, n} {sum, i} {product, i} {i} {sum} {product} INFL {6, 7, 8} R0 {i} {product, i} {product} R1 {n} {i, n} {product, i, n} {product} ? ? ?

Equations for related statements:

Hausler (functional style) For each type of statement, have a function and & express how a statement transforms the set of relevant variables & relevant statements reply. Functions for a while statement are obtained by transforming it into an infinite sequence of if statements

Information-flow relations (Bergeretti and Carr´e) Statement S: variable v and an expression e ( e can be control predicate or right-hand side of assignment) We define relations: They possess following properties:  iff the value of v on entry to S potentially affects the value computed for e  iff the value computed for e potentially affects the value of v on exit from S,  iff the value of v on entry to S may affect the value of v on exit from S.

Information-flow relations (Bergeretti and Carr´e) How to get the slice with respect to the final value of v ? The set of all expressions e for which can be used to construct “partial statements”  replace all statements in S that do not contain expressions in by empty statements. Relations are computed in a syntax-directed, bottom-up For S, v := e

Information-flow relations (Bergeretti and Carr´e) Set of expressions that potentially affect the value of product at the end of the program are {1, 2, 4, 5, 7, 8} Partial statement is obtained by omitting all statements from the program that do not contain expressions in this set, i.e., both assignments to sum and both write statements The slice is same as Weiser’s algorithm

Dependence graph based approaches (PDG) and Procedures PDG variant of Ottenstein shows considerably more detail than that by Horwitz, Reps, and Binkley Procedures Call-return structure of interprocedural execution paths Single pass considers infeasible execution paths – a problem called “calling-context” Will see two approaches: Weiser’s approach (CFG) Horwitz, Reps, and Binkley (SDG)

Dependence graph based approaches (PDG) and Procedures Weiser’s approach for interprocedural static slicing: Interprocedural summary information is computed, using previously developed techniques  P, set MOD(P) of variables = modified by P, and set USE(P) of variables = used by P Intraprocedural slicing algorithm: Treat ‘P()’ as a conditional assignment statement ‘if SomePredicate then MOD(P) := USE(P)’ (external procedures, source-code is unavailable?)

Weiser’s approach (i) procedures Q called by P: Actual inter-procedural slicing algo that generates new slicing criteria iteratively w.r.t slices computed in step (2): (i) procedures Q called by P: (i) procedures Q called by P (i) procedures Q called by P: consist of all pairs (ii) procedures R that call P (ii) procedures R that call P: consist of all pairs (ii) procedures R that call P:

Weiser’s Algo To formalize the generation of new criteria: UP(S) : Map (a set S of slicing criteria in a P) to (a set of criteria in procedures that call P) DOWN(S): Map (a set S of slicing criteria in a P) to (a set of criteria in procedures called by P) Set of all criteria: transitive and reflexive closure of the UP and DOWN relations (UP U DOWN)* UP and DOWN sets: Requires sets of relevant variables to be known at all call sites  computation of these sets is done by slicing these procedures When iteration stops? When no new criteria are generated

Main issue: procedure P(y1, y2, … , yn); program Main; begin … write(y1); write(y2); … (M) write(yn) end program Main; … while ( ) do P(x1, x2, , xn); z := x1; x1 := x2; x2 := x3; xn1 := xn end; (L) write(z) end Procedure P is sliced ‘n’ times by Weiser’s algorithm for criterion (L, {z}).

Weiser’s Algo Lprogram point at S = write(z) M  program point at last statement in P Slice w.r.t. criterion (L, { z })? ‘n’ iterations of the body of the while loop During the ith iteration, variables x1, …, xi will be relevant at call site DOWN(Main): criterion (M, { y1, …, yi }) gets included Issue is: ??? Procedure P will be sliced n times

What was the problem? Weiser’s algorithm does not take into account which output parameters are dependent on which input parameters is a source of imprecision Lets see another examples that shows this problem:

What was the problem? a := 17; program Example; begin (1) a := 17; (3) P(a, b, c, d); (4) write(d) end procedure P(v, w, x, y); (5) x := v; (6) y := w program Example; begin ; b := 18; P(a, b, c, d); write(d) end procedure P(v, w, x, y); y := w program Example; begin a := 17; a := 17; b := 18; P(a, b, c, d); end procedure P(v, w, x, y); ; y := w end Actual Slice Slice with Weiser’s algo

Horwitz, Reps, and Binkley Algo Computes precise inter-procedural static slices: 1. SDG, a graph representation for multi-procedure programs 2. Computation of inter-procedural summary information precise dependence relations between i/p & o/t parameters explicitly present in SDG as summary edges 3. Two-pass algorithm for extracting interprocedural slices from an SDG

Multi-procedure program

Horwitz, Reps, and Binkley Algo 1) Structure of SDG SDG = PDG for main program, & a procedure dependence graph for each procedure SDG <> PDG (Vertices and edges are different) For each call statement, there is a call-site vertex in the SDG as well as actual-in and actual-out vertices

1) Structure of SDG interprocedural dependence edges: Each procedure dependence graph has an entry vertex, and formal-in and formal-out vertices interprocedural dependence edges: (i) control dependence edge (call-site vertex & entry vertex) (ii) parameter-in edge between corresponding actual-in and formal-in vertices, (iii) a parameter out edge between corresponding formal-out and actual-out vertices, and (iv) summary edges that represent transitive interprocedural data dependences

1) Structure of SDG

Horwitz, Reps, and Binkley Algo 2) and 3) Second part: Models the calling relationships between the procedures (as in a call graph) Compute subordinate characteristic graph For each procedure in the program, this graph contains edges that correspond to precise transitive flow dependences between its input and output parameters. Third part: summary edges of an SDG serve to circumvent the calling context problem First phase: all vertices from which ‘s’ can be reached without descending into procedure calls (slicing starts at vertex s) Second phase: remaining vertices in the slice by descending into all previously side-stepped calls

COMPLETE SDG NEXT: Complete SDG for the example program shown above

SDG style interpretation Thin solid arrows  represent flow dependences, Thick solid arrows  correspond to control dependences, Thin dashed arrows  Used for call, parameter-in, and parameter-out dependences, Thick dashed arrows  Transitive inter-procedural flow dependences. Shaded vertices Vertices in the slice w.r.t. statement write(product) Light shading  Vertices identified in the first phase Dark shading  Vertices identified in the second phase

The slice with criteria (10, product) program Example; begin (1) read(n); (2) i := 1; (3) sum := 0; (4) product := 1; (5) while i <= n do (6) Add(sum, i); (7) Multiply(product, i); (8) Add(i, 1) end; (9) write(sum); (10) write(product) end procedure Add(a; b); begin 11) a := a + b End procedure Multiply(c; d); 12) j := 1; 13) k := 0; 14) while j <= d do 15) Add(k, c); 16) Add(j, 1); end; 17) c := k end

Application of slicing Debugging and program analysis Program differencing and program integration analyzing an old and a new version of a program partitioning the components compares slices in order to detect equivalent behaviors Software maintenance change at some place in a program  behavior of other parts of the program

QUESTIONS