Presentation is loading. Please wait.

Presentation is loading. Please wait.

Kai Pan, Xintao Wu University of North Carolina at Charlotte Generating Program Inputs for Database Application Testing Tao Xie North Carolina State University.

Similar presentations


Presentation on theme: "Kai Pan, Xintao Wu University of North Carolina at Charlotte Generating Program Inputs for Database Application Testing Tao Xie North Carolina State University."— Presentation transcript:

1 Kai Pan, Xintao Wu University of North Carolina at Charlotte Generating Program Inputs for Database Application Testing Tao Xie North Carolina State University 26th IEEE/ACM International Conference on Automated Software Engineering Nov 11, 2011 Lawrence, Kansas

2 2 Functional Testing Test Generation Program Inputs Background

3 3 Test Generation Program Inputs Background Database States Functional Testing

4 4 Program inputs Database An Example

5 Motivation 5

6 Represent real-world objects’ characteristics, helping detect faults that could cause failures in real-world settings Reduce cost of generating new database records 6 Benefits to use an existing database state

7 Dynamic Symbolic Execution (DSE) Execute the program in both concrete and symbolic way (also called concolic testing) Collect constraints along executed path as path condition Negate part of the path condition and solve the new path condition to lead to new path DSE tools for various program languages Pex for.NET from Microsoft Research 7

8 Motivation 8 Path Condition: C1: Query construction constraints

9 Motivation 9 Path Condition: C1: Query construction constraints C2: Query/DB constraints

10 Motivation 10 Path Condition: C1: Query construction constraints C2: Query/DB constraints C3: Result manipulation constraints

11 Motivation 11 Path Condition: C1: Query construction constraints C2: Query/DB constraints C3: Result manipulation constraints C1 ^ C2 ^ C3

12 Motivation 12 Path Condition: C1: Query construction constraints C2: Query/DB constraints C3: Result manipulation constraints C1 ^ C2 ^ C3 A hard part

13 Motivation 13 How to derive high-covering program input values based on a given database state?

14 Outline Background Approach Evaluation Conclusion and future work 14

15 SQL query forms Fundamental structure: SELECT, FROM, WHERE, GROUP BY, and HAVING clauses. SELECT select-list FROM from-list WHERE qualification (GROUP BY grouping-list) (HAVING group-qualification) 15

16 SQL query forms (cont’d) Nested query: a query with another query embedded within it Nested query can be unnested into equivalent single level canonical queries SELECT S.sname FROM Sailors S FROM Sailors S, Reserves R WHERE EXISTS ( SELECT * WHERE R.sid=S.sid AND R.bid=103 FROM Reserves R WHERE R.bid=103 AND R.sid=S.sid) 16 transoformation rules A nested query Its canonical form

17 SQL query forms of focus WHERE clause consisting of a disjunction of conjunctions SELECT C1, C2,..., Ch FROM from-list WHERE (A11 AND... AND A1n) OR... OR (Am1 AND... AND Amn) 17

18 Outline Background Approach Evaluation Conclusion and future work 18

19 Illustrative example 19

20 Apply DSE on the existing database 20 Step1: DSE chooses “ type=0, zip=0 ”  executed query: Q1: SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=1 AND C.SSN=M.SSN Execution of Q1  zero record, not covering loop body

21 Apply DSE on the existing database (cont’d) 21 Step2: DSE flips “type == 0” to “type != 0”  “type=1, zip=0”  executed query: Q2: SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=30 AND C.zipcode=1 AND C.SSN=M.SSN Execution of Q2  zero record not covering loop body

22 Apply DSE on the existing database (cont’d) 22 However, An input like “type=0, zip=27694”  executed query: Q3: SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=27695 AND C.SSN=M.SSN Execution of Q3  one record {C.SSN = 001, C.income = 50000, M.balance = 20000}. Covering Line14=true and Line18=false

23 Apply DSE on the existing database (cont’d) 23 Furthermore, An input like “type=0, zip=28222”,  executed query: Q4: SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=28223 AND C.SSN=M.SSN Execution of Q4  one record {C.SSN = 002, C.income = , M.balance = 30000}. As a result, Line14=true and Line18=true

24 Assist DSE to generate program inputs 24 How to derive high-covering program input values based on a given database state?

25 Our idea: construct auxiliary queries 25 Auxiliary query : SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN e.g., result set includes “fzip=27695”. From “fzip=zip+1”, we derive “zip=27694”!

26 Our idea: construct auxiliary queries (cont’d) 26 Auxiliary query : SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN e.g., result set includes “fzip=27695”. From “fzip=zip+1”, we derive “zip=27694”! Cover Line14=true and Line18=false! true false

27 Our idea: construct auxiliary queries (cont’d) 27 Auxiliary query : SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN e.g., result set includes “fzip=27695”. From “fzip=zip+1”, we derive “zip=27694”! Cover Line14=true and Line18=false! true false Act like “Constraint Solver” for Program Constraints +DB State Constraints

28 Approach Collect query construction constraints on program variables used in the executed queries from the program code 28

29 Approach (cont’d) Collect query construction constraints on program variables used in the executed queries from the program code Collect result manipulation constraints on comparing with record values in the query’s result set (such as “if (diff>100000)” ) 29

30 Construct auxiliary queries 30 SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=‘fzip’ AND C.SSN=M.SSN For path “Line04=true, Line14=true”, construct the abstract query: true

31 Construct auxiliary queries 31 SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=‘fzip’ AND C.SSN=M.SSN For path “Line04=true, Line14=true”, construct the abstract query: true Our target

32 Construct auxiliary queries 32 SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=‘fzip’ AND C.SSN=M.SSN SELECT C.zipcode true Construct auxiliary query

33 Construct auxiliary queries 33 SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=‘fzip’ AND C.SSN=M.SSN SELECT C.zipcode FROM customer C, mortgage M true Construct auxiliary query

34 Construct auxiliary queries 34 SELECT C.SSN, C.income, M.balance FROM customer C, mortgage M WHERE M.year=15 AND C.zipcode=‘fzip’ AND C.SSN=M.SSN SELECT C.zipcode FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN Construct auxiliary query true

35 Generate program input values 35 Run auxiliary query: SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN  fzip:27695 or 28223

36 Generate program input values 36 Run auxiliary query: SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN  fzip: or  zip: or 28222

37 37 “type=0, zip=27694” covers Line04=true, Line14=true, but Line18=false true false Input combinations: type: 0 or !0 X zip: or Generate program input values

38 Approach (cont’d) Not enough! Program variables in branch condition after executing the query may be data-dependent on returned record values. How to cover Line18 true branch? 38

39 Approach (cont’d) To cover path Line04=true, Line14=true, Line18=true We need to extend previous auxiliary query 39 true

40 Construct auxiliary queries 40 SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN (----how to extend?----) We extend the WHERE clause true

41 Construct auxiliary queries 41 SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN (----how to extend?----) We extend the WHERE clause true

42 Construct auxiliary queries 42 SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN AND C.income * M.balance > We extend the WHERE clause true

43 Generate program input values 43 Run auxiliary query: SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN AND C.income * M.balance >  fzip=28223

44 Generate program input values 44 Run auxiliary query: SELECT C.zipcode, FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN AND C.income * M.balance >  fzip=28223  zip=28222

45 Other issues (aggregate calculation) Extend auxiliary query with GROUP BY and HAVING clauses. 45 Involve multiple records

46 Other issues (aggregate calculation) SELECT C.zipcode, sum(M.balance) FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN AND C.income * M.balance > GROUP BY C.zipcode HAVING sum(M.balance) >

47 Other issues (cardinality constraints) SELECT C.zipcode FROM customer C, mortgage M WHERE M.year=15 AND C.SSN=M.SSN AND C.income * M.balance > GROUP BY C.zipcode HAVING COUNT(*) >= 3 Use a special DSE technique for dealing with input- dependent loops P. Godefroid and D. Luchaup. Automatic partial loop summarization in dynamic test generation. In ISSTA

48 Outline Background Approach Evaluation Conclusion and future work 48

49 Research questions RQ1 (Effectiveness): What is the percentage increase in code coverage by the program inputs generated by Pex with our approach’s assistance? RQ2 (Cost): What is the cost of our approach’s assistance? 49

50 Evaluation subjects Two open source database applications RiskIt 4.3K LOC, database: 13 tables, 57 attributes, and >1.2 million records 17 DB-interacting methods selected for testing UnixUsage 2.8K LOC, database: 8 tables, 31 attributes, and >0.25 million records 28 DB-interacting methods selected for testing 50

51 Evaluation setup Measurement for test generation effectiveness: code coverage cost: number of runs/paths, execution time Procedure run Pex w/o our approach’s assistance perform our algorithms to generate new additional test inputs 51

52 Evaluation results: RiskIt 52 Higher code coverage

53 Evaluation results: RiskIt 53 Low additional cost Pex (only) timeout: 120 seconds Even given longer time, no new coverage observed for Pex (only)

54 Evaluation results: RiskIt 54 Pex (only) timeout: 120 seconds Even given longer time, no new coverage observed for Pex (only)

55 Preliminary Evaluation(cont’d) Evaluation results: UnixUsage

56 Summary of evaluation results RQ1: Effectiveness RiskIt: 26% higher block coverage over Pex only UnixUsage: 35% higher block coverage over Pex only RQ2: Cost RiskIt: #runs/paths: 131 more over 1135 (Pex) execution time: 517 secs more over 1781 (Pex) UnixUsage #runs/paths: 93 more over 1197 (Pex) execution time: 580 secs more over 1718 (Pex) 56

57 Outline Background Approach Evaluation Conclusion 57

58 Conclusion A new approach that formulates auxiliary queries to bridge gap between program/DB constraints. Act like a “constraint solver” for program constraints + DB constraints Empirical evaluations on 2 open source DB apps our approach can assist DSE to generate program inputs effectively achieving higher code coverage with low additional cost. 58

59 Future Work To construct auxiliary queries directly from embedded complex queries (e.g., nested queries), rather than from their transformed norm forms. To handle complex program context such as multiple queries. 59

60 Acknowledgment: This work was supported in part by U.S. National Science Foundation under CCF for Kai Pan and Xintao Wu, and under CCF for Tao Xie. Thank you! Questions? 60

61 Related Work All previous related work addresses a different problem: constructing both program inputs and database states (from scratch) M. Emmi, R. Majumdar, and K. Sen. Dynamic test input generation for database applications. In ISSTA, K. Taneja, Y. Zhang, and T. Xie. MODA: Automated test generation for database applications via mock objects. In ASE,


Download ppt "Kai Pan, Xintao Wu University of North Carolina at Charlotte Generating Program Inputs for Database Application Testing Tao Xie North Carolina State University."

Similar presentations


Ads by Google