Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parallel Execution of Test Runs for Database Application Systems Donald Kossmann: ETH Zurich, i-TV-T AG Florian Haftmann: i-TV-T AG Eric Lo: ETH Zurich.

Similar presentations


Presentation on theme: "Parallel Execution of Test Runs for Database Application Systems Donald Kossmann: ETH Zurich, i-TV-T AG Florian Haftmann: i-TV-T AG Eric Lo: ETH Zurich."— Presentation transcript:

1 Parallel Execution of Test Runs for Database Application Systems Donald Kossmann: ETH Zurich, i-TV-T AG Florian Haftmann: i-TV-T AG Eric Lo: ETH Zurich

2 2 Some Facts Microsoft spends 50% of their development cost on testing SAP product cycle = 18 months –6 months to execute tests Testing is the most expensive phase of the software development cycle

3 3 Observation The more test runs, the better However, it takes more time! Goal: Optimize Testing Time

4 4 Definition: Test Run T i A sequence of requests Test Run Login (2 requests): ReqActionValueExpected Result 1Fill-in Login Fill-in Password Eric ******** 2ClickSign-in

5 5 Expected Result ReqActionValueExpected Result 1Fill-in ID Fill-in Password Eric ******** 2ClickSign-in

6 6 More Definitions Failed Test Run: At least one request does not return the expected result Test Database D : The state of an Application + Database at the beginning of each test Database Reset R : Bring the database back to D

7 7 A Test Run Fails When: 1.The application has a real bug 2.Or the test database is in wrong state due to execution of test runs Carry out resets to find real bugs

8 8 Resetting the Test Database? Database PurchaseOrder={P1} - P.O. Insertion - Count P.O. - …

9 9 Resetting the Test Database? - P.O. Insertion - Count P.O. - … Database PurchaseOrder={P1} T A : Insert Purchase Order P2

10 10 Resetting the Test Database? Database PurchaseOrder={P1 } T A : Insert Purchase Order P2 - P.O. Insertion - Count P.O. - … P2

11 11 Resetting the Test Database? Database PurchaseOrder={P1} T B : Get Total Purchase Order Expected Result: 1 Actual Result: 1 - P.O. Insertion - Count P.O. - …

12 12 Resetting the Test Database? Database PurchaseOrder={P1 } T A : Insert Purchase Order P2 T B : Get Total Purchase Order Expected Result: 1 Actual Result: 2 - P.O. Insertion - Count P.O. - … P2

13 13 - P.O. Insertion - Count P.O. - … Database Reset is needed! Database PurchaseOrder={P1 } T A : Insert Purchase Order P2 T B : Get Total Purchase Order Expected Result: 1 Reset DB P2

14 14 Database Reset Resetting a database for a large scale application takes about 2 minutes! Back-of-the-envelop calculation: –10000 test runs = 10000 resets x 2 min = 2 weeks on DB resets for 1 complete test

15 15 - P.O. Insertion - Count P.O. - … Reordering Test Runs Database PurchaseOrder={ P1, P2 } T A : Insert Purchase Order P2 T B : Get Total Purchase Order Expected Result: 1 Actual Result: 2

16 16 - P.O. Insertion - Count P.O. - … Reordering Test Runs Database PurchaseOrder={ P1 } T B : Get Total Purchase Order Expected Result: 1 Actual Result: 1

17 17 - P.O. Insertion - Count P.O. - … Order Matters! Database PurchaseOrder={ P1, P2 } T A : Insert Purchase Order P2 T B : Get Total Purchase Order Expected Result: 1 Actual Result: 1

18 18 Our Previous Work (CIDR 2005) A test run depends on a correct state of a database –Control the database state Reduce the number of database resets Algorithms to optimize order of test runs No parallelism in testing

19 19 Can we do better if we have > 1 machine?

20 20 Parallel Testing is a Two-dimensional Problem! 1.Fully utilize the available resources Load Balancing! 2.Same as single machine, we still have to control the database state Reduce the database resets!

21 21 More about the Problem Regression test –Later stage of the development cycle Minor changes between versions –Execute the same set of test runs Version 1.1 –Execute test: T 1 T 2 T 3 T 4 Version 1.2 (Bug fixed and/or minor changes) –Execute test: T 1 T 2 T 3 T 4

22 22 Parallel Testing Shared-Nothing vs. Shared-Database

23 23 Shared-Nothing (SN) If I work for IBM, I can install: –N applications –N databases –N machines One more machine: –More admin. work! –More license fees! Applications do not SHARE the database Application Database Machine 1... Application Database Machine N T 12 T4T4... T5T5 T 31...

24 24 Shared-Database (SDB) Application Database T 12 T4T4... T5T5 T 31... Thread 1 Thread N If I work for PoorEric.com, I install: –N threads (e.g., open N browsers) –1 database –1 machine The threads SHARE the database Test runs interference with each others –Cant scale as good as Shared-Nothing

25 25 T2T2 Parallel Testing Framework Conflicts DB Scheduler... Reset? History M1M1 MNMN... Application Database Machine/Thread 1 Application Machine/Thread N Database...T1T1 T5T5 T2T2 T6T6 T1T1 T5T5

26 26 Parallel Testing is a Two-dimensional Problem! 1.Fully utilize the available resources Load Balancing! 2.Same as single machine, we still have to control the database state Reduce the database resets!

27 27 Execution Strategies Optimistic Execution: –Reset the database only when it is a must –Example: R T 1 T 2 T 3 T 4 Optimistic++ Execution: –Avoid to execute a test run twice, again –Example (Wk 1): R T 1 T 2 T 3 T 4 R T 4 T 5 –Example (Wk 2): R T 1 T 2 T 3 (Next is T 4 ?) Slice Reordering Heuristics: –Slice: A sequence of test runs without conflicts –Example: R T 1 T 2 T 3 T 4 R T 4 T 5 –Collect s during each test Graph Reordering Heuristics - R T 4 T 5 R T 4 T 5 T 4

28 28 Parallel Testing Shared-Nothing (SN)

29 29 Shared-Nothing Conflicts DB Scheduler Reset Application Database Machine 1 Application Database Machine 2... Test Run Input Queue T1T1 T5T5 T2T2 T6T6

30 30 Test 1 Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8

31 31 Test 1 Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8

32 32 Test 1 Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8

33 33 Test 1 Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8

34 34 Test 1 Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8

35 35 Test 1 Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8

36 36 Test 1 Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8 R

37 37 Test 1 Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8 R T6T6 T 5 T 6

38 38 Test 1 Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8 R T6T6 R T 5 T 6 T 1 T 2 T 3

39 39 Test 1 Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8 R T6T6 R T 5 T 6 T 1 T 2 T 3

40 40 Test 1 Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8 R T6T6 R T 5 T 6 T 1 T 2 T 3

41 41 Test 1 Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8 R T6T6 R T3T3 T 5 T 6 T 1 T 2 T 3

42 42 Shared-Nothing - Slice 3 major principles: 1.The slices in the input queue are ordered by: –Reordering the slices on each machine locally –Merge the partial order 2.Executes all test runs of the same slice on the same machine 3.The scheduler makes sure conflicting slices are executed on different machines as much as possible

43 43 Collect Slices Test Run Input Queue Conflicts DB M1: R M2: R Scheduler T1T1 T5T5 T2T2 T6T6 T3T3 T7T7 T8T8 R T6T6 R T3T3 T 5 T 6 T 1 T 2 T 3

44 44 Reordering Slices M1: R M2: R T6T6 T3T3 R R T1T1 T2T2 T5T5 T3T3 T7T7 T8T8 T6T6 Local Order M1: Local Order M2:

45 45 Merge Partial Order M1: R M2: R T6T6 T3T3 R R T1T1 T2T2 T5T5 T3T3 T7T7 T8T8 T6T6 Local Order M1: Local Order M2: Test Run Input Queue

46 46 Shared-Nothing - Slice 3 major principles: 1.The slices in the input queue are ordered by: –Reordering the slices on each machine locally –Merge the partial order 2.Executes all test runs of the same slice on the same machine 3.The scheduler makes sure conflicting slices are executed on different machines as much as possible

47 47 Test 10 T 6 T 7 T 8 T3T3 Test Run Input Queue T 1 T 2 T5T5 M1: R M2: R Scheduler Conflicts DB T 5 T 6 T 1 T 2 T 3 T 3 T 1

48 48 Test 10 T 6 T 7 T 8 Test Run Input Queue T 1 T 2 T5T5 M1: R T 3 M2: R Scheduler T3T3 Conflicts DB T 5 T 6 T 1 T 2 T 3 T 3 T 1

49 49 Test 10 Test Run Input Queue T 1 T 2 T5T5 M1: R T 3 M2: R Scheduler T3T3 T 6 T 7 T 8 Conflicts DB T 5 T 6 T 1 T 2 T 3 T 3 T 1

50 50 Test 10 Test Run Input Queue T 1 T 2 T5T5 M1: R T 3 M2: R Scheduler T3T3 T 6 T 7 T 8 Conflict? Conflicts DB T 5 T 6 T 1 T 2 T 3 T 3 T 1

51 51 Test 10 Test Run Input Queue T 1 T 2 T5T5 M1: R T 3 M2: R Scheduler T3T3 T 6 T 7 T 8 Conflict? Conflicts DB T 5 T 6 T 1 T 2 T 3 T 3 T 1

52 52 Test 10 Test Run Input Queue T5T5 M1: R T 3 M2: R Scheduler T3T3 T 6 T 7 T 8 Conflict? T 1 T 2 Conflicts DB T 5 T 6 T 1 T 2 T 3 T 3 T 1

53 53 Test 10 Test Run Input Queue T 1 T 2 M1: R T 3 M2: R Scheduler T3T3 T 6 T 7 T 8 T5T5 Conflicts DB T 5 T 6 T 1 T 2 T 3 T 3 T 1

54 54 Test 10 Test Run Input Queue M1: R T 3 M2: R Scheduler T3T3 T 6 T 7 T 8 T5T5 T 1 T 2 Conflicts DB T 5 T 6 T 1 T 2 T 3 T 3 T 1

55 55 Parallel Testing Shared-Database (SDB)

56 56 Shared-Database T6T6 T2T2 T5T5 T1T1 Conflicts DB T 2 Scheduler Reset Application Thread 1 Application Database Thread 2... Test Run Input Queue

57 57 Shared-Database, Slice Similar to Shared-Nothing Different definition of a slice Different scheduling decisions

58 58 Performance Experiments Simulation: –10,000 test runs (0 min – 3 min) –10,000 (low) – 5M (high) conflicts –Uniform + Zipf distribution –SN: 1 to 50 machines –SDB: 1 to 10 threads Real data: 61 test runs Reporting average running time/reset of the last 10 tests (total 30 tests)

59 59 Shared-DB (Real Data) Time unit: minute Approach 1 thread5 threads10 threads TimeResetTimeResetTimeReset Optimistic++ 41 7 22 6.6 16 5.8 Graph(MWD) 37 3.5 19 4.2 13 4.2 Slice 31 3 18 3.8 12 4.2

60 60 Shared-DB (Real Data) Time unit: minute Approach 1 thread5 threads10 threads TimeResetTimeResetTimeReset Optimistic++ 41 7 22 6.6 16 5.8 Graph(MWD) 37 3.5 19 4.2 13 4.2 Slice 31 3 18 3.8 12 4.2

61 61 Experiment Summary Shared-Nothing (SN) –Linear scale-up, sometimes super-linear Shared-Database (SDB) –Scales up to 10 threads Heuristics: –Slice is the winner How about other distribution (e.g., Zipf)? –Similar results

62 62 Conclusions and Future Work Parallel execution of test runs? –It SCALES! Studied a dynamic scheduling approach for SN and SDB architecture: –Control the database state Minimize DB resets –and Load balancing How to generate test runs and test data for database application programs? More in the paper

63 63 Thank You Main contact: eric.lo@inf.ethz.ch

64 64 Parallel Testing Framework...T4T4 T 31 T5T5 T 12 Conflicts DB Scheduler... T 17 Reset? T8T8 T7T7 T9T9 T 25 T 13 History M1M1 MNMN... Application Database Machine/Thread 1 Application Database Machine/Thread N Test Run Input Queue

65 65 Example: Shared-Nothing, Slice T6T6 T2T2 T5T5 T1T1 Conflicts DB T 3 Scheduler Reset Application Database Machine 1 Application Database Machine 2... Test 1: M1: R T 1 T 4... R... M2: R T 2 T 3 T 5 R...... Test Run Input Queue Test 1: M1: R T 1 T 2 T 3 R T 3 M2: R T 5 T 6 R T 6 T 7 T 8

66 66 Shared-Database, Slice T6T6 T2T2 T5T5 T1T1 Conflicts DB T 2 Scheduler Reset Application Thread 1 Application Database Thread 2... Test 1: M1: R T 1 T 4... R... M2: R T 2 T 3 T 5 R...... Test Run Input Queue Th1: T 1 T 2 T 2 T 3 R R R Th2: T 5 T 6 T 7 T 8 T 8 Test 1:

67 67 Conflicts DB T 2 Shared-Database, Slice - Test 1 T6T6 T2T2 T5T5 T1T1 Scheduler Reset Application Thread 1 Application Database Thread 2... Test 1: M1: R T 1 T 4... R... M2: R T 2 T 3 T 5 R...... Test Run Input Queue Th1: T 1 T 2 T 2 T 3 R R R Th2: T 5 T 6 T 7 T 8 T 8 Test 1:

68 68 SDB – Subsequent Tests Test 1: M1: R T 1 T 4... R... M2: R T 2 T 3 T 5 R... Th1: T 1 T 2 T 2 T 3 R R R Th2: T 5 T 6 T 7 T 8 T 8 Test 1: Reordering T 2 T 7 T 3 T8T8 Test Run Input Queue T 1 T 5 T 6 Test N:

69 69 Additional Issues - SDB How to do a database reset when a test run fails? Deferred : –The database reset is deferred and the failed test run is re-scheduled at the end Eager : –Abort all concurrent test runs and reset immediately Lazy* : –Do not accept new test run, let active test runs finished and reset. Test 1: M1: R T 1 T 4... R... M2: R T 2 T 3 T 5 R... Th1: T 1 T 2 T 2 T 3 R R R Th2: T 5 T 6 T 7 T 8 T 8 Test 1:

70 70 Shared-Nothing Performance Achieve linear scale-up? –Yes The best among the three: –Slice How about low conflict? –Similar results How about other distribution (e.g., Zipf)? –Similar results

71 71 Shared-Database Performance Scale-up if increasing the number of threads? –Yes, up to 10 threads If number of conflicts is high, > 10 test threads might hurt performance The best among the three: –Slice

72 72 SN Simulation (High Conflict) Approach 1 machines5 machines10 machines50 machines TimeResetTimeResetTimeResetTimeReset Optimistic++ 358 1788 72 1787 36 1775 6.8 1753 Slice 306 867 64 1098 32 1038 6.4 1048 Graph(MWD) 359 1792 71 1784 36 1780 7.6 1767 Time unit: hour

73 73 SN Simulation (High Conflict) Approach 1 machines5 machines10 machines50 machines TimeResetTimeResetTimeResetTimeReset Optimistic++ 358 1788 72 1787 36 1775 6.8 1753 Slice 306 867 64 1098 32 1038 6.4 1048 Graph(MWD) 359 1792 71 1784 36 1780 7.6 1767 Time unit: hour

74 74 SDB Simulation Approach 1 thread5 threads10 threads50 threads TimeResetTimeResetTimeResetTimeReset Optimistic++ 358 1788 160 1385 157 1231 258 1425 Slice 306 867 120 793 112 796 259 1422 MWD 359 1792 164 1396 156 1251 204 1067 Time unit: hour

75 75 Optimistic Let the test runs execute until a DB reset is really needed! –Optimistic: R T 1 T 2 T 3 T 4 –If a test run T reports fail: –Reset the database and then rerun T –Then, if T still reports failure A real bug! –Example: Optimistic: R T 1 T 2 T 3 T 4 R T 4

76 76 Optimistic++ Optimistic++: Record all failures (conflicts) to avoid executing a test run twice, again –Test on Monday : R T 1 T 2 T 3 R T 3 … T n T 3 –Test on Tuesday : R T 1 T 2 –Test on Tuesday : R T 1 T 2 R T 3 … T n (Next? T 3 ?)

77 77 Reordering Heuristics - Slice Slice: sequence of test runs without conflicts Collect s during each test –Test Monday = R T 1 T 2 –Slices = Run test again? –Reorder slices according to the conflicts collected T 3 RT 3 T 4 T 5 RT5T5

78 78 Test on Yesterday and Test on Today Yesterday: M1: R T 1 T 2 T 3 R T 3 M2: R T 5 T 6 R T 6 T 7 T 8 Reordering Merge T 6 T 7 T 8 T3T3 Test Run Input Queue T 1 T 2 T5T5 O1: O2: T 6 T 7 T 8 T3T3 Test Run Input Queue T 1 T 2 T5T5 Today: M1: R T 3 T 1 R T 1 T 2 M2: R T 6 T 7 T 8 T 5 R T 5 Reordering Merge

79 79 False Positive Case 1: Buggy Application + Tx Fails Consistent DB State Case 2: Buggy Application + Tx Success Inconsistent DB State –The inconsistent DB helps the test run by coincidence! This a tradeoff between speed and nitpick accuracy


Download ppt "Parallel Execution of Test Runs for Database Application Systems Donald Kossmann: ETH Zurich, i-TV-T AG Florian Haftmann: i-TV-T AG Eric Lo: ETH Zurich."

Similar presentations


Ads by Google