Presentation is loading. Please wait.

Presentation is loading. Please wait.

@andy_pavlo FAS TER Making Fast Databases. Fast Cheap +

Similar presentations


Presentation on theme: "@andy_pavlo FAS TER Making Fast Databases. Fast Cheap +"— Presentation transcript:

1 @andy_pavlo FAS TER Making Fast Databases

2

3 Fast Cheap +

4 OLTP Through the Looking Glass, and What We Found There SIGMOD 2008 CPU Cycles TPC-C NewOrder 8.1 % 21.1 % 18.7 % 10.2 % 29.6 % 12.3 % Legacy Systems

5 FastRepetitiveSmall OLTP Transactions

6 Main Memory Parallel Shared-Nothing Transaction Processing H-Store: A High-Performance, Distributed Main Memory Transaction Processing System VLDB vol. 1, issue 2, 2008

7 Client Application Database Cluster Procedure Name Input Parameters Procedure Name Input Parameters Stored Procedure Execution Database Cluster Transaction Result Transaction Result

8

9

10 TPC-C NewOrder

11 Database Cluster Client Application Two-Phase Commit Prepare Two-Phase Commit Prepare Two-Phase Commit Finish Two-Phase Commit Finish Transaction Result Transaction Result

12 Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems SIGMOD 2012 Partition database to reduce the number of distributed txns. Optimization #1:

13 o_ido_c_ido_w_i d … 7870310045- 7870410023- 7870510067- 7870610056- 7870710056- 78708100312- c_idc_w_idc_last… 10015RZA- 10023GZA- 100312Raekwo n - 10045Deck- 10056Killah- 10067ODB- CUSTOMERORDERS CUSTO MER ORDER S CUSTO MER ORDER S CUSTO MER ORDER S ITEM i_idi_namei_price… 603514XXX23.99- 267923XXX19.99- 475386XXX14.99- 578945XXX9.98- 476348XXX103.49- 784285XXX69.99- ITEMITEMITEM CUSTOMER c_idc_w_idc_last… 10015RZA- 10023GZA- 100312Raekwo n - 10045Deck- 10056Killah- 10067ODB-

14 CUSTO MER ORDER S CUSTO MER ORDER S CUSTO MER ORDER S ITEMITEMITEM Client Application NewOrder(5, “Method Man”, 1234)

15 … Large-Neighorhood Search Algorithm Sche ma Workl oad -------------- ------- D DL SELECT * FROM WAREHOUSE WHERE W_ID = 10; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345); ⋮ SELECT * FROM WAREHOUSE WHERE W_ID = 10; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345); ⋮ SELECT * FROM WAREHOUSE WHERE W_ID = 10; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345); ⋮ SELECT * FROM WAREHOUSE WHERE W_ID = 10; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345); ⋮ SELECT * FROM WAREHOUSE WHERE W_ID = 10; SELECT * FROM DISTRICT D_W_ID = 10 AND D_ID =9; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345); ⋮ SELECT * FROM WAREHOUSE WHERE W_ID = 10; SELECT * FROM DISTRICT D_W_ID = 10 AND D_ID =9; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345); ⋮ SELECT * FROM WAREHOUSE WHERE W_ID = 10; SELECT * FROM DISTRICT WHERE D_W_ID = 10 AND D_ID =9; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID,…) VALUES (10, 9, 12345,…); ⋮ SELECT * FROM WAREHOUSE WHERE W_ID = 10; SELECT * FROM DISTRICT WHERE D_W_ID = 10 AND D_ID =9; INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID,…) VALUES (10, 9, 12345,…); ⋮ NewOrde r DDL CUSTOMER ITEM ORDERS

16 Workl oad -------------- ------- Sche ma D DL Initial Design Relaxation Local Search Restart Large- Neighborhood Search

17 Distributed Transactions Workload Skew Factor Cost Model +

18 TATPTPC-CTPC-C Skewed (txn/s) +88%+16%+183% HorticultureState-of-the-Art Throughput

19 TATP SEATS TPC-CTPC-C Skewed AuctionMark TPC-E Search Times % Single-Partitioned Transactions

20

21 Database Cluster Client Application Undo Log

22 On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systems VLDB, vol 5. issue 2, October 2011 Predict what txns will do before they execute. Optimization #2:

23 Database Cluster Client Application » Partitions Touched? » Undo Log? » Done with Partitions?

24

25

26 w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] Input Parameters: Current State: SELECT * FROM WAREHOUSE WHERE W_ID = ? GetWarehouse:

27 Confidence Coefficient:0.96 Best Partition:0 Partitions Accessed:{ 0 } Use Undo Logging:Yes Transaction Estimate: Estimated Execution Path w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] Input Parameters:

28 SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?; Current State: X w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] CheckStock: Input Parameters: INSERT INTO ORDERS (o_id, o_w_id) VALUES (?, ?); INSERT INTO ORDERS (o_id, o_w_id) VALUES (?, ?); InsertOrder:

29 w_id=0 i_w_ids=[0,1] i_ids=[1001,1002] w_id=0 i_w_ids=[0,1] i_ids=[1001,1002]

30 TATPTPC-CAuctionMark (txn/s) +57%+126%+117% HoudiniAssume Single-Partitioned Throughput

31 TATPTPC-CAuctionMark Prediction Overhead

32

33 Future Work : Reduce distributed txn overhead through creative scheduling. Conclusion: Achieving fast performance is more than just using only RAM.

34 h- store hstore.cs.bro wn.edu github.com/apavlo/h-store

35 Help is Available Graduate Student Abuse Hotline Available 24/7 Collect Calls Accepted +1-203-432-2003

36 Help is Available Graduate Student Abuse Hotline Available 24/7 Collect Calls Accepted +1-212-939-7064

37 Database Node … Txn Coordinator Execution Engine


Download ppt "@andy_pavlo FAS TER Making Fast Databases. Fast Cheap +"

Similar presentations


Ads by Google