A Fault-Tolerant Environment for Large-Scale Query Processing Mehmet Can Kurt Gagan Agrawal Department of Computer Science and Engineering The Ohio State.

A Fault-Tolerant Environment for Large-Scale Query Processing Mehmet Can Kurt Gagan Agrawal Department of Computer Science and Engineering The Ohio State University HiPC’12 Pune, India 1

Motivation “big data” problem – Walmart handles 1 million customer transaction every hour, estimated data volume is 2.5 Petabytes. – Facebook handles more than 40 billion images – LSST generates 6 petabytes every year massive parallelism is the key HiPC’12 Pune, India 2

Motivation Mean-Time To Failure (MTTF) decreases Typical first year for a new cluster* – 1000 individual machine failures – 1 PDU failure (~500-1000 machines suddenly disappear) – 20 rack failures (40-80 machines disappear, 1-6 hours to get back) HiPC’12 Pune, India 3 * taken from Jeff Dean’s talk in Google IO (http://perspectives.mvdirona.com/2008/06/11/JeffDeanOnGoogleInfrastructure.aspx)

Our Work supporting fault-tolerant query processing and data analysis for a massive scientific dataset focusing on two specific query types: 1.Range Queries on Spatial datasets 2.Aggregation Queries on Point datasets supported failure types: single-machine failures and rack failures HiPC’12 Pune, India 4 * rack: a number of machines connected to the same hardware (network switch, …)

Our Work Primary Goals 1)high efficiency of execution when there are no failures (indexing if applicable, ensuring load-balance) 2)handling failures efficiently up to a certain number of nodes (low-overhead fault tolerance through data replication) 3)a modest slowdown in processing times when recovered from a failure (preserving load-balance) HiPC’12 Pune, India 5

Range Queries on Spatial Data nature of the task: – each data object is a rectangle in 2D space – each query is defined with a rectangle – return intersecting data rectangles computational model: – master/worker model – master serves as coordinator – each worker responsible for a portion of data HiPC’12 Pune, India 6 Y X query data worker query master

Range Queries on Spatial Data data organization: – chunk is the smallest data unit – create chunks by grouping data objects together – assign chunks to workers in round-robin fashion HiPC’12 Pune, India 7 Y X chunk 1 chunk 2 chunk 3 worker chunk 4 worker * actual number of chunks depends on chunk size parameter.

Range Queries on Spatial Data ensuring load-balance: – enumerate & sort data objects according to Hilbert Space-Filling Curve, then pack sorted data objects into chunks spatial index support: – Hilbert R-Tree deployed on master node – leaf nodes correspond to data chunks – initial filtering at master, tells workers which chunks to look HiPC’12 Pune, India 8 1 2 3 4 o1o1 o4o4 o3o3 o8o8 o6o6 o5o5 o2o2 o7o7 sorted objects: o 1, o 3, o 8, o 6, o 2, o 7, o 4, o 5 chunk 1chunk 2chunk 3chunk 4

Range Queries on Spatial Data Fault-Tolerance Support – Sub-chunk Replication: step1: divide data chunks into k sub-chunks step2: distribute sub-chunks in round-robin fashion HiPC’12 Pune, India 9 Worker 1Worker 2 Worker 3Worker 4 chunk1chunk2 chunk3chunk4 chunk1,1chunk1,2 step1 chunk2,1chunk2,2 step1 chunk3,1chunk3,2 step1 chunk4,1chunk4,2 step1 * rack-failure: same approach, but distribute sub-chunks to nodes in different rack k = 2

Range Queries on Spatial Data Fault-Tolerance Support - Bookkeeping: – add a sub-leaf level to the bottom of Hilbert R-Tree – Hilbert R-Tree both as a filtering structure and failure management tool HiPC’12 Pune, India 10

Aggregation Queries on Point Data nature of the task: – each data object is a point in 2D space – each query is defined with a dimension (X or Y), and aggregation function (SUM, AVG, …) computational model: – master/worker model – divide space into M partitions – no indexing support – standard 2-phase algorithm: local and global aggregation HiPC’12 Pune, India 11 worker 1 worker 2 worker 3 worker 4 X Y partial result in worker 2 M = 4

Aggregation Queries on Point Data reducing communication volume – initial partitioning scheme has a direct impact – have insights about data and query workload: P(X) and P(Y) = probability of aggregation along X and Y-axis |r x | and |r y | = range of X and Y coordinates expected communication volume V comm defined as: Goal: choose a partitioning scheme (c v and c h ) that minimizes V comm HiPC’12 Pune, India 12

Aggregation Queries on Point Data Fault-Tolerance Support – Sub-partition Replication: step1: divide each partition evenly into M’ sub-partitions step2: send each of M’ sub-partitions to a different worker node Important questions: 1)how many sub-partitions (M’)? 2)how to divide a partition (cv’ and ch’) ? 3)where to send each sub-partition? (random vs. rule-based) HiPC’12 Pune, India 13 Y X M’ = 4 ch’ = 2 cv’ = 2 a better distribution reduces comm. overhead rule-based selection: assign to nodes which share the same coordinate- range

Experiments local cluster with nodes – two quad-core 2.53 GHz Xeon(R) processors with 12 GB RAM entire system implemented in C by using MPI-library range queries: – comparison with chunk replication scheme – 32 GB spatial data – 1000 queries are run, and aggregate time is reported aggregation queries: – comparison with partition replication scheme – 24 GB point data 64 nodes used, unless noted otherwise HiPC’12 Pune, India 14

Experiments: Range Queries Optimal Chunk Size SelectionScalability HiPC’12 Pune, India 15 - Execution Times with No Replication and No Failures (chunk size = 10000)

Experiments: Range Queries Single-Machine FailureRack Failure HiPC’12 Pune, India 16 -Execution Times under Failure Scenarios (64 workers in total) -k is the number of sub-chunks for a chunk

Experiments: Aggregation Queries Effect of Partitioning Scheme On Normal Execution Single-Machine Failure HiPC’12 Pune, India 17 P(X) = P(Y) = 0.5, |r x | = |r y | = 10000 P(X) = P(Y) = 0.5, |r x | = |r y | = 100000

Conclusion a fault-tolerant environment that can process – range queries on spatial data and aggregation queries on point data – but, proposed approaches can be extended for other type of queries and analysis tasks high efficiency under normal execution sub-chunk and sub-partition replications – preserve load-balance in presence of failures, and hence – outperform traditional replication schemes HiPC’12 Pune, India 18

Thank you for listening … Questions HiPC’12 Pune, India 19

A Fault-Tolerant Environment for Large-Scale Query Processing Mehmet Can Kurt Gagan Agrawal Department of Computer Science and Engineering The Ohio State.

Similar presentations

Presentation on theme: "A Fault-Tolerant Environment for Large-Scale Query Processing Mehmet Can Kurt Gagan Agrawal Department of Computer Science and Engineering The Ohio State."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Fault-Tolerant Environment for Large-Scale Query Processing Mehmet Can Kurt Gagan Agrawal Department of Computer Science and Engineering The Ohio State.

Similar presentations

Presentation on theme: "A Fault-Tolerant Environment for Large-Scale Query Processing Mehmet Can Kurt Gagan Agrawal Department of Computer Science and Engineering The Ohio State."— Presentation transcript:

Similar presentations

About project

Feedback