Scalable Data Partitioning Techniques for Parallel Sliding Window Processing over Data Streams DMSN 2011 Cagri Balkesen & Nesime Tatbul.

Scalable Data Partitioning Techniques for Parallel Sliding Window Processing over Data Streams
DMSN 2011 Cagri Balkesen & Nesime Tatbul

Talk Outline Intro & Motivation Stream Partitioning Techniques
Basic window partitioning Batch partitioning Pane-based partitioning Ring-based Query Evaluation Experimental Evaluation Conclusions & Future Work 2

Intro & Motivation DSMS 3

Architectural Overview
Query Query Split stage Split node Query Merge stage Merge node input stream output stream Query nodes QoS: latency < 5 seconds disorder < 3 tuples Classical Split-Merge pattern from Parallel DBs Adjustable parallelism level, d QoS on max latency & order 4

Related Work: How to Partition?
Content-sensitive FluX: Fault-tolerant, load balancing Exchange [1,2] Use group-by values from the query to partition Need explicit load-balancing due to skewed data Content-insensitive GDSM: Window-based parallelization (fixed-size tumbling wins) [3] Win-Distribute: Partition at window boundaries Win-Split: Partition each win into equi-length subwins The Problem: How to handle sliding windows? How to handle queries without group-by or a few groups? [1] Flux: An Adaptive Partitioning Operator for Continuous Query Systems, ICDE‘03 [2] Highly-Available, Fault-Tolerant, Parallel Dataflows, SIGMOD ‘04 [3] Customizable Parallel Execution of Scientific Stream Queries, VLDB ‘05 5

Stream Partitioning Techniques

Approach 1: Basic Sliding Window Partitioning
Independently processable chunking Window aware splitting of the stream Each window has an id & tuples are marked (first-winid, last-winid, is-win-closer) Tuples are replicated for each of their windows Node1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t W1 Split Node2 W2 W3 W4 Node3 w = 6 units, s = 2 units, Replication = 6/2 = 3 7

Approach 1: Basic Sliding Window Partitioning
The Problem with Basic sliding window partitioning: Tuples belong to many windows depending on slide Excessive replication of tuples for each window Increase in output data volume of split Node1 t1 t2 t3 t4 t5 t6 t7 t8 t9 t W1 Split Node2 W2 W3 W4 Node3 w = 6 units, s = 2 units, Replication = 6/2 = 3 8

Approach 2: Batch-based Partitioning
Batch several windows together to reduce replication “Batch-window”: wb = w+(B-1)*s ; sb = B*s All the tuples in a batch go to the same partition Only tuples overlapping btw. batches are replicated Replication reduced to wb/sb partitions instead of w/s t1 t2 t3 t4 t5 t6 t7 t8 t9 t w1 w2 w3 w4 w5 w6 w7 w8 B1 B2 Definitions: w : window-size s : slide-size B : batch-size w = 3, s = 1 B = 3  wb = 5, sb = 3 Replication : 3  5/3 9

The Panes Technique Divide overlapping windows into disjoint panes
Reduce cost by sub-aggregation and sharing Each window has w/gcd(w,s) panes of size gcd(w,s) Query is decomposed: pane-level (PLQ) & window-level (WLQ) queries w1 w2 w3 w4 w5 . . . windows p1 p2 p3 p4 p5 p6 p7 p8 panes [1] No Pane, No Gain: Efficient Evaluation of Sliding Window Aggregates over Data Streams, SIGMOD Record ‘05 10

Approach 3: Pane-based Partitioning
Mark each tuple with pane-id + win-id Treat panes as tumbling window with wp = sp = gcd(w,s) Route tuples to a node based on pane-id Nodes compute PLQ with pane tuples Combine all PLQ results of a window to form WLQ Need for an organized topology of nodes We propose organization of nodes in a ring Node1 Node2 Node3 Split w = 6 units, s = 2 units 11

Ring-based Query Evaluation
High amount of pipelined result sharing among nodes Organized communication topology Pane1 Pane2 4 3 Pane3 6 5 Window1 1 2 Input Source W = 6, S = 4 tuples P = GCD(6,4) = 2 tuples Pane3 6 5 Pane4 8 7 Pane5 10 9 Window2 … P9 P8 P3 P2 P1 … P11 P10 P5 P4 Window3 Pane6 Pane7 14 13 12 11 Pane5 10 9 Split … P13 P12 P7 P6 . . . R3 R9 Node2 Node1 W2 Merge W1 R11 R7 W3 R5 R13 Node3 12

Assignment of Windows and Panes to Nodes
All pane results only arrive from predecessors Pane results sent to successor is only local panes Each node is assigned n consecutive windows Min n st. Definitions: ww : win-size in # of panes sw : slide-size in # of panes 13

Flexible Result Merging
FIFO Fully-ordered * k = 0 k-ordered: k-ordering constraint [1], certain disorder allowed Defn: For any tuple s, s’ arrives at least k+1 tuples after s st. s’.A ≥ s.A [1] Exploiting k-Constraints to Reduce Memory Overhead in Continuous Queries over Data Streams. ACM TODS ‘04 14

Experimental Evaluation
Implementation of techniques in Borealis Workload adapted from Linear Road Benchmark Slightly modified segment statistics queries Basic aggregation functions with different window/slide ratios 15

Scalability of Split Operator
Maximum input rate (tuples/second) window-size/slide ratio (window overlap) Pane-partitioning: cost & tput constant regardless of overlap ratio Window & batch –partitioning: cost ↑ and tput↓ as overlap ↑ Excessive replication in window-partitioning is reduced by batching 16

Scalability of Partitioning Techniques
* w/s = overlap ratio = 100 Pane-based scales close to linear until split is saturated per tuple cost is constant Window & batch based: exteremely high replication Split is not saturated, but scales very slowly 17

Summary & Conclusions Pane-partitioning is the choice of partitioning
1) Window-based 2) Batch-based 3) Pane-based Pane-partitioning is the choice of partitioning Avoids tuple replication Incurs less overhead in split and aggregate Scales close to linear 18

Ongoing & Future Work Generalization of the framework
Support for adaptivity during runtime Extending complexity of query plans Extending performance analysis & experiments 19

Thank You! 20

Scalable Data Partitioning Techniques for Parallel Sliding Window Processing over Data Streams DMSN 2011 Cagri Balkesen & Nesime Tatbul.

Similar presentations

Presentation on theme: "Scalable Data Partitioning Techniques for Parallel Sliding Window Processing over Data Streams DMSN 2011 Cagri Balkesen & Nesime Tatbul."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Scalable Data Partitioning Techniques for Parallel Sliding Window Processing over Data Streams DMSN 2011 Cagri Balkesen & Nesime Tatbul.

Similar presentations

Presentation on theme: "Scalable Data Partitioning Techniques for Parallel Sliding Window Processing over Data Streams DMSN 2011 Cagri Balkesen & Nesime Tatbul."— Presentation transcript:

Similar presentations

About project

Feedback