Presentation is loading. Please wait.

Presentation is loading. Please wait.

Windows in Niagara Jin (Jenny) Li, David Maier, Vassilis Papadimos, Peter Tucker, Kristin Tufte.

Similar presentations


Presentation on theme: "Windows in Niagara Jin (Jenny) Li, David Maier, Vassilis Papadimos, Peter Tucker, Kristin Tufte."— Presentation transcript:

1 Windows in Niagara Jin (Jenny) Li, David Maier, Vassilis Papadimos, Peter Tucker, Kristin Tufte

2 Overview Make Windows Explicit Make Windows Explicit Tag tuples with a window id Tag tuples with a window id Standard operators don’t know about different kinds of windows - work with window ID attribute Standard operators don’t know about different kinds of windows - work with window ID attribute Use Punctuation Infrastructure Use Punctuation Infrastructure Punctuation signals end of window Punctuation signals end of window No need for specialized window operators – just use punctuate-aware operators No need for specialized window operators – just use punctuate-aware operators Flexible Flexible Window on system time, external time or tuple-based Window on system time, external time or tuple-based Data can arrive and be processed out of order Data can arrive and be processed out of order

3 Niagara Control Structure Push-based (pipelined) system. Push-based (pipelined) system. Each operator is a thread. Each operator is a thread. Operators are connected by queues of tuples. Operators are connected by queues of tuples. Operators wait on input queue, when tuple is ready, it is processed and result is inserted in output queue Operators wait on input queue, when tuple is ready, it is processed and result is inserted in output queue streamscan unnest (path expr) select

4 streamscan Niagara Query Execution unnest (bid.bidderid) select (bidderid = 501) Reads and parses data from a stream. Uses a path expression to extract matching elements from input tuples. 501 42 $10.00 Query: Find all bids that bidder with id = 501 has made. bid bidderid:501price: $5.00 bid bidderid:501price: $5.00

5 slide_number speaker *Kristin

6 NEXMark Schema Streams: bid auctionidbidderidpricedatetimeauctionsite auction iddescriptionreserveexpiresauctionsite itemnamesellercategory Note: bid.datetime and auction.expires are time generated at the source sites.

7 Three Example Queries All three queries are window aggregates, specifically, time-based window count All three queries are window aggregates, specifically, time-based window count Query 1: use internal system time and internal punctuations Query 1: use internal system time and internal punctuations Query 2: use external timestamp and internal punctuations Query 2: use external timestamp and internal punctuations Query 3: use external timestamp and external punctuations Query 3: use external timestamp and external punctuations

8 Query 1: Select the number of bids on each item in the past five minutes. Update the results every minute. SELECT B1.auctionid, count(*) FROM Bid [RANGE 5 MINUTES SLIDE 1 MINUTE] B1 GROUP BY B1.auctionid streamscan (Bid) B1 unnest (auctionid) WindowGroupby(B1.auctionid, count(*)) Punctuator/Timestamper Add timestamp field to tuple Punctuate at end of minute Groupby(B1.auctionid, B.wid, count(*)) Timer Timestamp = CURRENT_TIME Bucketizer Add window ranges to tuples Punctuate end of window streamscan (Bid) B1 unnest (auctionid)

9 5:01 TS = 5:00 Query 1 Details streamscan (Bid) B1 unnest (auctionid) groupby(B1.auctionid, B1.winId, count(*)) punctuator/timestamper timer Timestamp = CURRENT_TIME bucketizer T1T2T3 T1 5:00 T1 5:0010 T1 1-55:00101-1****(,5:00]* 5:00 T2 5:00 T2 5:0015 T2 1-55:0015 T3 5:00 T3 5:0015 T3 1-55:0015 5:01 SELECT B1.auctionid, count(*) FROM Bid [RANGE 5 MINUTES SLIDE 1 MINUTE] B1 GROUP BY B1.auctionid 10 1 15 2 auctionid count *(,5:00]

10 Query 1 vs. Query 2 Query 1: SELECT B1.auctionid, count(*) FROM Bid [RANGE 5 MINUTES SLIDE 1 MINUTE] B1 GROUP BY B1.auctionid Query 2: SELECT B1.auctionsite, count(*) FROM Bid [RANGE 5 MINUTES SLIDE 1 MINUTE ATTR datetime SLACK 5 MINUTES] B1 GROUP BY B1.auctionsite Select the number of bids on each item in the past five minutes. Update the results every minute. Select the number of bids made at each auction site in the past five minutes. Update the results every minute. “CQL2004”

11 Query 2: Select the number of bids made at each auction site in the past five minutes. Update the results every minute. SELECT B1.auctionsite, count(*) FROM Bid [RANGE 5 MINUTES SLIDE 1 MINUTE ATTR datetime SLACK 5 MINUTES] B1 GROUP BY B1.auctionsite streamscan (Bid) B1 unnest (auctionsite, datetime) groupby(B1.auctionsite, B.winId, count(*)) punctuator/enforcer Enforce datetime > current timestamp Punctuate at end of minute timer Timestamp = CURRENT_TIME – 5 MINUTES bucketizer Add window ranges to tuples Punctuate end of window

12 Query 2: Select the number of bids made at each auction site in the past five minutes. Update the results every minute. SELECT B1.auctionsite, count(*) FROM Bid [RANGE 5 MINUTES SLIDE 1 MINUTE ATTR datetime SLACK 5 MINUTES] B1 GROUP BY B1.auctionsite streamscan (Bid) B1 unnest (auctionsite, datetime) groupby(B1.auctionsite, B.winId, count(*)) punctuator/enforcer Enforce datetime > current timestamp Punctuate at end of minute timer Timestamp = CURRENT_TIME – 5 MINUTES bucketizer Add window ranges to tuples Punctuate end of window

13 Query 2 vs. Query 3 Query 2: SELECT B1.auctionsite, count(*) FROM Bid [RANGE 5 MINUTES SLIDE 1 MINUTE ATTR datetime SLACK 5 MINUTES] B1 GROUP BY B1.auctionsite Query 3: SELECT B1.auctionsite, count(*), B1.wid FROM Bid [RANGE 5 MINUTES SLIDE 5 MINUTES ATTR datetime] B1 GROUP BY B1.auctionsite Select the number of bids made at each auction site in the past five minutes. Update the results every minute. Select the number of bids made at each auction site in the past five minutes. Update the results every five minutes.

14 Query 3: Select the number of bids made at each auction site in the past five minutes. Update the results every five minutes. SELECT B1.auctionsite, count(*) FROM Bid [RANGE 5 MINUTES SLIDE 5 MINUTES ATTR datetime] B1 GROUP BY B1.auctionsite streamscan (Bid) B1 unnest (auctionsite, datetime) groupby(B1.auctionsite, B.winId, count(*)) bucketizer Add window ranges to tuples Punctuate end of window … T1 … T2 Site A Site B Site C

15 T1 A 1-1A5:01 Query 3 Details streamscan (Bid) B1 unnest (auctionsite, datetime) groupby(B1.auctionsite, B.winId, count(*)) bucketizer Add window ranges to tuples Punctuate end of window … T1 … T2 T1 A A5:01 T2 A A5:07A(,5:05] T2 A 2-2A5:071-1A* window 1, site A: T1 A T1 B B5:02A(,5:10]2-2A* T1 B 1-1B5:02 window 1, site B: T1 B window 2, site A: T2 A window 1, site B: T2 B A, 1, 1 A, 1, 2 T2 B 1-1B5:04 Site A Site B Site C SELECT B1.auctionsite, count(*), B1.wid FROM Bid [RANGE 5 MINUTES SLIDE 5 MINUTES ATTR datetime] B1 GROUP BY B1.auctionsite Auctionsite, count, wid T2 B B5:04 Window 1: 5:00 – 5:05 Window 2: 5:05 – 5:10 Legend:

16 Discussion Bucketizer Bucketizer Apply a function to the stream Apply a function to the stream Encapsulate window semantics Encapsulate window semantics Punctuate-Aware Punctuate-Aware e.g. punctuation on time -> punctuation on wid e.g. punctuation on time -> punctuation on wid Wid is used as a grouping/join attribute Wid is used as a grouping/join attribute Punctuator Punctuator Adds timestamp as an attribute- optional Adds timestamp as an attribute- optional Enforce punctuations- optional Enforce punctuations- optional Converts stream semantics to punctuations Converts stream semantics to punctuations Outputs punctuations Outputs punctuations Punctuations signal the end of windows, results are output and state is purged Punctuations signal the end of windows, results are output and state is purged

17 Conclusions Process window queries without specialized window operators Process window queries without specialized window operators Flexible window semantics Flexible window semantics Use punctuate-aware operators, introduce minimum number of new operators Use punctuate-aware operators, introduce minimum number of new operators

18 Future Work Semantics of window operators Semantics of window operators Performance of different implementations Performance of different implementations Study affect of disorder Study affect of disorder Groupby ? Window Groupby ? Window

19 Questions? … … … …


Download ppt "Windows in Niagara Jin (Jenny) Li, David Maier, Vassilis Papadimos, Peter Tucker, Kristin Tufte."

Similar presentations


Ads by Google