Presentation is loading. Please wait.

Presentation is loading. Please wait.

Continuous Data Stream Management  Music Virtual Channel – copyright violations  Data Stream Monitoring – counting sketches  Continuous Query Processing.

Similar presentations


Presentation on theme: "Continuous Data Stream Management  Music Virtual Channel – copyright violations  Data Stream Monitoring – counting sketches  Continuous Query Processing."— Presentation transcript:

1 Continuous Data Stream Management  Music Virtual Channel – copyright violations  Data Stream Monitoring – counting sketches  Continuous Query Processing – sequence queries 陳良弼教授 Date: 2005/7/1 Post-Excellence Project Subproject 6

2 Continuous Data Stream Management 2/36 P2P R&D should be done in a trust domain.  美國最高法院: P2P 檔案交換軟體公司侵權 05/06/28  米高梅等 28 家音樂和電影公司控告檔案交換軟 體公司 Grokster 和 StreamCast  1984 年最高法院做出的新力 Betamax 錄影機的 判例中,新力公司不需要為使用該錄影機盜錄 版權影片的行為負責  最高法院:「意在鼓勵侵權行為的軟體散佈者, 必須為第三方的侵權行為後果負責 」  合法拼不過非法!國內首家合法音樂下載網站艾 比茲( iMusic )停業 05/05/30 著作權法的兩大精神:「重製並公開散佈」「須事先取得授權」 Music Virtual Channel: listen-only music + authorized channels

3 Continuous Data Stream Management 3/36 付費下載

4 4/36 付費下載 vs. 付費廣播

5 Continuous Data Stream Management 5/36

6 6/36

7 7/36

8 8/36

9 9/36

10 10/36

11 11/36 Finding Music is beyond Providing Music  Paul Lamere, SUN Labs 2005  People have a strong emotional connection to their music and understand the difficulties in finding new music (the music they like).  Listen-only music will make it expensive for MIR researchers to get data for MIR experiments.  Music Virtual Channel  Finding music/channels by sharing user experience Content-based + Collaborative filtering

12 Continuous Data Stream Management 12/36 Music Virtual Channel … 1 1 N N 2 2 … Music collections Internet V.C. player V.C. player Clustering engine Clustering engine Filtering engine Filtering engine Music channel simulator Music channel simulator Music metadata Interface Profile monitor Profile monitor Cluster monitor Cluster monitor Channel monitor Channel monitor Favorite channel Favorite channel

13 Continuous Data Stream Management 13/36 Feature 1: Virtual Channel Ranked channel list Virtual channel Clustering engine Clustering engine Collaborative filtering

14 Continuous Data Stream Management 14/36 Feature 2: Favorite Channel User query Matched music # of matched music Filtering engine Filtering engine Content-based

15 Continuous Data Stream Management 15/36 Data Stream Monitoring  Counting Sketches  Maintain the moving sum of every data source in the sliding window  On the clustering engine, given n 1 channels and n 2 users, m = n 1  n 2 O((W/t)(log(1/ δ ))/  )

16 Continuous Query Processing over Event Streams Based on Approximate Mechanism

17 Continuous Data Stream Management 17/36 Motivation  Sequential (Temporal) Patterns on Data Streams  Abnormal / Special behavior detection Temperature monitoring Intrusion detection  Interesting pattern finding Document filtering Multimedia content-based filtering

18 Continuous Data Stream Management 18/36 Problem  Given a set of sequence queries (SQs), how to continuously monitor the event stream for them and report the segments that are approximate answers of certain queries as soon as the segments arrive according to the error bounds of the queries?  Event Stream  ······················  Sequence Query , ε=1

19 Continuous Data Stream Management 19/36 Related Work  N-gram Indexing (for retrieval and exact match) ABC a,1 b,5 c,7 a,4 b,10 c,2 a,2 a,7 b,2 ABC Inverted lists of Databases (n=3) de a,3 b,9 c,1 e,6 f,9 g,1

20 Continuous Data Stream Management 20/36 Architecture Data n-gram (sliding window) Event stream Sequence Queries Query n-gram (non-overlap) Query Manager Pruning Mechanism Merging Mechanism Final Results Filtering Engine

21 Continuous Data Stream Management 21/36 Query Manager  Cluster Summarization  A virtual n-gram for each cluster QNG1 = QNG2 = QNG3 = QNG4 = Virtual n-gram =

22 Continuous Data Stream Management 22/36 Pruning Mechanism d1d1 d2d2 d3d3 C1C1 C2C2 C3C3 A  Lower Bound Estimation (I)  The estimate between a data n-gram and a virtual n-gram cannot exceed the maximum error bound of the query n-grams in the cluster. B

23 Continuous Data Stream Management 23/36 Merging Mechanism  Lower Bound Estimation (II)  The estimate between the new partial answer (by merging) and the query segment cannot exceed the error bound of the query. Q 21 Q 23 Q 22  a b c b’a’ c’ Q 21 Q 22

24 Continuous Data Stream Management 24/36 Performance: Real Time The ratios of processing time to playing time (number of SQs=1000)

25 Continuous Data Stream Management 25/36 Performance: Scalability

26 Continuous Data Stream Management 26/36 Future Works  Continuous Query Processing in a Resource- limited Environment  CPU limitation  Memory limitation Load Shedding Mechanisms  Continuous Query Processing over Multiple Data Streams  Temporal relationship among data streams  Geographical relationship among data streams Sensor network

27 Continuous Data Stream Management 27/36 Future Works (Cont.)  Detect whether the behavior of a sensor is different from that of its neighbors in a time period  Abnormal behavior detection  Detect whether the behavior of a sensor follows that of another sensor  Trend analysis and correlation

28 Continuous Data Stream Management 28/36 Future Works (Cont.)  Possible Query Formats  A:  (1,3) B:  ?:  (1,3) ?:  (A,B):  (A» (d  (?» (d

29 Continuous Data Stream Management 29/36 Temperature Monitoring  In a warehouse, temperature sensors attached to those items send the temperature measurements periodically  Simple Queries  Return the abnormal temperature (that is above a threshold value).  Every minute retrieve the maximum among all the temperature measurements in the last five minutes.

30 Continuous Data Stream Management 30/36 Temperature Monitoring (Cont.)  Complex Queries  Find the sensors that send three consecutive measurements 38°, 39°, and 40° in sequence.  Find the sensors where, from a temperature between 35° and 38°, it goes up more that 2° in each of three subsequent measurements.  Find temporal patterns where the temperature jumps from a value below 38° to a value between 40 ° and 50 °, followed by two successive drops, returning to a temperature below 38 °.

31 Continuous Data Stream Management 31/36 Intrusion Detection Center of Intrusion Detection SystemIntrusion Detection Device

32 Continuous Data Stream Management 32/36 Applications

33 33/36

34 34/36 Document Filtering  Whenever a new document is published, the continuous queries that match it are found and the corresponding users are notified.  Document  Query

35 Continuous Data Stream Management 35/36 Document Filtering (Cont.)  In one or multiple streams of news stories, all stories related to an old event (previously seen) are automatically filtered out for the user.

36 Continuous Data Stream Management 36/36 Music Content-based Filtering  The user is allowed to issue a musical segment that s/he is interested as a query to monitor the music channels.  The system collects all the queries, monitors every musical channel, and then notifies each user which songs broadcasted on the music channels contain a segment close to his/her query.


Download ppt "Continuous Data Stream Management  Music Virtual Channel – copyright violations  Data Stream Monitoring – counting sketches  Continuous Query Processing."

Similar presentations


Ads by Google