Download presentation
Presentation is loading. Please wait.
1
Multimedia Data Stream Management System
By David Kleinman
2
Outline Definition Motivating Examples Nine Requirements
Current Systems Comparison Brief Overview of current Stream Systems Preview of My Project
3
What is it? Stream of multimedia data from a source (video camera)
Query stored in a system (This query may itself change Process high volumes of data in real-time
4
Motivating Examples Security Surveillance Baby Sitting Traffic Reports
Crowd Security Air Security Burglary Baby Sitting Traffic Reports Science Animal behavior Ocean Sensors can be expensive – especially when applied across a large area. It is also difficult to install (i.e. Washington D.C. do not have right to install on other’s buildings. Sensors are not 100% dependable. Sensors can be disabled easily. They are often located on the ground and are easily accessible. A video camera can be stored in one safe location high above ground. Sensors cannot detect items that a video camera can detect.
5
Reqirement #1 - Process Quickly
Low latency Messages Processed “In-Stream” No Storage to perform operation Active System Avoid Polling Low latency – the system must be able to perform message processing without having a costly storage operation. Storage adds latency to process. – i.e. writing to a database requires a disk write – Passive systems wait to be told what to do by an application before beginning processing. Passive systems require applications to continuously poll for conditions. Polling results in additional latency because on average half the polling interval is added to the processing delay. Active systems avoid this by having built in event/data driven processing capabilities
6
Requirement #2 – Query using SigmaQL for Streams (StreamSigmaQL)
Querying Mechanism Based on SQL Express Continuous Streams of Data Window Construct Time Frames Breakpoints Merge Operator SQL has remained the most enduring standard database language for over 30 years because it is very good at expressing complex data transformations. It is based on a set of very powerful data processing primitives that do filtering, merging, correlation, and aggregation. Also SQL is widely understood and used by programmers. The language should be easy to learn Windows should be definable over time, number of frames, or breakpoints in other attributes in a message. Windows should be able to slide a variable amount. Depending on slide amount windows can be made disjoint or overlapping A merge operator is needed to join multiple streams
7
Requirement # 3 –Handle Imperfections
Data might be late delayed, missing, or out-of sequence Time out individual calculations or computations Challenges with Dealing with out-of-order data Mechanism for additional time Networks aren’t reliable Let’s say computing average number of people in ten rooms. One of the cameras in a room is broken. You don’t want the system to block waiting for a result that will not come. A time out system is a must Let’s say you have a time window 9:00 – 9:01; Ordinarily after a timestamp greater than 9:01 is received the window will be closed. However this action assumes that data arrives in timestamp order which is not always the case. To deal with out of order data, a mechanism must be provided to allow windows to stay open for additional period time.
8
Requirement #4 – Generate Predictable Outcomes
Generate deterministic and repeatable results Time-ordered deterministic processing throughout entire pipeline Important for fault tolerance and recovery A stream processing system must process time-series messages in a predictable manner to ensure that the results of the processing are deterministic and repeatable Soundtrack with movie – It’s important that the frames of the movie and the sound wav file are processed in the correct order Time ordering is needed to guarantee correctness Time ordering is needed for fault tolerance and recovery, as replaying the same stream should yield the same results
9
Requirement #5 – Integrate Stored and Streaming Data
Comparing present with past Capability to efficiently store, access, and modify state information A query may wish to include a picture with known terrorists Finding unusual activity – requires gathering the usual activity patterns and comparing
10
Requirement #6 – Guarantee Data Safety
Must use a high-availability solution Secondary System Synchronizes with primary frequently Takes over in case of failure Mission critical information needs backup plan. If monitoring can’t have it failing.
11
Requirement #7 – Partition and Scale Automatically
Take advantage of distributed computing Support multi-threading Takes advantage of multi-processor Avoids blocking Load Balance across machines Automatic process Transparent
12
Requirement #8 – Process and Respond Instantaneously
Needs to respond in real – time Highly optimized, minimal overhead execution path All system components have high performance
13
Requirement #9 - Adaptability
Change queries without restarting Accept all different types of multimedia streams Allow for custom configuration Work with different systems API
14
DBMS Widely used Passive Do not keep data moving
Use SQL – but not equipped for Streams Passive Do not keep data moving Difficult to handle out of order data Difficulty with predictable out comes Incur latency with seamless integration Widely used due to their ability to reliably store large data sets and efficiently process human-iniated queries. Passive – wait to be told what to do. Some have trigger mechanisms but Triggers are not scalable Moving – require write to disk and then access – not real time Difficult – trigger systems have no obvious way to time out. Predicatable outcomes are difficult because they are passive
15
Rule Engine Example – Prolog Active Handle imperfections
Troubles with seamless integration A rule engine typically accepts condition / action pairs – using if then notation – enforces a collection of rules
16
Stream Processing Engine
Handle all the requirements Not specifically designed to handle multimedia constraints Not Specifically designed to handle streams of multimedia
17
Chart DBMS Rule Engine SPE MSPE Keep data moving No Yes SigmaQL
Handle Imperfections Difficult Possible Predictable outcome High availability Stored and Streamed data Distribution and scalability POssible Real time Adaptability
18
Aurora DSMS developed at MIT and Brown
19
Aurora Query Network QoS .
Consists of operator boxes and connection points – storage points Use QoS graphs to determine best path Has built in scheduling, optimization and load shedding Supports distributed environment
20
Stream Management System
Developed at Stanford Uses synopsis and queues
21
Simple Query Plan Q1 Q2 ⋈ ⋈ State3 State4 Scheduler State1 State2
Stream3 Consists of queues which connect producer and consumer Synopses – has tables at operators to store state Has a scheduler Stream1 Stream2
22
NiagaraCQ Developed at Wisconsin First DSMS Uses a grouping strategy
Not as complete as other two
23
System Architecture
24
TelegraphCQ Developed at Berkeley Stem – storage point
Eddy – route tuples Good at handling multiple queries Adaptive
25
Adaptivity (Telegraph)
Output Queues STeMs for join R grouped filter (R.A) EDDY S grouped filter (S.B) R x S x T T Input Streams R S T Runtime Adaptivity Multi-query Optimization Framework – implements arbitrary schemes
26
My Project Design a multimedia streaming database
Outline the specifications The Scheduling algorithm The query structure The operators Etc.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.