A new model and architecture for data stream management.

Slides:



Advertisements
Similar presentations
Load Management and High Availability in Borealis Magdalena Balazinska, Jeong-Hyon Hwang, and the Borealis team MIT, Brown University, and Brandeis University.
Advertisements

Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.
1 11. Streaming Data Management Chapter 18 Current Issues: Streaming Data and Cloud Computing The 3rd edition of the textbook.
DEXA 2005 Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Mohamed Hefeeda‡, Yuni Xia†, Sunil Prabhakar†, and Song.
The Design of the Borealis Stream Processing Engine Daniel J. Abadi1, Yanif Ahmad2, Magdalena Balazinska1, Ug ̆ur C ̧ etintemel2, Mitch Cherniack3, Jeong-Hyon.
The Design of the Borealis Stream Processing Engine Brandeis University, Brown University, MIT Magdalena BalazinskaNesime Tatbul MIT Brown.
IBM TJ Watson Research Center © 2010 IBM Corporation – All Rights Reserved AFRL 2010 Anand Ranganathan Role of Stream Processing in Ad-Hoc Networks Where.
Design and Implementation of a Middleware for Sentient Spaces Bijit Hore, Hojjat Jafarpour, Ramesh Jain, Shengyue Ji, Daniel Massaguer Sharad Mehrotra,
An Abstract Semantics and Concrete Language for Continuous Queries over Streams and Relations Presenter: Liyan Zhang Presentation of ICS
Aurora Proponent Team Wei, Mingrui Liu, Mo Rebuttal Team Joshua M Lee Raghavan, Venkatesh.
Chapter 10: Stream-based Data Management Title: Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core Authors:
Quality-Of-Service (QoS) Panel Mitch Cherniack Brandeis David Maier OGI Rajeev Motwani Stanford Johannes GehrkeCornell Hari BalakrishnanMIT SWiM, Stanford.
Stream Processing Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems March 30, 2005.
Scalable Distributed Stream System Mitch Cherniack, Hari Balakrishnan, Magdalena Balazinska, Don Carney, Uğur Çetintemel, Ying Xing, and Stan Zdonik Proceedings.
MPDS 2003 San Diego 1 Reducing Execution Overhead in a Data Stream Manager Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack.
Monitoring Streams -- A New Class of Data Management Applications Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack Brandeis.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Re-thinking Data Management for Storage-Centric Sensor Networks Deepak Ganesan University.
Providing First Responders Critical Real-Time GIS and AVL Information Paper UC1566 Presented By: David Blankinship, Steve Harris, Edward Kolarik and Thomas.
Streaming Data, Continuous Queries, and Adaptive Dataflow Michael Franklin UC Berkeley NRC June 2002.
Monitoring Streams -- A New Class of Data Management Applications Don Carney Brown University Uğur ÇetintemelBrown University Mitch Cherniack Brandeis.
SWIM 1/9/20031 QoS in Data Stream Systems Rajeev Motwani Stanford University.
Panel on Stream Query Languages The Aurora View Stan Zdonik Brown University.
U.S. Department of the Interior U.S. Geological Survey Development of Inferential Sensors for Real-time Quality Control of Water- level Data for the EDEN.
Monitoring Streams- A New Class of Data Management Applications Presented by Qing Cao at
MONITORING STREAMS: A NEW CLASS OF DATA MANAGEMENT APPLICATIONS DON CARNEY, U Ğ UR ÇETINTEMEL, MITCH CHERNIACK, CHRISTIAN CONVEY, SANGDON LEE, GREG SEIDMAN,
The Design of the Borealis Stream Processing Engine CIDR 2005 Brandeis University, Brown University, MIT Kang, Seungwoo Ref.
Providing Resiliency to Load Variations in Distributed Stream Processing Ying Xing, Jeong-Hyon Hwang, Ugur Cetintemel, Stan Zdonik Brown University.
COMPUTER&AUTOMATIONCOMPUTER&AUTOMATION RESEARCH INSTITUTERESEARCH INSTITUTE Systems and Control Laboratory1 Commercial Vehicle Fleet Management System.
Chapter 10: Stream-based Data Management Title: Retrospective on Aurora Authors: Hari Balakrishnan, et. al.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Re-thinking Data Management for Storage-Centric Sensor Networks Deepak Ganesan University.
FEN  Concepts and terminology  Operations (relational algebra)  Integrity constraints The relational model.
Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.
IGERT: Graduate Program in Computational Transportation Science Ouri Wolfson (Project Director) Peter Nelson, Aris Ouksel, Robert Sloan Piyushimita Thakuriah.
A new model and architecture for data stream management.
February 3, Location Based M-Services The numbers of on-line mobile personal devices increase. New types of context-aware e-services become possible.
1 Fjording The Stream An Architecture for Queries over Streaming Sensor Data Samuel Madden, Michael Franklin UC Berkeley.
Query Processing for Sensor Networks Yong Yao and Johannes Gehrke (Presentation: Anne Denton March 8, 2003)
Aurora – system architecture Pawel Jurczyk. Currently used DB systems Classical DBMS: –Passive repository storing data (HADP – human-active, DBMS- passive.
Creating and Maintaining Geographic Databases. Outline Definitions Characteristics of DBMS Types of database Relational model SQL Spatial databases.
Load Shedding in Stream Databases – A Control-Based Approach Yicheng Tu, Song Liu, Sunil Prabhakar, and Bin Yao Department of Computer Science, Purdue.
Network Computing Laboratory A programming framework for Stream Synthesizing Service.
1 Supporting Dynamic Migration in Tightly Coupled Grid Applications Liang Chen Qian Zhu Gagan Agrawal Computer Science & Engineering The Ohio State University.
Aurora Group 19 : Chu Xuân Tình Trần Nhật Tuấn Huỳnh Thái Tâm Lec: Associate Professor Dr.techn. Dang Tran Khanh A new model and architecture for data.
Control-Based Load Shedding in Data Stream Management Yicheng Tu †, Song Liu ‡, Sunil Prabhakar †, Bin Yao ‡ † Indiana Center of Database Systems, Department.
Aurora: a new model and architecture for data stream management Daniel J. Abadi 1, Don Carney 2, Ugur Cetintemel 2, Mitch Cherniack 1, Christian Convey.
February 4, Location Based M-Services Soon there will be more on-line personal mobile devices than on-line stationary PCs. Location based mobile-services.
Control-Based Load Shedding in Data Stream Management Systems Yicheng Tu and Sunil Prabhakar Department of Computer Sciences, Purdue University April 3,
Monitoring Streams -- A New Class of Data Management Applications based on paper and talk by authors below, slightly adapted for CS561: Don Carney Brown.
AegisDB: Integrated realtime geo-stream processing and monitoring system Chengyang Zhang Computer Science Department University of North Texas.
Control-based Quality Adaptation in Data Stream Management Systems (DSMS) Yicheng Tu†, Song Liu‡, Sunil Prabhakar†, and Bin Yao‡ † Department of Computer.
Control-Based Load Shedding in Data Stream Management Systems Yicheng Tu and Sunil Prabhakar Department of Computer Sciences, Purdue University April 3,
Using Structured Query Language (SQL) NCCS Applications –MS Access queries (“show SQL”) –SAS (PROC SQL) –MySQL (the new dataserver) –Visual Foxpro Other.
Data Streams COMP3017 Advanced Databases Dr Nicholas Gibbins –
Introduction GIS often represent spatial information with a two-dimensional x,y coordinate system. Some data linearly measured. In order to use the information.
S. Sudarshan CS632 Course, Mar 2004 IIT Bombay
COMP3211 Advanced Databases
Load Shedding CS240B notes.
An overview of Data Streaming
Advanced Database Management System
Data Stream Management System (DSMS)
Presenter Kyungho Jeon 11/17/2018.
Multimedia Data Stream Management System
Load Shedding in Stream Databases – A Control-Based Approach
Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy
Advanced Database Management System
Load Shedding CS240B notes.
Chengyang Zhang Computer Science Department University of North Texas
Adaptive Query Processing (Background)
An Analysis of Stream Processing Languages
Stream-Lined Data Management
Presentation transcript:

A new model and architecture for data stream management

Why on earth would one need it? Data Stream Management

The Problem: Tokyo Traffic Control

Stream Processing for Traffic Control  24-hour real-time control  traffic intersections  traffic signals  Input  Cameras  Helicopters  Police  Citizen reports  vehicle detectors  Onboard vehicle sensors  Traffic jams, accidents & closed streets  Output  Central monitors  300 traffic information boards  Digital speed signs  Route signs  Affectors  Adjusted traffic signal lights (7.000)  Communications with officers on site

TTC: Center Display Board

TTC: Information Board

Example Domains  Smart Energy Grid Management  Network Traffic Management  System Monitoring  Road Traffic Monitoring  Military Logistics  Online Auctions  Habitat Monitoring  Immersive Environments

Stream Processing Engines  HADP vs DAHP  Events & Triggers  Continuous Queries  Real-time processing  Transient data  Lossy information

Overview Aurora

The Topic  Aurora  The prototype  DBMS / SPE / DSMS  UI  The query language  The project  The authors

The Authors  M.I.T., Department of EECS and Laboratory of Computer Science  Michael Stonebraker  Brandeis University, Department of Computer Science  Daniel J. Abadi  Mitch Cherniack  Brown University, Department of Computer Science  Don Carney  Uğur Çetintemel  Christian Convey  Sangdon Lee  Nesime Tatbul  Stan Zdonik

Talk Overview  Stream Processing Engines  SQuAl  Runtime  Related work

SQuAl (Stream Query Algebra) Aurora

SQuAl Overview  Connection Points  Models  Continuous Query  View  Ad-hoc Query  Operators  Order-agnostic  Order-sensitive

SQuAl Operators  Order-agnostic  Filter  Map  Union  Order-sensitive  BSort  Aggregate  Join  Resample  Quirks!

Union (Unordered)

BSort (Ordered)

SQuAl: Example

Runtime Aurora

Query Optimization  Dynamic Continuous Query Optimization  Inserting projections  Combining boxes  Reordering boxes  Ad-hoc query optimization

Real-time Scheduling  Timestamped Tuples  Train scheduling  Interbox nonlinearities  Intrabox nonlinearities  Superboxes  Introspection  Static  Run-time

Handling overload  QoS specifications  Response times  Tuple drops  Values produced  Load Shedding  Not Implemented at the time

Related work Aurora

Related work  STREAM  Stanford University,  Telegraph  UC Berkley, ?  SASE  UC Berkley / Mass Amherst, ?  Cayuga  Cornell University, ?  PIPES  University of Marburg, ?  NiagaraCQ  University of Wiscon-Madison,

Aurora’s Evolution TimespanProject Aurora (and Aurora*) Medusa Borealis (Medusa + Aurora*) 2003-presentStreamBase (Commercialized)

Complex Event Processing Today  Oracle  Oracle CEP  Microsoft  MS SQL Server StreamInsight  Open Source  OpenPDC  Aleri  Coral8  TruViso  StreamBase  Aurora’s Grandchild  IBM  SPADE  Active Middleware Technology

Summary  SPEs address different problems  e.g. dynamic realtime monitoring  Data Active, Human Passive  Realtime, transient, even lossy data  Aurora evolved into StreamBase  SQuAl evolved into StreamSQL  Many production-quality alternatives

Filter (Unordered)

Map (Unordered)

Aggregate (Ordered)

Join (Ordered)

Resample (Ordered)  Based on RRDTool’s philosophy?  Paper:  Simple interpolation  Use The Force, Read The Source:  Average  Count  Sum  Max  Min  LastVal