Distributed Query Processing over Streaming and Stored Data Alasdair J G Gray Information Management Group University of Manchester Dagstuhl Seminar –

Slides:



Advertisements
Similar presentations
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Advertisements

Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.
Berkeley dsn declarative sensor networks problem David Chu, Lucian Popa, Arsalan Tavakoli, Joe Hellerstein approach related dsn architecture status  B.
Report on Common Intrusion Detection Framework By Ganesh Godavari.
Kien A. Hua Division of Computer Science University of Central Florida.
High-Performance Complex Event Processing over Streams Eugene Wu, Yanlei Diao, ShariqRizvi Presented by Ming Li and Mo Liu Presented by Ming Li and Mo.
Efficient Query Evaluation on Probabilistic Databases
StreaQuel Overview Mike Franklin UC Berkeley Language Panel 1 st Octennial SWiM Meeting January 9, 2003.
A Semantically Enabled Service Architecture for Mashups over Streaming and Stored Data Alasdair J G Gray University of Manchester Extended Semantic Web.
Distributed Query Processing over Streaming and Stored Heterogeneous Data Sources Alasdair J G Gray Information Management Group University of Manchester.
Slides thanks to Steve Lynden Amy Krause EPCC Distributed Query Processing with OGSA-DQP Principles and Architectures for Structured Data Integration:
An Abstract Semantics and Concrete Language for Continuous Queries over Streams and Relations Presenter: Liyan Zhang Presentation of ICS
1 Rethinking Data Management for Storage-centric Sensor Networks Yanlei Diao, Deepak Ganesan, Gaurav Mathur, and Prashant Shenoy CIDR 2007 Proceedings.
Future Access to the Scientific and Cultural Heritage – A shared Responsibility Birte Christensen-Dalsgaard State and University Library.
Dunja Mladenić Marko Grobelnik Jožef Stefan Institute, Slovenia.
CMSC724: Database Management Systems Instructor: Amol Deshpande
Automatic Data Ramon Lawrence University of Manitoba
Algebraic Laws. {P1,P2,…..} {P1,C1>...} parse convert apply laws estimate result sizes consider physical plans estimate costs pick best execute Pi answer.
14-18 March 2004 EDBT'04 : Service-Based Distributed Query Processing for the Grid (M N Alpdemir) 1 Title, places, people, funding, projects Manchester.
Efficient Query Evaluation over Temporally Correlated Probabilistic Streams Bhargav Kanagal, Amol Deshpande ΗΥ-562 Advanced Topics on Databases Αλέκα Σεληνιωτάκη.
Republishers in a Publish/Subscribe Architecture for Data Streams Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences, Heriot-Watt.
Integrating XML with Microsoft SQL Server ©NIITeXtensible Markup Language/Lesson 9/Slide 1 of 31 Objectives In this lesson, you will learn to: * Generate.
STREAM The Stanford Data Stream Management System.
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
Speaker: Oscar Corcho Building Semantic Sensor Webs and Applications ESWC 2011 Tutorial 29 May 2011.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
Data Integration on the Semantic Sensor Web Alasdair J G Gray Information Management Group University of Manchester Seminar at Imperial College London.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Environmental Monitoring: Database and Beyond Chengyang Zhang Computer Science Department University of North Texas.
Combining the strengths of UMIST and The Victoria University of Manchester Utility Driven Adaptive Workflow Execution Kevin Lee School of Computer Science,
CYBORG Domain Independent Distributed Database Retrieval System Alok Khemka Kapil Assudani Kedar Fondekar Rahul Nabar.
Sensor Database System Sultan Alhazmi
Optimizing Sensor Data Acquisition for Energy-Efficient Smartphone-based Continuous Event Processing By Archan Misra (School of Information Systems, Singapore.
1 Data Warehouses BUAD/American University Data Warehouses.
Optimization in XSLT and XQuery Michael Kay. 2 Challenges XSLT/XQuery are high-level declarative languages: performance depends on good optimization Performance.
Linked Stream Data: a URI naming proposal Juan F. Sequeda – Oscar Corcho University of Texas at Austin Universidad Politécnica de Madrid
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
Semantic Access to Existing Archives Using RDF and SPARQL Alasdair J G Gray.
POLICY ENGINE Research: Design & Language IRT Lab, Columbia University.
Copyright © 2013 Curt Hill UML Unified Modeling Language.
DDBMS Distributed Database Management Systems Fragmentation
Aum Sai Ram Security for Stream Data Modified from slides created by Sujan Pakala.
Data access and integration with OGSA-DAI: OGSA-DQP Steven Lynden University of Manchester.
OGSA-DQP:Service-Based Distributed Query Processing on the Grid M.Nedim Alpdemir Department of Computer Science University of Manchester.
1 CS851 Data Services in Advanced System Applications Sang H. Son
A Data Stream Publish/Subscribe Architecture with Self-adapting Queries Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences,
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
What does the Cloud mean for Data Management: Challenges and Opportunities Akrivi Vlachou Norwegian University of Science and Technology (NTNU), Trondheim,
Speaker: SSG4Env WP4 Semantic Integrator Proposal & WP2 Collaboration.
Distributed Database Management Systems. Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE.
1 Querying the Physical World Son, In Keun Lim, Yong Hun.
W. Hong & S. Madden – Implementation and Research Issues in Query Processing for Wireless Sensor Networks, ICDE 2004.
University of Maryland Scaling Heterogeneous Information Access for Wide area Environments Michael Franklin and Louiqa Raschid.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
COMP30311: Advanced Database Systems Norman Paton University of Manchester
Supporting Join Queries Talk by: Andy Cooke Collaborators: Alasdair Gray, Lisha Ma, and Werner Nutt Heriot-Watt University.
OGSA-DQP Steven Lynden University of Manchester. Data access & integration with OGSA-DAI: GGF 17 2 Introduction OGSA-DQP is a service based distributed.
Database Environment Chapter 2. The Three-Level ANSI-SPARC Architecture External Level Conceptual Level Internal Level Physical Data.
Chapter 13: Query Processing
Stream Reasoning with Linked Data Open Data Open Day 2013 Sina Samangooei, Nick Gibbins 26 June 2013.
The Design of an Acquisitional Query Processor For Sensor Networks Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong Presentation.
Streaming Semantic Data COMP6215 Semantic Web Technologies Dr Nicholas Gibbins –
Powerpoint Templates Data Communication Muhammad Waseem Iqbal Lecture # 07 Spring-2016.
Data Streams COMP3017 Advanced Databases Dr Nicholas Gibbins –
S. Sudarshan CS632 Course, Mar 2004 IIT Bombay
Distributed Database Management Systems
Data Warehouse.
The Design of an Acquisitional Query Processor For Sensor Networks
Lecture 16: Probabilistic Databases
Query Optimization CS 157B Ch. 14 Mien Siao.
Presentation transcript:

Distributed Query Processing over Streaming and Stored Data Alasdair J G Gray Information Management Group University of Manchester Dagstuhl Seminar – Semantic Challenges in Sensor Networks 25 – 29 January 2010

Acknowledgements RAs/PhDs Christian Y. A. Brenninkmeijer Ixent Galpin Alasdair J. G. Gray Farhana Jabeen Academics Alvaro A. A. Fernandes Norman W. Paton MSc Students Jamil Naja Varadarajan Rajagopalan 26 January 20102Semantic Challenges in Sensor Networks

Overview of the Talk Motivation Data source characteristics Query language: SNEEql Query processor: SNEE-DQP 26 January 20103Semantic Challenges in Sensor Networks

Motivating Scenario Discover relevant data sources (see Manolis Koubarakis’s talk) Unify data models (see Oscar Corcho’s talk) Extract, combine, and process relevant data. This talk and Alvaro’s 26 January 20104Semantic Challenges in Sensor Networks Stored data Sensor Network

Motivating Scenario Stored data Sensor Network 26 January 20105Semantic Challenges in Sensor Networks Sensor Network Stored data service Streaming data service

Data Source Characteristics Traditional stored data –Data stored in a database –User observes a static data set –One-off query execution Streaming data –Data processed on-the-fly (may also be stored for later access) –User observes changes in data set –Continuous or snap-shot query execution 26 January 20106Semantic Challenges in Sensor Networks

Types of Data Stream Pull StreamPush Stream Stream Processor Source GetData()Data Stream Processor Source Data 26 January 20107Semantic Challenges in Sensor Networks

Query Processing Challenges Variety of data sources –Stored –Push-stream –Pull-stream No common query semantics –Streaming data languages –Stored data languages Distributed data sources 26 January 2010Semantic Challenges in Sensor Networks8

SNEE-DQP Stored data Sensor Network SNEE-DQP 26 January 20109Semantic Challenges in Sensor Networks Stored data service Streaming data service Sensor Network Streaming data service

Query Language: SNEEql Aimed at in-WSN query processing –Pull streams –Reactive/periodic operators –Controls network behaviour Also capable of querying –Push streams –Stored sources Well defined semantics –Independent of system 26 January Semantic Challenges in Sensor Networks

SNEEql Query Syntax SELECT {RSTREAM | DSTREAM | ISTREAM} + attribute list FROM extent list WHERE expression *STREAM optional –Converts a window to a stream Extent list: –Streams with windows of the form [FROM t1 TO t2 SLIDE int unit] –Relations with windows of the form [SCAN EVERY t1 unit] 26 January Semantic Challenges in Sensor Networks

Example Query Every 15 minutes, and within 24 hours of their being taken, we wish to obtain time-correlated measurements of the river depth now and the rainfall at the top of the hill 15 minutes before, provided that it is now raining less in the river than it was in the hill top, that the rainfall in the hill top was above 5mm and greater than average rainfall. SELECT RSTREAM r.time, h.rain, r.depth FROM River[NOW] r, Hilltop[AT NOW-15 MINUTES] h, WHERE h.rain > 5 AND r.rain < h.rain AND h.rain >= (SELECT AVG(weather.rain) FROM Weather [rescan every day] WHERE weather.region = 'Peak District'); 26 January 2010Semantic Challenges in Sensor Networks12

SNEE DQP Query Stack Metadata –Logical schema –Physical schema Source Allocation –Splitting the query into parts for each data source Source Planning –Physical operator selection –Generate plan for source 26 January 2010Semantic Challenges in Sensor Networks13 Metadata SNEEql query + QoS Query Execution Plan Parsing Logical Planning Source Allocation Source Planning More details on in-WSN planning in Alvaro’s talk

Stream Data Query Processing Sensor Network Data Service Stream In-Network SNEE WSDL Stream Access Service WSDL Stream Access Service 26 January Semantic Challenges in Sensor Networks Data Service Stream Event Stream SNEE WSDL Stream Access Service WSDL Stream Access Service Sensor Network Acquisitional Stream Processing Event Stream Processing

Worked Example SELECT RSTREAM r.time, h.rain, r.depth FROM River[NOW] r, Hilltop[AT NOW-15 MINUTES] h, WHERE h.rain > 5 AND r.rain < h.rain AND h.rain >= (SELECT AVG(weather.rain) FROM Weather [rescan every day] WHERE weather.region = 'Peak District'); 26 January 2010Semantic Challenges in Sensor Networks15 EXCHANGE JOIN river.rain<hilltop.rain ACQUIRE [time,rain] rain > 5 hilltop EVERY 15 min ACQUIRE [time,rain] rain > 5 hilltop EVERY 15 min ACQUIRE [time,rain, depth] true river EVERY 15 min ACQUIRE [time,rain, depth] true river EVERY 15 min TIME_WINDOW [t-15, t-15, 15] DELIVER EXCHANGE AVERAGE (rain) AVERAGE (rain) SCAN [rain] region = ‘Peak District’ weather EVERY HOUR SCAN [rain] region = ‘Peak District’ weather EVERY HOUR JOIN h.rain >= AVG(weather.rain) JOIN h.rain >= AVG(weather.rain)

Conclusions Query-based access to distributed data sources, both streaming and stored SNEEql provides well defined, unified semantics for streaming and stored data SNEE-DQP provides execution environment 26 January 2010Semantic Challenges in Sensor Networks16

Motivating Scenario Stored data Sensor Network 26 January Semantic Challenges in Sensor Networks Stored data service Streaming data service Sensor Network Streaming data service