Data Stream Management Systems

Slides:



Advertisements
Similar presentations
1 11. Streaming Data Management Chapter 18 Current Issues: Streaming Data and Cloud Computing The 3rd edition of the textbook.
Advertisements

Maintaining Variance over Data Stream Windows Brian Babcock, Mayur Datar, Rajeev Motwani, Liadan O ’ Callaghan, Stanford University ACM Symp. on Principles.
Mining Data Streams.
A Data Stream Management System for Network Traffic Management Shivnath Babu Stanford University Lakshminarayanan Subramanian Univ. California, Berkeley.
Web Screen Saver 2008  Display content on a screensaver  PowerPoint slideshows  Video files  Flash animations  Web pages and RSS feeds  Update the.
Data Stream Computation Lecture Notes in COMP 9314 modified from those by Nikos Koudas (Toronto U), Divesh Srivastava (AT & T), and S. Muthukrishnan (Rutgers)
Data Streams & Continuous Queries The Stanford STREAM Project stanfordstreamdatamanager.
Algorithms for data streams Foundations of Data Science 2014 Indian Institute of Science Navin Goyal.
1 Continuous Queries over Data Streams Vitaly Kroivets, Lyan Marina Presentation for The Seminar on Database and Internet The Hebrew University of Jerusalem,
Ph.D. DefenceUniversity of Alberta1 Approximation Algorithms for Frequency Related Query Processing on Streaming Data Presented by Fan Deng Supervisor:
Chapter Physical Database Design Methodology Software & Hardware Mapping Logical Design to DBMS Physical Implementation Security Implementation Monitoring.
Aurora Proponent Team Wei, Mingrui Liu, Mo Rebuttal Team Joshua M Lee Raghavan, Venkatesh.
Public Works and Government Services Canada Travaux publics et Services gouvernementaux Canada Password Management for Multiple Accounts Some Security.
SWiM Panel on Engine Implementation Jennifer Widom.
1 Stream-based Data Management IS698 Min Song 2 Characteristics of Data Streams  Data Streams Data streams — continuous, ordered, changing, fast, huge.
Network Traffic Measurement and Modeling CSCI 780, Fall 2005.
Building a Data Stream Management System Prof. Jennifer Widom Joint project with Prof. Rajeev Motwani and a team of graduate studentshttp://www-db.stanford.edu/stream.
1 PODS 2002 Motivation. 2 PODS 2002 Data Streams data sets Traditional DBMS – data stored in finite, persistent data sets data streams New Applications.
1 Mining Data Streams The Stream Model Sliding Windows Counting 1’s.
Mining Behavior Models Wenke Lee College of Computing Georgia Institute of Technology.
Chapter 14 The Second Component: The Database.
The Stanford Data Streams Research Project Profs. Rajeev Motwani & Jennifer Widom And a cast of full- and part-time students: Arvind Arasu, Brian Babcock,
Bro: A System for Detecting Network Intruders in Real-Time Presented by Zachary Schneirov CS Professor Yan Chen.
Models and Issues in Data Streaming Presented By :- Ankur Jain Department of Computer Science 6/23/03 A list of relevant papers is available at
CS 591 A11 Algorithms for Data Streams Dhiman Barman CS 591 A1 Algorithms for the New Age 2 nd Dec, 2002.
Intrusion Detection Systems. Definitions Intrusion –A set of actions aimed to compromise the security goals, namely Integrity, confidentiality, or availability,
Morten Lindeberg University of Oslo (With slides from Vera Goebel)
Cloud and Big Data Summer School, Stockholm, Aug Jeffrey D. Ullman.
Play The Web 2010  Digital signage solution  Display content on a web kiosk  PowerPoint slideshows  Video files  Flash animations  Web pages and.
CPS 216: Advanced Database Systems Shivnath Babu.
Internet Traffic Management. Basic Concept of Traffic Need of Traffic Management Measuring Traffic Traffic Control and Management Quality and Pricing.
An Agent-based Bayesian Forecasting Model for Enhancing Network Security J. PIKOULAS, W.J. BUCHANAN, Napier University, Edinburgh, UK. M. MANNION, Glasgow.
CPS 216: Advanced Database Systems Shivnath Babu Fall 2006.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
Master’s Thesis (30 credits) By: Morten Lindeberg Supervisors: Vera Goebel and Jarle Søberg Design, Implementation, and Evaluation of Network Monitoring.
Data Stream Systems Reynold Cheng 12 th July, 2002 Based on slides by B. Babcock et.al, “Models and Issues in Data Stream Systems”, PODS’02.
John Plummer Technical Specialist Data Platform Microsoft Ltd StreamInsight Complex Event Processing (CEP) Platform.
Data Management Conference Introducing SQL Server 2008 R2 Mark Linton Director of WW Marketing SQL Server Business Group
PODS Models and Issues in Data Stream Systems Rajeev Motwani Stanford University (with Brian Babcock, Shivnath Babu, Mayur Datar, and Jennifer Widom)
INFORMATION MANAGEMENT Unit 2 SO 4 Explain the advantages of using a database approach compared to using traditional file processing; Advantages including.
APM for Security Forensics ENHANCING IT SECURITY WITH POST-EVENT INTRUSION RESOLUTION Lakshya Labs.
Aum Sai Ram Security for Stream Data Modified from slides created by Sujan Pakala.
INNOV-10 Progress® Event Engine™ Technical Overview Prashant Thumma Principal Software Engineer.
Department of Computer Science and Engineering Applied Research Laboratory Architecture for a Hardware Based, TCP/IP Content Scanning System David V. Schuehler.
Data Streams Topics in Data Mining Fall 2015 Bruno Ribeiro © 2015 Bruno Ribeiro.
Data Mining: Concepts and Techniques Mining data streams
Intrusion Detection Systems Paper written detailing importance of audit data in detecting misuse + user behavior 1984-SRI int’l develop method of.
Venus Project Brief Description. What It Do What Monitor Log Analyze Block Narrow Report Search Where Single stations Internet Gates Special Devices Web.
Aurora: a new model and architecture for data stream management Daniel J. Abadi 1, Don Carney 2, Ugur Cetintemel 2, Mitch Cherniack 1, Christian Convey.
Intrusion Detection System
Mining of Massive Datasets Ch4. Mining Data Streams
Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang.
Review Lecture DB A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Review Lecture Databases Phil Gibbons May 1, 2003.
CPT-S Advanced Databases 11 Yinghui Wu EME 49.
Lecture A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Lecture 15 Sensor Databases & Data Stream Systems Phil Gibbons.
1 Internet Traffic Measurement and Modeling Carey Williamson Department of Computer Science University of Calgary.
Stream Reasoning with Linked Data Open Data Open Day 2013 Sina Samangooei, Nick Gibbins 26 June 2013.
Understanding DBMSs. Data Management Data Query Application DataBase Management System (DBMS)
Streaming Semantic Data COMP6215 Semantic Web Technologies Dr Nicholas Gibbins –
Data Streams COMP3017 Advanced Databases Dr Nicholas Gibbins –
Data Mining - Introduction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Mining Data Streams (Part 1)
Advanced Database Systems: DBS CB, 2nd Edition
3.1 Types of Servers.
COMP3211 Advanced Databases
The Stream Model Sliding Windows Counting 1’s
Arvind Arasu, Brian Babcock
Introduction to Stream Computing and Reservoir Sampling
Mining Data Streams Many slides are borrowed from Stanford Data Mining Class and Prof. Jiawei Han’s lecture slides.
Presentation transcript:

Data Stream Management Systems

Data Streams Traditional DBMS – data stored in finite, persistent data sets New Applications – data input as continuous, ordered data streams

Applications ? Network monitoring and traffic engineering Healthcare monitoring Network security Financial applications Sensor networks Manufacturing processes Web logs and click streams Massive data sets

Sample Applications Network security (e.g., iPolicy, NetForensics/Cisco, Niksun) Network packet streams, user session information Queries: URL filtering, detecting intrusions & DOS attacks & viruses Financial applications (e.g., Traderbot) Streams of trading data, stock tickers, news feeds Queries: arbitrage opportunities, analytics, patterns

Data Stream Management System User/Application Register Query Results Data Stream Management System (DSMS) Stream Query Processor Scratch Space (Memory and/or Disk)

Meta-Questions Killer-apps Application stream rates exceed DBMS capacity? Can DSMS handle high rates anyway? Motivation Need for general-purpose DSMS? Not ad-hoc, application-specific systems? Non-Trivial DSMS = merely DBMS with enhanced support for triggers, temporal constructs, data rate mgmt?

DBMS versus DSMS Persistent relations Transient streams One-time queries Random access (pull) “Unbounded” disk store Only current state matters Passive repository Relatively low update rate No real-time services Assume precise data Access plan determined by query processor, physical DB design Transient streams Continuous queries Sequential access (push) Bounded main memory History/arrival-order is critical Active stores Multi-GB arrival rates Real-time requirements Data stale/imprecise Unpredictable/variable data arrival & charact.