Moirae: History-Enhanced Monitoring

Slides:



Advertisements
Similar presentations
Chapter 5: Introduction to Information Retrieval
Advertisements

A Framework for Clustering Evolving Data Streams Charu C. Aggarwal, Jiawei Han, Jianyong Wang, Philip S. Yu Presented by: Di Yang Charudatta Wad.
6 SQL Server Integration Same manageability, administration & development experience Integrated queries & transactions Integrated HA and backup/restore.
SkewTune: Mitigating Skew in MapReduce Applications
1 SAFIRE Project DHS Update – July 15, 2009 Introductions  Update since last teleconference Demo Video - Fire Incident Command Board (FICB) SAFIRE Streams.
SWiM Panel on Engine Implementation Jennifer Widom.
Chapter 11 - Monitoring Server Performance1 Ch. 11 – Monitoring Server Performance MIS 431 – created Spring 2006.
Mining Behavior Models Wenke Lee College of Computing Georgia Institute of Technology.
Streaming Data, Continuous Queries, and Adaptive Dataflow Michael Franklin UC Berkeley NRC June 2002.
GCSE Computing#BristolMet Session Objectives# Must identify some common types of computer system Should describe the meaning of a computer system Could.
Stream Clustering CSE 902. Big Data Stream analysis Stream: Continuous flow of data Challenges ◦Volume: Not possible to store all the data ◦One-time.
Cloud and Big Data Summer School, Stockholm, Aug Jeffrey D. Ullman.
Hands-On Microsoft Windows Server 2008
Module 7: Fundamentals of Administering Windows Server 2008.
On-Demand View Materialization and Indexing for Network Forensic Analysis Roxana Geambasu 1, Tanya Bragin 1 Jaeyeon Jung 2, Magdalena Balazinska 1 1 University.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Re-thinking Data Management for Storage-Centric Sensor Networks Deepak Ganesan University.
Mark A. Magumba Storage Management. What is storage An electronic place where computer may store data and instructions for retrieval The objective of.
1 SATWARE: A Semantic Middleware for Multi Sensor Applications Sharad Mehrotra.
© 2008 Quest Software, Inc. ALL RIGHTS RESERVED. Perfmon and Profiler 101.
Exploiting Gray-Box Knowledge of Buffer Cache Management Nathan C. Burnett, John Bent, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of.
SQL Unit Test Editor WinForm App T-SQL Script Dom Assemblies SQL Unit Test Framework Definition files for customization T-SQL from App SQL Profiler Trace.
Hyperion :High Volume Stream Archival Divya Muthukumaran.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison Spring 2000.
3.Data Communications 3.3Network Operating Systems.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
Troubleshooting Dennis Shasha and Philippe Bonnet, 2013.
Tool Support for Testing Classify different types of test tools according to their purpose Explain the benefits of using test tools.
SQL Server 2016 – New Features Tilahun Endihnew March 12, 2016.
Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window MADALGO – Center for Massive Data Algorithmics, a Center of the Danish.
Oracle Database Architectural Components
CPSC-310 Database Systems
Integration of Oracle and Hadoop: hybrid databases affordable at scale
Mining Data Streams (Part 1)
Monitoring Windows Server 2012
Smarter Technology for Better Business
USGS EROS LCMAP System Status Briefing for CEOS
S. Sudarshan CS632 Course, Mar 2004 IIT Bombay
Integration of Oracle and Hadoop: hybrid databases affordable at scale
Managing Multi-User Databases
The Stream Model Sliding Windows Counting 1’s
ALICE Monitoring
Applying Control Theory to Stream Processing Systems
Andy Wang COP 5611 Advanced Operating Systems
Andy Wang COP 5611 Advanced Operating Systems
SQL Server Monitoring Overview
Information Retrieval and Web Search
Enabling Scalable and HA Ingestion and Real-Time Big Data Insights for the Enterprise OCJUG, 2014.
Introduction to NewSQL
MONITORING MICROSOFT WINDOWS SERVER 2003
Ishan Sharma Abhishek Mittal Vivek Raj
Mapping the Data Warehouse to a Multiprocessor Architecture
Database management concepts
Communication and Memory Efficient Parallel Decision Tree Construction
Operating Systems.
Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy
Ch 4. The Evolution of Analytic Scalability
Operating Systems.
Managing Services with VMM and App Controller
Database management concepts
Andy Wang COP 5611 Advanced Operating Systems
Incrementally Maintaining Classification using an RDBMS
Ridewaan Hanslo ridewaanhanslo
Performance And Scalability In Oracle9i And SQL Server 2000
Chapter-1 Computer is an advanced electronic device that takes raw data as an input from the user and processes it under the control of a set of instructions.
An Analysis of Stream Processing Languages
Andy Wang COP 5611 Advanced Operating Systems
7/28/ :33 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or.
Supporting Online Analytics with User-Defined Estimation and Early Termination in a MapReduce-Like Framework Yi Wang, Linchuan Chen, Gagan Agrawal The.
Fetch And Add – switching network
Presentation transcript:

Moirae: History-Enhanced Monitoring Magdalena Balazinska, YongChul Kwon, Nathan Kuchta, and Dennis Lee University of Washington and Marchex Inc.

Monitoring Applications Continuously observe current state Produce near real-time information and alerts Event: (eid, timestamp, a1, ..., an) Examples Sensor-based environment monitoring Computer system monitoring Network intrusion detection load level time

Problem: Exploit History Monitoring applications accumulate history Exploiting history can improve monitoring apps Refine event detection Explain newly detected events What types of queries to support? How to support them?

Types of Queries Standard hybrid queries Contextual hybrid queries Standard SQL query over the data archive executes as part of the continuous query “For each network intrusion, show historical activity of the intruder on the network” Contextual hybrid queries For each newly detected event, produce approximate set of k most similar past events “If a server fails, show similar alerts that occurred in the past”

historical information Standard Hybrid Query Query model based on Borealis Input Streams Stream Proc. Operators Recall Other Stream Proc. Ops. Continuous stream processing (event query) Archive Look up historical information (historical query)

Contextual Hybrid Query Similar past events Input Stream Event detection Event Similarity Recall Input Stream Window Join Past contexts Context Input Sream Window Join Find similar past events Use TF-IDF for similarity computation Archive

Framework Three key issues Three goals History size, near real-time, concurrent events Three goals Responsiveness and fairness Incremental processing integrated with stream processing Retrieve at least some historical data for all new events Relevance: favor recent history over older history Similarity: find similar past events Exploit context similarity in other ways as well

Moirae’s Design Based on Borealis Based on PostgreSQL Approximate & incrementally improving results Stop Improving Query Contextual & Standard hybrid queries Application MOIRAE SPE Stream Processor Deploy Manager Based on Borealis Raw Streams Recall Manager RDBMS Storage Manager Based on PostgreSQL Archiver Materialized Events & Context Raw Stream Archive Other Materialized Views Present Chunk

Design Components Archiver: partitioned stream archive Archive raw and intermediate streams Present chunks in memory Recent chunks on disk Materialize & index necessary streams and contexts Old chunks on disk Recall Manager: partitioned, incremental queries Execute queries one chunk at the time (present ->past) Schedule concurrent queries to ensure fairness Incorporate user feedback to drop events

Related Work Queries over live data & data archives [chandrasekaran:04,chandrasekaran:05, franklin:05] Log-structured access method [muth:00] Multi-level storage manager [stonebraker:91] Materialized views (e.g., [goldstein:01]) Partial indexes [stonebraker:89,sartori:94,seshadri:95] Online processing [hellerstein:97,hellerstein:00, raman:02,shanmugasundaram:01,tan:99] top-K, kNN, IR+RDBMS [e.g.,carey:97,fagin:03,chaudhuri:05,li:05]

Conclusion Monitoring applications accumulate history How to leverage history ? By supporting queries for specific historical data Through new types of queries How to support all these different queries ? Can reuse several database/IR techniques Need to integrate and extend these techniques More information about Moirae http://data.cs.washington.edu/moirae/moirae.shtml