Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos.

Slides:



Advertisements
Similar presentations
RDFTL: An Event-Condition- Action Language for RDF George Papamarkos Alexandra Poulovassilis Peter T. Wood School of Computer Science and Information Systems.
Advertisements

SeLeNe Kick-off Meeting 15-16/11/2002 SeLeNe-related Research At Birkbeck Alex Poulovassilis and Peter T.Wood Database and Web Technologies Group School.
Database System Concepts and Architecture
XML: Extensible Markup Language
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
Selective Dissemination of Streaming XML By Hyun Jin Moon, Hetal Thakkar.
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
The SQL Query Language DML1 The SQL Query Language DML Odds and Ends.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
1 Distributed Databases Chapter What is a Distributed Database? Database whose relations reside on different sites Database some of whose relations.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
Cmpt-225 Simulation. Application: Simulation Simulation  A technique for modeling the behavior of both natural and human-made systems  Goal Generate.
CS401 presentation1 Effective Replica Allocation in Ad Hoc Networks for Improving Data Accessibility Takahiro Hara Presented by Mingsheng Peng (Proc. IEEE.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Department of Computer Science 1 CSS 496 Business Process Re-engineering for BS(CS)
Service Broker Lesson 11. Skills Matrix Service Broker Service Broker, provides a solution to common problems with message delivery and consistency that.
TIBCO Designer TIBCO BusinessWorks is a scalable, extensible, and easy to use integration platform that allows you to develop, deploy, and run integration.
2.2 SQL Server 2005 的 XML 支援功能. Overview XML Enhancements in SQL Server 2005 The xml Data Type Using XQuery.
CSCI 6962: Server-side Design and Programming JDBC Database Programming.
Developing Analytical Framework to Measure Robustness of Peer-to-Peer Networks Niloy Ganguly.
1 Distributed Monitoring of Peer-to-Peer Systems By Serge Abiteboul, Bogdan Marinoiu Docflow meeting, Bordeaux.
NiagaraCQ : A Scalable Continuous Query System for Internet Databases (modified slides available on course webpage) Jianjun Chen et al Computer Sciences.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
Database Technical Session By: Prof. Adarsh Patel.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
CODD’s 12 RULES OF RELATIONAL DATABASE
San Diego Supercomputer Center SDSC Storage Resource Broker Data Grid Automation Arun Jagatheesan et al., San Diego Supercomputer Center University of.
AXML Transactions Debmalya Biswas. 16th AprSEIW Transactions A transaction can be considered as a group of operations encapsulated by the operations.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
Triggers A Quick Reference and Summary BIT 275. Triggers SQL code permits you to access only one table for an INSERT, UPDATE, or DELETE statement. The.
Salah A. Aly,Moustafa Youssef, Hager S. Darwish,Mahmoud Zidan Distributed Flooding-based Storage Algorithms for Large-Scale Wireless Sensor Networks Communications,
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
National Institute of Advanced Industrial Science and Technology Query Processing for Distributed RDF Databases Using a Three-dimensional Hash Index Akiyoshi.
Analyzing the Vulnerability of Superpeer Networks Against Attack Niloy Ganguly Department of Computer Science & Engineering Indian Institute of Technology,
What is a Package? A package is an Oracle object, which holds other objects within it. Objects commonly held within a package are procedures, functions,
Advanced SQL Concepts - Checking of Constraints CIS 4301 Lecture Notes Lecture /6/2006.
A Data Stream Publish/Subscribe Architecture with Self-adapting Queries Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences,
INRIA - Progress report DBGlobe meeting - Athens November 29 th, 2002.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Session 1 Module 1: Introduction to Data Integrity
Accessing XML Documents Using DOM ©NIITeXtensible Markup Language/Lesson 8/Slide 1 of 23 Objectives In this lesson, you will learn to: * Use XML DOM objects.
Brief Announcement : Measuring Robustness of Superpeer Topologies Niloy Ganguly Department of Computer Science & Engineering Indian Institute of Technology,
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
NiagaraCQ : A Scalable Continuous Query System for Internet Databases Jianjun Chen et al Computer Sciences Dept. University of Wisconsin-Madison SIGMOD.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
SQL Triggers, Functions & Stored Procedures Programming Operations.
In this session, you will learn to: Create and manage views Implement a full-text search Implement batches Objectives.
Composing Web Services and P2P Infrastructure. PRESENTATION FLOW Related Works Paper Idea Our Project Infrastructure.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
SQL Database Management
Salah A. Aly ,Moustafa Youssef, Hager S. Darwish ,Mahmoud Zidan
Pastry Scalable, decentralized object locations and routing for large p2p systems.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Peer-to-Peer Data Management
Chapter 2 Database Environment Pearson Education © 2009.
Effective Replica Allocation
Unit I-2.
Event-Condition-Action rules for the Semantic Web Alex Poulovassilis, Birkbeck, U. of London March 2006.
Deterministic and Semantically Organized Network Topology
CPSC-608 Database Systems
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment Pearson Education © 2009.
Assertions and Triggers
Presentation transcript:

Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos

13/10/20062 Outline What Event-Condition-Action (ECA) Rules are and what we can do with them? ECA Rules for XML ECA Langugage System Architecture Performance ECA Rules for RDF ECA Langugage System Architecture Performance

13/10/20063 What is an ECA Rule? An Event-Condition-Action rule performs actions in response to events, given that a stated condition holds An event in a database system can be the insertion of a new tuple The condition can be a query The action may be a relational table update This behaviour is called reactive functionality

13/10/20064 What is an ECA Rule? An ECA rule has the general syntax: on event if condition do action The event part specifies when the rule is triggered The condition part determines if the data are in a particular state, in which case the rule fires The action part describes the actions to be performed if the rule fires.

13/10/20065 Advantages of using ECA Rules Allow applications reactive functionality to be defined and managed within a single rule base rather than being encoded in the programs Use of a high-level declarative syntax and are thus amenable to analysis and optimisation techniques that cannot be applied if the functionality was encoded in the programming code

13/10/20066 Outline What Event-Condition-Action (ECA) Rules are and what we can do with them? ECA Rules for XML ECA Language System Architecture Performance ECA Rules for RDF ECA Langugage System Architecture Performance

13/10/20067 ECA Rules for XML - Outline Design issues of an ECA language for XML The XTL Language Implementing an XTL rules processing system Performance Study

13/10/20068 Design issues of an ECA language for XML Comparing with relational triggers the following are the most important XML-specific issues on designing an ECA language for XML Event Granularity: Specifying the granularity of where data has be modified is more complex and requires path expressions Action Granularity: Action may affect an entire sub- document meaning that: An action can trigger a different set of events The analysis of which events are triggered by an action cannot be based on syntax alone

13/10/20069 The XTL Language The general syntax of XTL rules is: on event if condition do action Fragments of XPath and XQuery are used to specify the event, condition and action parts of XTL rules. XPath is used for selecting and matching fragments of XML XQuery is used withing actions where it is needed to construct a new XML fragment

13/10/ The XTL Language Event Part Syntax: (INSERT | DELETE) e where e is an XPath expression evaluating to a set of nodes. A rule is triggered if this set of nodes includes any node in the XML fragment inserted or deleted The system-defined variable $delta contains this set of nodes and is available for use in condition and action part of the rule

13/10/ The XTL Language Condition Part The condition part is either the constant TRUE or one or more XPath expressions connected by the boolean connectives and, or, not. Each of these expressions is evaluated on the data to tell whether the condition is TRUE or FALSE

13/10/ The XTL Language Action Part: The action part is a sequece of one or more actions Syntax: INSERT r BELOW e (BEFORE | AFTER) q r is an XQuery expression specifying the XML fragment to be inserted, e is an XPath expression specifying the set of nodes under which the new fragment will be inserted, q is either a constant or an XPath qualifier specifying the set of nodes BEFORE or AFTER which the new nodes will be placed. DELETE e e is an XPath expression specifing the set of nodes to be deleted.

13/10/ XTL Language Example rule: ON INSERT doc(‘s.xml’)/shares/share/day- info/prices/price IF $delta > $delta/../../high DO DELETE $delta/../high; INSERT $delta/text() BELOW $delta/../.. AFTER prices

13/10/ XTL rule processing system

13/10/ XTL rule processing system - Architecture ECA Rules Management: Validates and registers a rule to the Rule Base ECA Rule Processing Engine: Evaluates the Event and Condition Parts of the rules and schedules their actions for execution in the Action Schedule

13/10/ System Performance The system performance was studied by: Developing an analytical model of the system Performing experiments in the actual system We have studied the effects of rule base indexes in system performance Performance criterion: Update response time: The mean time taken to complete all rule execution resulting from a single update submitted by a top-level update transaction

13/10/ System Performance Varying quantities: Number of rules in the rule base Experiments on the actual performed with three (3) different rule sets XML data set: a fragment of DBLP database

13/10/ System Performance - Analytical Model The analytical model is a mathematical description of the system behaviour Uses queue theory to simulate the transaction queues and database processing Uses a set of simplifying assumptions to emulate the behaviour of some system parameters (e.g. triggering probability, transaction arrival rate etc.)

13/10/ System Performance - Analytical Model Results

13/10/ System Performance - Analytical Model Response time increases non-linearly for as long as the system is stable (I.e. arrival rate in the transaction queue is less that the service rate) After the stability point the transaction queue grows uncontrollably large, flooding the memory and slowing it down Reasons: Everything served by a single queue High number of event query evaluations to find what is triggered

13/10/ System Performance - Experimental Results

13/10/ System Performance - Experimental Results Difference with Analytical Model due to: implementation choices (use of DOM etc.) and the simplification assumptions made in the analytical model

13/10/ System Performance

13/10/ System Performance - Indexing Rule Base

13/10/ System Performance - Indexing Rule Base Better overall behaviour and scalability characteristics due to smaller number of rules that need to be checked for triggering Smaller number of rules checked --> smaller number of queries need to be evaluated

13/10/ Outline What Event-Condition-Action (ECA) Rules are and what we can do with them? ECA Rules for XML ECA Langugage System Architecture Performance ECA Rules for RDF ECA Performance Langugage System Architecture

13/10/ ECA Rules for RDF The RDFTL ECA Language Implementing RDFTL processing system in P2P environments System performance

13/10/ The RDFTL Language We have designed the language from scratch specifically for RDF General Syntax: ON event IF condition DO action

13/10/ The RDFTL Language Event Part: May contain let expressions of the form: LET $var := e (INSERT | DELETE) e e is a path expression that evaluates on a set of RDF nodes. Catches the insertion or deletion of a node (INSERT | DELETE) triple triple is an expression of the form (source,arc, target) specifying an RDF triple. Catches the insertion or deletion of a property in an RDF triple. UPDATE upd_triple upd_triple is an expression of the form (source, arc, old_target->new_target). Catches the update of a property from one RDF node to another.

13/10/ The RDFTL Language Condition Part: It is a boolean-valued expression May consist of conjunctions, disjunctions and negations May also contain let expressions The $delta variable bound to the set of nodes or arcs modified and caught by the event part Action Part: A sequence of actions Each action has similar syntax with the event part

13/10/ RDFTL Rules in P2P Environments System Architecture

13/10/ RDFTL Rules in P2P Environments Each peer (P) is supervised by a superpeer (SP) The set of Ps supervised by an SP form a peergroup At each SP there is an RDFTL processing engine installed Each P or SP hosts a fragment of the RDF schema that may change due to updates Hybrid fragmentation with possible replication

13/10/ RDFTL Rules in P2P Environments Ps notify the SPs for any updates on their local data An ECA rule generated at one P or SP may be replicated, triggered, evaluated or executed in different sites in the net.

13/10/ Distributed Rule Registration A rule generated is sent from P to SP for validation and storage From there it is sent to all other SPs A replica of it will be stored also to those SPs that are e-relevant to the rule. I.e. the event part queries of a rule can be evaluated on SP At each SP each rule is annotated with IDs of local peers that are e-, c- and a-relevant to the rule c- and a- relevance have a similar meaning with e- relevance for the condition and action part

13/10/ Distributed Rule Execution Each SP manages its own rule execution schedule Each execution schedule is a sequence of updates to be executed on the local peergroup Once an update u occurs in P, SP is notified SP determines if u may trigger any rule whose event part is annotated with P’s ID. If yes, the event query is sent to P for evaluation If the rule is triggered, its condition will be evaluated If the condition is true SP will send each instance of r’s action part to local peers that are a-relevant to it

13/10/ System Performance The system performance was studied by: Developing an analytical model of the system Developing a system simulator and performing experiments with it Performance criterion: Update response time: The mean time taken to complete all rule execution resulting from a single update submitted by a top-level update transaction

13/10/ System Performance Cases studied with both the Analytical Model and the Simulator : Random Network topology between SPs, with various data replication degree HyperCup Network topology between SPs, with various data replication degree Varying quantities: Number of peergroups Number of rules

13/10/ System Performance Random topology - Replication 10% Analytical Model Simulation

13/10/ System Performance With random topology system does not scale well even with low replication and number of rules and peergroups Exponential update response time System becomes unusable due to high load

13/10/ System Performance HyperCup organises the SPs into hypercubes HyperCup topology guarantees that: Each peer receives a message only once A total number of N-1 hops is necessary to broadcast a message to N peers The more distant peers are reached after log 2 N hops

13/10/ System Performance HyperCup - Replication 10% Analytical ModelSimulation

13/10/ System Performance HyperCup - Replication 90% Analytical ModelSimulation

13/10/ System Performance With HyperCup we achieve higher performance for various replication levels and number of peergroups System scales better System remains stable and the update response time within acceptable values Analytical with simulation approach show good agreement

13/10/ Conclusions We have described two ECA languages for XML and RDF We have studied and defined the architectural characteristics for an ECA rule processing system in centralised and distributed environment We have conducted a study to determine the system performance in both the centralised and distributed case

13/10/ Conclusions The whole study shows that ECA rules is a usable technology for various different application environments over semi- structured data

13/10/ Thank you !!