17 th International World Wide Web Conference 2008 Beijing, China XML Data Dissemination using Automata on top of Structured Overlay Networks Iris Miliaraki.

Slides:



Advertisements
Similar presentations
1
Advertisements

Feichter_DPG-SYKL03_Bild-01. Feichter_DPG-SYKL03_Bild-02.
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
Chapter 3 Demand and Behavior in Markets. Copyright © 2001 Addison Wesley LongmanSlide 3- 2 Figure 3.1 Optimal Consumption Bundle.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
RXQ Customer Enrollment Using a Registration Agent (RA) Process Flow Diagram (Move-In) Customer Supplier Customer authorizes Enrollment ( )
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination. Introduction to the Business.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 5 second questions
Year 6 mental test 10 second questions
1 Discreteness and the Welfare Cost of Labour Supply Tax Distortions Keshab Bhattarai University of Hull and John Whalley Universities of Warwick and Western.
Algorithms for Geometric Covering and Piercing Problems Robert Fraser PhD defence Nov. 23, 2012.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
CSCI 3130: Formal Languages and Automata Theory Tutorial 5
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Minimum Weight Plastic Design For Steel-Frame Structures EN 131 Project By James Mahoney.
PP Test Review Sections 6-1 to 6-6
Seungmi Choi PlanetLab - Overview, History, and Future Directions - Using PlanetLab for Network Research: Myths, Realities, and Best Practices.
EU market situation for eggs and poultry Management Committee 20 October 2011.
EU Market Situation for Eggs and Poultry Management Committee 21 June 2012.
Bright Futures Guidelines Priorities and Screening Tables
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
2 |SharePoint Saturday New York City
Green Eggs and Ham.
IP Multicast Information management 2 Groep T Leuven – Information department 2/14 Agenda •Why IP Multicast ? •Multicast fundamentals •Intradomain.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
BEEF & VEAL MARKET SITUATION "Single CMO" Management Committee 18 April 2013.
VOORBLAD.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Constant, Linear and Non-Linear Constant, Linear and Non-Linear
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Adding Up In Chunks.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
25 seconds left…...
Januar MDMDFSSMDMDFSSS
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
Analyzing Genes and Genomes
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Distributed Computing 9. Sorting - a lower bound on bit complexity Shmuel Zaks ©
FoXtrot: Distributed Structural & Value XML Filtering Iris Miliaraki* Department of Informatics and Telecommunications National and Kapodistrian University.
Distributed Structural and Value XML Filtering Iris Miliaraki and Manolis Koubarakis Department of Informatics and Telecommunications National and Kapodistrian.
Presentation transcript:

17 th International World Wide Web Conference 2008 Beijing, China XML Data Dissemination using Automata on top of Structured Overlay Networks Iris Miliaraki Zoi Kaoudi Manolis Koubarakis Department of Informatics and Telecommunications National and Kapodistrian University of Athens

24 April th International World Wide Web Conference 2008 Beijing, China 2 Outline XML Dissemination scenario Problems Background: DHTs Our approach Experiments Future work

24 April th International World Wide Web Conference 2008 Beijing, China 3 XPath/XQuery? XML Dissemination system XML Dissemination system XML Dissemination scenario XPath/XQuery? XML Subscriber Subscriber Publisher Publisher Publisher News monitoring Publication monitoring YFilter XTrie FiST Index-Filter CentralizedDistributed ONYX Gong et al. [ICDE05] XPush Parallel/Hierarchical XTrie Snoeren [SOSP 2001]

24 April th International World Wide Web Conference 2008 Beijing, China 4 XML Dissemination: Broker-based architecture Mesh or tree-based overlays XML Subscriber Publisher Publisher XPath/XQuery? ? XML Subscriber Publisher

24 April th International World Wide Web Conference 2008 Beijing, China 5 Problems Load imbalances

24 April th International World Wide Web Conference 2008 Beijing, China 6 XML Dissemination: Broker-based architecture Systems like ONYX and work of Gong et al. [ICDE05] Mesh or tree-based overlays XML Publisher Publisher Publisher XPath/XQuery? XML Subscriber Subscriber XPath/XQuery?

24 April th International World Wide Web Conference 2008 Beijing, China 7 Problems Load imbalances Centralized control Single point of failure and bottleneck

24 April th International World Wide Web Conference 2008 Beijing, China 8 XML Dissemination: Broker-based architecture Systems like ONYX and work of Gong et al. [ICDE05] Mesh or tree-based overlays XML Publisher Publisher Publisher Subscriber XPath/XQuery? ? XML Subscriber

24 April th International World Wide Web Conference 2008 Beijing, China 9 Problems Load imbalances Centralized control Single point of failure and bottleneck Scalability (size of routing tables)

24 April th International World Wide Web Conference 2008 Beijing, China 10 XML Dissemination: Broker-based architecture Systems like ONYX and work of Gong et al. [ICDE05] Mesh or tree-based overlays XML Publisher Publisher Publisher Subscriber XPath/XQuery? ? XML Subscriber

24 April th International World Wide Web Conference 2008 Beijing, China 11 Background: DHTs Structured overlay networks Solve the item location problem in a distributed and dynamic network of nodes (in O(log N) hops): Let x be some data item. Find x! Distributed version of hash table data structure id=Hash(K) Main operations: Put: given a key (for a data item), map the key onto a node. Get: Find the location of a data item with a given a key. Successor peer responsible peer

24 April th International World Wide Web Conference 2008 Beijing, China 12 XML Dissemination revisited: Structured overlay network architecture XML Subscriber Subscriber Publisher Publisher Publisher XPath/XQuery? ?

24 April th International World Wide Web Conference 2008 Beijing, China 13 Problems revisited Load imbalances Centralized control Single point of failure and bottleneck Scalability (size of routing tables)

24 April th International World Wide Web Conference 2008 Beijing, China 14 Automata-based approaches XFilter and YFilter, ONYX, XTrie, IndexFilter, FiST etc. Main idea Construct an automaton from a set of XPath/Xquery queries Use it as a matching engine against the XML documents

24 April th International World Wide Web Conference 2008 Beijing, China 15 Q1: /dblp/phdthesis/year = 2008 YFilter – NFA Construction 3 year Q1 0 dblp phdthesis 1 2

24 April th International World Wide Web Conference 2008 Beijing, China 16 Q1: /dblp/phdthesis/year = 2008 Q2: /dblp/proceedings/school = Univ. of Athens YFilter – NFA Construction 3 year Q1 0 dblp phdthesis school Q2 proceedings 4

24 April th International World Wide Web Conference 2008 Beijing, China 17 Q1: /dblp/phdthesis/year = 2008 Q2: /dblp/proceedings/school = Univ. of Athens Q3: /dblp/proceedings/title = XML Dissemination YFilter – NFA Construction 3 year Q1 0 dblp phdthesis 1 2 title Q3 6 5 school Q2 proceedings 4

24 April th International World Wide Web Conference 2008 Beijing, China 18 8 author Q4 Q1: /dblp/phdthesis/year = 2008 Q2: /dblp/proceedings/school = Univ. of Athens Q3: /dblp/proceedings/title = XML Dissemination Q4: /dblp/*/author = John Doe YFilter – NFA Construction 3 year Q1 0 dblp phdthesis 1 2 * 7 title Q3 6 5 school Q2 proceedings 4

24 April th International World Wide Web Conference 2008 Beijing, China 19 ε 9 * Q5: //*/cite = [12743] 11 cite Q5Q5 10 * YFilter – NFA Construction 3 year Q1 0 dblp phdthesis author Q4 * 7 title Q3 6 5 school Q2 proceedings 4 Q1: /dblp/phdthesis/year = 2008 Q2: /dblp/proceedings/school = Univ. of Athens Q3: /dblp/proceedings/title = XML Dissemination Q4: /dblp/*/author = John Doe

24 April th International World Wide Web Conference 2008 Beijing, China 20 ε 9 * Q5: //*/cite = [12743] 11 cite Q5Q5 10 * YFilter – NFA Construction 3 year Q1 0 dblp phdthesis author Q4 * 7 title Q3 6 5 school Q2 proceedings 4 Q1: /dblp/phdthesis/year = 2008 Q2: /dblp/proceedings/school = Univ. of Athens Q3: /dblp/proceedings/title = XML Dissemination Q4: /dblp/*/author = John Doe

24 April th International World Wide Web Conference 2008 Beijing, China 21 Main idea Utilize a distributed version of a state-of-the- art approach YFilter Instead of a centralized NFA Distribute the NFA in the DHT

24 April th International World Wide Web Conference 2008 Beijing, China 22 Distributing the NFA on top of DHT P1 P2 P9 P8 P7 P6 P3 P5 P4 P10 State key Successor peerP3P5P1P2P6P7 P8P10P4P9P

24 April th International World Wide Web Conference 2008 Beijing, China 23 Distributing the NFA on top of DHT P1 P2 P9 P8 P7 P6 P3 P5 P4 P10 State key Successor peerP3P5P1P2P6P7 P8P10P4P9P

24 April th International World Wide Web Conference 2008 Beijing, China 24 Distributing the NFA on top of DHT P1 P2 P9 P8 P7 P6 P3 P5 P4 P10 State key Successor peerP3P5P1P2P6P7 P8P10P4P9P =0 =1

24 April th International World Wide Web Conference 2008 Beijing, China 25 Distributing the NFA on top of DHT State key Successor peerP3P5P1P2P6P7 P8P10P4P9P10 P1 P2 P9 P8 P7 P6 P3 P5 P4 P

24 April th International World Wide Web Conference 2008 Beijing, China 26 YFilter - NFA Execution Univ. of Athens XML and DHTs Incoming XML document These paths can be executed in parallel! Runtime stack dblp proceedings school title * ε * * Start of document End of document

24 April th International World Wide Web Conference 2008 Beijing, China 27 Univ. of Athens XML and DHTs Start of document End of document Distributed NFA execution – Iterative Incoming XML document Publisher P1 P2 P9 P8 P7 P6 P3 P5 P4 P Publisher becomes overloaded!

24 April th International World Wide Web Conference 2008 Beijing, China 28 Univ. of Athens XML and DHTs Distributed NFA execution - Recursive Incoming XML document Publisher P1 P2 P9 P8 P7 P6 P3 P5 P4 P Start of document End of document

24 April th International World Wide Web Conference 2008 Beijing, China 29 Experimental evaluation Chord simulator 2 different document workloads Aggregated Including DBLP, NITF, ebXML, Auction (XMark) NITF 2 kinds of query sets Random Distinct

24 April th International World Wide Web Conference 2008 Beijing, China 30 Metrics Network traffic total number of messages Latency longest chain of hops Filtering load number of messages received during execution

24 April th International World Wide Web Conference 2008 Beijing, China 31 Iterative vs Recursive

24 April th International World Wide Web Conference 2008 Beijing, China 32 Varying number of queries – Network traffic

24 April th International World Wide Web Conference 2008 Beijing, China 33 Varying number of queries - Latency

24 April th International World Wide Web Conference 2008 Beijing, China 34 Load balancing Virtual peers Originally proposed in Chord Mapping of multiple virtual peers to each real peer Load-shedding Replicate on demand

24 April th International World Wide Web Conference 2008 Beijing, China 35 Load balancing – Filtering load

24 April th International World Wide Web Conference 2008 Beijing, China 36 Conclusions DHT-based protocols overcoming weaknesses of broker-based architectures Utilize a distributed YFilter engine Exploit inherent parallelism of an automaton Experimental evaluation

24 April th International World Wide Web Conference 2008 Beijing, China 37 Future Work Implementation and experimenting on an Internet-scale testbed like PlanetLab More sophisticated methods for predicate evaluation

24 April th International World Wide Web Conference 2008 Beijing, China 38 Thank you for your attention Questions?