Presentation is loading. Please wait.

Presentation is loading. Please wait.

FoXtrot: Distributed Structural & Value XML Filtering Iris Miliaraki* Department of Informatics and Telecommunications National and Kapodistrian University.

Similar presentations


Presentation on theme: "FoXtrot: Distributed Structural & Value XML Filtering Iris Miliaraki* Department of Informatics and Telecommunications National and Kapodistrian University."— Presentation transcript:

1 FoXtrot: Distributed Structural & Value XML Filtering Iris Miliaraki* Department of Informatics and Telecommunications National and Kapodistrian University of Athens * Supported by Microsoft Research through European PhD Scholarship Programme

2 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Outline of the talk 1.XML Filtering scenario 2.FoXtrot system – Distributed structural matching – Distributed value matching 3.Experimental evaluation 4.Sum up and future work

3 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens XML Filtering system XML Filtering system XML Filtering scenario XPath/XQuery ? XPath/XQuery ? Subscriber Publisher YFilter XTrie FiST Index-Filter CentralizedDistributed ONYX Gong et al. [ICDE05] XPush Parallel/Hierarchical XTrie Snoeren [SOSP 2001] Li et al. [ICDCS08]

4 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens XML Filtering scenario XPath/XQuery ? XPath/XQuery ? Subscriber Publisher Mesh or tree-based overlays × Load imbalances × Potential bottlenecks due to centralized control SubscriberPublisher

5 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens FoXtrot XPath/XQuery ? XPath/XQuery ? Subscriber Publisher DHT √ Fully distributed √ Load balanced √ Scalable Filtering of XML data using structured overlay networks SubscriberPublisher

6 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens XML data model - example Q1: /bib/*/author[text()="John Smith"] Q2: Q3: Q4: Q1: /bib/*/author[text()="John Smith"] Q2: Q3: Q4: John Smith John Smith

7 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Automata-based approaches XFilter and YFilter, ONYX, XTrie, IndexFilter, FiST etc. Main idea Construct an automaton from a set of XPath/Xquery queries Use it as a matching engine against the XML documents Structural matching!

8 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens 3 3 bib ε * Q5: //*/cite = 12743] cite Q5Q5 * year Q1 phdthesis author Q4 * title Q3 school Q2 proceedings Q1: /bib/phdthesis/year = ‘2010’ Q2: /bib/proceedings/school = ‘Univ. of Athens’ Q3: /bib/proceedings/title = ‘XML Dissemination’ Q4: /bib/*/author = ‘Michael Smith’ Example NFA (YFilter)

9 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Designing FoXtrot  Moving to a distributed solution  Utilize automata-based techniques  Instead of a single centralized automaton, the automaton is shared by the DHT peers  Design and employ methods for filtering of XML data against a distributed automaton

10 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens P8 P1 P2 P9 P3 P4 P10 P7 P6 P5 FoXtrot Distributing the NFA on top of DHT State key Responsible peerP3P5P1P2P6P7 P8P10P4P9P10

11 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens P8 P1 P2 P9 P3 P4 P10 P7 P6 P5 FoXtrot Distributing the NFA on top of DHT State key Responsible peerP3P5P1P2P6P7 P8P10P4P9P10 247

12 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens P8 P1 P2 P9 P3 P4 P10 P7 P6 P5 FoXtrot Distributing the NFA on top of DHT State key Responsible peerP3P5P1P2P6P7 P8P10P4P9P10 ℓ=0 ℓ=

13 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Load balancing in FoXtrot Static replication Create a fixed number r of replicas for each state Load previously suffered by 1 peer, will be now shared by r+1 peers

14 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Load balancing in FoXtrot cont. Assumption: the frequency f of visiting an NFA state during filtering is inversely proportional to the NFA depth d of this state. Dynamic replication Create r/d replicas for each state where d is the NFA depth of the state

15 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Incremental construction To index a new query Traverse the distributed automaton

16 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Centralized NFA Execution (YFilter) Univ. of Athens XML and DHTs Univ. of Athens XML and DHTs Incoming XML document These paths can be executed in parallel! Runtime stack bib proceedings school title * ε * * Start of document End of document

17 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Distributed NFA execution – Iterative P1 P2 P9 P8 P7 P6 P3 P5 P4 P Publisher becomes overloaded! Runtime stack

18 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Distributed NFA execution – Recursive P1 P2 P9 P8 P7 P6 P3 P5 P4 P Several parallel executions

19 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Distributed NFA Miliaraki, Z. Kaoudi and M. Koubarakis. XML Data Dissemination using automata on top of structured overlay networks. In WWW Structural matching! What about value matching?

20 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens What about value matching? Automata-based approaches efficient for structural matching Queries apart from defining a structural path also contain value-based predicates We want FoXtrot to scale for both the size of the query set and the number of predicates per query

21 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Definitions Attribute predicates: op value] e.g. Textual predicates: element[text() op value] e.g. /bib/*/author[text()=“John Smith”] So, how can we deal with value matching along with structural matching?

22 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Direct evaluation with automaton/trie  Treat predicates as elements! bib phdthesis Q3: /bib/article/conference[text()=WWW 2009] Q1: 6 6 article Q2: /bib/*/author[text()=Michael Smith] author 7 7 conference 5 5 author * 3 3 year 7 7 conference 5 5 author text() nationality text() Huge increase of NFA states! Destroy sharing of path expressions!

23 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Bottom-up evaluation  Common rule in relational query optimization  apply selections as early as possible  Works well for relational query processing A lot of effort evaluating predicates while the structure may not be matched

24 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Top-down evaluation  Check predicates after structural matching depending on predicate selectivity number of false positives may be very large depending on predicate selectivity number of false positives may be very large

25 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Step-by-step evaluation  XPath queries consist of distinct steps  Each step contains one or more value-based predicates  Perform value matching with structural matching in a stepwise manner Effort spent for evaluating predicates while the structure may not be fully matched

26 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Moving on to details  Parse XML document and generate a set of candidate predicates to perform predicate evaluation Filtering"] CP4:author[text()="John Smith"] Enriched parsing eventsCandidate predicates

27 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Top-down evaluation  Execute distributed NFA  Only check predicates if a final state is reached  Each peer uses a local index mapping predicates to the list of queries that contain them (hash index) Delay value matching after structural matching

28 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Example author bib phdthesis 4 4 * conference 6 6 author article 7 7 cite Q1: Q2: /bib/*/author[text()="John Smith"] Q3: Q4: Q5: Q6: PREDICATEQUERY LIST PREDICATEQUERY LIST [paper- id=2770] {Q6} [paper- id=2392] {Q5} Candidate predicates Filtering"] CP4:author[text()="John Smith"] Candidate predicates Filtering"] CP4:author[text()="John Smith"]

29 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Top-down evaluation with pruning  At each step of the execution, part of the NFA is revealed  Applies on equality predicates IDEA: Use a compact summary of predicate information to stop NFA execution (prune) if we can deduce that no match can be found bib phdthesis 4 4 * conference 6 6 author article 7 7 cite

30 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Experimental evaluation Implemented FoXtrot in Java using FreePastry release (http://freepastry.org)http://freepastry.org Environment – 400 peers in PlanetLab (http://www.planet-lab.org/)http://www.planet-lab.org/ – 112 peers in a local shared cluster (http://www.grid.tuc.gr)http://www.grid.tuc.gr

31 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Experimental evaluation – Datasets Sets of 10 6 distinct XPath queries depth 5-15 predicates 1-3 wildcard probability 0.2 descendant axis probability XML documents depth 5-25

32 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Indexing throughput

33 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Filtering latency & notifications

34 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Load balancing I – 10 most loaded peers

35 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Load balancing II – storage overhead

36 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Network size

37 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Parameter l

38 Cluster (4 predicates per query)

39 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Sum up & future work  Overcome weaknesses of distributed XML filtering systems  Described methods to combine both structural and value XML filtering in a distributed environment  Future work  ….

40 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Other research Atlas system Distributed RDF query processing

41 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Thank you for your attention Questions? References 1.I. Miliaraki, Z. Kaoudi and M. Koubarakis. XML Data Dissemination using Automata on top of Structured Overlay Networks. 17th International World Wide Web Conference (WWW 2008), Beijing, China, April 21-25, I. Miliaraki, and M. Koubarakis. Distributed Structural and Value XML Filtering. 4th ACM International Conference on Distributed Event-Based Systems (DEBS 2010), Cambridge, United Kingdom, July 12-15, I. Miliaraki and M. Koubarakis, FoXtrot: Distributed Structural and Value XML filtering. Journal paper. To be submitted to ACM TWEB.

42 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens NFA size

43 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Network traffic

44 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Iterative vs. Recursive

45 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Network traffic – parameter l

46 Load balancing II – load distribution

47 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Filtering network traffic

48 FoXtrot: Distributed structural and value XML filteringNational and Kapodistrian University of Athens Experimental evaluation

49 Tue Apr 07 22:52: At least I can get your humor through tweets. I don't mean this in a bad way, but genetically speaking your a cul-de-sac. <a href="http://www.tweetdeck.com/">TweetDeck</a> false false Doug Williams dougw San Francisco, CA Twitter API Support. Internet, greed, users, dougw and opportunities are my passions. false ae4e ff e0ff92 87bc Sun Mar 18 06:42: Eastern Time (US & Canada) false 3390 false true... truncated...


Download ppt "FoXtrot: Distributed Structural & Value XML Filtering Iris Miliaraki* Department of Informatics and Telecommunications National and Kapodistrian University."

Similar presentations


Ads by Google