Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree

Slides:



Advertisements
Similar presentations
Ting Chen, Jiaheng Lu, Tok Wang Ling
Advertisements

Computing Structural Similarity of Source XML Schemas against Domain XML Schema Jianxin Li 1 Chengfei Liu 1 Jeffrey Xu Yu 2 Jixue Liu 3 Guoren Wang 4 Chi.
Succinct Representations of Dynamic Strings Meng He and J. Ian Munro University of Waterloo.
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Jianxin Li, Chengfei Liu, Rui Zhou Swinburne University of Technology, Australia Wei Wang University of New South Wales, Australia Top-k Keyword Search.
Frequent Closed Pattern Search By Row and Feature Enumeration
Di Yang, Elke A. Rundensteiner and Matthew O. Ward Worcester Polytechnic Institute VLDB 2009, Lyon, France 1 A Shared Execution Strategy for Multiple Pattern.
Fast Algorithms For Hierarchical Range Histogram Constructions
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
1 Abdeslame ALILAOUAR, Florence SEDES Fuzzy Querying of XML Documents The minimum spanning tree IRIT - CNRS IRIT : IRIT : Research Institute for Computer.
22C:19 Discrete Structures Trees Spring 2014 Sukumar Ghosh.
Constant-Time LCA Retrieval
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Midterm 2 Overview Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Accelerating Inferencing. Assertion Efficient inferencing using taxonomies require fast computation of subsumption, disjointness, least common ancestors,
B+-tree and Hashing.
1 Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Amnon Shochot.
DEPARTMENT OF COMPUTER SCIENCE SOFTWARE ENGINEERING, GRAPHICS, AND VISUALIZATION RESEARCH GROUP 15th International Conference on Information Visualisation.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Recursive Graph Deduction and Reachability Queries Yangjun Chen Dept. Applied Computer Science, University of Winnipeg 515 Portage Ave. Winnipeg, Manitoba,
Abstract Shortest distance query is a fundamental operation in large-scale networks. Many existing methods in the literature take a landmark embedding.
CS 580S Sensor Networks and Systems Professor Kyoung Don Kang Lecture 7 February 13, 2006.
Teaching Teaching Discrete Mathematics and Algorithms & Data Structures Online G.MirkowskaPJIIT.
File Organization Techniques
TEDI: Efficient Shortest Path Query Answering on Graphs Author: Fang Wei SIGMOD 2010 Presentation: Dr. Greg Speegle.
Graph Data Management Lab, School of Computer Science gdm.fudan.edu.cn XMLSnippet: A Coding Assistant for XML Configuration Snippet.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
1 On Querying Historical Evolving Graph Sequences Chenghui Ren $, Eric Lo *, Ben Kao $, Xinjie Zhu $, Reynold Cheng $ $ The University of Hong Kong $ {chren,
The main mathematical concepts that are used in this research are presented in this section. Definition 1: XML tree is composed of many subtrees of different.
A Summary of XISS and Index Fabric Ho Wai Shing. Contents Definition of Terms XISS (Li and Moon, VLDB2001) Numbering Scheme Indices Stored Join Algorithms.
An Improved Algorithm to Accelerate Regular Expression Evaluation Author: Michela Becchi, Patrick Crowley Publisher: 3rd ACM/IEEE Symposium on Architecture.
Querying Structured Text in an XML Database By Xuemei Luo.
Graph Data Management Lab, School of Computer Science Add title here: Large graph processing
Approximate XML Joins Huang-Chun Yu Li Xu. Introduction XML is widely used to integrate data from different sources. Perform join operation for XML documents:
1/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science
Constant-Time LCA Retrieval Presentation by Danny Hermelin, String Matching Algorithms Seminar, Haifa University.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Succinct Dynamic Cardinal Trees with Constant Time Operations for Small Alphabet Pooya Davoodi Aarhus University May 24, 2011 S. Srinivasa Rao Seoul National.
A correction The definition of knot in page 147 is not correct. The correct definition is: A knot in a directed graph is a subgraph with the property that.
Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August.
QED: A Novel Quaternary Encoding to Completely Avoid Re-labeling in XML Updates Changqing Li,Tok Wang Ling.
XML Access Control Koukis Dimitris Padeleris Pashalis.
1 Multi-Level Indexing and B-Trees. 2 Statement of the Problem When indexes grow too large they have to be stored on secondary storage. However, there.
1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.
Design and Analysis of Algorithms (09 Credits / 5 hours per week) Sixth Semester: Computer Science & Engineering M.B.Chandak
Chapter 10: Trees A tree is a connected simple undirected graph with no simple circuits. Properties: There is a unique simple path between any 2 of its.
Kijung Shin Jinhong Jung Lee Sael U Kang
Efficient Processing of Updates in Dynamic XML Data Changqing Li, Tok Wang Ling, Min Hu.
1 Review of report "LSDX: A New Labeling Scheme for Dynamically Updating XML Data"
Holistic Twig Joins Optimal XML Pattern Matching Nicolas Bruno Columbia University Nick Koudas Divesh Srivastava AT&T Labs-Research SIGMOD 2002.
1 Holistic Twig Joins: Optimal XML Pattern Matching Nicolas Bruno, Nick Koudas, Divesh Srivastava ACM SIGMOD 2002 Presented by Jun-Ki Min.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
A Fully Polynomial Time Approximation Scheme for Timing Driven Minimum Cost Buffer Insertion Shiyan Hu*, Zhuo Li**, Charles Alpert** *Dept of Electrical.
Reuse or Never Reuse the Deleted Labels in XML Query Processing Based on Labeling Schemes Changqing Li, Tok Wang Ling, Min Hu.
Enumerating XML Data for Dynamic Updating L.Kit and V.Ng, Hong Kong Polytechnique University Sang-Ho Nah Lily Daniel Yun Hee Lee.
Indexing and Querying XML Data for Regular Path Expressions Quanzhong Li and Bongki Moon Dept. of Computer Science University of Arizona VLDB 2001.
An Improved Prefix Labeling Scheme: A Binary String Approach for Dynamic Ordered XML Changqing LiTok Wang Ling Department of Computer Science School of.
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
Mehdi Kargar Department of Computer Science and Engineering
A paper on Join Synopses for Approximate Query Answering
RE-Tree: An Efficient Index Structure for Regular Expressions
Ariel Rosenfeld Bar-Ilan Uni.
Lecture 9 Greedy Strategy
Structure and Content Scoring for XML
Structure and Content Scoring for XML
Non-Linear data structures
Efficient Processing of Top-k Spatial Preference Queries
Presentation transcript:

Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree {shawyh, The 12-th International Conference of Date Engineering Yanghua Xiao, Ji Hong, Wanyun Cui, Zhenying He, Wei Wang, Guodong Feng April 2012 Branch Code: A Labeling Scheme for Efficient Query Answering on Trees

2 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Background  Tree is widely used data model XML data File directory Spanning tree in graphs  One typical task on tree data is querying structural relationships PC: Parent/Child AD: Ancestor/Descendant SR: Sibling Relation LCA: Lowest Common Ancestor

3 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Interval-based A triple, generated by pre- order/post-order traverse Can not support SR Hard to compute LCA Hard to update  Prefix-based Dewey Code and its variety Storage costly for deep trees Hard to update  Prime-based (Integer-based) Use primes to encode (X. Wu, etc., ICDE’04) Storage costly Previous Labeling Schemes

4 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Support various queries efficiently PC, AD in constant time LCA in O(d), where d is the depth of tree  Space efficient Exact labeling cost O(Nd) spaces, but in most cases is less space than other labelings Approximate labeling allows us to tradeoff accuracy for space cost  Support update on trees Amortized O(logN) modification cost by Splay tree Our Labeling Schemes: Brach codes

5 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Original Idea  Definition of BranchCode  Addressing Update Operations on Trees  Compression Method  Experimental Evaluation  Conclusion Outline

6 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Prefix-based A : * B : *.0 C : *.1 D : *.0.0 E : *.0.1 F : *  Prime-based A : 2 B : 3 × A C : 5 × A D : 7 × B E : 11 × B F : 13 × E Basic Idea Our Idea

7 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Simple Radix  Decimal (10-based): 123, 78, 23472, …  Binary (2-based): 0, 1, 101, 1010, 1101,… Representation of Numbers

8 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Complex Radix Prefix form

9 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Original Idea  Definition of BranchCode  Addressing Update Operations on Trees  Compression Mechanism  Experimental Evaluation  Conclusion Outline

10 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Definition of BranchCode

11 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Example [-, 1] [2, 1] [3, 1] [3, -]  R =  D =  b(n) = S(D, R) = × (1 + 3 × 1) = 13

12 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Query Answering 2. Navigability 3. Lowest Common Ancestor (LCA) Stems from Navigability.

13 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Original Idea  Definition of BranchCode  Addressing Update Operations on Trees  Compression Mechanism  Experimental Evaluation  Conclusion Outline

14 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  S(D,R), where D = R =  S’(D’,R’), where D’ = R’ =  Delta = |S’ – S|  How to calculate Delta? BranchCode for Dynamic Trees

15 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Incremental Update of BranchCode

16 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Incremental Update of BranchCode

17 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Incremental Update of BranchCode

18 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Incremental Update of BranchCode

19 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Example

20 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  When we insert (or delete) a child of a particular node, all its descendants will be affected.  According to mathematical proofs, in expection O(n) nodes can be affected after an insertion operation in some bad cases, here n is the size of the tree. Affect Nodes after Update

21 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Post-order traversal on trees. Seq = {2, 3, 6, 7, 4, 5, 1}  Two properties of post-order sequence: 1)All descendants of a single node are consecutive in the post-order sequence. 2)All descendants of a set of consecutive siblings are consecutive in the post-order sequence. Affect Nodes after Update (Cont’d) Use Splay Tree to maintain the sequence.

22 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Update Based on Splay Tree Update and query based on splay tree

23 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Maintainance of Buffered Marks

24 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Original Idea  Definition of BranchCode  Addressing Update Operations on Trees  Compression Mechanism  Experimental Evaluation  Conclusion Outline

25 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Definition of Compressed Code: Compressed BranchCode

26 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Congruence:  CA Determination: Property of Compressed Code

27 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Original Idea  Definition of BranchCode  Addressing Update Operations on Trees  Compression Mechanism  Experimental Evaluation  Conclusion Outline

28 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Accuracy of Compressed Code

29 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Data sets: Results on Real Data

30 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Results on Real Data (Cont’d)

31 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Results on Synthetic Data

32 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  Original Idea  Definition of BranchCode  Addressing Update Operations on Trees  Compression Mechanism  Experimental Evaluation  Conclusion Outline

33 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh,  We systematically explore the basic properties about branch code and construct conditions for correctly determining the relationships of nodes in trees.  The compressed BranchCode reduces the storage cost to linear complexity.  We also design an incremental approach (of O(logN) amortized update cost and query cost) based on splay tree to maintain branch codes on dynamic trees. Conclutions

34 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Open Question How to theoretically estimate the possibility of FP given particular modulo set?

35 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Thank you for your attention!

36 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Motivation of Problem  Why you study this problem?

37 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Related works  How did people solve this problem in the previous works?  Survey of any other related works Problems that is similar to your works Techniques that used in your solution Any other related works

38 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Problem definition  Formal definition  Property of proposed problem  Is this problem novel Difference of this problem to the related problem  Does this problem deserve our research efforts? Challenges of this problem Is this problem NP-hard? If so, give the proof

39 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Baseline Solution  What is the naive solution to solve this problem  Why this solution is unacceptable? Complexity Salability Or any other issues

40 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Your solution  Basic idea of your solution Example if exists  Algorithm framework of your solution

41 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Key technique of your solution  For each technique, give the following Rationality of this technique Procedure of the technique Can we prove the efficiency or effectiveness of your solution? If so, give them Optimization of your technique when handle large data or dynamic data

42 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Planning of next step  What you plan to do as the next step?  Checkpoint  Delivery