XML Labling and Query Optimization Sigmod2009 2009-7-3.

Slides:



Advertisements
Similar presentations
Ting Chen, Jiaheng Lu, Tok Wang Ling
Advertisements

Jiaheng Lu, Ting Chen and Tok Wang Ling National University of Singapore Finding all the occurrences of a twig.
Computing Structural Similarity of Source XML Schemas against Domain XML Schema Jianxin Li 1 Chengfei Liu 1 Jeffrey Xu Yu 2 Jixue Liu 3 Guoren Wang 4 Chi.
XML: Extensible Markup Language
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,
The Volcano/Cascades Query Optimization Framework
Inference of Concise DTDs from XML data Geert Jan Bex 1 Frank Neven 1 Thomas Schwentick 2 Karl Tuyls 3 1 Hasselt University and Transnational University.
Inferring XML Schema Definitions from XML Data
1 Abdeslame ALILAOUAR, Florence SEDES Fuzzy Querying of XML Documents The minimum spanning tree IRIT - CNRS IRIT : IRIT : Research Institute for Computer.
2015/5/5 A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML Ning Zhang(University of Waterloo) Varun Kacholia(Indian Institute.
1 CS 561 Presentation: Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Ming Li.
Web Data Management XML Query Evaluation 1. Motivation PTIME algorithms for evaluating XPath queries: – Simple tree navigation – Translation into logic.
Suffix Sorting & Related Algoritmics Martin Farach-Colton Rutgers University USA.
Presentation for Cmpe-521 VIST – Virtual Suffix Tree Prepared by: Evren CEYLAN – Aslı UYAR
Advanced Topics in Algorithms and Data Structures 1 Rooting a tree For doing any tree computation, we need to know the parent p ( v ) for each node v.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Data Structures and Algorithms1 B-Trees with Minimum=1 2-3 Trees.
Containment and Equivalence for an XPath Fragment By Gerom e Mikla Dan Suciu Presented By Roy Ionas.
Storing and Querying Ordered XML Using Relational Database System Swapna Dhayagude.
1 Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Amnon Shochot.
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
Tree Contraction Label leaf nodes 1...n –Rake odd indexed leaf nodes –Left Compress –Right Compress –Left Compress –Right Compress Key: avoid memory conflicts.
Introduction to XPath Bun Yue Professor, CS/CIS UHCL.
Storing and Querying Multi-version XML Documents using Durable Node Numbers Shu-Yao Chien Dept. of CS UCLA Vassilis J. Tsotras Dept. of.
1 Prefix Path Streaming: a New Clustering Method for XML Twig Pattern Matching Ting Chen, Tok Wang Ling, Chee-Yong Chan School of Computing, National University.
1 Static Type Analysis of Path Expressions in XQuery Using Rho-Calculus Wang Zhen (Selina) Oct 26, 2006.
1 Holistic Twig Joins: Optimal XML Pattern Matching ACM SIGMOD 2002.
1 Maintaining Semantics in the Design of Valid and Reversible SemiStructured Views Yabing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science.
Ceng-112 Data Structures I 1 Chapter 7 Introduction to Trees.
A Summary of XISS and Index Fabric Ho Wai Shing. Contents Definition of Terms XISS (Li and Moon, VLDB2001) Numbering Scheme Indices Stored Join Algorithms.
Querying Structured Text in an XML Database By Xuemei Luo.
Approximate XML Joins Huang-Chun Yu Li Xu. Introduction XML is widely used to integrate data from different sources. Perform join operation for XML documents:
The Volcano Query Optimization Framework S. Sudarshan (based on description in Prasan Roy’s thesis Chapter 2)
VLDB'02, Aug 20 Efficient Structural Joins on Indexed XML1 Efficient Structural Joins on Indexed XML Documents Shu-Yao Chien, Zografoula Vagena, Donghui.
TwigStackList¬: A Holistic Twig Join Algorithm for Twig Query with Not-predicates on XML Data by Tian Yu, Tok Wang Ling, Jiaheng Lu, Presented by: Tian.
Crimson: A Data Management System to Support Evaluating Phylogenetic Tree Reconstruction Algorithms Yifeng Zheng, Stephen Fisher, Shirley cohen, Sheng.
Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007.
5/2/20051 XML Data Management Yaw-Huei Chen Department of Computer Science and Information Engineering National Chiayi University.
BLAS: An Efficient XPath Processing System Zhimin Song Advanced Database System Professor: Dr. Mengchi Liu.
Preview  Graph  Tree Binary Tree Binary Search Tree Binary Search Tree Property Binary Search Tree functions  In-order walk  Pre-order walk  Post-order.
Chap 8 Trees Def 1: A tree is a connected,undirected, graph with no simple circuits. Ex1. Theorem1: An undirected graph is a tree if and only if there.
Templated Search over Relational Databases Date: 2015/01/15 Author: Anastasios Zouzias, Michail Vlachos, Vagelis Hristidis Source: ACM CIKM’14 Advisor:
QED: A Novel Quaternary Encoding to Completely Avoid Re-labeling in XML Updates Changqing Li,Tok Wang Ling.
Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba.
Trees Dr. Yasir Ali. A graph is called a tree if, and only if, it is circuit-free and connected. A graph is called a forest if, and only if, it is circuit-free.
Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree
From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching Jiaheng Lu, Tok Wang Ling, Chee-Yong Chan, Ting Chen National.
Efficient Processing of Updates in Dynamic XML Data Changqing Li, Tok Wang Ling, Min Hu.
1 Review of report "LSDX: A New Labeling Scheme for Dynamically Updating XML Data"
Holistic Twig Joins Optimal XML Pattern Matching Nicolas Bruno Columbia University Nick Koudas Divesh Srivastava AT&T Labs-Research SIGMOD 2002.
Holistic Twig Joins: Optimal XML Pattern Matching Nicholas Bruno, Nick Koudas, Divesh Srivastava ACM SIGMOD 02 Presented by: Li Wei, Dragomir Yankov.
Reuse or Never Reuse the Deleted Labels in XML Query Processing Based on Labeling Schemes Changqing Li, Tok Wang Ling, Min Hu.
1 Updates ADT 2010 ADT 2010 XQuery Updates in MonetDB/XQuery Stefan Manegold
MonetDB/XQuery Technology Preview 1 Stefan Manegold Centrum voor Wiskunde en Informatica Amsterdam -
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
Discrete Mathematics Chapter 10 Trees.
Indexing and Querying XML Data for Regular Path Expressions Quanzhong Li and Bongki Moon Dept. of Computer Science University of Arizona VLDB 2001.
XRANK: RANKED KEYWORD SEARCH OVER XML DOCUMENTS Lin Guo Feng Shao Chavdar Botev Jayavel Shanmugasundaram Abhishek Chennaka, Alekhya Gade Advanced Database.
An Improved Prefix Labeling Scheme: A Binary String Approach for Dynamic Ordered XML Changqing LiTok Wang Ling Department of Computer Science School of.
1 Efficient Processing of Partially Specified Twig Queries Junfeng Zhou Renmin University of China.
XML Query languages--XPath. Objectives Understand XPath, and be able to use XPath expressions to find fragments of an XML document Understand tree patterns,
TREES From root to leaf. Trees  A tree is a non-linear collection  The elements are in a hierarchical arrangement  The elements are not accessible.
Tries 07/28/16 11:04 Text Compression
CSCI5570 Large Scale Data Processing Systems
Efficient processing of path query with not-predicates on XML data
RE-Tree: An Efficient Index Structure for Regular Expressions
Data Structures and Algorithms
2/18/2019.
MCN: A New Semantics Towards Effective XML Keyword Search
Presentation transcript:

XML Labling and Query Optimization Sigmod

Outline XML DBS related researches in sigmod2009 DDE labeling Scheme XQuery Optimization conclusion

sigmod2009 Research Session 16: Query Processing on Semi-structured Data Cost Based Plan Selection for XPath Haris Georgiadis (Athens University of Economics and Business) Minas Charalambides (Athens University of Economics and Business) Vasilis Vassalos (Athens University of Economics and Business) ROX: Run-time Optimization of XQueries Riham Abdel Kader (University of Twente) Peter Boncz (CWI) Stefan Manegold (CWI) Maurice Van Keulen (University of Twente) Research Session 19: Semi-structured Data Management DDE: From Dewey to a Fully Dynamic XML Labeling Scheme Liang Xu (National University of Singapore) Tok Wang Ling (National University of Singapore) Huayu Wu (National University of Singapore) Zhifeng Bao (National University of Singapore) Simplifying XML Schema: Effortless Handling of Nondeterministic Regular Expressions Geert Jan Bex (Hasselt University and Transnational University of Limburg) Wouter Gelade (Hasselt University and Transnational University of Limburg) Wim Martens (Technical University of Dortmund) Frank Neven (Hasselt University and Transnational University of Limburg) FlexRecs: Expressing and Combining Flexible Recommendations Georgia Koutrika (Stanford University) Benjamin Bercovitz (Stanford University) Hector Garcia-Molina (Stanford University)

Outline XML DBS related researches in sigmod2009 DDE labeling Scheme XQuery Optimization conclusion

Dewey labeling Concatenation of its parent labeling and local order helpful for Keyword search High cost of relabeling for dynamic XML document Is there a labeling schemewhich not only has compact size and high query performance but also completely avoids relabeling?

DDE Labeling (1) Character  Can completely avoid relabeling  Efficiently support query  Not add the length of labeling  Only the definition of “preorder” Preorder labels A :a1.a2 … am and B : b1.b2...bn A≤dde B if DDE: From Dewey to a Fully Dynamic XML Labeling Scheme Liang Xu, Tok Wang Ling School of Computing National University of singapore

DDE Labeling (2) Leftmost insertion insert before node A : a1.a2 … an (A is the first child ) a1.a2 … (an-1) ->this node. Rightmost insertion insert after node A : a1.a2 … an (A is the last child ) a1.a2 … (an + 1) -> this node. Insertion below a leaf node insert below a leaf node A :a1.a2 … an a1.a2 … an.1 -> this node. Insertion between two consecutive siblings insert between A and B A+B->this node

DDE Labeling (3)--example AB C D HGF E I

DDE Labeling (4) AD relationship A/m is an ancestor of B/n if m<n and PC relationship A/m is the parent of B/n if m=n and A/m is an ancestor of B/n Document order A precedes B if A< dde B Sibling relationship A is the sibling of B if

CDDE labeling(1) Compact DDE Enhance the performance of DDE for insertion Relationship between them CDDE label ->DDE label

CDDE labeling(2) — example AB C D HGF E I KJ

Outline XML DBS related researches in sigmod2009 DDE labeling Scheme XQuery Optimization conclusion

Optimization of XQueries Take Join Graph as input and care the correlations Interleave optimization and execution steps Use sample to estimate the cost Use index to get sample Use Chain Sampling to get optimal path ROX: Runtime Optimization of XQueries Riham Abdel Kader University of Twente Enschede, The Netherlands

Join Graphs let $r := doc(“auction.xml”) for $a in $r//open auction[./r eserve]/bidder//personref, $b in $r//person[.//education] where = return $a

Related notation CutOff Sampled Operators : return a sample with size l :return partial execution results of the operator OP with size l Weight of edge cost(p)=cost(p’)+est*card(source) ÷T Sf(p)=est/T

Chain Sampling Chain Sampling makes ROX avoid local minimum due to correlations only explore those paths that branch from the edge with the smallest weight Find the optimal path pi

Example for Chain Sampling [cost, sf ](p1) = [1500, 1.5] [cost, sf ](p2) = [2000, 1] [cost, sf ](p3) = [1300, 0.1] [cost, sf ](p4) = [3200, 2] V2V3 V5 V1 V4 V7 V6V8 p1p2 p3 p2 p3 p4 p3 is selected

Optimization Algorithm Reduce the intermediate results exploring the search space by Chain Sampling Find the optimal path interleave optimization and execution steps

Illustration let $d := doc(“xmark.xml”) for $o in $d//open auction[.//current/text() < 145], $p in $d//person[.//province], $i in $d//item[./quantity = 1] where = and = return $a

Illustration( 续 ) p1 p

Outline XML DBS related researches in sigmod2009 DDE labeling Scheme XQuery Optimization conclusion

DDE and CDDE They are tailored for both static and dynamic XML documents. ROX: Runtime Optimization To reduce the number of intermediate results