Jiaheng Lu, Ting Chen and Tok Wang Ling National University of Singapore Finding all the occurrences of a twig.

Slides:



Advertisements
Similar presentations
Ting Chen, Jiaheng Lu, Tok Wang Ling
Advertisements

APWeb 2004 Hangzhou, China 1 Labeling and Querying Dynamic XML Trees Jiaheng Lu and Tok Wang Ling School of Computing National University of Singapore.
Benchmarking Holistic Approaches to XML TPQ Processing Jiaheng Lu Renmin University of China BenchmarX 2010.
1 Efficient Processing of XML Twig Patterns with Parent Child Edges: A Look-ahead Approach Jiaheng Lu, Ting Chen, Tok Wang Ling National University of.
XML data management and approximate string matching Jiaheng Lu Key Lab of Data Engineering and Knowledge Engineering Renmin University of China November.
From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching Jiaheng Lu, Tok Wang Ling, Chee-Yong Chan, Ting Chen National.
On Boosting Holism in XML Twig Pattern Matching Using Two Data Streaming Techniques Presenter: Lu Jiaheng Supervisor: Prof. Ling Tok Wang Joint work: Chen.
Computing Structural Similarity of Source XML Schemas against Domain XML Schema Jianxin Li 1 Chengfei Liu 1 Jeffrey Xu Yu 2 Jixue Liu 3 Guoren Wang 4 Chi.
1 Virtual Cursors for XML Joins Beverly Yang (Stanford) Marcus Fontoura, Eugene Shekita Sridhar Rajagopalan, Kevin Beyer CIKM’2004.
Jianxin Li, Chengfei Liu, Rui Zhou Swinburne University of Technology, Australia Wei Wang University of New South Wales, Australia Top-k Keyword Search.
Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,
1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011.
Efficient Keyword Search for Smallest LCAs in XML Database Yu Xu Department of Computer Science & Engineering University of California, San Diego Yannis.
Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,
DIMACS Streaming Data Working Group II On the Optimality of the Holistic Twig Join Algorithm Speaker: Byron Choi (Upenn) Joint Work with Susan Davidson.
Structural Joins: A Primitive for Efficient XML Query Pattern Matching Al Khalifa et al., ICDE 2002.
DBLABNational Taiwan Ocean University1/35 A Document-based Approach to Indexing XML Data Ya-Hui Chang and Tsan-Lung Hsieh Department of Computer Science.
2015/5/5 A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML Ning Zhang(University of Waterloo) Varun Kacholia(Indian Institute.
1 CS 561 Presentation: Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Ming Li.
Web Data Management XML Query Evaluation 1. Motivation PTIME algorithms for evaluating XPath queries: – Simple tree navigation – Translation into logic.
Suggestion of Promising Result Types for XML Keyword Search Joint work with Jianxin Li, Chengfei Liu and Rui Zhou ( Swinburne University of Technology,
Xyleme A Dynamic Warehouse for XML Data of the Web.
1 Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Amnon Shochot.
Storing and Querying Ordered XML Using a Relational Database System By Khang Nguyen Based on the paper of Igor Tatarinov and Statis Viglas.
Indexing XML Data Stored in a Relational Database VLDB`2004 Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili.
1 Prefix Path Streaming: a New Clustering Method for XML Twig Pattern Matching Ting Chen, Tok Wang Ling, Chee-Yong Chan School of Computing, National University.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
1 Holistic Twig Joins: Optimal XML Pattern Matching ACM SIGMOD 2002.
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
1 Ranking Inexact Answers. 2 Ranking Issues When inexact querying is allowed, there may be MANY answers –different answers have a different level of incompleteness.
A Summary of XISS and Index Fabric Ho Wai Shing. Contents Definition of Terms XISS (Li and Moon, VLDB2001) Numbering Scheme Indices Stored Join Algorithms.
Approximate XML Joins Huang-Chun Yu Li Xu. Introduction XML is widely used to integrate data from different sources. Perform join operation for XML documents:
Efficient P2P Searches Using Result-Caching From U. of Maryland. Presented by Lintao Liu 2/24/03.
TwigStackList¬: A Holistic Twig Join Algorithm for Twig Query with Not-predicates on XML Data by Tian Yu, Tok Wang Ling, Jiaheng Lu, Presented by: Tian.
Clustering XML Documents for Query Performance Enhancement Wang Lian.
5/2/20051 XML Data Management Yaw-Huei Chen Department of Computer Science and Information Engineering National Chiayi University.
Early Profile Pruning on XML-aware Publish- Subscribe Systems Mirella M. Moro, Petko Bakalov, Vassilis J. Tsotras University of California VLDB 2007 Presented.
QED: A Novel Quaternary Encoding to Completely Avoid Re-labeling in XML Updates Changqing Li,Tok Wang Ling.
Tree-Pattern Queries on a Lightweight XML Processor MIRELLA M. MORO Zografoula Vagena Vassilis J. Tsotras Research partially supported by CAPES, NSF grant.
Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba.
Querying and Maintaining Ordered XML Data using Relational Databases Anil Rawat Wei Pan Chang chen Kelly Yu.
1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.
Dynamic Faceted Search for Discovery- driven Analysis Debabrata Sash, Jun Rao, Nimrod Megiddo, Anastasia Ailamaki, Guy Lohman CIKM’08 Speaker: Li, Huei-Jyun.
Deriving Relation Keys from XML Keys by Qing Wang, Hongwei Wu, Jianchang Xiao, Aoying Zhou, Junmei Zhou Reviewed by Chris Ying Zhu, Cong Wang, Max Wang,
2004/12/31 報告人 : 邱紹禎 1 Mining Frequent Query Patterns from XML Queries L.H. Yang, M.L. Lee, W. Hsu, and S. Acharya. Proc. of 8th Int. Conf. on Database.
From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching Jiaheng Lu, Tok Wang Ling, Chee-Yong Chan, Ting Chen National.
Efficient Processing of Updates in Dynamic XML Data Changqing Li, Tok Wang Ling, Min Hu.
Holistic Twig Joins Optimal XML Pattern Matching Nicolas Bruno Columbia University Nick Koudas Divesh Srivastava AT&T Labs-Research SIGMOD 2002.
1 Holistic Twig Joins: Optimal XML Pattern Matching Nicolas Bruno, Nick Koudas, Divesh Srivastava ACM SIGMOD 2002 Presented by Jun-Ki Min.
Holistic Twig Joins: Optimal XML Pattern Matching Nicholas Bruno, Nick Koudas, Divesh Srivastava ACM SIGMOD 02 Presented by: Li Wei, Dragomir Yankov.
Reuse or Never Reuse the Deleted Labels in XML Query Processing Based on Labeling Schemes Changqing Li, Tok Wang Ling, Min Hu.
1 Efficient Processing of XML Twig Patterns with Parent Child Edges: A Look-ahead Approach Presenter: Qi He.
Indexing and Querying XML Data for Regular Path Expressions Quanzhong Li and Bongki Moon Dept. of Computer Science University of Arizona VLDB 2001.
1 Efficient Processing of Partially Specified Twig Queries Junfeng Zhou Renmin University of China.
XML Query languages--XPath. Objectives Understand XPath, and be able to use XPath expressions to find fragments of an XML document Understand tree patterns,
1 Keyword Search over XML. 2 Inexact Querying Until now, our queries have been complex patterns, represented by trees or graphs Such query languages are.
Compressing XML Documents with Finite State Automata
Efficient processing of path query with not-predicates on XML data
Holistic Twig Joins: Optimal XML Pattern Matching
Probabilistic Data Management
(b) Tree representation
Structure and Content Scoring for XML
Alin Deutsch, University of Pennsylvania Mary Mernandez, AT&T Labs
Early Profile Pruning on XML-aware Publish-Subscribe Systems
XML Query Processing Yaw-Huei Chen
MCN: A New Semantics Towards Effective XML Keyword Search
Structure and Content Scoring for XML
Wei Wang University of New South Wales, Australia
Relax and Adapt: Computing Top-k Matches to XPath Queries
Presentation transcript:

Jiaheng Lu, Ting Chen and Tok Wang Ling National University of Singapore Finding all the occurrences of a twig pattern in an XML database is a core operation for efficient evaluation of XML queries. Our motivation is: (1) The performance of previous holistic twig join algorithms[1][2] can be further improved. (2) Algorithm based on region encoding CANNOT answer queries with wildcards in branching nodes. For example. According to region codes, which document, Doc1 or Doc2, matches query? By reading the region encoding of elements a,b,c alone, we CANNOT answer this wildcards branching query. Extended Dewey solve two problems: Wildcards query and Query performance Reference: (1) N. Bruno, D. Srivastava, and N. Koudas. Holistic twig joins: optimal XML pattern matching. In SIGMOD Conference, pages , (2) J. Lu, T. Chen, and T. W. Ling. Efficient processing of xml twig patterns with parent child edges: a look-ahead approach. In CIKM, pages 533~542, 2004 (3) P. O'Neil et al. ORDPATHs: Insert-friendly XML node labels SIGMOD pages 903~908, (4) I. Tatarinov, et al. Storing and querying ordered XML using a relational database system. In Proc. of SIGMOD, pages 204–215, Given an extended Dewey label, we can use the above finite state transducer to derive its path: For example: bib/book/chapter/section bib/book/chapter/title Experiemntal setting: (1)We use the random data sets (with 3 millions nodes) consisting of seven labels, namely a,b,...,e. The node labels in the data were uniformly distributed. (2) We issue four twig queries: a[.//b]//c, a[./b]/c, a[./b/c]/d/e, a[.//b/c]//d/e, (3) We compare our method with the previous work TwigStack[1] and TwigStackList[2]. To answer a twig pattern query, we propose a new holistic twig join algorithm, called TJFast. Compared to previous algorithms, to answer path and twig queries, we only need to access the labels of leaf nodes, So we significantly reduce I/O cost. For example, given a path query //chapter/section/text, we only access the labels of text to answer this query. Given a twig query: //chapter/section[.//keyword]/text, We only scan keyword and text. TJFast: Effective Processing of XML Twig Pattern Matching [1. INTRODUCTION] [2. Our new labeling scheme: EXTENDED DEWEY] [3. A new holistic algorithm: TJFAST] [4. Preliminary experiments] Tatarinov et al.[4] proposed a Dewey labeling scheme. It can be used to answer this wildcards query. See Fig 2. Since in Doc 1, b and c does not share the same parent, only Doc 2 matches this wildcard queries. But twig join algorithm based on Dewey scheme is not as efficient as that based on region encoding, since the prefix comparison is more time consuming than integer comparison in region encoding. In this paper, we extend Dewey labeling scheme, which not only can be used to answer wildcards queries, but also has better performance than algorithms on region encoding. Figure 1 An example to illustrate the limitation of region encoding Figure 2 An example to answer wildcards query with Dewey scheme Figure 3 An example to answer wildcards query with Dewey scheme Figure 4. DTD for the XML tree in Fig 3. Labeling methods: Given a document and DTD, we use module function to match an integer with the certain tag name. For example: book author, title, chapter Assume x(t) denote the last integer of the label of tag t, then x(author) mod 3 = 1, x(title) mod 3 =2 and x(chapter) mod 3 = 0. The label of any text value ends with 0. Figure 5. A Finite state transducer for DTD in Fig 4. TJFast only need to access the labels of LEAF nodes to answer a query. Resutls analysis: TJFast outperforms TwigStack, TwigStackList under all settings The improvement is due to the facts that TJFast only scan labels for query leaf nodes. Algorithmson region encoding is comparable to TJFast only when the number of elements for internal query nodes is very small.