Indexing Semistructured Data J. McHugh, J. Widom, S. Abiteboul, Q. Luo, and A. Rajaraman Stanford University January 1998

Slides:



Advertisements
Similar presentations
XML DOCUMENTS AND DATABASES
Advertisements

Evaluating “find a path” reachability queries P. Bouros 1, T. Dalamagas 2, S.Skiadopoulos 3, T. Sellis 1,2 1 National Technical University of Athens 2.
By Daniela Floresu Donald Kossmann
Yinghui Wu, LFCS DB talk Database Group Meeting Talk Yinghui Wu 10/11/ Simulation Revised for Graph Pattern Matching.
Incremental Maintenance for Materialized Views over Semistructured Data Written By: Serge Abiteboul Jason McHuge Michael Rys Vasilis Vassalos Janet L.
Introduction to Spatial Database System Presented by Xiaozhi Yu.
1 CS 561 Presentation: Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Ming Li.
Data Management for XML: Research Directions By: Jennifer Widom Stanford University Reviewer: Kristin Streilein.
Web-site Management System Strudel Presented by: LAKHLIFI Houda Instructor: Dr. Haddouti.
From Semistructured Data to XML: Migrating The Lore Data Model and Query Language Roy Goldman, Jason McHugh, Jennifer Widom Stanford University
Object-Orientation in Query Languages By: Toan Nguyen Class: CS 157A.
Managing XML and Semistructured Data Lecture : Indexes.
Query Optimization for Semistructured Data Jason McHug, Jennifer Widom Stanford University - Rajendra S. Thapa.
TOSS: An Extension of TAX with Ontologies and Similarity Queries Edward Hung, Yu Deng, V.S. Subrahmanian Department of Computer Science University of Maryland,
1 Efficient Processing of XPath Queries Using Indexes Yan Chen 1, Sanjay Madria 1, Kalpdrum Passi 2, Sourav Bhowmick 3 1 Department of Computer Science,
LORE Light Object Repository by Othman Chhoul CSC5370 Fall 2003.
Multimedia Information Systems CS Outlines Introduction to DMBS Relational database and SQL B + - tree index structure.
Lore: A Database Management System for Semistructured Data.
OEM and LORE Query Language Sanjay Madria Department of Computer Science University of Missouri-Rolla
Putting Semi-structured Data to Practice Alon Levy Seattle, Washingon University of Washington.
Algorithmics and Applications of Tree and Graph Searching Dennis Shasha, Jason T. L. Wang, and Rosalba Giugno Presenters: Jerod Watson & Christan Grant.
IS432: Semi-Structured Data Dr. Azeddine Chikh. 1. Semi Structured Data Object Exchange Model.
Cost-based Optimization of Graph Queries Silke Trißl Humboldt-Universität zu Berlin Knowledge Management in Bioinformatics IDAR 2007.
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
A TREE BASED ALGEBRA FRAMEWORK FOR XML DATA SYSTEMS
Sanjay Agarwal Surajit Chaudhuri Gautam Das Presented By : SRUTHI GUNGIDI.
XML과 Database 홍기형 성신여자대학교 성신여자대학교 홍기형.
G-SPARQL: A Hybrid Engine for Querying Large Attributed Graphs Sherif SakrSameh ElniketyYuxiong He NICTA & UNSW Sydney, Australia Microsoft Research Redmond,
1 Semi-structured data Patrick Lambrix Department of Computer and Information Science Linköpings universitet.
DBXplorer: A System for Keyword- Based Search over Relational Databases Sanjay Agrawal, Surajit Chaudhuri, Gautam Das Cathy Wang
Querying Structured Text in an XML Database By Xuemei Luo.
The TSIMMIS Approach to Mediation: Data Models and Languages Hector Garcia-Molina Yannis Papakonstantinou Dallan Quass Anand Rajaraman Yehoshua Sagiv Jeffrey.
The WWW as a Database: WWW Query Languages Curtis Dyreson James Cook University ( Townsville, Australia ) Aalborg University.
Winter 2006Keller, Ullman, Cushing18–1 Plan 1.Information integration: important new application that motivates what follows. 2.Semistructured data: a.
1 XML Storage and Query Processing Yanlei Diao University of Massachusetts Amherst Some slide content courtesy of Donald Kossmann.
Web Data Management Indexes. In this lecture Indexes –XSet –Region algebras –Indexes for Arbitrary Semistructured Data –Dataguides –T-indexes –Index Fabric.
5/2/20051 XML Data Management Yaw-Huei Chen Department of Computer Science and Information Engineering National Chiayi University.
For: CS590 Intelligent Systems Related Subject Areas: Artificial Intelligence, Graphs, Epistemology, Knowledge Management and Information Filtering Application.
[ Part III of The XML seminar ] Presenter: Xiaogeng Zhao A Introduction of XQL.
Index Example From Garcia-Molina, Ullman, and Widom: Database Systems, the Complete Book pp
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Lore: A Database Management System for Semistructured Data.
Lore: A Database Management System for Semi-structured Data Jason McHugh, Serge Abiteboul, Roy Goldman, Dallan Quass, Jennifer Widom Stanford University.
Introduction to Databases
1 M ATERIALIZED V IEW M AINTENANCE FOR THE X ML D OCUMENTS Yuan Fa, Yabing Chen, Tok Wang Ling, Ting Chen Yuan Fa, Yabing Chen, Tok Wang Ling, Ting Chen.
Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba.
Database Management System. DBMS A software package that allows users to create, retrieve and modify databases. A database is a collection of related.
Data Structures Chapter 6. Data Structure A data structure is a representation of data and the operations allowed on that data. Examples: 1.Array 2.Record.
Part One XML and Databases Soumen Chakrabarti CSE, IIT Bombay.
Graphs. Graphs Similar to the graphs you’ve known since the 5 th grade: line graphs, bar graphs, etc., but more general. Those mathematical graphs are.
September 2000XML Workshop, IIT Bombay Indexing of XML Data Raghuraman Rangarajan KReSIT, IIT Bombay.
Finite State Machines 1.Finite state machines with output 2.Finite state machines with no output 3.DFA 4.NDFA.
Date: 2013/4/1 Author: Jaime I. Lopez-Veyna, Victor J. Sosa-Sosa, Ivan Lopez-Arevalo Source: KEYS’12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang KESOSD.
Raluca Paiu1 Semantic Web Search By Raluca PAIU
APEX: An Adaptive Path Index for XML data Chin-Wan Chung, Jun-Ki Min, Kyuseok Shim SIGMOD 2002 Presentation: M.S.3 HyunSuk Jung Data Warehousing Lab. In.
Semi-structured Data In many applications, data does not have a rigidly and predefined schema: –e.g., structured files, scientific data, XML. Managing.
Finding Regular Simple Paths Sept. 2013Yangjun Chen ACS Finding Regular Simple Paths in Graph Databases Basic definitions Regular paths Regular simple.
Dynamically Computing Fastest Paths for Intelligent Transportation Systems - ADITI BHAUMICK ab3585.
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
DATA STRUCTURE Presented By: Mahmoud Rafeek Alfarra Using C# MINISTRY OF EDUCATION & HIGHER EDUCATION COLLEGE OF SCIENCE AND TECHNOLOGY KHANYOUNIS- PALESTINE.
Xifeng Yan Philip S. Yu Jiawei Han SIGMOD 2005 Substructure Similarity Search in Graph Databases.
Incomplete Answers over Semistructured Data Kanza, Nutt, Sagiv PODS 1999 Slides by Yaron Kanza.
Set Collection A Bag is a general collection class that implements the Collection interface. A Set is a collection that resembles a Bag with the provision.
Graph Database.
Computing Full Disjunctions
MCN: A New Semantics Towards Effective XML Keyword Search
Magnet & /facet Zheng Liang
CS224w: Social and Information Network Analysis
Connecting the Dots Between News Article
Presentation transcript:

Indexing Semistructured Data J. McHugh, J. Widom, S. Abiteboul, Q. Luo, and A. Rajaraman Stanford University January EECS /21/2000 Presented by Weiming Zhou

Outline Introduction - Data Model - Query Language Indexes in Lore Query plans using indexes Conclusions

Data Model - Object Exchange Model (OEM)

The Lorel Query Language (Lorel) Example 1 select DB.Movie.Title where DB.Movie.Actor.Name = “Harrison Ford” Example 2 select T from DB.Movie M, M.Title T where exists A in M.Actor : exists N in A.Name : N = “Harrison Ford”

Indexes In Lore Value index Text index Link index Path index Edge index

Value index Similar to attribute indexes in Relational DBMS Example Suppose we create a Value index for DB.Movie.Year If we perform a lookup for DB.Movie.Year = “1956”, Result: &12.

Text Index An information-retrieval style keyword search. Restricted by incoming labels. Locates string values containing specific words. Useful for strings containing a significant amount of text. Implementation: Inverted lists - map a given word w and label l to a list of atomic values with incoming edge l that contain word w. Example: Lookup for all objects with an atomic string value containing the word “Ford" and an incoming edge Name. Results: {, }.

Link Index Locates parents of a given object. Serves as back-pointers Implementation Extendible hashing One Link Index for the entire database graph Example The Link Index lookup for object &17 returns parent object &6, and the lookup for object &21 returns object &13.

Path Index Locate all objects reachable by a given labeled path. Provided by DataGuide. Example select DB.Movie.Title Using the Path Index to directly locate all objects reachable via DB.Movie.Title. Results: &5; &9; &1 4.

Edge Index All parent-child pairs connected via a specified label. Example Look up label “Year” in Edge Index Results: &2-&7, &3-&12

Query Plans Using Indexes Top-Down Bottom-Up Hybrid Example select T from DB.Movie M, M.Title T where exists A in M.Actor : exists N in A.Name : N = “Harrison Ford”

Top-Down Query Plan Exhaustive Top-down traversals DB.Movie.Actor.Name = “Harrison Ford” &17, &21 Link Index &17  &2, &21  &4 DB.Movie.Title &5, &14

Bottom-Up Query Plan Look up Value Index DB.Movie.Actor.Name = “Harrison Ford” &17, &21 Link Index &17  &2, &21  &4 DB.Movie.Title &5, &14

Hybrid Query Plan select X from A.B X where exists Y in X.C : Y =5 Bottom-up: Value Index A.B.C = “5” Top-down: A.B Intersect

Conclusions Presents Lore’s indexing structures: Value Index, Text Index, Link Index, Path Index and Edge Index. Query plans using indexes Preliminary performance results: at least an order of magnitude improvement when indexes are used for query processing.