Information Retrieval and Databases: Synergies and Syntheses IDM Workshop Panel 15 Sep 2003 Jayavel Shanmugasundaram Cornell University.

Slides:



Advertisements
Similar presentations
XML DOCUMENTS AND DATABASES
Advertisements

Efficient IR-Style Keyword Search over Relational Databases Vagelis Hristidis University of California, San Diego Luis Gravano Columbia University Yannis.
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
Flexible and Efficient XML Search with Complex Full-Text Predicates Sihem Amer-Yahia - AT&T Labs Research → Yahoo! Research Emiran Curtmola - University.
COMP630 Paper Presentation by Haomian(Eric) Wang.
CAREER: Towards Unifying Database Systems and Information Retrieval Systems NSF IDM Workshop 10 Oct 2004 Jayavel Shanmugasundaram Cornell University.
CH 11 Multimedia IR: Models and Languages
EASE: An Effective 3-in-1 Keyword Search Method for Unstructured, Semi-structured and Structured Data Guoliang Li et al.
1 - Fuhr: Information Retrieval Methods for XML Documents XIRQL: Eine Anfragesprache für Information Retrieval in XML- Dokumenten Norbert Fuhr Universität.
8/17/20151 Querying XML Database Using Relational Database System Rucha Patel MS CS (Spring 2008) Advanced Database Systems CSc 8712 Instructor : Dr. Yingshu.
Welcome to CPSC 534B: Web Data Integration & Management Laks V.S. Lakshmanan Rm. CICSR Main Mall.
NUITS: A Novel User Interface for Efficient Keyword Search over Databases The integration of DB and IR provides users with a wide range of high quality.
2 September 2005VLDB Tutorial on XML Full-Text Search XML Full-Text Search: Challenges and Opportunities Jayavel Shanmugasundaram Cornell University Sihem.
Keyword Search in Relational Databases Jaehui Park Intelligent Database Systems Lab. Seoul National University
1 IDAR 2007 Emiran Curtmola A Platform for Efficient Full-Text SEARCH on the Web.
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation XML Storage Techniques.
DBXplorer: A System for Keyword- Based Search over Relational Databases Sanjay Agrawal Surajit Chaudhuri Gautam Das Presented by Bhushan Pachpande.
XML과 Database 홍기형 성신여자대학교 성신여자대학교 홍기형.
IS6125 Database Analysis and Design Lecture 1: Introduction to IS6125 Rob Gleasure
Information Integration Across Heterogeneous Sources: Where Do We Stand and How to Proceed? Aditya Telang Sharma Chakravarthy, Yan Huang.
1 Searching XML Documents via XML Fragments D. Camel, Y. S. Maarek, M. Mandelbrod, Y. Mass and A. Soffer Presented by Hui Fang.
Flexible Text Mining using Interactive Information Extraction David Milward
EASE: An Effective 3-in-1 Keyword Search Method for Unstructured, Semi-structured and Structured Data Cuoliang Li, Beng Chin Ooi, Jianhua Feng, Jianyong.
ITCS 6265 Information Retrieval & Web Mining Lecture 18-A Fall 2009.
1 The Role of Document Structure in Querying, Scoring and Evaluating XML Full-Text Search Sihem Amer-Yahia AT&T Labs Research - USA Database Department.
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
BNCOD07Indexing & Searching XML Documents based on Content and Structure Synopses1 Indexing and Searching XML Documents based on Content and Structure.
2 September 2005VLDB Tutorial on XML Full-Text Search XML Full-Text Search: Challenges and Opportunities Jayavel Shanmugasundaram Cornell University Sihem.
1 Of Crawlers, Portals, Mice and Men: Is there more to Mining the Web? Jiawei Han Simon Fraser University, Canada ACM-SIGMOD’99 Web Mining Panel Presentation.
[ Part III of The XML seminar ] Presenter: Xiaogeng Zhao A Introduction of XQL.
Group A Next Generation Information Access Group.
Core Integration Web Services Dean Krafft, Cornell University
XML and Database.
Ranking objects based on relationships Computing Top-K over Aggregation Sigmod 2006 Kaushik Chakrabarti et al.
Integrating Structured & Unstructured Data. Goals  Identify some applications that have crucial requirement for integration of unstructured and structured.
Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba.
Modern Information Retrieval Presented by Miss Prattana Chanpolto Faculty of Information Technology.
Data Integration Hanna Zhong Department of Computer Science University of Illinois, Urbana-Champaign 11/12/2009.
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
Soon Joo Hyun Database Systems Research and Development Lab. US-KOREA Joint Workshop on Digital Library t Introduction ICU Information and Communication.
Structured Text Retrieval Models. Str. Text Retrieval Text Retrieval retrieves documents based on index terms. Observation: Documents have implicit structure.
Date: 2013/4/1 Author: Jaime I. Lopez-Veyna, Victor J. Sosa-Sosa, Ivan Lopez-Arevalo Source: KEYS’12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang KESOSD.
Welcome to CPSC 534B: Information Integration Laks V.S. Lakshmanan Rm. 315.
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
One Platform for Mining Structured and Unstructured Data: Dream or Reality? VLDB Panel 13 Sep 2006 Jayavel Shanmugasundaram Yahoo! Research.
11 Copyright © 2004, Oracle. All rights reserved. Managing XML Data in an Oracle 10g Database.
XML 1. Chapter 8 © 2013 Pearson Education, Inc. Publishing as Prentice Hall SAMPLE XML SCHEMA (XSD) 2 Schema is a record definition, analogous to the.
XRANK: RANKED KEYWORD SEARCH OVER XML DOCUMENTS Lin Guo Feng Shao Chavdar Botev Jayavel Shanmugasundaram Abhishek Chennaka, Alekhya Gade Advanced Database.
Partial Query-Evaluation in Internet Query Engines Jayavel Shanmugasundaram Kristin Tufte David DeWitt David Maier Jeffrey Naughton University of Wisconsin.
Databases and Information Retrieval: Rethinking the Great Divide SIGMOD Panel 14 Jun 2005 Jayavel Shanmugasundaram Cornell University.
Database Research for the Current Millennium ICDE Panel 1 Apr 2004 Jayavel Shanmugasundaram Cornell University.
Text Search over XML Documents Jayavel Shanmugasundaram Cornell University.
I Copyright © 2004, Oracle. All rights reserved. Introduction.
Overview of XML Data Management Research at Cornell Jayavel Shanmugasundaram Cornell University.
Structured-Value Ranking in Update- Intensive Relational Databases Jayavel Shanmugasundaram Cornell University (Joint work with: Lin Guo, Kevin Beyer,
1 Keyword Search over XML. 2 Inexact Querying Until now, our queries have been complex patterns, represented by trees or graphs Such query languages are.
1 Keyword Search over XML. 2 Inexact Querying Until now, our queries have been complex patterns, represented by trees or graphs Such query languages are.
XML: Extensible Markup Language
XRANK: Ranked Keyword Search over XML Documents
Guangbing Yang Presentation for Xerox Docushare Symposium in 2011
موضوع پروژه : بازیابی اطلاعات Information Retrieval
eXtensible Markup Language (XML)
Structure and Content Scoring for XML
CSE 635 Multimedia Information Retrieval
2/18/2019.
Web Mining Department of Computer Science and Engg.
Structure and Content Scoring for XML
Information Retrieval and Web Design
Introduction to XML IR XML Group.
Presentation transcript:

Information Retrieval and Databases: Synergies and Syntheses IDM Workshop Panel 15 Sep 2003 Jayavel Shanmugasundaram Cornell University

10000 foot view of Data Management Structured Unstructured Complex and Structured Ranked Keyword Search Data Queries Database Systems Information Retrieval Systems

10000 foot view of Data Management Structured Unstructured Complex and Structured Ranked Keyword Search Data Queries Database Systems Information Retrieval Systems

Applications Information discovery over structured databases Keyword search over relational databases –DBXplorer [Agrawal et al.] –DISCOVER [Hristidis et al.] –BANKS [Hulgeri et al.]

10000 foot view of Data Management Structured Unstructured Complex and Structured Ranked Keyword Search Data Queries Database Systems Information Retrieval Systems

10000 foot view of Data Management Structured Unstructured Complex and Structured Ranked Keyword Search Data Queries Database Systems Information Retrieval Systems

Applications Content management –Mix of structured and unstructured data Database with date and time of accident (structured data) and accident description (unstructured data) –Semi-structured data Scientific documents, Shakespeare’s plays, … Support flexible ranked keyword search interface over such data –XRANK [Guo et al., SIGMOD 2003] –XIRQL [Fuhr et al., SIGIR 2001]

XML Keyword Search XML and Information Retrieval: A SIGIR 2000 Workshop David Carmel, Yoelle Maarek, Aya Soffer XQL and Proximal Nodes Ricardo Baeza-Yates Gonzalo Navarro We consider the recently proposed language … Searching on structured text is becoming more important with XML … … … Most specific results (exploits structure!) Ranking at granularity of elements

10000 foot view of Data Management Structured Unstructured Complex and Structured Ranked Keyword Search Data Queries Database Systems Information Retrieval Systems

Applications The Internet is enabling end-users to directly ask queries and explore results –E.g., Used car marketplace –Find all “bright red ford mustangs” that cost less than 20% of the average price of cars in its class Characteristics of queries –Keyword search (for ease of use) –Complex query operations (information synthesis) –Want to see ranked results!

Towards Unifying DB and IR No standard query language for both DB and IR –SQL and XQuery mostly “database” query languages Currently developing TeXQuery: a full-text search extension to XQuery –With S. Amer-Yahia, C. Botev, J. Robie –Full composability of database and IR primitives, ranking –Submitted to W3C committee on full-text extensions to XQuery

Summary Applications have mix of structured (DB domain) and unstructured (IR domain) data –Stark difference in how they can be processed Benefits of unifying DB & IR –Ranked keyword search (information discovery) over both structured and unstructured data –Complex queries over structured/semi-structured data A truly unified data store –Need to generalize DB and IR techniques