Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Retrieval and Databases: Synergies and Syntheses IDM Workshop Panel 15 Sep 2003 Jayavel Shanmugasundaram Cornell University.

Similar presentations


Presentation on theme: "Information Retrieval and Databases: Synergies and Syntheses IDM Workshop Panel 15 Sep 2003 Jayavel Shanmugasundaram Cornell University."— Presentation transcript:

1 Information Retrieval and Databases: Synergies and Syntheses IDM Workshop Panel 15 Sep 2003 Jayavel Shanmugasundaram Cornell University

2 10000 foot view of Data Management Structured Unstructured Complex and Structured Ranked Keyword Search Data Queries Database Systems Information Retrieval Systems

3 10000 foot view of Data Management Structured Unstructured Complex and Structured Ranked Keyword Search Data Queries Database Systems Information Retrieval Systems

4 Applications Information discovery over structured databases Keyword search over relational databases –DBXplorer [Agrawal et al.] –DISCOVER [Hristidis et al.] –BANKS [Hulgeri et al.]

5 10000 foot view of Data Management Structured Unstructured Complex and Structured Ranked Keyword Search Data Queries Database Systems Information Retrieval Systems

6 10000 foot view of Data Management Structured Unstructured Complex and Structured Ranked Keyword Search Data Queries Database Systems Information Retrieval Systems

7 Applications Content management –Mix of structured and unstructured data Database with date and time of accident (structured data) and accident description (unstructured data) –Semi-structured data Scientific documents, Shakespeare’s plays, … Support flexible ranked keyword search interface over such data –XRANK [Guo et al., SIGMOD 2003] –XIRQL [Fuhr et al., SIGIR 2001]

8 XML Keyword Search XML and Information Retrieval: A SIGIR 2000 Workshop David Carmel, Yoelle Maarek, Aya Soffer XQL and Proximal Nodes Ricardo Baeza-Yates Gonzalo Navarro We consider the recently proposed language … Searching on structured text is becoming more important with XML … … … Most specific results (exploits structure!) Ranking at granularity of elements

9 10000 foot view of Data Management Structured Unstructured Complex and Structured Ranked Keyword Search Data Queries Database Systems Information Retrieval Systems

10 Applications The Internet is enabling end-users to directly ask queries and explore results –E.g., Used car marketplace –Find all “bright red ford mustangs” that cost less than 20% of the average price of cars in its class Characteristics of queries –Keyword search (for ease of use) –Complex query operations (information synthesis) –Want to see ranked results!

11 Towards Unifying DB and IR No standard query language for both DB and IR –SQL and XQuery mostly “database” query languages Currently developing TeXQuery: a full-text search extension to XQuery –With S. Amer-Yahia, C. Botev, J. Robie –Full composability of database and IR primitives, ranking –Submitted to W3C committee on full-text extensions to XQuery

12 Summary Applications have mix of structured (DB domain) and unstructured (IR domain) data –Stark difference in how they can be processed Benefits of unifying DB & IR –Ranked keyword search (information discovery) over both structured and unstructured data –Complex queries over structured/semi-structured data A truly unified data store –Need to generalize DB and IR techniques


Download ppt "Information Retrieval and Databases: Synergies and Syntheses IDM Workshop Panel 15 Sep 2003 Jayavel Shanmugasundaram Cornell University."

Similar presentations


Ads by Google