Presentation is loading. Please wait.

Presentation is loading. Please wait.

Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li.

Similar presentations


Presentation on theme: "Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li."— Presentation transcript:

1 Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li

2 Motivation “Retrieve castles near London that are reachable by train in less than 2 hours” “Find 3-bedroom houses in Houston within 2 miles of a school and within 5 miles of a highway and priced under 250,000$” “Retrieve French restaurants within 1 mile of IMAX Theater in Dallas, Texas” …

3 Current Scenario “Retrieve castles near London that are reachable by train in less than 2 hours” London Train schedules Trains from London Castles Near London - Decision Making Process - Manually Combine Results to arrive at a decision - Decision Making Process - Manually Combine Results to arrive at a decision

4 Ideal Scenario Information Integration System Intent: Retrieve castles near London that are reachable by train in less than 2 hours Actual Results for the intent

5 The InfoMosaic Approach

6 Query Specification Query: “Bank within 1 mile of ‘University of Texas, Arlington’” Query: Castle within 2 hours by train from London

7 How to specify a query? Search Method (e.g., Google) – – Just needs to search for the ‘keyword’ in a set of documents – Get list of documents and post-process (rank, cluster, classify, etc.) – In an Integration scenario, this doesn’t work ‘bank’ – D1 ‘University of Texas, Arlington’ – D2 1 (out of 1 mile) is ignored Intersecting documents returned will not generate results desired

8 How to specify a query? Database Method (e.g., SQL) – – Too rigid – Need to know database (or source) and its corresponding attributes SELECT T1.a1, T2.a2 FROM T1, T2 WHERE … – Web is not organized as a database hence exact mapping between sources and attributes is not feasible and not available

9 How to specify a query? Natural Language – – Ideal mechanism – Inherently hard considering ambiguities of natural language. – school – institution for education; group of fish – Mechanisms such as Question-Answering frameworks focus on sophisticated language models built for specific domains independently. – Incorrect assumption in a integration scenario

10 Query Specification Query – “castle near London” List of documents retrieved from Web containing text – “castle near London” Relation containing tuples SELECT castle.name, … FROM castle_DB WHERE Castle.location = ‘London’

11 Query Specification Query – “castle near London” Information Integration No idea about source, schema, attributes, etc. No idea about how to pose a query No idea about user intent – Castle:= building, move in chess, … ? SELECT castle.* WHERE castle.place = “London”

12 Proposed Approach Approach: refine-as-you-input Approach: verify-after-input Amount of input Number of Interactions (verify-after-input) (refine-as-you-input) Succinct (keyword) Verbose (natural language, SQL) Ideal Formulation

13 Approach 1: Refine-as-you-input Based on most popular paradigm of querying used today – keyword search – Input: Set of keywords/concepts (e.g., castle, train, …) – Output: Set of 1 or more Precise Structured Query – Challenge: Keyword Resolution – entity, attribute, value? Generating Query from minimal information – Problem: Could result in too many non-relevant queries – Positive: Paradigm accepted by Web, IR and even DB community !!!

14 Approach 1: Refine-as-you-input User Interaction

15 Approach 1: Verify-after-input Based on a rigorous method of formulation queries – similar to SQL – Input – user filled template based – Output – single precise query – Problem Users don’t like filling too many details Coming up with a unique template across domains – Positive Less ambiguous Reduced number of user interactions

16 Approach 1: Verify-after-input

17 Evaluation Plan Testing the approaches on RDBMS where the schema and output is known Actual user studies

18 Related Work

19 Future Work Perform extensive experiments to prove the validity of the proposed approaches Address other issues in information integration Current focus – Ranking [Telang:DBRank’07]

20 Thank You !


Download ppt "Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? Aditya Telang Sharma Chakravarthy, Chengkai Li."

Similar presentations


Ads by Google