Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discovering Query Context using Concept Hierarchy Mukesh Mohania IBM Research – India.

Similar presentations


Presentation on theme: "Discovering Query Context using Concept Hierarchy Mukesh Mohania IBM Research – India."— Presentation transcript:

1 Discovering Query Context using Concept Hierarchy Mukesh Mohania IBM Research – India

2 Discovering Query Context using Concept Hierarchy for Information Integration Motivation: We keep refining the search query keywords till we get the desired results That is, the onus of specifying appropriate set of keywords (called, query context) remains with the user,  limitation since the user might not be aware of the overall context at the point of submitting the query.  Inevitable when the search query is submitted through the mobile device, particularly when the data source is not available all the time for searching or the bandwidth is limited.  The problem is how to derive the full context (as a set of keywords) of a query without sending the search query to the back-end data source.  We propose to store and use the concept hierarchies at mobile device for discovering the query context.  The advantage of discovering the query context is to further get all documents from enterprise content repositories and/or from external web which are highly relevant to the query results.

3 Report.doc: “… Report on Database Research at IRL … Policy Definition … “ New.txt: “Some other document” Another.pdf: “Expense Reimbursement Policy at IRL” Content Manager Proj | Name | Description | Group 1 | CORTEX | Policy Management | 6 Proj | Emp 1 | 7 1 | 4 Emp | Name | Group 4 | Mukesh | 6 7 | Manish | 6 Group | Name | Manager | Org 6 | Databases | 4 | 5 Org | Org Name | Address 5 | IRL | Hauz Khas, New Delhi PROJECTS PROJEMPS EMPLOYEES GROUPS ORG DB2 DocID | Author | Year Report.doc | Mukesh | 2000 Another.pdf | Manish | 2002 New.txt | Prasan | 2003 Name | Description CORTEX | Policy Management Report.doc: “… Report on Database Research at IRL … Policy Definition … “ Report.doc additionally retrieved based on the context. Note: No mention of CORTEX in document Result Select Name, Description From PROJECTS Where Name = “CORTEX” Query Year < 2003 Directives Motivating Example

4 Main Idea Specify information need in terms of SQL over the structured database  Additional information needs specified using “directives” (optional) Automatically synthesize the “context” of the SQL query from its result as well as related information present elsewhere in the database (found using the known semantic dependencies in the structured data)  Considers the foreign-key dependencies in either direction (PK  FK, FK  PK)  Automatically determines the most promising subset of the related information to explore Use this context and the directives to retrieve the relevant unstructured data CRUX

5 Select ARTICLES.id From ARTICLES Where ARTICLES.contains(“Report AND SET-OR($GROUPS.Name)”) And ARTICLES.year > 2001 ORDER BY RELEVANCE USING THRESHOLD = 0.8 FILLED IN BY THE USER CORE QUERYSEARCH DIRECTIVES Select GROUPS.Name, PROJECTS.Description From PROJECTS, PROJEMPS, EMPLOYEES, GROUPS Where PROJECTS.Proj# = PROJEMPS.Proj# And PROJEMPS.Emp# = EMPLOYEES.Emp# And EMPLOYEES.Grp# = GROUPS.Grp# And PROJECTS.Name = “CORTEX”

6 Core SQL Query and Search Directives Select GROUPS.Name, Projects.Description From PROJECTS, PROJEMPS, EMPLOYEES, GROUPS Where PROJECTS.Proj# = PROJEMPS.Proj# And PROJEMPS.Emp# = EMPLOYEES.Emp# And EMPLOYEES.Grp# = GROUPS.Grp# And PROJECTS.Name = “CORTEX” Select ARTICLES.id From ARTICLES Where ARTICLES.contains(“Report AND SET- OR($GROUPS.Name)”) And ARTICLES.year > 2001 ORDER BY RELEVANCE USING THRESHOLD = 0.8 Restrict to documents containing “Report” and at least one group name Restrict to documents published after 2001 Order by relevance to the structured query with cutoff threshold as 0.8 Core SQL Query Search Directives (optional)

7 Concept Hierarchies (CH) CH are defined over relational tables using FK, PK relationship (not all attributes are considered – depending on the application/scenarios) Predefined hierarchical relationships that map a set of lower-level concepts to their higher level correspondences Example, {tennis, rugby, hockey, football} can be generalized as “sports” at a high level concept. CH can be constructed in many ways

8 Scenarios Health: Patient specific report and medical articles, Manufacturing: Defect statistics and engineering specifications, Marketing: Customer transaction history and marketing documents, Travel: Traveler itinerary and promotional flyers, travel advisories, Management: Employee records and status reports

9 Research Issues Limitation: Context is a set of keywords.  – Semantics based context? Limitation: The context of a search query is determined by concept hierarchy.  -- Other avenues such as the previous results retrieved and the query workload, the user profile, if given Limitation: Context of a search query is mapped to one or more categories  -- Precise context-based document retrieval may be helpful Limitation: The documents and search query research are returned as unordered sets.  -- For usability reasons, efficient ranking algorithms to order the returned results with respect to the input query context would be needed. Limitation: Entire query results returned in addition to the documents on response to a query.  -- (HCI issue) The query results need to be presented in a browse-able manner, or may even be presented as smart tags dynamically attached to the documents.


Download ppt "Discovering Query Context using Concept Hierarchy Mukesh Mohania IBM Research – India."

Similar presentations


Ads by Google