Discovering Query Context using Concept Hierarchy Mukesh Mohania IBM Research – India.

Slides:



Advertisements
Similar presentations
A Domain Level Personalization Technique A. Campi, M. Mazuran, S. Ronchi.
Advertisements

XML: Extensible Markup Language
A Linguistics-Based Approach for Use Case Driven Analysis Using Goal and Scenario Authoring Vijayan Sugumaran Oakland University Rochester, Michigan, USA.
Travel and Expense Management Scenario Overview
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
The user entered the query “What is the historical relation between Greek and Roma”. Here are the query’s results. The user clicked the topic “Roman copies.
The Hierarchy of Data Bit (a binary digit): a circuit that is either on or off Byte: 8 bits Character: each byte represents a character; the basic building.
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
21 21 Web Content Management Architectures Vagan Terziyan MIT Department, University of Jyvaskyla, AI Department, Kharkov National University of Radioelectronics.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Universe Design Concepts Business Intelligence Copyright © SUPINFO. All rights reserved.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Adding metadata to web pages Please note: this is a temporary test document for use in internal testing only.
1 Web Developer Foundations: Using XHTML Chapter 11 Web Page Promotion Concepts.
0 1 Presented by MANSOUREH SERATI Faculty Member of Islamic World Science Citation Center (ISC) shiraz, Iran.
1 Web Developer & Design Foundations with XHTML Chapter 13 Key Concepts.
NUITS: A Novel User Interface for Efficient Keyword Search over Databases The integration of DB and IR provides users with a wide range of high quality.
Search Engines and Information Retrieval Chapter 1.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
Social scope: Enabling Information Discovery On Social Content Sites
July 20, 2007 Healthcare Information Technology Standards Panel Principles for Proper Use of HITSP Interoperability Specifications And Proposal for Proper.
CST203-2 Database Management Systems Lecture 2. One Tier Architecture Eg: In this scenario, a workgroup database is stored in a shared location on a single.
Building Search Portals With SP2013 Search. 2 SharePoint 2013 Search  Introduction  Changes in the Architecture  Result Sources  Query Rules/Result.
DBXplorer: A System for Keyword- Based Search over Relational Databases Sanjay Agrawal Surajit Chaudhuri Gautam Das Presented by Bhushan Pachpande.
3231 Software Engineering By Germaine Cheung Hong Kong Computer Institute Lecture 12.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Information Integration Across Heterogeneous Sources: Where Do We Stand and How to Proceed? Aditya Telang Sharma Chakravarthy, Yan Huang.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Querying Structured Text in an XML Database By Xuemei Luo.
Copyrighted material John Tullis 10/17/2015 page 1 04/15/00 XML Part 3 John Tullis DePaul Instructor
Chapter 1 : Introduction §Purpose of Database Systems §View of Data §Data Models §Data Definition Language §Data Manipulation Language §Transaction Management.
Lecture # 3 & 4 Chapter # 2 Database System Concepts and Architecture Muhammad Emran Database Systems 1.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
0 SharePoint Search 2013 Rafael de la Cruz SharePoint Developer Seneca Resources twitter.com/delacruz_rafael
1 15 quality goals for requirements  Justified  Correct  Complete  Consistent  Unambiguous  Feasible  Abstract  Traceable  Delimited  Interfaced.
LRI Université Paris-Sud ORSAY Nicolas Spyratos Philippe Rigaux.
1 Context-Aware Internet Sharma Chakravarthy UT Arlington December 19, 2008.
INTRODUCTION lecture1 1. Data base concept Data is a meaningless static value. What does 3421 means? Information is the data you process in a manner that.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Feb 24-27, 2004ICDL 2004, New Dehli Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
1 Chapter 2 Database Environment Pearson Education © 2009.
Toward Entity Retrieval over Structured and Text Data Mayssam Sayyadian, Azadeh Shakery, AnHai Doan, ChengXiang Zhai Department of Computer Science University.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
Traffic Source Tell a Friend Send SMS Social Network Group chat Banners Advertisement.
Personalized Ontology for Web Search Personalization S. Sendhilkumar, T.V. Geetha Anna University, Chennai India 1st ACM Bangalore annual Compute conference,
Databases and Database User ch1 Define Database? A database is a collection of related data.1 By data, we mean known facts that can be recorded and that.
- The most common types of data models.
Datab ase Systems Week 1 by Zohaib Jan.
Chapter 2 Database Environment.
Healthcare Information Technology Standards Panel
Guangbing Yang Presentation for Xerox Docushare Symposium in 2011
Chapter 2 Database Environment.
Introduction lecture1.
Chair of Tech Committee, BetterGrids.org
Introduction to Database Systems
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment.
Information Retrieval
Data Mining Chapter 6 Search Engines
Chapter 2 Database Environment Pearson Education © 2014.
Chapter 2 Database Environment Pearson Education © 2009.
Context-Aware Internet
Chapter 2 Database Environment Pearson Education © 2009.
Presentation transcript:

Discovering Query Context using Concept Hierarchy Mukesh Mohania IBM Research – India

Discovering Query Context using Concept Hierarchy for Information Integration Motivation: We keep refining the search query keywords till we get the desired results That is, the onus of specifying appropriate set of keywords (called, query context) remains with the user,  limitation since the user might not be aware of the overall context at the point of submitting the query.  Inevitable when the search query is submitted through the mobile device, particularly when the data source is not available all the time for searching or the bandwidth is limited.  The problem is how to derive the full context (as a set of keywords) of a query without sending the search query to the back-end data source.  We propose to store and use the concept hierarchies at mobile device for discovering the query context.  The advantage of discovering the query context is to further get all documents from enterprise content repositories and/or from external web which are highly relevant to the query results.

Report.doc: “… Report on Database Research at IRL … Policy Definition … “ New.txt: “Some other document” Another.pdf: “Expense Reimbursement Policy at IRL” Content Manager Proj | Name | Description | Group 1 | CORTEX | Policy Management | 6 Proj | Emp 1 | 7 1 | 4 Emp | Name | Group 4 | Mukesh | 6 7 | Manish | 6 Group | Name | Manager | Org 6 | Databases | 4 | 5 Org | Org Name | Address 5 | IRL | Hauz Khas, New Delhi PROJECTS PROJEMPS EMPLOYEES GROUPS ORG DB2 DocID | Author | Year Report.doc | Mukesh | 2000 Another.pdf | Manish | 2002 New.txt | Prasan | 2003 Name | Description CORTEX | Policy Management Report.doc: “… Report on Database Research at IRL … Policy Definition … “ Report.doc additionally retrieved based on the context. Note: No mention of CORTEX in document Result Select Name, Description From PROJECTS Where Name = “CORTEX” Query Year < 2003 Directives Motivating Example

Main Idea Specify information need in terms of SQL over the structured database  Additional information needs specified using “directives” (optional) Automatically synthesize the “context” of the SQL query from its result as well as related information present elsewhere in the database (found using the known semantic dependencies in the structured data)  Considers the foreign-key dependencies in either direction (PK  FK, FK  PK)  Automatically determines the most promising subset of the related information to explore Use this context and the directives to retrieve the relevant unstructured data CRUX

Select ARTICLES.id From ARTICLES Where ARTICLES.contains(“Report AND SET-OR($GROUPS.Name)”) And ARTICLES.year > 2001 ORDER BY RELEVANCE USING THRESHOLD = 0.8 FILLED IN BY THE USER CORE QUERYSEARCH DIRECTIVES Select GROUPS.Name, PROJECTS.Description From PROJECTS, PROJEMPS, EMPLOYEES, GROUPS Where PROJECTS.Proj# = PROJEMPS.Proj# And PROJEMPS.Emp# = EMPLOYEES.Emp# And EMPLOYEES.Grp# = GROUPS.Grp# And PROJECTS.Name = “CORTEX”

Core SQL Query and Search Directives Select GROUPS.Name, Projects.Description From PROJECTS, PROJEMPS, EMPLOYEES, GROUPS Where PROJECTS.Proj# = PROJEMPS.Proj# And PROJEMPS.Emp# = EMPLOYEES.Emp# And EMPLOYEES.Grp# = GROUPS.Grp# And PROJECTS.Name = “CORTEX” Select ARTICLES.id From ARTICLES Where ARTICLES.contains(“Report AND SET- OR($GROUPS.Name)”) And ARTICLES.year > 2001 ORDER BY RELEVANCE USING THRESHOLD = 0.8 Restrict to documents containing “Report” and at least one group name Restrict to documents published after 2001 Order by relevance to the structured query with cutoff threshold as 0.8 Core SQL Query Search Directives (optional)

Concept Hierarchies (CH) CH are defined over relational tables using FK, PK relationship (not all attributes are considered – depending on the application/scenarios) Predefined hierarchical relationships that map a set of lower-level concepts to their higher level correspondences Example, {tennis, rugby, hockey, football} can be generalized as “sports” at a high level concept. CH can be constructed in many ways

Scenarios Health: Patient specific report and medical articles, Manufacturing: Defect statistics and engineering specifications, Marketing: Customer transaction history and marketing documents, Travel: Traveler itinerary and promotional flyers, travel advisories, Management: Employee records and status reports

Research Issues Limitation: Context is a set of keywords.  – Semantics based context? Limitation: The context of a search query is determined by concept hierarchy.  -- Other avenues such as the previous results retrieved and the query workload, the user profile, if given Limitation: Context of a search query is mapped to one or more categories  -- Precise context-based document retrieval may be helpful Limitation: The documents and search query research are returned as unordered sets.  -- For usability reasons, efficient ranking algorithms to order the returned results with respect to the input query context would be needed. Limitation: Entire query results returned in addition to the documents on response to a query.  -- (HCI issue) The query results need to be presented in a browse-able manner, or may even be presented as smart tags dynamically attached to the documents.