Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai.

Slides:



Advertisements
Similar presentations
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
Advertisements

Date: 2014/05/06 Author: Michael Schuhmacher, Simon Paolo Ponzetto Source: WSDM’14 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Knowledge-based Graph Document.
Processing XML Keyword Search by Constructing Effective Structured Queries Jianxin Li, Chengfei Liu, Rui Zhou and Bo Ning Swinburne University of Technology,
1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011.
Oracle Labs Graph Analytics Research Hassan Chafi Sr. Research Manager Oracle Labs Graph-TA 2/21/2014.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. 1 The Architecture of a Large-Scale Web Search and Query Engine.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
Retrieving Documents with Geographic References Using a Spatial Index Structure Based on Ontologies Database Laboratory University of A Coruña A Coruña,
1 Draft of a Matchmaking Service Chuang liu. 2 Matchmaking Service Matchmaking Service is a service to help service providers to advertising their service.
The Data Mining Visual Environment Motivation Major problems with existing DM systems They are based on non-extensible frameworks. They provide a non-uniform.
Keyword Proximity Search on XML Graphs Vagelis Hristidis Yannis Papakonstatinou Andrey Presenter: Feng Shao.
Disambiguation Algorithm for People Search on the Web Dmitri V. Kalashnikov, Sharad Mehrotra, Zhaoqi Chen, Rabia Nuray-Turan, Naveen Ashish For questions.
IST NeOn-project.org The Semantic Web is growing… #SW Pages Lee, J., Goodwin, R. (2004) The Semantic.
Query Biased Snippet Generation in XML Search Yi Chen Yu Huang, Ziyang Liu, Yi Chen Arizona State University.
1 MARG-DARSHAK: A Scrapbook on Web Search engines allow the users to enter keywords relating to a topic and retrieve information about internet sites (URLs)
Query Planning for Searching Inter- Dependent Deep-Web Databases Fan Wang 1, Gagan Agrawal 1, Ruoming Jin 2 1 Department of Computer.
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
Authors: Bhavana Bharat Dalvi, Meghana Kshirsagar, S. Sudarshan Presented By: Aruna Keyword Search on External Memory Data Graphs.
NUITS: A Novel User Interface for Efficient Keyword Search over Databases The integration of DB and IR provides users with a wide range of high quality.
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Graph Data Management Lab, School of Computer Science gdm.fudan.edu.cn XMLSnippet: A Coding Assistant for XML Configuration Snippet.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
-1- Philipp Heim, Thomas Ertl, Jürgen Ziegler Facet Graphs: Complex Semantic Querying Made Easy Philipp Heim 1, Thomas Ertl 1 and Jürgen Ziegler 2 1 Visualization.
DBXplorer: A System for Keyword- Based Search over Relational Databases Sanjay Agrawal, Surajit Chaudhuri, Gautam Das Cathy Wang
Querying Structured Text in an XML Database By Xuemei Luo.
SemSearch: A Search Engine for the Semantic Web Yuangui Lei, Victoria Uren, Enrico Motta Knowledge Media Institute The Open University EKAW 2006 Presented.
GStore: Answering SPARQL Queries via Subgraph Matching Lei Zou, Jinghui Mo, Lei Chen, M. Tamer Ozsu ¨, Dongyan Zhao {
EASE: An Effective 3-in-1 Keyword Search Method for Unstructured, Semi-structured and Structured Data Cuoliang Li, Beng Chin Ooi, Jianhua Feng, Jianyong.
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Chapter 6: Information Retrieval and Web Search
SPARQL Query Graph Model (How to improve query evaluation?) Ralf Heese and Olaf Hartig Humboldt-Universität zu Berlin.
LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March 2012
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
Keyword Query Routing.
Improving Web Search Results Using Affinity Graph Benyu Zhang, Hua Li, Yi Liu, Lei Ji, Wensi Xi, Weiguo Fan, Zheng Chen, Wei-Ying Ma Microsoft Research.
P2P Concept Search Fausto Giunchiglia Uladzimir Kharkevich S.R.H Noori April 21st, 2009, Madrid, Spain.
BNCOD07Indexing & Searching XML Documents based on Content and Structure Synopses1 Indexing and Searching XML Documents based on Content and Structure.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Templated Search over Relational Databases Date: 2015/01/15 Author: Anastasios Zouzias, Michail Vlachos, Vagelis Hristidis Source: ACM CIKM’14 Advisor:
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
VLDB2005 CMS-ToPSS: Efficient Dissemination of RSS Documents Milenko Petrovic Haifeng Liu Hans-Arno Jacobsen University of Toronto.
Shridhar Bhalerao CMSC 601 Finding Implicit Relations in the Semantic Web.
Scalable Keyword Search on Large RDF Data. Abstract Keyword search is a useful tool for exploring large RDF datasets. Existing techniques either rely.
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
Supporting Ranking and Clustering as Generalized Order-By and Group-By Chengkai Li (UIUC) joint work with Min Wang Lipyeow Lim Haixun Wang (IBM) Kevin.
Date: 2013/4/1 Author: Jaime I. Lopez-Veyna, Victor J. Sosa-Sosa, Ivan Lopez-Arevalo Source: KEYS’12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang KESOSD.
APEX: An Adaptive Path Index for XML data Chin-Wan Chung, Jun-Ki Min, Kyuseok Shim SIGMOD 2002 Presentation: M.S.3 HyunSuk Jung Data Warehousing Lab. In.
An Effective SPARQL Support over Relational Database Jing Lu, Feng Cao, Li Ma, Yong Yu, Yue Pan SWDB-ODBIS 2007 SNU IDB Lab. Hyewon Lim July 30 th, 2009.
Leveraging Knowledge Bases for Contextual Entity Exploration Categories Date:2015/09/17 Author:Joonseok Lee, Ariel Fuxman, Bo Zhao, Yuanhua Lv Source:KDD'15.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Efficient Semantic Web Service Discovery in Centralized and P2P Environments Dimitrios Skoutas 1,2 Dimitris Sacharidis.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
哈工大信息检索研究室 HITIR ’ s Update Summary at TAC2008 Extractive Content Selection Using Evolutionary Manifold-ranking and Spectral Clustering Reporter: Ph.d.
Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.
PAIR project progress report Yi-Ting Chou Shui-Lung Chuang Xuanhui Wang.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Proposal for Term Project
Web News Sentence Searching Using Linguistic Graph Similarity
An Empirical Study of Learning to Rank for Entity Search
Wikitology Wikipedia as an Ontology
A Graph-Based Approach to Learn Semantic Descriptions of Data Sources
Magnet & /facet Zheng Liang
Bidirectional Query Planning Algorithm
Semantic Markup for Semantic Web Tools:
WSExpress: A QoS-Aware Search Engine for Web Services
Presentation transcript:

Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai Jiao Tong University 2 Institute AIFB, University Karlsruhe, Germany

Agenda Introduction Q2Semantic – Workflow – Data Pre-Processing – Query Interpretation – Query Ranking Experiments Demo Conclusions and Future Work

Introduction Semantic Web can be seen as an ever growing web of structured and interlinked data – Large repositories of such data are available in RDF (DBpedia, TAP, DBLP and etc.) – Increasing available of these semantic data offers opportunities for semantic search engines to support more expressive queries Query interface in semantic search engines – Formal query interface (e.g. SPARQL) is supported in current semantic search engines – Natural language query interface as one solution – keyword query interface is the most popular one (our focus) Information need Find specifications about “SVG” whose author's name is “Capin” The SPARQL query PREFIX tap: SELECT ?spec WHERE { ?spec tap:hasAuthor ?person. ?spec tap:label “SVG”. ?person tap:name “Capin”. } The keyword query “SVG” + “Capin”

Introduction (cont’d) Many studies have been carried out to bridge the gap between keyword queries and formal queries – Keyword interfaces for DB or XML – Keyword Interfaces for semantic search engines Challenges – How to deal with keyword phrases which are expressed in the user's own words which do not appear in the RDF data? – How to find the relevant query when keywords are ambiguous (ranking)? – How to return the relevant queries as quickly as possible (scalability)?

Our Contributions We leverage terms extracted from Wikipedia to enrich literals described in the original RDF data. We adopt several mechanisms for query ranking, which can consider many relevant factors. We propose a novel graph data structure called clustered graph and an exploration algorithm. Additionally, the exploration algorithm also allows for the construction of the top-k queries.

Workflow of Q2Semantic Input: a keyword query K composed of keyword phrases {k 1, k 2, …, k n }. Search Process – Phrase Mapping – Query Construction and Ranking Index Process – Mapping, Clustering and Indexing Output: a formal query F as a tree of the form, where r is the root node of F and p i is a path in F. In our example, K includes k 1 = “Capin” and k 2 = “SVG”, and F =, where r = W3CSpecification, p 1 = and p 2 =.

Data Pre-Processing in Q2Semantic Four rules for mapping from RDF graph to RACK graph -Every instance of the RDF graph is mapped to a C-Node labeled by the concept name that the instance belongs to. -Every attribute value is mapped to a K-Node labeled by the value literal. -Every relation is mapped to a R-Edge that is labeled by the relation name and connects two C- Nodes. -Every attribute is mapped to an A-Edge that is labeled by the attribute name and connects a C- Node with a K-Node. Four rules for clustering RACK graph -Two C-Nodes are clustered to one if they have the same label. -Two R-Edges are clustered to one if they have the same label and connect the same pair of C-Nodes. -Two A-Edges are clustered to one if they have the same label and connected to the same C-Node. -Two K-Nodes are clustered to one if they are connected to the same A-Edge. The resulting node inherits the labels of both these K-Nodes.

Query Interpretation in Q2Semantic Phrase Mapping Query Construction – Thread Expansion (T-Expansion) – Cursor Expansion (C-Expansion) – Two strategies for expansion Intra-Thread Strategy Inter-Thread Strategy – Optimization for Top-k Termination – Optimization for Repeated Expansion

Query Ranking in Q2Semantic Path only Adding matching relevance Adding importance of edges and nodes

Experiment Setup TAP (220K triples) DBLP (26M triples) – 100 valid queries by combining literals from different attributes (from one to three keywords) LUBM(1,0), LUBM(20,0) and LUBM(50,0) – 8 queries from the LUBM Query Set (LQ) are used by removing 2 cyclic queries and 4 queries requiring reasoning support QueryKeywordsPotential information need Q3SupergirlWho is called “supergirl" Q5Strip, Las VegasWhat is the well-known “Strip" in Las Vegas Q9 Web Accessibility Initiative, www- rdf-perllib Find persons who work for Web Accessibility Initiative and involve in the activity with mailing list “www-rdf-perllib"

Effectiveness Evaluation A simple but effective metric Target Query Position (TQP): TQP = 11 – P target TQPs of different ranking schemes on TAP TQPs on LUBM benchmark queries TQP QueryLQ1LQ3LQ4LQ6LQ7LQ8LQ10LQ14

Efficiency Evaluation Search time under different ranking schemes Search time under different top-k Performance of penalty parameters Index size and search time on different datasets RACK graph vs. clustered RACK graph

Demo Q2Semantic – Find specifications about “SVG” whose author‘s name is “Capin"

Conclusions and Future Work For the efficiency purpose, we propose a new clustered graph index structure as a summary of the original RDF data and support top-k formal query construction on it. For the effectiveness purpose, we design well- performed ranking schemes. Additionally, we leverage knowledge from Wikipedia to enrich and disambiguates the keyword queries. Future Work – Query Capability Extension – Clustering Method

Questions? Thank you for your attending!