Doctoral Thesis Presentation Mohammed Nazim Uddin Dept. of Computer Science & Information Engineering, INHA University, Korea Advisor: Professor Geun-Sik.

Slides:



Advertisements
Similar presentations
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
Advertisements

Chapter 5: Introduction to Information Retrieval
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
1/1/ A Knowledge-based Approach to Citation Extraction Min-Yuh Day 1,2, Tzong-Han Tsai 1,3, Cheng-Lung Sung 1, Cheng-Wei Lee 1, Shih-Hung Wu 4, Chorng-Shyong.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
SciVal Experts & SciVal Funding Information Sessions.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
Search Engines and Information Retrieval
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
WebMiningResearch ASurvey Web Mining Research: A Survey By Raymond Kosala & Hendrik Blockeel, Katholieke Universitat Leuven, July 2000 Presented 4/18/2002.
Queensland University of Technology An Ontology-based Mining Approach for User Search Intent Discovery Yan Shen, Yuefeng Li, Yue Xu, Renato Iannella, Abdulmohsen.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
University of Kansas Department of Electrical Engineering and Computer Science Dr. Susan Gauch April 2005 I T T C Dr. Susan Gauch Personalized Search Based.
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Information Retrieval
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
The 2nd International Conference of e-Learning and Distance Education, 21 to 23 February 2011, Riyadh, Saudi Arabia Prof. Dr. Torky Sultan Faculty of Computers.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Search Engines and Information Retrieval Chapter 1.
Citation Recommendation 1 Web Technology Laboratory Ferdowsi University of Mashhad.
Exploiting Wikipedia as External Knowledge for Document Clustering Sakyasingha Dasgupta, Pradeep Ghosh Data Mining and Exploration-Presentation School.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Name : Emad Zargoun Id number : EASTERN MEDITERRANEAN UNIVERSITY DEPARTMENT OF Computing and technology “ITEC547- text mining“ Prof.Dr. Nazife Dimiriler.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Dr. Susan Gauch When is a rock not a rock? Conceptual Approaches to Personalized Search and Recommendations Nov. 8, 2011 TResNet.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
WEB SEARCH PERSONALIZATION WITH ONTOLOGICAL USER PROFILES Data Mining Lab XUAN MAN.
Similar Document Search and Recommendation Vidhya Govindaraju, Krishnan Ramanathan HP Labs, Bangalore, India JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE.
Theory and Application of Database Systems A Hybrid Approach for Extending Ontology from Text He Wei.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
Giorgos Giannopoulos (IMIS/”Athena” R.C and NTU Athens, Greece) Theodore Dalamagas (IMIS/”Athena” R.C., Greece) Timos Sellis (IMIS/”Athena” R.C and NTU.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
CONCLUSION & FUTURE WORK Normally, users perform search tasks using multiple applications in concert: a search engine interface presents lists of potentially.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Algorithmic Detection of Semantic Similarity WWW 2005.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Information Retrieval
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Improving the performance of personal name disambiguation.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Post-Ranking query suggestion by diversifying search Chao Wang.
Web Search Personalization with Ontological User Profile Advisor: Dr. Jai-Ling Koh Speaker: Shun-hong Sie.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
Citation-Based Retrieval for Scholarly Publications 指導教授:郭建明 學生:蘇文正 M
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Hybrid Content and Tag-based Profiles for recommendation in Collaborative Tagging Systems Latin American Web Conference IEEE Computer Society, 2008 Presenter:
Refined Online Citation Matching and Adaptive Canonical Metadata Construction CSE 598B Course Project Report Huajing Li.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Contextual Text Cube Model and Aggregation Operator for Text OLAP
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Personalized Ontology for Web Search Personalization S. Sendhilkumar, T.V. Geetha Anna University, Chennai India 1st ACM Bangalore annual Compute conference,
MINING DEEP KNOWLEDGE FROM SCIENTIFIC NETWORKS
Web Mining Department of Computer Science and Engg.
Presentation transcript:

Doctoral Thesis Presentation Mohammed Nazim Uddin Dept. of Computer Science & Information Engineering, INHA University, Korea Advisor: Professor Geun-Sik Jo 1 Personalized Semantic Search using Ontological User Profile

Outline 2 Introduction Related Works Personalized Semantic Search Experimental Evaluations Conclusions

Introduction 3 Personalized Information Search User Modeling (User Profile) Search information based on user profile Rank the search results to make a new order list

Motivation 4 Personalized Semantic Search Traditional search is Keyword based, does not provide any semantics. Users interest are not matched most of the time with search results Different users with diverse intentions submit the same keyword for search receive the same set of results Personalized semantic search provides the search results considering various concepts and relations with user’s intention.

Research Issue 5 Personalized search is not a new in information retrieval but effective personalization is still an open challenge. Number of researches are focused on personalized information searched with different methods to enhanced the retrieve results matched with user intention. A few methods addressed the semantic approach to the personalized search and successfully applied in the domain of information retrieval.

Goal and Research Questions 6 This research considers the following issues to provide personalized semantic information search. How to collect user information and model it with ontological approach to construct a user profile? How to extend the query based on user profile to create semantic context describing user’s interests and preferences? How to utilize the semantic user context for searching and ranking information?

Research Approach 7 Propose a framework for personalized information searching and ranking using semantic web technology in the domain of scientific research area in computer science and information engineering field A novel method to model user details in ontological approach to represent user’s interests and preferences semantically. Utilized semantic user profile to provide personalized search services The extraction of scientific publications semantically related to a given query and re-rank the results to provide proper ordering search results to the user. Searching and Ranking experts for a particular field using social network in semantic web environment.

Related Works 8 Learning Ontology-Based User Profiles: A Semantic Approach to Personalized Web Search; [Ahu Sieg, et al. (2007)., IEEE Intelligent Informatics Bulletin, Vol.8.No.1] Present a method for building ontological user profiles by assigning interest scores to existing concepts(ODP) A spreading activation algorithm is applied for maintain the interest scores to update the profile based on user’s behavior. Re-rank the search results based on interest scores and semantic evidence in ontological user profile.

Related Works 9 Personalized information retrieval based on context and ontological knowledge [P. MYLONAS, et al.(2008), The Knowledge Engineering Review, Cambridge university Press] Focused on the combination of conceptualization and personalization methods to improve the performance of personalized information retrieval Context are represented by concepts and relationships between them that build an ontology structure described by the concept of fuzzy relational algebra. User Profile are modeled with positive(P + ) and negative(P - ) preferences based on user actions and usage histories.

Related Works 10 Contextual Information Search Based on Ontological User Profile [Nazim et al., ICCCI 2010]. Propose a framework for searching information based on user profile User profile modeled with ontological approach WordNet is used to extend the query context to provide semantic information of users interest Log file analysis approach has undertaken to monitor user’s interest on access page to initially learn profile Filter and rank the results based on the profile

Related Works 11 A combination approach to web user profiling [Jie Tang et. Al [ACM Transaction, 2010] Aim to extracting and fusing semantic based user profile from the web Researchers profiles are constructed by extending FOAF ontology with relevant information from the web Based on the profile information an academic expert list are determined Researchers' interests are extracted based on topics and publication venues

Personalized Semantic Search 12

Personalized Semantic Search 13 The goal of personalized semantic search is to utilize user context in the form of ontological approach The main intention is to accomplished the semantic search on structural scientific research information based on user profile. Search mainly focus on Searching and ranking experts based on a query topic Searching academic research information related to user preferences Personalization is carried out to customize search information utilizing user’s preferences in the form ontological profile.

System Structure for Personalized Semantic Search 14 Query Generator Query Generator Experts Search Publication Search Modeling Academic Information AKB User Profile Ontology GUI 1. Query Ranking Semantic Search Space Extraction of Academic Information 4. Personalized Search Results 2. Matching 3. Extended Query

System Components 15 The system consists of four main components. Ontological User Profile User Interface and Query Generator Semantic Search Space Searching and Ranking Academic Information

Ontological User Profile 16 An Ontology is defined as a formal, explicit specification of a shared conceptual understanding of a domain A new ontology can be designed to model the users’ details An pre-existing domain ontology can be utilized as a reference ontology to model user’s information. Instance of reference ontology can be defined as semantic profile for individual user.

Ontological User Profile 17 Used a combine approach with minimum user’s intervention Explicitly provide User’s name, address and social web id Automatically crawl the related information implicitly Information Collection User activities and preferences

Enrichment with thesauri, links and Ontology ODP WordNet User activities and Preferences Sub Concept Concept Sub Concept Concept Sub Concept User Profile Ontology Ontological User Profile 18 Construction of Ontological User Profile

Ontological User Profile 19 Concept Vector Generation Vector Space Model (TF-IDF) For each document d in a collection of documents D, a weighted concept vector is constructed as: Where, w i is the weight of term i in document d. Weights (w i ) are calculated as: Where, f i is the frequency of terms i in the document d, N is the number of documents in collection D and n i is the number of documents that contains term i. Representation of User Activities and Preferences

Ontological User Profile 20 A domain ontology ODP (Open Directory Project) has be investigated as reference ontology to model the use details. In ODP, topics are organized in hierarchical manner along with web pages belongs to the related topics maintained by volunteer users. Each topic is considered as a concept and related documents represent the concept. Enhance Preferences with ODP Ontology

Ontological User Profile 21 Child Directory Related Page Link Related Page Link About AI ODP (Open Directory Project):Human edited web directory

Ontological User Profile 22 Overview of Ontological User Profile Construction Reference Ontology(ODP) AI DB Machine Learning Concept vectors of Preferences and Activates Computer Classification Computer WWW Web2.0 Internet Portion of Semantic User Profile Mapping LOD Data

User Interface and Query Generator 23 Query Expansion The key point for a semantic search is to define the semantics (meanings) of user query to search the desire information related to given query Query expansion is a process of adding new term(s)/concept(s) based on user profile Query is extended by similarity matching with Ontological Profile An extended query is send to the search space to extract the related information

Semantic Search Space 24 Documents are organized in semantic approach in the form of resources and relationships rather simply link of HTML pages. Ontological approach is employed to build a knowledge base with concepts and their relationships which we called Academic Knowledge Base(AKB) An Academic Knowledge Base (AKB) is to be built for a particular domain. We select scientific research of computer science as a domain to build AKB.

Semantic Search Space 25 Scientific research information related to computer science domain are investigated with ontological approach to build AKB. Ontology In this approach an ontology is defined as (C,R,C f,R f ), where, C - set of concepts R - set of relations C f - concepts with relevant weights R f – relation relevant weights Concepts are named as Classes while describing AKB Building Academic Knowledge Base(AKB)

AKB Ontology Researcher Field Publication has_Publication Topic_2 Topic_1 Topic_3 Journal Book Proceeding Technical Report is_A(.) is_A (weight) is_A (.) is_A(weig ht ) is_A (.) is_A(weigh t ) is_A(.) include belog_To_Field written_By Class Subclass Relation Semantic Search Space 26

27 Two scenarios have been considered for searching and ranking academic information Searching and Ranking Experts Searching and Ranking Scientific Publications Searching and Ranking Academic Information

28 A expert list for a particular query topic is generated by constructing an Academic Social Network (ANS). All the authors, co-authors exist in the publication list generated by a matching algorithm are extracted ANS is constructed by analyzing author, co-author relationships in retrieved publications. Searching and Ranking Experts

Searching and Ranking Academic Information 29 Topic-document relationship model (TRM) An initial score is measured for all the authors (including co-author) exist in the publications for a given query topic based on AKB. The initial score of a researcher can be calculated by equation Where, c is the expert candidate (researcher/author), t is a given topic, w(c|1; p) is the relevant degree of publication ( p) as a first author and w(c|2; q) is the relevant degree as a co-author. and are two damping factors where, ANS Construction

Searching and Ranking Academic Information 30 Author and Co-Author Relationship Model (ARM) In this model initial scores of expert candidates in ASN are update based on Outward and Inward relations. Relation between expert candidates are calculated considering Outward and Inward relations by the equation Where, r(x; y) is the relation weight node x (expert candidate) to y (expert candidate) and y i is Inward relation of node y. Based on the relation weights initial scores measured earlier are updated to rank the experts with the equation Where, O x is the Outward relation of node x, and is damping factors for Outward and Inward relations.

Searching and Ranking Academic Information 31 Academic information such as publications aresearched by matching the query to the semantic search space. Semantic search space includes the “Field” hierarchy where publications are assigned considering the concepts and relations P91 (..) Semantic web(..) P19 (..) P101(..) belong_To (weigh ts) belong_To(..) P91 (..) Ontology(..) P19(..) P101 (..) belong_To (weights ) belong_To(..) Field ……………… …… P19(..) instances Searching and Ranking Scientific Publications

Searching and Ranking Academic Information 32 Query is extended on Ontological user profile with meta data. Each concept of “Filed” Concept hierarchy contains the topics and feature vectors of the topic and related publication list with abstract or index keywords Query concept with meta data are mapped to the concepts of Field class(Topic) with the cosign similarity Best matched concepts are selected with a similarity threshold Related publications are extracted from the matched concepts Matching

Searching and Ranking Academic Information 33 Field concepts contain the list of publications with several annotated relations Weight of each publication is calculated by adding all the relations weight, which can be denoted as P_w = belong_to+ cite_By. where, belong_To weights are calculated by measuring degree relevancy of a publication and a field concept. And cite_By is how many other publication cite this publication Finally, re-rank the publications Ranking the Results

Experimental Evaluations 34 Ultimate goal of the thesis is to provide a framework for personalized searching and ranking using semantic web technology. Experiments are carried out for testing the efficiency and the accuracy of the framework which depends upon How accurately Building User Profile? How effectively user Query Expands using profile? Finally, test the Searching and Ranking accuracy for academic information.

Building User profile 35 Data collections RFD representation of ODP is downloaded for the website ( Top Computer concept is considered as root concept The main goal of using ODP in this experiment is to construct a reference ontology which is learned with the users’ details to ontological user profile. Reference ontology is constructed with ODP concept hierarchy where concepts include with feature vectors generated from respective concept.

Building User profile 36 Most common used information retrieval measures precision and recall are used in this experiment to evaluate profile accuracy. Additionally, F-measure is calculated by combining precision and recall as: Evaluation Metrics

37 User profile accuracy is to demonstrate that constructed ontological user profile represents user interests and preferences accurately. To construct the ontological user profile, fifty users’ details are collected from social network site facebook and Google Scholar by query their names and addresses Users’ details are mapped with the reference ontology Judgments of profiles were relevant or not to the users with necessary information of whose profile have been constructed. Profile Accuracy Building User profile

38 To evaluate the accuracy of user’s profile precision, recall and F-measure are calculated Judgments carried out by how user’s profile are relevant with interesting concepts Profile accuracy

39 In this test different levels of user profile are examined that contained the number of relevant and non-relevant concepts to represents the actual users’ information need. Top three levels of ODP hierarchy are utilized for the experiment. Distribution of concepts in different depth levels Profile accuracy in different levels Building User profile

40 Extension of Query with Ontological User Profile for Personalized Searching and Ranking. The goal of this experiment is to evaluate expended query concept using profile are relevant to user’s context or not. Ten Queries using twenty profiles (divided into 5 groups) are tested for evaluation. Query Expansion

Experiments in Searching and Ranking 41 The main goal of this experiment is to measure the retrieval accuracy of academic information using ontological user profile Searching and ranking academic experts using the model in this research is evaluated comparing with baseline methods Similar approach is tested using ontological profile to generate a personalized ranking list of academic expert Additionally, searching and ranking personalized scientific publications for a given query is tested by the methods described in this thesis with baseline.

Experiments in Searching and Ranking 42 Data collections Real world data has been collected from a scientific literature digital library CiteSeerX that focuses primarily on the literature in computer and information science. The metadata includes about publications related to computer and information science of total size approximately 2 GB. Table1 shows the collection based on CiteSeer after cleaning the downloaded data. Record NameData Volume No. of Publications No. Authors75000

Experiments in Searching and Ranking 43 Building Academic Knowledge Base (AKB) We have built the AKB automatically with the corpus. Selecting topics for our AKB is accomplished by investigating ODP and Eventseer with the help of some senior researchers and research faculty members who are skilled in ontological knowledge representation. Table 2 shows the part of topic selection for the Field of AKB. Topics Artificial IntelligenceBelief Network Semantic WebKnowledge Representation AgentsMulti Agents Ontologyfuzzy Machine LearningData Mining

Experiments in Searching and Ranking 44 Precision at k R-precision (R-prec) Mean Average Precision (MAP) Evaluation Metrics

Experiments in Searching and Ranking 45 To be able to evaluate quality, web search spaces typically use human judgments that indicate which results are relevant for a given query, or some approximation of a “ground truth” inferred from user’s clicks, or finally a combination of both. Initially, for a given query, the top 50 results were given to some researchers including research faculty members, doctoral and master students to assess the expert candidates returned by our system. To help the researchers in the evaluation process, we have provided necessary information of expert candidates. Assessments

Experiments in Searching and Ranking 46 Baseline Set a baseline with Hybrid model of Language and Topic model (HLT) by Deng el al. [Deng08] and ArnetMiner by Tang et al. [Tang08] Comparison of our methods (TRM and ARM) with baseline methods with Precision at Searching and ranking Experts

Experiments in Searching and Ranking 47 R-Prec and MAP ApproachesR-PrecMAP Baseline48.8%32.39% TRM52.2%39.4% ARM63.8%46.4% Search and ranking Experts

Experiments in Searching and Ranking 48 Precision at K based on personalization vs. Non- personalization Searching and ranking Experts

Searching and Ranking Academic Information 49 In this experiment, personalized searching and ranking academic document (publications) using ontological profile is evaluated. Set a baseline of a semantic approach to Personalized Web Search (SPWS) by Sieg et al. [Sien2007]. Searching and Ranking Scientific Publications

Conclusions 50 A novel framework for personalized semantic searching and ranking information using ontological user profile has been presented and tested with series of experiment. Experiments carried out to the different components of the framework such semantic profile building and query expansion shown significant achievements in profile accuracy (90% in precision and 70% in recall) and generation of search context with relevant information. In the both scenarios, empirical results show that the semantic search framework considerably Provide improved searching and ranking accuracy and efficiency for finding academic information Alleviate user satisfactions by presenting information based on individual user’s needs Offer robust and worthy performance in finding information in the related domain such experts finding and publications search.

Future Works 51 Limited techniques of social network has been adopted only in this thesis for modeling experts and collecting users’ information for building profile Semantic search space was constructed with the CiteSear collections only Future plans Utilized the maximum features of social networks (facebook, Twitter and others) to infer user preferences and interest for constructing semantic user profile Integrate different data sources for constructing inclusive search space using semantic technology.