A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June 16--18, 2008, Columbus Ohio.

Slides:



Advertisements
Similar presentations
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Advertisements

Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
1 Block-based Web Search Deng Cai *1, Shipeng Yu *2, Ji-Rong Wen * and Wei-Ying Ma * * Microsoft Research Asia 1 Tsinghua University 2 University of Munich.
A Markov Random Field Model for Term Dependencies Donald Metzler and W. Bruce Croft University of Massachusetts, Amherst Center for Intelligent Information.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
© Anselm Spoerri Lecture 13 Housekeeping –Term Projects Evaluations –Morse, E., Lewis, M., and Olsen, K. (2002) Testing Visual Information Retrieval Methodologies.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
Chapter 5: Information Retrieval and Web Search
Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.
Search is not only about the Web An Overview on Printed Documents Search and Patent Search Walid Magdy Centre for Next Generation Localisation School of.
Search Engines and Information Retrieval Chapter 1.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
TREC 2009 Review Lanbo Zhang. 7 tracks Web track Relevance Feedback track (RF) Entity track Blog track Legal track Million Query track (MQ) Chemical IR.
Leveraging Conceptual Lexicon : Query Disambiguation using Proximity Information for Patent Retrieval Date : 2013/10/30 Author : Parvaz Mahdabi, Shima.
Minimal Test Collections for Retrieval Evaluation B. Carterette, J. Allan, R. Sitaraman University of Massachusetts Amherst SIGIR2006.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Linking Wikipedia to the Web Antonio Flores Bernal Department of Computer Sciencies San Pablo Catholic University 2010.
Redeeming Relevance for Subject Search in Citation Indexes Shannon Bradshaw The University of Iowa
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
11 Learning to Suggest Questions in Online Learning to Suggest Questions in Online Forums Tom Chao Zhou, Chin-Yew Lin, Irwin King Michael R.
НИУ ВШЭ – НИЖНИЙ НОВГОРОД EDUARD BABKIN NIKOLAY KARPOV TATIANA BABKINA NATIONAL RESEARCH UNIVERSITY HIGHER SCHOOL OF ECONOMICS A method of ontology-aided.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
A Machine Learning Approach to Sentence Ordering for Multidocument Summarization and Its Evaluation D. Bollegala, N. Okazaki and M. Ishizuka The University.
Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.
A Language Independent Method for Question Classification COLING 2004.
Chapter 6: Information Retrieval and Web Search
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Querying Web Data – The WebQA Approach Author: Sunny K.S.Lam and M.Tamer Özsu CSI5311 Presentation Dongmei Jiang and Zhiping Duan.
GrammAds: Keyword and Ad Creative Generator for Online Advertising Campaigns Author : Stamatina Thomaidou, Konstantinos Leymonis, and Michalis Vazirgiannis.
Information Retrieval at NLC Jianfeng Gao NLC Group, Microsoft Research China.
Semantic v.s. Positions: Utilizing Balanced Proximity in Language Model Smoothing for Information Retrieval Rui Yan†, ♮, Han Jiang†, ♮, Mirella Lapata‡,
Finding Experts Using Social Network Analysis 2007 IEEE/WIC/ACM International Conference on Web Intelligence Yupeng Fu, Rongjing Xiang, Yong Wang, Min.
Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.
Information Retrieval
AN EFFECTIVE STATISTICAL APPROACH TO BLOG POST OPINION RETRIEVAL Ben He Craig Macdonald Iadh Ounis University of Glasgow Jiyin He University of Amsterdam.
Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.
Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA SIGIR 2001.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Date: 2012/11/29 Author: Chen Wang, Keping Bi, Yunhua Hu, Hang Li, Guihong Cao Source: WSDM’12 Advisor: Jia-ling, Koh Speaker: Shun-Chen, Cheng.
Using Social Annotations to Improve Language Model for Information Retrieval Shengliang Xu, Shenghua Bao, Yong Yu Shanghai Jiao Tong University Yunbo Cao.
Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.
Survey Jaehui Park Copyright  2008 by CEBT Introduction  Members Jung-Yeon Yang, Jaehui Park, Sungchan Park, Jongheum Yeon  We are interested.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.
DISTRIBUTED INFORMATION RETRIEVAL Lee Won Hee.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
Indri at TREC 2004: UMass Terabyte Track Overview Don Metzler University of Massachusetts, Amherst.
GENERATING RELEVANT AND DIVERSE QUERY PHRASE SUGGESTIONS USING TOPICAL N-GRAMS ELENA HIRST.
A Multilingual Hierarchy Mapping Method Based on GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Information Retrieval Lecture 3 Introduction to Information Retrieval (Manning et al. 2007) Chapter 8 For the MSc Computer Science Programme Dell Zhang.
Usefulness of Quality Click- through Data for Training Craig Macdonald, ladh Ounis Department of Computing Science University of Glasgow, Scotland, UK.
Information Retrieval and Extraction 2009 Term Project – Modern Web Search Advisor: 陳信希 TA: 蔡銘峰&許名宏.
Using Blog Properties to Improve Retrieval Gilad Mishne (ICWSM 2007)
WHIM- Spring ‘10 By:-Enza Desai. What is HCIR? Study of IR techniques that brings human intelligence into search process. Coined by Gary Marchionini.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
An Empirical Study of Learning to Rank for Entity Search
A Markov Random Field Model for Term Dependencies
Query Type Classification for Web Document Retrieval
Learning to Rank with Ties
Presentation transcript:

A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June , 2008, Columbus Ohio

Schedule Introduction 1 Fine-grained Expert Search 2 Conclusion 4 Experimental Results 3

3 Introduction Expert Search  “who is an expert on X?” UserQuery Search Engine Experts Who are experts on Semantic Web Search Engine ?

Introduction Pioneering Expert Search Systems  Log data in software development Kautz et al., 1996; Mockus and Herbsleb, 2002; McDonald and Ackerman, 1998; etc.  communications Campbell et al., 2003; Dom et al. 2003; Sihn and Heeren, 2001; etc.  General documents Yimam, 1996; Davenport and Prusak, 1998; Steer and Lochbaum, 1988; Mattox et al., 1999; Hertzum and Pejtersen, 2000; Craswell et al., 2001; etc.

Introduction Expert Search at TREC  A new task at TREC 2005, 2006, 2007 Craswell et al., 2005; Soboroff et al., 2006; Bailey et al., 2007;  Many approaches have been proposed Two generative models, Balog et al Prior distribution, relevance feedback, Fang et al Hierarchical language model, Petkova and Croft 2006 Voting and data fusion, Macdonald and Ounis 2006 …

Introduction Coarse-grained approach.  Expert search is carried out under a grain of document. Further improvements are hard to achieve Different blocks of electronic documents Different functions and qualities Different impacts for expert search

Windowed Section Relation irrelevant Window relevant queried topic 7 Examples

Title-Author Relation Title Author Query: Timed Text 8 Examples

Reference Section Relation 9 Examples

Query: W3C Management Team 10 Examples Section Title-Body Relation

Schedule Introduction 1 Fine-grained Expert Search 2 Conclusion 4 Experimental Results 3

12 Fine-grained Evidence Who are experts on Semantic Web Search Engine? Fine-grained Expert Search --Evidence Extraction Document-001: “…a high-level plan of the architecture of the semantic web by Tim Berners- Lee… ” “…later, Berners-Lee describes a semantic web search engine experience…” E1: E2: Tim Berners-Lee

Fine-grained Expert Search –Search Model (t,p,r,d) Expert Candidate (c) Query (q) Expert Matching Model Evidence Matching Model

Fine-grained Expert Search -- Expert Matching MaskSample Full NameRitu Raj Tiwari Combined NameTiwari, Ritu R; Abbr. NameRitu Raj ; Ritu Short NameRRT Alias, new ( for short)

Fine-grained Expert Search -- Evidence Matching TypeSample Query Semantic Web Search Engine Phrase “Semantic Web Search Engine” Bi-gram “Semantic Web” “Search Engine” Proximity “Semantic … Web Search Engine” Fuzzy “Samentic Web Saerch Engine” Stemmed “Semantic Web Search Engin” Relation Type Same Section Windowed Section Reference Section Title-Author Section Title-Body Quality Type Dynamic Quality Static Qualify

Schedule Introduction 1 Fine-grained Expert Search 2 Conclusion 4 Experimental Results 3

Experimental Result W3C Corpus  331,307 web pages  10 training topics of TREC 2005  50 test topics of TREC 2005  49 test topics of TREC 2006 Evaluation Metrics  Mean average precision (MAP)  R-precision (R-P)  Top N precision

Experimental Result Query Matching TREC 2005TREC 2006 Baseline Bi-gram Proximity Fuzzy, Stemmed Improv %17.09%9.80%22.07%11.49%5.30% T-test

Experimental Result Person Matching TREC 2005TREC 2006 Baseline Combined Name Abbr. Name Short Name Alias, new Improv. 6.21%3.60%1.19%14.00%1.96%1.68% T-test

Experimental Result Multiple Relations TREC 2005TREC 2006 Baseline Windowed Section Reference Section Title-Author Section Title-Body Improv %19.91%10.00%8.35%8.77%8.50% T-test

Experimental Result Evidence Quality TREC 2005TREC 2006 Baseline Static quality Dynamic quality Improv. 6.13%4.67%3.74%2.86%3.67%8.61% T-test Rank

Schedule Introduction 1 Fine-grained Expert Search 2 Conclusion 4 Experimental Results 3

Conclusion Fine-grained expert search Probabilistic model and its implementation Evaluation on the TREC data set