A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June 16--18, 2008, Columbus Ohio.

A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June 16--18, 2008, Columbus Ohio

Schedule Introduction 1 Fine-grained Expert Search 2 Conclusion 4 Experimental Results 3

3 Introduction Expert Search  “who is an expert on X?” UserQuery Search Engine Experts Who are experts on Semantic Web Search Engine ?

Introduction Pioneering Expert Search Systems  Log data in software development Kautz et al., 1996; Mockus and Herbsleb, 2002; McDonald and Ackerman, 1998; etc.  Email communications Campbell et al., 2003; Dom et al. 2003; Sihn and Heeren, 2001; etc.  General documents Yimam, 1996; Davenport and Prusak, 1998; Steer and Lochbaum, 1988; Mattox et al., 1999; Hertzum and Pejtersen, 2000; Craswell et al., 2001; etc.

Introduction Expert Search at TREC  A new task at TREC 2005, 2006, 2007 Craswell et al., 2005; Soboroff et al., 2006; Bailey et al., 2007;  Many approaches have been proposed Two generative models, Balog et al. 2006 Prior distribution, relevance feedback, Fang et al. 2006 Hierarchical language model, Petkova and Croft 2006 Voting and data fusion, Macdonald and Ounis 2006 …

Introduction Coarse-grained approach.  Expert search is carried out under a grain of document. Further improvements are hard to achieve Different blocks of electronic documents Different functions and qualities Different impacts for expert search

Windowed Section Relation irrelevant Window relevant queried topic 7 Examples

Title-Author Relation Title Author Query: Timed Text 8 Examples

Reference Section Relation 9 Examples

Query: W3C Management Team 10 Examples Section Title-Body Relation

12 Fine-grained Evidence Who are experts on Semantic Web Search Engine? Fine-grained Expert Search --Evidence Extraction Document-001: “…a high-level plan of the architecture of the semantic web by Tim Berners- Lee… ” “…later, Berners-Lee describes a semantic web search engine experience…” E1: E2: Tim Berners-Lee

Fine-grained Expert Search –Search Model (t,p,r,d) Expert Candidate (c) Query (q) Expert Matching Model Evidence Matching Model

Fine-grained Expert Search -- Expert Matching MaskSample Full NameRitu Raj Tiwari Email Namertiwari@nuance.com Combined NameTiwari, Ritu R; Abbr. NameRitu Raj ; Ritu Short NameRRT Alias, new emailrtiwari@hotmail.com ( for short)

Fine-grained Expert Search -- Evidence Matching TypeSample Query Semantic Web Search Engine Phrase “Semantic Web Search Engine” Bi-gram “Semantic Web” “Search Engine” Proximity “Semantic … Web Search Engine” Fuzzy “Samentic Web Saerch Engine” Stemmed “Semantic Web Search Engin” Relation Type Same Section Windowed Section Reference Section Title-Author Section Title-Body Quality Type Dynamic Quality Static Qualify

Experimental Result W3C Corpus  331,307 web pages  10 training topics of TREC 2005  50 test topics of TREC 2005  49 test topics of TREC 2006 Evaluation Metrics  Mean average precision (MAP)  R-precision (R-P)  Top N precision (P@N)

Experimental Result Query Matching TREC 2005TREC 2006 MAPR-PP@10MAPR-PP@10 Baseline 0.18400.21360.30600.37520.45850.5604 +Bi-gram 0.19570.24380.33200.41400.49100.5799 +Proximity 0.20240.25010.33600.45300.51370.5922 + Fuzzy, Stemmed 0.20300.25010.33600.45800.51120.5901 Improv. 10.33%17.09%9.80%22.07%11.49%5.30% T-test 0.00840.0000

Experimental Result Person Matching TREC 2005TREC 2006 MAPR-PP@10MAPR-PP@10 Baseline 0.20300.25010.33600.45800.51120.5901 + Combined Name 0.20560.25390.34630.47090.51520.5931 + Abbr. Name 0.21060.25450.34000.50100.51810.6000 + Short Name 0.21110.25780.34000.51210.51920.6000 + Alias, new email 0.21560.25910.34000.52210.52120.6000 Improv. 6.21%3.60%1.19%14.00%1.96%1.68% T-test 0.00640.0057

Experimental Result Multiple Relations TREC 2005TREC 2006 MAPR-PP@10MAPR-PP@10 Baseline 0.21560.25910.34000.52210.52120.6000 +Windowed Section 0.21580.26330.33800.52550.53110.6082 +Reference Section 0.21600.26300.33800.52720.53140.6061 +Title-Author 0.22340.26340.35800.53540.53550.6245 +Section Title-Body 0.25860.31070.37400.56570.56690.6510 Improv. 19.94%19.91%10.00%8.35%8.77%8.50% T-test 0.00130.0043

Experimental Result Evidence Quality TREC 2005TREC 2006 MAPR-PP@10MAPR-PP@10 Baseline 0.25860.31070.37400.56570.56690.6510 +Static quality 0.27110.31880.37200.59000.58130.6796 +Dynamic quality 0.27550.32520.38800.59430.58770.7061 Improv. 6.13%4.67%3.74%2.86%3.67%8.61% T-test 0.03600.0252 Rank 1 @TREC0.27490.33300.45200.59470.57830.7041

Conclusion Fine-grained expert search Probabilistic model and its implementation Evaluation on the TREC data set

A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June 16--18, 2008, Columbus Ohio.

Similar presentations

Presentation on theme: "A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June 16--18, 2008, Columbus Ohio."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June 16--18, 2008, Columbus Ohio.

Similar presentations

Presentation on theme: "A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June 16--18, 2008, Columbus Ohio."— Presentation transcript:

Similar presentations

About project

Feedback