Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June 16--18, 2008, Columbus Ohio.

Similar presentations


Presentation on theme: "A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June 16--18, 2008, Columbus Ohio."— Presentation transcript:

1 A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June 16--18, 2008, Columbus Ohio

2 Schedule Introduction 1 Fine-grained Expert Search 2 Conclusion 4 Experimental Results 3

3 3 Introduction Expert Search  “who is an expert on X?” UserQuery Search Engine Experts Who are experts on Semantic Web Search Engine ?

4 Introduction Pioneering Expert Search Systems  Log data in software development Kautz et al., 1996; Mockus and Herbsleb, 2002; McDonald and Ackerman, 1998; etc.  Email communications Campbell et al., 2003; Dom et al. 2003; Sihn and Heeren, 2001; etc.  General documents Yimam, 1996; Davenport and Prusak, 1998; Steer and Lochbaum, 1988; Mattox et al., 1999; Hertzum and Pejtersen, 2000; Craswell et al., 2001; etc.

5 Introduction Expert Search at TREC  A new task at TREC 2005, 2006, 2007 Craswell et al., 2005; Soboroff et al., 2006; Bailey et al., 2007;  Many approaches have been proposed Two generative models, Balog et al. 2006 Prior distribution, relevance feedback, Fang et al. 2006 Hierarchical language model, Petkova and Croft 2006 Voting and data fusion, Macdonald and Ounis 2006 …

6 Introduction Coarse-grained approach.  Expert search is carried out under a grain of document. Further improvements are hard to achieve Different blocks of electronic documents Different functions and qualities Different impacts for expert search

7 Windowed Section Relation irrelevant Window relevant queried topic 7 Examples

8 Title-Author Relation Title Author Query: Timed Text 8 Examples

9 Reference Section Relation 9 Examples

10 Query: W3C Management Team 10 Examples Section Title-Body Relation

11 Schedule Introduction 1 Fine-grained Expert Search 2 Conclusion 4 Experimental Results 3

12 12 Fine-grained Evidence Who are experts on Semantic Web Search Engine? Fine-grained Expert Search --Evidence Extraction Document-001: “…a high-level plan of the architecture of the semantic web by Tim Berners- Lee… ” “…later, Berners-Lee describes a semantic web search engine experience…” E1: E2: Tim Berners-Lee

13 Fine-grained Expert Search –Search Model (t,p,r,d) Expert Candidate (c) Query (q) Expert Matching Model Evidence Matching Model

14 Fine-grained Expert Search -- Expert Matching MaskSample Full NameRitu Raj Tiwari Email Namertiwari@nuance.com Combined NameTiwari, Ritu R; Abbr. NameRitu Raj ; Ritu Short NameRRT Alias, new emailrtiwari@hotmail.com ( for short)

15 Fine-grained Expert Search -- Evidence Matching TypeSample Query Semantic Web Search Engine Phrase “Semantic Web Search Engine” Bi-gram “Semantic Web” “Search Engine” Proximity “Semantic … Web Search Engine” Fuzzy “Samentic Web Saerch Engine” Stemmed “Semantic Web Search Engin” Relation Type Same Section Windowed Section Reference Section Title-Author Section Title-Body Quality Type Dynamic Quality Static Qualify

16 Schedule Introduction 1 Fine-grained Expert Search 2 Conclusion 4 Experimental Results 3

17 Experimental Result W3C Corpus  331,307 web pages  10 training topics of TREC 2005  50 test topics of TREC 2005  49 test topics of TREC 2006 Evaluation Metrics  Mean average precision (MAP)  R-precision (R-P)  Top N precision (P@N)

18 Experimental Result Query Matching TREC 2005TREC 2006 MAPR-PP@10MAPR-PP@10 Baseline 0.18400.21360.30600.37520.45850.5604 +Bi-gram 0.19570.24380.33200.41400.49100.5799 +Proximity 0.20240.25010.33600.45300.51370.5922 + Fuzzy, Stemmed 0.20300.25010.33600.45800.51120.5901 Improv. 10.33%17.09%9.80%22.07%11.49%5.30% T-test 0.00840.0000

19 Experimental Result Person Matching TREC 2005TREC 2006 MAPR-PP@10MAPR-PP@10 Baseline 0.20300.25010.33600.45800.51120.5901 + Combined Name 0.20560.25390.34630.47090.51520.5931 + Abbr. Name 0.21060.25450.34000.50100.51810.6000 + Short Name 0.21110.25780.34000.51210.51920.6000 + Alias, new email 0.21560.25910.34000.52210.52120.6000 Improv. 6.21%3.60%1.19%14.00%1.96%1.68% T-test 0.00640.0057

20 Experimental Result Multiple Relations TREC 2005TREC 2006 MAPR-PP@10MAPR-PP@10 Baseline 0.21560.25910.34000.52210.52120.6000 +Windowed Section 0.21580.26330.33800.52550.53110.6082 +Reference Section 0.21600.26300.33800.52720.53140.6061 +Title-Author 0.22340.26340.35800.53540.53550.6245 +Section Title-Body 0.25860.31070.37400.56570.56690.6510 Improv. 19.94%19.91%10.00%8.35%8.77%8.50% T-test 0.00130.0043

21 Experimental Result Evidence Quality TREC 2005TREC 2006 MAPR-PP@10MAPR-PP@10 Baseline 0.25860.31070.37400.56570.56690.6510 +Static quality 0.27110.31880.37200.59000.58130.6796 +Dynamic quality 0.27550.32520.38800.59430.58770.7061 Improv. 6.13%4.67%3.74%2.86%3.67%8.61% T-test 0.03600.0252 Rank 1 @TREC0.27490.33300.45200.59470.57830.7041

22 Schedule Introduction 1 Fine-grained Expert Search 2 Conclusion 4 Experimental Results 3

23 Conclusion Fine-grained expert search Probabilistic model and its implementation Evaluation on the TREC data set

24


Download ppt "A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu June 16--18, 2008, Columbus Ohio."

Similar presentations


Ads by Google