Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab N.Y.U.S.T. I. M. A language modeling framework for expert finding Presenter : Lin, Shu-Han Authors : Krisztian Balog,

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab N.Y.U.S.T. I. M. A language modeling framework for expert finding Presenter : Lin, Shu-Han Authors : Krisztian Balog,"— Presentation transcript:

1 Intelligent Database Systems Lab N.Y.U.S.T. I. M. A language modeling framework for expert finding Presenter : Lin, Shu-Han Authors : Krisztian Balog, Leif Azzopardi, Maarten de Rijke Information Processing and Management (IPM) 45 (2009) 1–19

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outline Motivation Objective Methodology Experiments Conclusion Comments

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Motivation The expert finding: finding experts given a topic. Yellow Pages:  Profiles: employees self-assess their skills.  Keywords; e.g., marketing Problem:  Information: antiquated  Keywords: restricted 3

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objectives Within the organization…  Mine published intranet documents.  Search all kinds of expertise. ‘Who are the experts on topic “Internet marketing and internet advertising” in my organization?’ 4

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Overview To capture the association between a candidate expert and an area of expertise… “What is the probability of a candidate ca being an expert given the query topic q?”  Model 1: candidate-based (query-independent) approach: idea: build a profile of candidate experts, and rank them based on query.  Model 2: document-based (query-dependent) approach idea: find the query-relevant documents, then associate with experts. 5 (constant) Bayes’ Theorem (uniform)

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Model 1 Build a textual representation (model) of a person’s knowledge according to his documents. Then estimate the probability of the query given the candidate’s model. 6 p(Internet Marketing | θ ca ) =p(“Internet”| θ ca ) ‧ p(“Marketing”| θ ca ) e.g., p(Internet marketing and internet advertising| θ ca ) =p(“Internet”| θ ca ) 2 ‧ p(“Marketing”| θ ca ) ‧ p(“and”| θ ca ) ‧ p(“Advertising”| θ ca ) (Smoothed) (weighted)

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Model 1B Estimate p(t | d, ca)  Candidate identifier  Window size (w) 7 e.g., p(“Internet”| “Mail.No.43”, “John”) … John (john@gmail.com) is a major in marketing. …john@gmail.com … ( ) is a major in marketing. … p.s. the closer, the more powerful. (weighted)

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Model 2 8 (Smoothed)

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – Model 2B Model 2 Model2B 9

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology – document-candidate associations Boolean model TF-IDF 10 (document importance) (senior member of organization)

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments Evaluation measures:  MAP (mean average precision)  MRR (mean reciprocal rank): 11 (1/3 + 1/2 + 1)/3 = 11/18

12 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments Model 1 vs. Model 2 Window-based models 12

13 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments Association methods Parameter sensitivity 13

14 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusions Model 1: build a profile of candidate experts, and rank them based on query. Model 2: find the query-relevant documents, then associate with experts. Model 2 was to be preferred over Model 1:  Effectiveness: in terms of average precision and reciprocal rank  Implement: only requiring a regular document index window-based extensions improved :  Effectiveness: especially on top of Model 1 Frequency-based (TF-IDF) document-candidate associations is helpful. 14

15 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Comments Advantage  Integrate ideas Drawback  … Application  … 15


Download ppt "Intelligent Database Systems Lab N.Y.U.S.T. I. M. A language modeling framework for expert finding Presenter : Lin, Shu-Han Authors : Krisztian Balog,"

Similar presentations


Ads by Google