Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analyzing and Predicting Question Quality in Community Question Answering Services Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak CQA2012,

Similar presentations


Presentation on theme: "Analyzing and Predicting Question Quality in Community Question Answering Services Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak CQA2012,"— Presentation transcript:

1 Analyzing and Predicting Question Quality in Community Question Answering Services Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak CQA2012, Lyon, France

2 Introduction 22/2/20162

3 Question Quality 22/2/20163 Number of tag-of-interests Number of answers

4 Motivation Question quality affects answer quality – Low quality questions hinder the CQA services – High quality questions promote the development of the community Identifying question quality facilitates question search and recommendation 22/2/20164

5 Outline Problem Definition Data Two Studies – Factors Affecting Question Quality – Prediction of Question Quality Discussion and Conclusion 22/2/20165

6 Problem Definition 22/2/20166 Figure 1. Construct of question quality in CQA

7 Data Description 22/2/20167 Table 1. Summary of data in Entertainment & Music category and its subcategories

8 Ground Truth 22/2/20168 NTA 4321 44432 34332 23321 12211 Table 2. Rule base for the ground truth setting RM Table 3. Summary of questions in four levels Level1234 Count53,80662,19269,83652,715 NTA: number of tag-of-interests + number of answers RM: reciprocal of the minutes for getting the best answer 1 23 4

9 Study One: Factors Affecting Question Quality Possible Factors Process – Select the two most popular subcategories (say, Music and Movies) and check their distributions of question quality – Track askers with at least five questions in both these two subcategories 22/2/20169 Askers Topics

10 Observations 22/2/201610 Table 4. Summary of question quality for different askers

11 Observations 22/2/201611 Question Quality

12 Study Two: Prediction of Question Quality Modeling the relationships among questions, topics and askers as a bipartite graph 22/2/201612 Asking Expertise Question Quality

13 Mutual Reinforcement Label Propagation for Predicting Question Quality 22/2/201613 ? ? ? ? ? ? ?

14 MRLP 22/2/201614 similar users’ asking expertise question quality asking expertise similar questions’ quality

15 Data for Study Two 22/2/201615

16 Methods Comparison Logistic Regression – LG_Q and LG_QA Stochastic Gradient Boosted Tree (Friedman, J. H., 1999) – SGBT_Q and SGBT_QA Harmonic Function (Zhou et al., 2007) – HF_Q and HF_QA 22/2/201616

17 Experimental Results: Accuracy 22/2/201617

18 Sensitivity & Specificity Sensitivity measures the algorithm’s ability to identify high quality questions Sensitivity = TP/(TP+FN) Specificity measures the algorithm’s ability to identify low quality questions Specificity = TN/(TN+FP) 22/2/201618

19 Experimental Results: Music 22/2/201619

20 Experimental Results: Movies 22/2/201620

21 Discussion MRLP is more effective in distinguishing high quality questions from low quality ones than state-of-the-art methods At present, neither MRLP nor other methods achieves satisfactory performance due to the influence of features 22/2/201621

22 Discussion Salient features? – User study via crowdsourcing sytems 22/2/201622

23 Conclusion Define Question Quality in CQA Conduct two studies to investigate question quality in CQA services – Analyze the factors influencing question quality – Propose a mutual reinforcement-based label propagation algorithm to predict question quality Future Work – Explore more salient features – Utilize question quality to improve question search and question recommendation 22/2/201623

24 Thank You! Q&A

25 Data Description 238,549 resolved questions under the Entertainment & Music category of Yahoo! Answers Question Features – Text, post time, etc. Asker Features – Total points, No. of questions asked, No. of questions resolved, etc. 22/2/201625

26 MRLP 22/2/201626 For the question part of the bipartite graph, we create edges between any two questions within same topics: n × n probabilistic transition matrix For the asker part of the bipartite graph, we generate the probabilistic transition matrix M similarly.


Download ppt "Analyzing and Predicting Question Quality in Community Question Answering Services Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak CQA2012,"

Similar presentations


Ads by Google