Vote Calibration in Community Question-Answering Systems Bee-Chung Chen (LinkedIn), Anirban Dasgupta (Yahoo! Labs), Xuanhui Wang (Facebook), Jie Yang (Google)

Slides:

Advertisements

Similar presentations

Learning to Suggest: A Machine Learning Framework for Ranking Query Suggestions Date: 2013/02/18 Author: Umut Ozertem, Olivier Chapelle, Pinar Donmez,

Advertisements

1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.

Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi Advisors: Dinesh Govindaraj SVN Vishwanathan* Group:

Towards Twitter Context Summarization with User Influence Models Yi Chang et al. WSDM 2013 Hyewon Lim 21 June 2013.

FTP Biostatistics II Model parameter estimations: Confronting models with measurements.

Ao-Jan Su † Y. Charlie Hu ‡ Aleksandar Kuzmanovic † Cheng-Kok Koh ‡ † Northwestern University ‡ Purdue University How to Improve Your Google Ranking: Myths.

Finding High-Quality Content in Social Media chenwq 2011/11/26.

Sparse vs. Ensemble Approaches to Supervised Learning

Carnegie Mellon 1 Maximum Likelihood Estimation for Information Thresholding Yi Zhang & Jamie Callan Carnegie Mellon University

CS246 Search Engine Bias. Junghoo "John" Cho (UCLA Computer Science)2 Motivation “If you are not indexed by Google, you do not exist on the Web” --- news.com.

The Social Web: A laboratory for studying s ocial networks, tagging and beyond Kristina Lerman USC Information Sciences Institute.

EVENT IDENTIFICATION IN SOCIAL MEDIA Hila Becker, Luis Gravano Mor Naaman Columbia University Rutgers University.

Machine Learning in Simulation-Based Analysis 1 Li-C. Wang, Malgorzata Marek-Sadowska University of California, Santa Barbara.

Large-Scale Cost-sensitive Online Social Network Profile Linkage.

Quality-aware Collaborative Question Answering: Methods and Evaluation Maggy Anastasia Suryanto, Ee-Peng Lim Singapore Management University Aixin Sun.

Quality-Aware Collaborative Question Answering: Methods and Evaluation Maggy Anastasia Suryanto, Ee-Peng Lim, Aixin Sun, and Roger H. L. Chiang. In Proceedings.

Inferential statistics Hypothesis testing. Questions statistics can help us answer Is the mean score (or variance) for a given population different from.

Modern Retrieval Evaluations Hongning Wang

Crowdsourcing Predictors of Behavioral Outcomes. Abstract Generating models from large data sets—and deter¬mining which subsets of data to mine—is becoming.

1 A Discriminative Approach to Topic- Based Citation Recommendation Jie Tang and Jing Zhang Presented by Pei Li Knowledge Engineering Group, Dept. of Computer.

by B. Zadrozny and C. Elkan

Probabilistic Question Recommendation for Question Answering Communities Mingcheng Qu, Guang Qiu, Xiaofei He, Cheng Zhang, Hao Wu, Jiajun Bu, Chun Chen.

Group Recommendations with Rank Aggregation and Collaborative Filtering Linas Baltrunas, Tadas Makcinskas, Francesco Ricci Free University of Bozen-Bolzano.

DETECTING SPAMMERS AND CONTENT PROMOTERS IN ONLINE VIDEO SOCIAL NETWORKS Fabrício Benevenuto ∗, Tiago Rodrigues, Virgílio Almeida, Jussara Almeida, and.

WALKING IN FACEBOOK: A CASE STUDY OF UNBIASED SAMPLING OF OSNS junction.

Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.

When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.

1 Discovering Authorities in Question Answer Communities by Using Link Analysis Pawel Jurczyk, Eugene Agichtein (CIKM 2007)

A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.

Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,

Using Entropy-Related Measures in Categorical Data Visualization  Jamal Alsakran The University of Jordan  Xiaoke Huang, Ye Zhao Kent State University.

Online Learning for Collaborative Filtering

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.

Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.

Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.

A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.

LOGO Finding High-Quality Content in Social Media Eugene Agichtein, Carlos Castillo, Debora Donato, Aristides Gionis and Gilad Mishne (WSDM 2008) Advisor.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

Prediction of Influencers from Word Use Chan Shing Hei.

AI on the Battlefield: an Experimental Exploration Alexander Kott BBN Technologies Robert Rasch US Army Battle Command Battle Lab Views expressed in this.

Classsourcing: Crowd-Based Validation of Question-Answer Learning Objects Jakub Šimko, Marián Šimko, Mária Bieliková, Jakub Ševcech, Roman Burger

Finding high-Quality contents in Social media BY : APARNA TODWAL GUIDED BY : PROF. M. WANJARI.

CSKGOI'08 Commonsense Knowledge and Goal Oriented Interfaces.

Department of Electrical Engineering and Computer Science Kunpeng Zhang, Yu Cheng, Yusheng Xie, Doug Downey, Ankit Agrawal, Alok Choudhary {kzh980,ych133,

Jiafeng Guo(ICT) Xueqi Cheng(ICT) Hua-Wei Shen(ICT) Gu Xu (MSRA) Speaker: Rui-Rui Li Supervisor: Prof. Ben Kao.

1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.

Collecting High Quality Overlapping Labels at Low Cost Grace Hui Yang Language Technologies Institute Carnegie Mellon University Anton Mityagin Krysta.

CoCQA : Co-Training Over Questions and Answers with an Application to Predicting Question Subjectivity Orientation Baoli Li, Yandong Liu, and Eugene Agichtein.

Diversifying Search Results Rakesh AgrawalSreenivas GollapudiSearch LabsMicrosoft Research Alan HalversonSamuel.

Liangjie Hong and Brian D. Davison Department of Computer Science and Engineering Lehigh University SIGIR 2009.

A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,

Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media J. Bian, Y. Liu, E. Agichtein, and H. Zha ACM WWW, 2008.

11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.

TO Each His Own: Personalized Content Selection Based on Text Comprehensibility Date: 2013/01/24 Author: Chenhao Tan, Evgeniy Gabrilovich, Bo Pang Source:

Consensus Relevance with Topic and Worker Conditional Models Paul N. Bennett, Microsoft Research Joint with Ece Kamar, Microsoft Research Gabriella Kazai,

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.

Analyzing and Predicting Question Quality in Community Question Answering Services Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak CQA2012,

Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:

Reputation-aware QoS Value Prediction of Web Services Weiwei Qiu, Zhejiang University Zibin Zheng, The Chinese University of HongKong Xinyu Wang, Zhejiang.

Topic Modeling for Short Texts with Auxiliary Word Embeddings

Beliefs and Biases in Web Search

iSRD Spam Review Detection with Imbalanced Data Distributions

Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.

Statistically speaking

A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 22, Feb, 2010 Department of Computer.

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Learning to Rank with Ties

Presentation transcript:

Vote Calibration in Community Question-Answering Systems Bee-Chung Chen (LinkedIn), Anirban Dasgupta (Yahoo! Labs), Xuanhui Wang (Facebook), Jie Yang (Google) SIGIR 2012 This work was conducted when all authors were affiliated with Yahoo! 1

Why I Present This Paper? Vote bias exists in many social media platforms This paper solves a problem in a relatively old context “CQA” from a new perspective, “crowd sourcing quality content identification” 2

Outline Motivation Related Work Data Set Vote Calibration Model Exploratory Analysis Features Experimental Results Conclusion 3

Community Question Answering Crowd sourced alternative to search engines for providing information 4

Community Question Answering Commercial spam: mostly can be tackled by conventional machine learning Low quality content: difficult for machines to detect! Crowdsourcing quality content identification 5

Voting Mechanism Content quality User expertise 6

Vote in Yahoo! Answers Asker vote for the best answer Asker does not vote for the best answer within certain period, other users in the community vote Thumb-up or thumb-down votes on each individual answer However… Are users’ votes always un-biased? 7

Potential Bias Vote more positively for friends’ answers Use votes to show appreciation instead of identifying high quality content Game the system to obtain high status, multiple accounts, vote for one another Questions about opinions, vote for answer that share same opinions … 8

Potential Bias Trained human editors to judge answers based on a set of well-defined guidelines Raw user votes have low correlation with editorial judgment 9

Motivation Propose the problem of vote calibration in CQA systems Based on exploratory data analysis, identify a variety of potential factors that bias the votes Develop a model for vote calibration based on supervised learning, content-agnostic approach 10

Related Work Predicting user-voted best answer – Assumption: readily available user-voted best answer are ground truth Predicting editorial judgments – User votes are used as features, calibration of each individual vote has not be studied Content-agnostic user expertise estimation 11

Dataset Editorial data – Sample questions and answers from Yahoo! Answers – Give quality grade to the answer according to pre- determined set of editorial guideline, excellent, good, fair, bad – 21,525 editorial judged answers on 7,372 questions 12

Dataset Distribution of editorial grades for best answers are not very different from non-best answers. Low correlation between users’ best-answer votes and answer quality Significant percentage (>70%) of best answers are not even good Many non-best answers are actually good or excellent 13

Dataset Numeric quality scores, excellent=1,good=0.5,fair=0,bad=-0.5 Voting data, 1.3M questions, 7.0M answers, 0.5M asker best answer votes, 2.1M community best answer votes, 9.1M thumb up/down votes 14

Vote Calibration Model 15

Vote Calibration Model Three types of votes – Asker votes: best answer votes by asker +1 for best answer -1 for other answers – CBA votes: community best answer votes +1 from the voter that votes for best answer -1 from the voter for other answers – Thumb votes: thumb-up and thumb down +1 for thumb up -1 for thumb down 16

Average Vote of An Answer Pseudo votes, prior Calibrated type-t votes 17

Average Vote of An Answerer/User 18

Quality Prediction Function Quality prediction: weighted sum of answer-level and user-level average vote values of all types on an answer Calibrated vote aggregation model: 19 Bias term Answer levelUser level

Training Algorithm Determine model parameters by minimizing the following loss function Using gradient descent to determine model parameters 20

Self Voting Self votes contribute to 33% of total CBA votes Users who cast at least 20 votes, percentage of self votes goes above 40% 21

Vote Spread and Reciprocity 22

Interaction Bias Chi-squared statistic and randomized test show past interaction could be useful features for vote calibration 23

Feature Voter features 24

Feature Relation feature 25

Feature Transformation Each for the features C that are counts, consider log(1+C) as an additional feature For ratio features R, include a quadratic term R 2 26

Experimental Results User-level expert ranking – How well we rank users based on the predicted user-level scores Answer ranking – How well we rank answers based on the predicted answer-level scores 27

Experimental Results 28

Comparison of Calibration Models 29

Impact on Heavy Users 30

Conclusion Introduce vote calibration problem to CQA Propose a set of features to capture bias by analyzing potential bias in users’ voting behavior Supervised calibrated models are better than non-calibrated versions 31

Thanks Q & A 32