Recommending Forum Posts to Designated Experts

Slides:



Advertisements
Similar presentations
1 Opinion Summarization Using Entity Features and Probabilistic Sentence Coherence Optimization (UIUC at TAC 2008 Opinion Summarization Pilot) Nov 19,
Advertisements

C ONTEXT - AWARE SIMILARITIES WITHIN THE FACTORIZATION FRAMEWORK Balázs Hidasi Domonkos Tikk C A RR WORKSHOP, 5 TH F EBRUARY 2013, R OME.
Jason H.D. Cho 1,2, Parikshit Sondhi 1, Chengxiang Zhai 1, Bruce R. Schatz 1,2,3 1 Department of Computer Science, 2 Institute of Genomic Biology, 3 Department.
1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.
NAKAYAMA, Kazuhiro a, NISHIO, Arisa b, YOKOYAMA, Yukari c, SETOYAMA, Yoko a, TOGARI, Taisuke d and YONEKURA, Yuki c a St. Luke's college of Nursing, Nursing.
Comparing Twitter Summarization Algorithms for Multiple Post Summaries David Inouye and Jugal K. Kalita SocialCom May 10 Hyewon Lim.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Computing Trust in Social Networks
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Recommender systems Ram Akella November 26 th 2008.
Quality-aware Collaborative Question Answering: Methods and Evaluation Maggy Anastasia Suryanto, Ee-Peng Lim Singapore Management University Aixin Sun.
Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.
Item-based Collaborative Filtering Recommendation Algorithms
Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.
THE ROLE OF ADAPTIVE ELEMENTS IN WEB-BASED SURVEILLANCE SYSTEM USER INTERFACES RICARDO LAGE, PETER DOLOG, AND MARTIN LEGINUS
Focused Matrix Factorization for Audience Selection in Display Advertising BHARGAV KANAGAL, AMR AHMED, SANDEEP PANDEY, VANJA JOSIFOVSKI, LLUIS GARCIA-PUEYO,
Probabilistic Question Recommendation for Question Answering Communities Mingcheng Qu, Guang Qiu, Xiaofei He, Cheng Zhang, Hao Wu, Jiajun Bu, Chun Chen.
Teaching Thermodynamics with Collaborative Learning Larry Caretto Mechanical Engineering Department June 9, 2006.
GAUSSIAN PROCESS FACTORIZATION MACHINES FOR CONTEXT-AWARE RECOMMENDATIONS Trung V. Nguyen, Alexandros Karatzoglou, Linas Baltrunas SIGIR 2014 Presentation:
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Google News Personalization: Scalable Online Collaborative Filtering
Online Learning for Collaborative Filtering
Learning Geographical Preferences for Point-of-Interest Recommendation Author(s): Bin Liu Yanjie Fu, Zijun Yao, Hui Xiong [KDD-2013]
EigenRank: A ranking oriented approach to collaborative filtering By Nathan N. Liu and Qiang Yang Presented by Zachary 1.
LOGO Identifying Opinion Leaders in the Blogosphere Xiaodan Song, Yun Chi, Koji Hino, Belle L. Tseng CIKM 2007 Advisor : Dr. Koh Jia-Ling Speaker : Tu.
Carnegie Mellon Novelty and Redundancy Detection in Adaptive Filtering Yi Zhang, Jamie Callan, Thomas Minka Carnegie Mellon University {yiz, callan,
Modern information retreival Chapter. 02: Modeling (Latent Semantic Indexing)
1 A Formal Study of Information Retrieval Heuristics Hui Fang, Tao Tao and ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of.
Presenter: Libin Zheng, Yongqi Zhang Department of Computer Science and Engineering HKUST Date: 24/11/2015 Crowd-aided course selection on MOOC.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Active Feedback in Ad Hoc IR Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Yue Xu Shu Zhang.  A person has already rated some movies, which movies he/she may be interested, too?  If we have huge data of user and movies, this.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Analyzing and Predicting Question Quality in Community Question Answering Services Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak CQA2012,
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Matrix Factorization & Singular Value Decomposition Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Mining Tag Semantics for Social Tag Recommendation Hsin-Chang Yang Department of Information Management National University of Kaohsiung.
Autumn Web Information retrieval (Web IR) Handout #14: Ranking Based on Click Through data Ali Mohammad Zareh Bidoki ECE Department, Yazd University.
Reputation-aware QoS Value Prediction of Web Services Weiwei Qiu, Zhejiang University Zibin Zheng, The Chinese University of HongKong Xinyu Wang, Zhejiang.
A Recommender System based on Tag and Time Information for Social Tagging Systems Nan Zheng and Qiudan Li (Chinese Academy of Sciences) Expert Systems.
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
Matrix Factorization and Collaborative Filtering
Statistics 202: Statistical Aspects of Data Mining
Mining User Similarity from Semantic Trajectories
A Formal Study of Information Retrieval Heuristics
Differential Evolution
Introduction to IR Research
Information Retrieval and Web Search
Next Question Prediction
Multimodal Learning with Deep Boltzmann Machines
Data Structures Algorithms: (Slides to be Adopted from Goodrich and aligned with Weiss' book) Instructor: Ganesh Ramakrishnan
Tingdan Luo 05/02/2016 Interactively Optimizing Information Retrieval Systems as a Dueling Bandits Problem Tingdan Luo
Comparing Genetic Algorithm and Guided Local Search Methods
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Google News Personalization: Scalable Online Collaborative Filtering
MURI Kickoff Meeting Randolph L. Moses November, 2008
Movie Recommendation System
Analysis of Forum Discourse in Large Online Classes
WorkShop on Community Question Answering on the Web
Recommendation Systems
Recommender Systems Group 6 Javier Velasco Anusha Sama
<Application Name>
Inductive Clustering: A technique for clustering search results Hieu Khac Le Department of Computer Science - University of Illinois at Urbana-Champaign.
Presentation transcript:

Recommending Forum Posts to Designated Experts Jason Hyun Duk Cho1,3, Yanen Li2, Roxana Girju1, Chengxiang Zhai1 1Department of Computer Science, University of Illinois at Urbana-Champaign 2LinkedIn 3@WalmartLabs

Designated Experts in Online Domains Rises in expert participations in forums Examples include education (Coursera, Piazza), health (MedHelp), or legal (ask-a-lawyer).

Designated Experts in Online Domains 5.2 Million Students over 10,000 courses First example is Coursera. Here, instructors answer questions that students may have, or fellow students answer questions. Notice not all of them are answered

Designated Experts in Online Domains 12 Million visitors per month MedHelp has ‘Ask A Doctor’ forums where doctors respond to patients’ questions. Here, a patient asks a doctor what a ‘rapid cycler’ is.

Designated Experts in Online Domains Rises in expert participations in forums Examples include education (Coursera, Piazza), health (MedHelp), or legal (ask-a-lawyer). We call people who have credentials ‘Designated Experts.’

Problem Number of users/questions overwhelm experts! 62.1% of online forums benefit from medical experts, but only 4.7% had responses from experts [1]. 5.2M students over 532 courses on Coursera [2] Each course has an average of 10,000 students

Solutions Hire more designated experts Not very realistic Model designated experts’ behaviors and route questions that they are most likely to answer. Hiring more designated experts not realistic, and lower returns.

Approach Utilize existing framework We used matrix factorization framework, by combining collaborative filtering, and encoding user/document information Explore experts’ behaviors to improve recommendation performance More on second point in the next slide…

Designated Expert Behavior Most forum posts had either zero or one designated expert responses! This was taken from MedHelp

Outline Experimental Setup Expert-document modeling Document-word modeling Expert-word modeling Expert Similarities Expert behavior constraints Analysis

Experimental Setup We used matrix factorization framework to run the experiments Combination of collective matrix factorization and regularization We used MedHelp’s Ask A Doctor forum for evaluation 56,194 threads across 18 forum categories 168 designated experts Used stochastic gradient descent – parallelizable, so it can be used on big data

Framework We used matrix factorization to model the problem: Matrices X and Y are low rank matrices. The goal is to infer k latent features. These are often solved using SGD or Least squares

Expert-Document Model document-expert matrix Where matrix C corresponds to weight of a given row and column Sim(U,P) is cosine similarity R corresponds to feedback matrix. It is set to 1 if an expert responded to a thread, 0 otherwise.

Document-Words Encode words in the objective function Matrix D is modeled using TF-IDF weighting.

Evaluation Results Did not perform well Does not capture experts’ preferences Can we improve performance by adding words that experts prefer?

Expert-Words Encode words in the objective function Matrix E is modeled using TF-IDF weighting.

Evaluation Results Performs significantly better than previously Encoding both documents and expert profiles help tremendously Can we do better?

Expert Similarity There are not that much collaborative filtering going on Vast majority of the posts have only one expert response. Encode expert-expert similarity to mitigate this issue Cosine similarity used for matrix S.

Evaluation Results Performs somewhat better. d Performs somewhat better. Can we explicitly encode experts’ behavior?

Constraint 1 – Propensity to answer Some experts respond more than others We should capture these characteristics CS 440 – Introduction to AI Spring 2015 d

Constraint 2 – One expert per thread Once an expert has answered a forum post, another expert is highly unlikely to respond to the post. We still try to give a response to each forum

Evaluation Results Adding the constraints improved the performance quite significantly

Objective Function There are lots of parameters to tune. How sensitive is the algorithm to different parameters?

Sensitivity Analysis

Sensitivity Analysis

Sensitivity Analysis Other than modeling expert-word matrices, algorithm was not very sensitive to parameters

Impact of Data Size Study was conducted across all 18 forum categories. Circles indicate cases where the combined method performed better. Seems to consistently perform better. Notice we set parameters constant throughout the experiment

Impact of Data Size

Impact of Data Size In all cases, combining all the method yielded the best performance in terms of MAP. MAP chosen because it is standard

Conclusion We introduced new type of experts called Designated Experts. By utilizing the experts’ behavior, we can improve recommendation performance. Our proposed algorithm was not sensitive to parameters, nor data size. Designated experts – websites give such power.

Future Works We would like to apply our algorithm on other domains, such as Coursera. Modeling the interaction between experts and average users may be of interest. We used Bag-of-words model. We would also like to model semantics of how experts talk, and see if they differ from average users.

Acknowledgements We would like to thank the anonymous reviewers for their helpful comments. We would like to thank @WalmartLabs for partially funding this research.

Q&A Thank you!

Appendix 0 P@K Mean Average Precision (MAP) Mean Reciprocal Rank (MRR)

Appendix 1 Stochastic Gradient Descent for inference

Appendix 2 In-depth explanation of the constraints One expert per thread Strict rule NP complete problem – change from ILP to LP.