Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploiting Social Context for Review Quality Prediction Yue Lu University of Illinois at Urbana-Champaign Panayiotis Tsaparas Microsoft Research Alexandros.

Similar presentations


Presentation on theme: "Exploiting Social Context for Review Quality Prediction Yue Lu University of Illinois at Urbana-Champaign Panayiotis Tsaparas Microsoft Research Alexandros."— Presentation transcript:

1 Exploiting Social Context for Review Quality Prediction Yue Lu University of Illinois at Urbana-Champaign Panayiotis Tsaparas Microsoft Research Alexandros Ntoulas Microsoft Research Livia Polanyi Microsoft April 28, WWW’2010 Raleigh, NC

2 2 Why do we care about Predicting Review Quality? User reviews (1764) User “helpfulness” votes help prioritize reading User “helpfulness” votes help prioritize reading But not all reviews have votes 1.New reviews 2.Reviews aggregated from multiple sources But not all reviews have votes 1.New reviews 2.Reviews aggregated from multiple sources

3 What has been done? 3 As classification or regression problem √ × ? ? ? ? ? ? ? ? ? √ [Zhang&Varadarajan`06] [Kim et al. `06] [Liu et al. `08] [Ghose&Ipeirotis `10] Labeled Unlabeled Textual features Meta-data features

4 Reviews are NOT Stand-Alone Documents We also observe… 4 ReviewerIdentitySocialNetwork Social Context =+ Our Work: Exploiting Social Context for Review Quality Prediction Our Work: Exploiting Social Context for Review Quality Prediction

5 Roadmap Motivation Review Quality Prediction Algorithms Experimental Evaluation Conclusions 5

6 SentiPositive SentiNegative Text-only Baseline 6 Textual Features Text Statistics NumSent NumTokens SentLen CapRatio UniqWordRatio Syntactic POS:RB POS:PP POS:V POS:CD POS:JJ POS:NN POS:SYM POS:COM POS:FW Conformity KLDiv Sentiment FeatureVector( )=

7 Base Model: Linear Regression 7 w = argmin = argmin{ } Quality( ) = Weights×FeatureVector( ) i i Closed-form: w=

8 Straight-forward Approach: Adding Social Context as Features 8 Reviewer History NumReview AvgRating Social Network InDegree OutDegree PageRank Textual Features Social Context Features FeatureVector( )= Disadvantages: Social context features not always available Anonymous reviews? A new reviewer? Need more training data Disadvantages: Social context features not always available Anonymous reviews? A new reviewer? Need more training data

9 Our Approach: Social Context as Constraints 9 ReviewerIdentity SocialNetwork Quality( ) is related to Quality( ) is related to its Social Network Our Intuitions: How to combine such intuitions with Textual info?

10 Formally: Graph-based Regularizers 10 { + β× Graph Regularizer } w = argmin Trade-off parameter Designed to “favor” our intuitions Baseline Loss function Advantages: Semi-supervised: make use of unlabeled data Applicable to reviews without social context LabeledUnlabeled We will define four regularizers base on four hypotheses.

11 1.Reviewer Consistency Hypothesis 11 Quality( ) Quality( ) ~ 1 2 3 4 1 4 Quality( ) 2 Quality( ) ~ 3 Reviewers are consistent! Reviewers are consistent!

12 Regularizer for Reviewer Consistency 12 Reviewer Regularizer =∑ [ Quality( ) - Quality( ) ] 2 Quality( ) ] 2 1 2 Sum over all data (train + test) for all pairs reviews in the same-author graph Closed-form solution! 1 2 3 4 Same-Author Graph (A) [Zhou et al. 03] [Zhu et al. 03] [Belkin et al 06] w= Graph LaplacianReview-Feature Matrix

13 2.Trust Consistency Hypothesis 13 Quality( ) - Quality( ) ≤ 0 I trust people with quality at least as good as mine! AVG ( Quality( ) ) Defined as

14 Regularizer for Trust Consistency 14 Trust Regularizer =∑max[0, Quality( ) - Quality( )] 2 Quality( )] 2 Sum over all data (train + test) for all pairs of reviewers connected in the trust graph No closed-form solution… Still convex  Gradient Descent Trust Graph

15 3.Co-Citation Consistency Hypothesis 15 Quality( ) - Quality( ) → 0 Trust GraphCo-citation Graph I am consistent with my “trust standard”!

16 Regularizer for Co-citation Consistency 16 Co-citation Regularizer =∑[ Quality( ) - Quality( ) ] 2 Quality( ) ] 2 Closed-form solution! Sum over all data (train + test) for all pairs of reviewers connected in the co-citation graph Co-citation Graph (C) w= Review-Reviewer Matrix

17 4.Link Consistency Hypothesis 17 Quality( ) - Quality( ) → 0 Trust GraphLink Graph I trust people with similar quality as mine!

18 Regularizer for Link Consistency 18 Link Regularizer =∑[ Quality( ) - Quality( ) ] 2 Quality( ) ] 2 Closed-form solution! Sum over all data (train + test) for all pairs of reviewers connected in the co-citation graph Link Graph

19 Roadmap Motivation Review Quality Prediction Algorithms Experimental Evaluation Conclusions 19

20 Data from Ciao UK StatisticsCellphoneBeautyDigital Camera # Reviews194348493697 Reviews/Reviewer ratio 2.212.841.06 Trust Graph Density0.00750.0140.0006 20 SummaryCellphoneBeautyDigital Camera Social Contextrich sparse Gold-std Quality Distribution balancedskewedbalanced

21 Hypotheses Testing: Reviewer Consistency 21 Qg( ) - 1 Qg( ) 2 Qg( ) - 1 Qg( ) 3 Reviewer Consistency Hypothesis supported by data Difference in Review Quality Density From same reviewer From different reviewers (Cellphone)

22 Hypotheses Testing: Social Network-based Consistencies 22 Qg( ) - Qg( ) B is not linked to A B trusts A B is co-cited with A B is linked to A BA Social Network-based Consistencies supported by data Difference in Reviewer Quality Density (Cellphone)

23 Prediction Performance: Exploiting Social Context 23 % of MSE Difference Percentage of Training Data 10%25%50%100% AddFeatures is most effective given sufficient training data With limited training data, Reg methods work best With limited training data, Reg methods work best Reg:Reviewer > Reg:Trust > Reg:Cocitation > Reg:Link (Cellphone) Better Reg:Link AddFeatures Reg:Reviewer Reg:Cocitation Reg:Trust

24 Prediction Performance: Compare Three Categories 24 % of MSE Difference CellphoneBeauty Digital Camera Better Reg:Link Reg:Reviewer Reg:Cocitation Reg:Trust Improvement on Digital Camera is smaller due to sparse social context Reviews/Reviewer ratio = 1.06

25 Parameter Sensitivity 25 Text-only Baseline (Cellphone)(Beauty) Regularization Parameter Mean Squared Error consistently better than Baseline when parameter < 0.1 Better

26 Conclusions Improve Review Quality Prediction using Social Context Formalize into a Semi-supervised Graph Regularization framework Utilize both labeled and unlabeled data Applicable on data with no social context Promising results on real world data – Esp. limited labels, rich social context 26

27 Future Work Combine multiple regularizers Optimize by nDCG instead of MSE Infer trust network Spam detection 27

28 Thank you! & Questions?


Download ppt "Exploiting Social Context for Review Quality Prediction Yue Lu University of Illinois at Urbana-Champaign Panayiotis Tsaparas Microsoft Research Alexandros."

Similar presentations


Ads by Google