Text-based User-kNN: Measuring user similarity based on text reviews

Text-based User-kNN: Measuring user similarity based on text reviews
Maria Terzi, Matthew Rowe, Maria-Angela Ferrario, Jon Whittle School of Computing & Communications, Infolab21 Lancaster University, United Kingdom

Recommender systems Information filtering systems:
seek to predict the interest of a user for an item or a social element that had not yet considered at the right time and in the right context by measuring similarity between users and items/social elements Text-based user-kNN: Measuring user similarity based on text reviews 1/19

User-based k-Nearest Neighbors Collaborative Filtering
Matches similar users based on the similarity of their ratings to predict ratings and make recommendations to a target user. How it works: a. weight all users with respect to similarity with the target user b. select k users that have the highest similarity (neighbors) c. compute a prediction value from a weighted average of the selected neighbors. Movie Dave Alice Bob Carol The Wolverine 5 4 3 ? World War Z To Rome with Love 2 Text-based user-kNN: Measuring user similarity based on text reviews 2/19

Similarity reflection problems
Ratings cannot fully reflect the similarity between users… ratings do not capture the rationale behind a user’s rating there is a high probability (p=0.8) that two ratings with the same value on the same item will be given for different reasons. ..new sources of knowledge are required! Not as good as the first, but I still love Johny Depp It was a really good movie. The storyline to this one was great. Text-based user-kNN: Measuring user similarity based on text reviews 3/19

Importance of Online User Text Reviews
Why do they share? 46% feel they can be brutally honest on the Internet. 38% aim to influence others when they express their preferences online (Harris Poll, April 2010) 34% have turned to social media to air their feelings about a company. 26% to express dissatisfaction, 23% to share companies or products they like. (Harris Poll, April 2010) 90% of customers say Buying Decisions are influenced by online reviews (Dimensional Research, April 2013) 97% who made a purchase based on an online review found the review to be accurate (Comscore/The Kelsey Group, Oct. 2007) Users trust more text reviews than ratings Make an ideal source of knowledge for recommender systems and measuring similarity between users. .. fake, computationally expensive, noisy and hard to quantify Text-based user-kNN: Measuring user similarity based on text reviews 4/19

How existing work used text-reviews in RecSys
Matrix Factorization CF approaches use text reviews to define a regularizer: opinion score: feature extraction and sentiment analysis (Pero et al., 2013) sentiment score (Singh et al., 2011) review quality score (Raghavan et al., 2011) ratings & features (MacAuley & Leskovec, 2013) ..Not focused on improving the performance of neighborhood based models, such as user-kNN Using the sentiment of text reviews instead of ratings in user-kNN similar to a rating, the sentiment says how much a person liked an item.. but misses the reason why ..The reasons behind a user’s rating remain unexploited Text-based user-kNN: Measuring user similarity based on text reviews 5/19

How existing work used text-reviews in RecSys
Use text reviews to construct user preference profiles for CF: profiles : set of item features (such as “plot”) extracted from text reviews used to match user profiles (Chen and Wang, 2013) ..not focused on predicting ratings or reviews similar to the profile when making recommendations in Topic Profile CF (TPCF) (Musat et al.,2013) ..TPCF does not form neighborhoods of similar users “the overall number of opinions regarding a certain item feature reveals how important that feature is to the item” .. Do not distinguish user preferences across domains Text-based user-kNN: Measuring user similarity based on text reviews 6/19

Contributions Novelty of Text-based User-kNN
A text-based user-kNN approach An extensive evaluation: a) investigation of the performance of various text similarity measures in the user-kNN approach. -> significant improvement of semantic similarity measures over text similarity measures b) a comparison of text-based user-kNN with a range of rating based approaches in a ratings prediction task over two datasets -> consistently higher accuracy of text-based user-kNN Novelty of Text-based User-kNN Text-based user-kNN calculates directly the similarity of text reviews to measure the similarity between users, form neighborhoods of similar users and predict ratings. Text-based user-kNN: Measuring user similarity based on text reviews 7/19

How text-based User-kNN works
Text-based user-kNN measures the similarity between users by applying a text similarity measure over the content of the reviews for each of their co-reviewed items, instead of statistical measures on ratings. set of users u∈U, a set of items i∈I and a set of quadruples D, (u,i,r,c)∈D, u,v: user, i:item, r: rating, c: content of text review UserItemAverage baseline Text-based user-kNN: Measuring user similarity based on text reviews 8/19

Short text Similarity measures
Identify “similar reviews” : reviews that use semantically similar wordings to review an item. word overlap: produces a similarity score based on the number of words that occur in both segments. ..cannot identify semantic similarity WordNet: structure of synsets; sets of terms with synonymous meanings. Why WordNet? WordNet offers two ways to measure similarity between two terms WordNet based measures had the higher correlation with human users & outperformed LSA in paraphrase detection tasks. Does not require to build a semantic space like LSA Does not require a corpus like LSA, LDA Text-based user-kNN: Measuring user similarity based on text reviews 9/19

based on the path length path: count of links between two terms in the taxonomy. (path of synonymous=0) Leacock & Chodorow, Wu & Palmer based on the information content (IC) is a measure of specificity based on the frequency counts of a word in a WordNet tagged corpus i.e high IC -> mouse, lower IC-> animal Resnik, Lin, Jiang & Conrath All similarity measures are publicly provided by Pedersen et al, 2004. 10/19

Great storyline! ...very interesting plot Measure Similarity Score Word Overlap Leacock & Chodorow 2.9957 Wu & Palmer 0.9524 Resnik 9.5685 Lin 0.9457 Jiang & Conrath 0.9102 All similarity measures are publicly provided by Pedersen et al, 2004. 11/19

Great storyline! ...very interesting plot Measure Similarity Score Word Overlap Leacock & Chodorow 2.9957 Wu & Palmer 0.9524 Resnik 9.5685 Lin 0.9457 Jiang & Conrath 0.9102 Remove Stop Words Reviews similarity = Average similarity between all words All similarity measures are publicly provided by Pedersen et al, 2004. 11/19

Experimental Setup - Datasets
MyMediaLite 3.07 Experimental Setup - Datasets Datasets: {itemid, userid, score, text, timestamp} RottenTomatoes: Critics reviews, collected using the API ( 0.85 cosine similarity with standard users) all reviews of the Top-100 movies for the years 2001 to 2010 normalized to 0-to-1 scale AudioCDs - Amazon Product Reviews: (Jindal and Liu, 2008) used by related work (Pero et al., 2013, Raghavan et al.,2013) Three-way dataset splitting method: 64.64% for training, 1.36% for validation, 1.36% for 25 test sets time ordering is preserved allows for significance testing Dataset Users Items Total reviews Training Validation Test set (fold size) Sparsity RottenTomatoes 451 1000 32365 40371 848 21200 (848) 86.17% AudioCDs 53060 36381 102714 66394 1397 34925(1397) 99.99% Text-based user-kNN: Measuring user similarity based on text reviews /19

Dataset splitting method
Three-way dataset splitting method: 64.64% for training, 1.36% for validation, 1.36% for 25 test sets time ordering is preserved allows for significance testing Netflix Prize : training 95.9%, validation 1.36%, testing 1.36% .. no significance testing! Why significance testing is important? due to the marginal increases in performance often performed in the literature in assessing for the chance involvement to be more confident in any improvement we find Why preserving time ordering? resembles, most closely, the situation of a recsys in a real system Cross Validation (CV) approaches introduce bias in a model by training on future results. Text-based user-kNN: Measuring user similarity based on text reviews /19

Experimental Setup - Metrics
We evaluate the approaches in a ratings prediction task Measure: Root Mean Square Error (RMSE) between the actual ratings and the predictions averaged over the 25 test-folds Significance testing using the Sign Test Text-based user-kNN: Measuring user similarity based on text reviews /19

Experimental Setup - Ratings based approaches
Modified ratings-based approaches are usually compared: to : the ratings equivalent (Pero et al.,2013) a non-personalized baseline (Chen et al.,2013, Musat et al., 2013) Ratings-based approaches used in our experiment: Non-personalised Baseline: User Item Average Memory-based Neighborhood algorithms : user-kNN with Pearson and Cosine Correlation Coefficients Similarity item-kNN with Pearson and Cosine Correlation Coefficients Similarity Matrix Factorization Methods: Singular Value Decomposition Plus Plus (SVD++) : SVD + asymmetric SVD Biased Matrix Factorization (BMF) : MF + user and item biases Text-based user-kNN: Measuring user similarity based on text reviews /19

Training the text-based user-kNN approaches
Text-based user-kNN approaches perform better when using the most proximate user in terms of sharing similar views on items (k=1) when using a weighted average of ratings of a large amount of users (k>=200). .. resembling a real life scenario Text-based user-kNN: Measuring user similarity based on text reviews /19

Mean RMSE over the 25 test folds (lower is better)
Results Text-based user-kNN approaches perform consistently and significantly (p <0.0001) better than the rating based approaches Semantic similarity measures using IC (Lin, Resnik, J&C) perform better than path-based approaches Mean RMSE over the 25 test folds (lower is better) Even a small improvement in a rating prediction error can affect the ordering of items and have significant impact on the quality of the top few presented recommendations. (Koren, 2008) Text-based user-kNN: Measuring user similarity based on text reviews /19

Conclusions Measuring the similarity between users based on text reviews can help to overcome similarity reflection problems in user-kNN. Text-based user-kNN performs significantly better than the rating-based approaches used in this experiment. Semantic measures (based on WordNet) tend to perform better than simple lexical matching measures. Particularly those using the IC. Text-based user-kNN: Measuring user similarity based on text reviews /19

Future Work Provide evidence of how significant the results are to the users text-based user-k vs other text-based approaches in an items prediction task (top-N) / user study How to enhance the measurement of similarity… combinations of text, ratings and sentiment analysis identify hidden similarity in text reviews using Open Linked Data Text-based user-kNN: Measuring user similarity based on text reviews /19

contact details: m.terzi@lancaster.ac.uk
Questions ? Thank You contact details:

Text-based User-kNN: Measuring user similarity based on text reviews

Similar presentations

Presentation on theme: "Text-based User-kNN: Measuring user similarity based on text reviews"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Text-based User-kNN: Measuring user similarity based on text reviews

Similar presentations

Presentation on theme: "Text-based User-kNN: Measuring user similarity based on text reviews"— Presentation transcript:

Similar presentations

About project

Feedback