Privacy risks of collaborative filtering Yuval Madar, June 2012 Based on a paper by J.A. Calandrino, A. Kilzer, A. Narayanan, E. W. Felten & V. Shmatikov.

Slides:



Advertisements
Similar presentations
Recommender Systems & Collaborative Filtering
Advertisements

- A Powerful Computing Technology Department of Computer Science Wayne State University 1.
Google News Personalization: Scalable Online Collaborative Filtering
Differentially Private Recommendation Systems Jeremiah Blocki Fall A: Foundations of Security and Privacy.
Presenter: Nguyen Ba Anh HCMC University of Technology Information System Security Course.
Open Source Recommender System Sagnik Ray Choudhury.
Collaborative Filtering Sue Yeon Syn September 21, 2005.
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
Sean Blong Presents: 1. What are they…?  “[…] specific type of information filtering (IF) technique that attempts to present information items (movies,
A. Darwiche Learning in Bayesian Networks. A. Darwiche Known Structure Complete Data Known Structure Incomplete Data Unknown Structure Complete Data Unknown.
Recommender Systems Aalap Kohojkar Yang Liu Zhan Shi March 31, 2008.
Rubi’s Motivation for CF  Find a PhD problem  Find “real life” PhD problem  Find an interesting PhD problem  Make Money!
CS345 Data Mining Recommendation Systems Netflix Challenge Anand Rajaraman, Jeffrey D. Ullman.
Computing Trust in Social Networks
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Collaborative Filtering CMSC498K Survey Paper Presented by Hyoungtae Cho.
Recommender systems Ram Akella November 26 th 2008.
CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.
Collaborative Filtering: Tuck Siong Chung Roland Rust Michel Wedel Choice Conference 2007.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Chapter 12 (Section 12.4) : Recommender Systems Second edition of the book, coming soon.
Julian Keenaghan 1 Personalization of Supermarket Product Recommendations IBM Research Report (2000) R.D. Lawrence et al.
Game Theory and Privacy Preservation in Recommendation Systems Iordanis Koutsopoulos U of Thessaly Thalis project CROWN Kick-off Meeting Volos, May 11,
Performance of Recommender Algorithms on Top-N Recommendation Tasks
Distributed Networks & Systems Lab. Introduction Collaborative filtering Characteristics and challenges Memory-based CF Model-based CF Hybrid CF Recent.
RecSys 2011 Review Qi Zhao Outline Overview Sessions – Algorithms – Recommenders and the Social Web – Multi-dimensional Recommendation, Context-
Recommendation system MOPSI project KAROL WAGA
+ Recommending Branded Products from Social Media Jessica CHOW Yuet Tsz Yongzheng Zhang, Marco Pennacchiotti eBay Inc. eBay Inc.
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Thwarting Passive Privacy Attacks in Collaborative Filtering Rui Chen Min Xie Laks V.S. Lakshmanan HKBU, Hong Kong UBC, Canada UBC, Canada Introduction.
Presented By :Ayesha Khan. Content Introduction Everyday Examples of Collaborative Filtering Traditional Collaborative Filtering Socially Collaborative.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
1 Recommender Systems Collaborative Filtering & Content-Based Recommending.
Digital Citizenship Lesson 3. Does it Matter who has your Data What kinds of information about yourself do you share online? What else do you do online.
1 Business System Analysis & Decision Making – Data Mining and Web Mining Zhangxi Lin ISQS 5340 Summer II 2006.
Collaborative Filtering versus Personal Log based Filtering: Experimental Comparison for Hotel Room Selection Ryosuke Saga and Hiroshi Tsuji Osaka Prefecture.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
Order the featured book of the day Estimated effort: 2.
Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1.
1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.
Recommender Systems Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata Credits to Bing Liu (UIC) and Angshul Majumdar.
Preventing Private Information Inference Attacks on Social Networks.
Xinyu Xing, Wei Meng, Dan Doozan, Georgia Institute of Technology Alex C. Snoeren, UC San Diego Nick Feamster, and Wenke Lee, Georgia Institute of Technology.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Online Evolutionary Collaborative Filtering RECSYS 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University.
Recommendation Systems By: Bryan Powell, Neil Kumar, Manjap Singh.
Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown Structure Complete Data Unknown Structure Incomplete.
Presented By: Madiha Saleem Sunniya Rizvi.  Collaborative filtering is a technique used by recommender systems to combine different users' opinions and.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Collaborative Filtering: Searching and Retrieving Web Information Together Huimin Lu December 2, 2004 INF 385D Fall 2004 Instructor: Don Turnbull.
Collaborative Filtering - Pooja Hegde. The Problem : OVERLOAD Too much stuff!!!! Too many books! Too many journals! Too many movies! Too much content!
Analysis of massive data sets Prof. dr. sc. Siniša Srbljić Doc. dr. sc. Dejan Škvorc Doc. dr. sc. Ante Đerek Faculty of Electrical Engineering and Computing.
Item-Based Collaborative Filtering Recommendation Algorithms
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Chapter 14 – Association Rules and Collaborative Filtering © Galit Shmueli and Peter Bruce 2016 Data Mining for Business Analytics (3rd ed.) Shmueli, Bruce.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
Data Mining: Concepts and Techniques
Recommender Systems & Collaborative Filtering
Adopted from Bin UIC Recommender Systems Adopted from Bin UIC.
Collaborative Filtering Nearest Neighbor Approach
Q4 : How does Netflix recommend movies?
Author: Kazunari Sugiyama, etc. (WWW2004)
Privacy Protection for Social Network Services
Published in: IEEE Transactions on Industrial Informatics
Recommendation Systems
Recommender Systems Group 6 Javier Velasco Anusha Sama
A Glimpse of Recommender Systems on the Web
Exploiting Unintended Feature Leakage in Collaborative Learning
Presentation transcript:

Privacy risks of collaborative filtering Yuval Madar, June 2012 Based on a paper by J.A. Calandrino, A. Kilzer, A. Narayanan, E. W. Felten & V. Shmatikov

 Help suggesting users items and other users to their liking by deducing them from previous purchases.  Numerous examples ◦ Amazon, iTunes, CNN, Last.fm, Pandora, Netflix, Youtube, Hunch, Hulu, LibraryThing, IMDb and many others.

 User to item – “You might also like…”  Item to item – Similar items list  User to user – Another customer with common interests

 Content Based filtering ◦ Based on A-priori similarity between items, and recommendations are derived from a user’s own history. ◦ Doesn’t pose a privacy threat.  Collaborative filtering ◦ Based on correlations between other uses purchases. ◦ Our attacks will target this type of systems.  Hybrid ◦ A system employing both filtering techniques

 The data the recommendation system uses is modeled as a matrix, where the rows correspond to users, and columns correspond to items.  Some auxiliary information on the target user is available (A subset of a target user’s transaction history)  An attack is successful, if it allows the attacker to learn transactions not part of its auxiliary information.

 User public rating and comments on products  Shared transactions (Via facebook, or other mediums)  Discussions in 3 rd party sites  Favorite books in facebook profile  Non-online interactions (With friends, neighbors, coworkers, etc.)  Other sources…

 Input: ◦ a set of target items T and a set of auxiliary items A  Observe the related items list of A, until an item in T appears, or moves up.  If a target item appears in enough related items lists in the same time, the attacker may infer it was bought by the target user.  Note 1 – Scoring may be far more complex, since different items in A are correlated. (Books which belong to a single series, bundle discounts, etc.)  Note 2 – It is preferable that A consist of obscure and uncommon items, to improve the effect of the target user’s choices on its related items lists.

 In some sites, the covariance matrix, describing the correlation between items in the site, is exposed to the users. (Hunch is one such website)  Similarly, the attacker is required to watch for improvement in the correlation between the auxiliary items and the target items.  Note 1 – Asynchronous updates to different matrix cells.  Note2 – inference probability improves if the auxiliary items are user-unique. (No other user bought all auxiliary items) More likely if some of them are unpopular, or if there are enough of them.

 System model ◦ For each user, the system finds the k users most similar to it, and ranks items purchased by them by total number of sales.  Active Attack  Create k dummy users, each buying all known auxiliary items.  With high probability, the k dummy users and the target user will be clustered together. (Given auxiliary items list of size logarithmic in the total number of users. In practice, 8 items were found to be enough for most sites)  In that case, the recommendations to the dummy users will consist of transactions of the target user previously unknown to the attacker.  Note – The attack is more feasible in a system where user interactions with items does not involve spending money.

 The main parameters for evaluation of an inference attack are: ◦ Yield – How many inferences are produced. ◦ Accuracy – How likely is each inference.  Yield-accuracy tradeoff - stricter accuracy algorithms reject less probable inferences.

 The paper further discusses specific attacks performed against: ◦ Hunch ◦ LibraryThing ◦ Last.fm ◦ Amazon  And measures the accuracy and yield of these attacks, arriving in some instances to impressive tradeoff figures. (Such as 70% accuracy for 100% yield in Hunch)

 Not discussed in the paper.  Achieved in other papers for static recommendation databases.  Remains an open problem for dynamic systems. (Which all real world examples are)

 Limited-length related items list – The first elements of such lists have low sensitivity to single purchases.  Factoring item popularity into update frequency – less popular items are more sensitive to single purchases. Batching their purchases together will decrease the information leak.

 Limit data access rate – Preventing large- scale privacy attacks, though lowering utility and may be circumvented using a botnet.  User opt-out – A privacy conscious user may decide to opt-out of recommender systems entirely. (At clear cost of utility)

 A passive attack on recommender systems using auxiliary information on a certain user’s purchases, allowing the attacker to infer undisclosed private transactions.  Increased user awareness is required  Suggested several methods to decrease the information leaked by these systems

Questions?