Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Agenda 1. What is (Web) data mining? And what does it have to do with privacy? – a simple view – 2. Examples of data mining and "privacy-preserving data.

Similar presentations


Presentation on theme: "1 Agenda 1. What is (Web) data mining? And what does it have to do with privacy? – a simple view – 2. Examples of data mining and "privacy-preserving data."— Presentation transcript:

1 1 Agenda 1. What is (Web) data mining? And what does it have to do with privacy? – a simple view – 2. Examples of data mining and "privacy-preserving data mining": l Association-rule mining (& privacy-preserving AR mining) l Collaborative filtering (& privacy-preserving collaborative filtering) 3. A second look at...privacy 4. A second look at...Web / data mining 5. The goal: More than modelling and hiding – Towards a comprehensive view of Web mining and privacy. Threats, opportunities and solution approaches. 6. An outlook: Data mining for privacy

2 2 Privacy Problems: Example 1 Technical background of the problem: The dataset allows for Web mining (e.g., which search queries lead to which site choices), it violates k-anonymity (e.g. "Lilburn"  a likely k = #inhabitants of Lilburn)

3 3 Privacy Problems: Example 2 Where do people live who will buy the Koran soon? Technical background of the problem: A mashup of different data sources Amazon wishlists Yahoo! People (addresses) Google Maps each with insufficient k-anonymity, allows for attribute matching and thereby inferences

4 4 Predicting political affiliation from Facebook profile and link data (1): Most Conservative Traits Trait NameTrait ValueWeight Conservative Groupgeorge w bush is my homeboy 45.88831329 Groupcollege republicans40.51122488 Grouptexas conservatives32.23171423 Groupbears for bush30.86484689 Groupkerry is a fairy28.50250433 Groupaggie republicans27.64720818 Groupkeep facebook clean23.653477 Groupi voted for bush23.43173116 Groupprotect marriage one man one woman 21.60830487 Lindamood et al. 09 & Heatherly et al. 09 Privacy Problems: Example 3

5 5 Predicting political affiliation from Facebook profile and link data (2): Most Liberal Traits per Trait Name Trait NameTrait ValueWeight Liberal activitiesamnesty international4.659100601 Employerhot topic2.753844959 favorite tv showsqueer as folk9.762900035 grad schoolcomputer science1.698146579 hometownmumbai3.566007713 Relationship Statusin an open relationship1.617950632 religious viewsagnostic3.15756412 looking forwhatever i can get1.703651985 Lindamood et al. 09 & Heatherly et al. 09

6 6 "Privacy-preserving Web mining" example: find patterns, unlink personal data Volvo S40 website targets people in 20s n Are visitors in their 20s or 40s? n Which demographic groups like/dislike the website? n An example of the "Randomization Approach" to PPDM: R. Agrawal and R. Srikant, "Privacy Preserving Data Mining", SIGMOD 2000.

7 7 Randomization Approach Overview 50 | 40K |...30 | 70K |...... Randomizer Reconstruct distribution of Age Reconstruct distribution of Salary Data Mining Algorithms Model 65 | 20K |...25 | 60K |......

8 8 Seems to work well!

9 9 What is collaborative filtering? "People like what people like them like" – regardless of support and confidence

10 10 User-based Collaborative Filtering n Idea: People who agreed in the past are likely to agree again n To predict a user’s opinion for an item, use the opinion of similar users n Similarity between users is decided by looking at their overlap in opinions for other items n Next step: build a model of user types  "global model" rather than "local patterns" as mining result

11 11 1. Privacy as confidentiality: "the right to be let alone" – and to hide data Data Is this all there is to privacy?

12 12 2. Privacy as control: informational self-determination Data Don‘t do THIS ! n e.g. data privacy: "the right of the individual to decide what information about himself should be communicated to others and under what circumstances" (Westin, 1970) n behind much of data-protection legislation (see Eleni Kosta‘s talk)

13 13 Discussion item: What is this an example of? Tracing anonymous edits in Wikipedia http://wikiscanner.virgil.gr/

14 14 [Method: Attribute matching]

15 15 Results (an example)


Download ppt "1 Agenda 1. What is (Web) data mining? And what does it have to do with privacy? – a simple view – 2. Examples of data mining and "privacy-preserving data."

Similar presentations


Ads by Google