Presentation is loading. Please wait.

Presentation is loading. Please wait.

Finding your friends and following them to where you are by Adam Sadilek, Henry Kautz, Jeffrey P. Bigham Presented by Guang Ling 1.

Similar presentations


Presentation on theme: "Finding your friends and following them to where you are by Adam Sadilek, Henry Kautz, Jeffrey P. Bigham Presented by Guang Ling 1."— Presentation transcript:

1 Finding your friends and following them to where you are by Adam Sadilek, Henry Kautz, Jeffrey P. Bigham Presented by Guang Ling 1

2 Why you should care about this work? It is the best paper of WSDM2012 It is covered by famous news media – Newscientist – Gizmodo It claims to infer your fine-grained location even when you keep your data private (Dangerous!) 2

3 What is it about? Leverage Geo-tagged twitter data Two predictive tasks – Social ties (link prediction) – Location prediction 3

4 The system: FLAP Three main components of Flap(Friendship + Location Analysis and Prediction) – Data crawler – Data visualizervisualizer – Machine learning module Social ties prediction Location prediction 4

5 Data crawler Use Twitter Search API to collect data – To avoid twitter’s query rate limitations: Distribute the work to a number of machines with different IP Asynchronous queries Merge the results to form the dataset 5

6 The Data New York City & Los Angeles 26M tweets crawled in a month – 1.2M unique users – 7.6M geo-tagged tweets 6

7 The Data 11K geo-active users 4M tweets by geo-active users – 123K “follows” relationship – 52K “friend” relationship Geo-active users? User that posted more than 100 GPS-tagged tweets during one month Friend relationship? A pair of user who have reciprocal follow relationships have friend relationship 7

8 Data Visualizer Take a look yourself!look 8

9 Machine learning module Meat of the paper – Feature based link prediction Features Tools – Location prediction Treat users with known GPS positions as noisy sensors of the location of their friends 9

10 Feature based link prediction No single property (feature) of a pair of individuals functions as a good indicator Combine multiple disparate features – What features? 10

11 Feature based link prediction Features – Text similarity – Co-location score – Graph structure (meet/min coefficient) Text similarity Co- location score Regression decision tree  One feature 11

12 Feature based link prediction Features – Text similarity – Co-location score – Graph structure What about other features? – Tried and discarded Jaccard coefficient Preferential attachment Hypergeometric coefficient – To keep it efficient and scalable 12

13 Feature based link prediction Features – Text similarity – Co-location score – Graph structure Text similarity Co- location score Regression decision tree  One feature 13

14 Feature based link prediction 14

15 Location prediction Idea – Friends ~ as noisy sensors of u’s location Task – Infer the most likely location of person u at any time Input – Sequence of locations visited by u’s friends – Location of u himself over the training period (supervised learning only) 15

16 Location prediction Procedure to extract important locations – For each user Extract a set of distinct locations from which he/she tweet from Merge (cluster) all locations within 100 meters range – To account for GPS sensor noise Remove location with fewer than 5 visits – Merge all the extracted locations 89,077 unique locations 25,830 significant locations 16

17 Location prediction Time? – Location are modeled in 20 minute increments – The domain of the time of day r.v. is 0, 1, …, 71 (24/0.3 = 72) 17

18 Location prediction Two settings – Supervised learning Location of u over the training period is given – Unsupervised learning Only u’s friends location during training period is given 18

19 Location prediction Solution: dynamic Bayesian network 19

20 Location prediction Learning in supervised setting – Optimization objective function 20 Observed values Hidden values

21 Location prediction Learning in unsupervised settings – Optimization objective function – EM algorithm – Intractable for sizable domains – Optimize lower bound 21

22 Location prediction Inference – Given a learnt model, what is the most likely sequence of location visited by a user? – Viterbi decoding algorithm 22

23 Experiments and evaluation Friendship prediction experiments – AUC adopted as evaluation method – Observed edges ranging from 0% to 50% – Two-fold cross validation 23

24 Experiments and evaluation Friendship prediction experiments 24

25 Experiments and evaluation Location prediction experiments – Cross validation over all users – Train on first 3 weeks data, test on the 4 th week 25

26 Experiments and evaluation 26

27 Conclusion A lot of information can be inferred – User friendship Even when no ties are given – User’s fine-grained location Even for a user who have never revealed his location – Ethical questions implied Would you trade your privacy with automated system? 27

28 A thought on the paper This paper… – Solve problems using existing tools Regression decision tree Belief propagation Dynamic Bayesian networks Why this is a best paper? – Demo a working system: FLAP – Creative use of existing tools – Very impressive experiment results – Well written paper 28

29 Q&A Any questions? 29


Download ppt "Finding your friends and following them to where you are by Adam Sadilek, Henry Kautz, Jeffrey P. Bigham Presented by Guang Ling 1."

Similar presentations


Ads by Google