Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE390 Advanced Computer Networks Lecture 15: Online Social Networks (The network is people) Based on slides by A. Mislove, F. Schneider, and W. Willinger.

Similar presentations


Presentation on theme: "CSE390 Advanced Computer Networks Lecture 15: Online Social Networks (The network is people) Based on slides by A. Mislove, F. Schneider, and W. Willinger."— Presentation transcript:

1 CSE390 Advanced Computer Networks Lecture 15: Online Social Networks (The network is people) Based on slides by A. Mislove, F. Schneider, and W. Willinger. Updated by P Gill. Fall 2014

2 What are (online) social networks? 2  Social networks are graphs of people  Graph edges connect friends  `Friend’ has different implications  How hard is it to be Facebook `friends’?  Online social networking  Social network hosted by a Web site  Friendship represents shared interest or trust  Online friends may have never met

3 What are online social networks used for? 3  Popular for sharing content  Photos (Flickr), videos (YouTube), blogs (LiveJournal), profiles (Facebook, Orkut)  Fixed broadband (Sandvine Q1 2014) YouTube 5.5% upload, 13.2% down Facebook 2.2% upload, 2.0% down  Popular with users on the go  Mobile (Sandvine Q1 2014) YouTube 3.8% up, 17.6% down Facebook 27.0% up, 14.0% down

4 Why are social networks interesting? 4  Popular way to connect  Estimated 1.32B users online each day  Average American spends 40 minutes/day on the site  Changing the flow of information  Formerly few ``writers’’ many ``readers’’ online  Now anyone can write!  What does this mean for Internet traffic?  Important in regions with strict media controls  E.g., Iran, Egypt using social media platforms to get word out in times of unrest  Useful in times of disaster

5 Notable incidents … 5

6 Not just a social phenomenon… 6 Facebook now contains photo and video  Content delivery challenges! YouTube is a large fraction of Google’s traffic! Understanding properties of these networks is important to understand how we build systems to support them!

7 Graph Properties of OSNs (Mislove et al 2007) The Demise of MySpace (Torkjazi et al 2009) How do people use OSNs (Schneider et al 2009) Outline 7

8 Required reading: Mislove et al. 2007 8  One of the first measurement studies of online social networks (OSNs)  Large-scale measurement study and analysis of multiple online social networks  11 M users, 328 M links  Four diverse OSNs  Flickr: photosharing  LiveJournal: blogging  Orkut: social networking  YouTube: video sharing  Goals:  Measure OSNs at scale  Understand their static structural properties

9 How to measure OSNs at scale? 9  Sites are reluctant to give out data  Cannot enumerate user list in general  Instead, performed crawls of the user graph  Picked known seed user  Crawled all of his friends  Added new users to list  Continued until all known users crawled  Effectively performed a BFS of graph

10 Challenges faced 10  Obtaining data using crawling presents unique challenges  Need to crawl quickly!  Underlying network changes rapidly  Consistent snapshot is hard to get  Crawling completely  Social networks aren’t necessarily connected Some users have no links! Or are in small clusters. Need to estimate the crawl coverage

11 How fast could they crawl? 11  Crawled using a cluster of 58 machines  Used APIs where available  Otherwise, used screen scraping  Crawls took varying times  Flickr, YouTube 1 day  LiveJournal 3 days  Orkut (Partial) 39 days)  Crawls subject to rate-limiting  Discovered appropriate rates

12 Data collected 12 Able to crawl a large portion of the network Node degrees vary by orders of magnitude However, networks share many key properties To ground analysis, will compare to Web [Broder et al.,

13 How are links distributed 13

14 What fraction of links are symmetric? 14

15 Aside: User relationships on Twitter  Broadcasters  News outlets, radio stations  No reason to follow anyone  Post playlists, headlines 13

16 Aside: User relationships on Twitter  Acquaintances  Similar number of followers and following  Along the diagonal  Green portion is top 1- percentile of tweeters 14

17 Aside: User relationships on Twitter  Miscreants?  Some people follow many users (programmatically)  Hoping some will follow them back  Spam, widgets, celebrities (at top) 15

18 Aside: User relationships on Twitter 18 Twitter noticed the miscreants… … enacted the 10% rule (you can follow 10% more people than follow you)

19 Implications of high symmetry 19  High link symmetry implies indegree equals outdegree  Users tend to receive as many links as the give  Unlike other complex networks, such as the Web  Sites like cnn.com receive much links more than they give  Implications is that ‘hubs’ become ‘authorities’  May impact search algorithms (PageRank, HITS)  So far, observed networks are power-law with high symmetry  Take a closer look next

20 Complex network structure 20

21 Does a core exist? 21

22 How clustered is the fringe? 22

23 Implications 23

24 Graph Properties of OSNs (Mislove et al 2007) The Demise of MySpace (Torkjazi et al 2009) How do people use OSNs (Schneider et al 2009) Outline 24

25 Hot Today, Gone Tomorrow… 25  Slides borrowed from W. Willinger  Paper: Hot Today, Gone Tomorrow: On the Migration of MySpace Users. M. Torkjazi, R. Rejaie, and W. Willinger.

26 Motivation  A majority of empirical studies of Online Social Networks (OSNs) has focused on their associated friendship graphs What about the temporal dynamics of OSNs? What about the “active” portion of an OSN? A majority of empirical studies of OSNs has examined the growth of these systems What about the patterns of decline in user population? What about changes over time in user activity? A majority of empirical studies of OSNs has been based on connectivity information What about timing information? How to obtain relevant timing information? 8/17/2009WOSN 2009 - Barcelona 26

27 This Study We examine the evolution of user population and user activity in MySpace User arrival/activity/departure, life cycle of MySpace Why MySpace? It is one of the largest and most popular OSNs It provides several features making our study feasible  Main challenges OSNs are often studied when they are popular and the number of departure is negligible Popular OSNs tend to hide the information about user departures 8/17/2009WOSN 2009 - Barcelona 27

28 MySpace Features (I) Provides explicit profile status Public Private Invalid Availability of users’ last login Enables assessment of the level of activity among users Importantly, allows inference of population growth of MySpace (see later for details) Global visibility http://www.myspace.com/user_id 8/17/2009WOSN 2009 - Barcelona 28

29 MySpace Features (II) Monotonic assignment of numeric ID Searched periodically for currently smallest unassigned ID and checked that all larger IDs are unassigned; after waiting for a short period, we observed that the smallest unassigned ID (and others after it) are now assigned. Found no apparent patterns in gaps between consecutive invalid IDs No evidence for re-assignement of deleted IDs  Makes the selection of random samples of MySpace users easy. 8/17/2009WOSN 2009 - Barcelona 29 No visible pattern

30 Measurement Feb. 26 th 2009: MySpace ID space [1 … 455,881,700] 50 parallel samplers to collect 360K users in less than 12 hours (0.1% of MySpace population) Using HTML parser to post-process the downloaded profiles and extract User s’ profile status (invalid, public, private) Users’ last login date Users’ friend list (only for public profiles) Unable to parse last login info for 0.96% of public and 0.08% of private profiles Last login info is not provided or is provided with obvious errors (e.g. 1/1/0001) 8/17/2009WOSN 2009 - Barcelona 30

31 On the Population size of MySpace  Population of valid MySpace users (Feb. 26, 2009) was about (41.5 + 17.3)% of 455,881,700 = 268M  Compare with www.myspace.com/tom who has 266,029,430 friends (Aug. 13, 2009)www.myspace.com/tom  How has MySpace grown during the past years?  How many “active” users are there in MySpace? 8/17/2009WOSN 2009 - Barcelona 31 TotalInvalidPublicPrivate 362K149K (41.2%)150K (41.5%)63K (17.3%)

32 On User Arrival 8/17/2009WOSN 2009 - Barcelona 32 Public users What does u ser ID say about account creation time? Plot user ID vs. last login of that user for all our users Private users

33 On User Arrival  32% of public and 18% of private users are tourists  Discovery of “tourists” enables accurate estimation of user account creation time based on their associated user ID 8/17/2009WOSN 2009 - Barcelona 33 Tourists What does user ID say about account creation time? “Clean edge” = users whose last login is shortly after their account creation time = “MySpace tourists”

34 On MySpace’s Growth Use the observed uniform spread of tourists across entire ID space Estimate account creation time by last login time Estimate account creation time of all sampled accounts based on their ID. 8/17/2009WOSN 2009 - Barcelona 34 April 2008 Estimating the user population of MySpace in the past ?  Slope of the top line shows the growth rate of MySpace population  Exponential growth until about April 2008  Visible knee around April 2008 followed by a slow-down in growth

35 On User Departure 8/17/2009WOSN 2009 - Barcelona 35 More public and private profiles in the first half of ID space More invalid profiles in the second half of ID space  Users joining the system earlier have been more likely to keep their accounts than newer users Are newer users more likely to leave than older ones?

36 MySpace Life Cycle (I) Slow-down in the growth rate of MySpace is related to emergence of Facebook Informal evidence (Alexa.com): Daily accesses to Facebook surpassed that of MySpace, at around April 2008 8/17/2009WOSN 2009 - Barcelona 36 Possible reasons behind MySpace’s decline?

37 Graph Properties of OSNs (Mislove et al 2007) The Demise of MySpace (Torkjazi et al 2009) How do people use OSNs (Schneider et al 2009) Outline 37

38 Understanding Online Social Network Usage from a Network Perspective 38  F. Schneider, A. Feldmann, B. Krishnamurthy, and W. Willinger. ACM Internet Measurement Conference 2009  Slides borrowed from F. Schneider.  This study differs from a lot of related work by looking at OSN behavior at the network traffic level  Vs. crawling the application-level social graph

39 General Approach 39

40 OSNs studied 40

41 HTTP Traces 41

42 Categories of pages 42 Pages manually classified based on small user generated traces in the lab setting

43 Session Characteristics 43

44 Action popularity 44

45 Feature sequences 45

46 OSNs: Wrap up 46  Many different types of OSNs  Photos, video, profile-based  Some extremely popular source of much Internet traffic  Facebook, YouTube  New ones emerging  Instagram, snapchat  Old ones fading  MySpace, Friendster  Studying their properties can inform how we build networks and systems to support them!

47 End 47

48 BitTorrent Overview 48 Tracker SwarmLeechers Seeder


Download ppt "CSE390 Advanced Computer Networks Lecture 15: Online Social Networks (The network is people) Based on slides by A. Mislove, F. Schneider, and W. Willinger."

Similar presentations


Ads by Google