CSE390 Advanced Computer Networks Lecture 15: Online Social Networks (The network is people) Based on slides by A. Mislove, F. Schneider, and W. Willinger.

Slides:



Advertisements
Similar presentations
Mojtaba Torkjazi, Reza Rejaie, Walter Willinger University of Oregon AT&T Labs-Research WOSN09 Barcelona, Spain.
Advertisements

Measurement and Analysis of Online Social Networks 1 A. Mislove, M. Marcon, K Gummadi, P. Druschel, B. Bhattacharjee Presentation by Shahan Khatchadourian.
1 KSIDI June 9, 2010 Measuring User Influence in Twitter: The Million Follower Fallacy Meeyoung Cha Max Planck Institute for Software Systems (MPI-SWS)
The Role of Twitter in YouTube Videos Diffusion George Christodoulou EPFL Switzerland Laboratory for Internet Computing Department of Computer Science.
Analysis and Modeling of Social Networks Foudalis Ilias.
Walter Willinger AT&T Research Labs Reza Rejaie, Mojtaba Torkjazi, Masoud Valafar University of Oregon Mauro Maggioni Duke University HotMetrics’09, Seattle.
Social Media Networking Sites Charlotte Jenkins Designing the Social Web
Marios Iliofotou (UC Riverside) Brian Gallagher (LLNL)Tina Eliassi-Rad (Rutgers University) Guowu Xi (UC Riverside)Michalis Faloutsos (UC Riverside) ACM.
Assessing the Effects of a Soft Cut-off in the Twitter Social Network Niloy Ganguly, Saptarshi Ghosh.
Enabling the Social Web Krishna P. Gummadi Networked Systems Group Max Planck Institute for Software Systems.
Asking Questions on the Internet
TDTS21: Advanced Networking Lecture 8: Online Social Networks Based on slides from P. Gill Revised 2015 by N. Carlsson.
Flickr Information propagation in the Flickr social network Meeyoung Cha Max Planck Institute for Software Systems With Alan Mislove.
Social Networking Ottawa Lifelong Learning Fall 2009 Impact.
Masoud Valafar †, Reza Rejaie †, Walter Willinger ‡ † University of Oregon ‡ AT&T Labs-Research WOSN’09 Barcelona, Spain Beyond Friendship Graphs: A Study.
UNDERSTANDING VISIBLE AND LATENT INTERACTIONS IN ONLINE SOCIAL NETWORK Presented by: Nisha Ranga Under guidance of : Prof. Augustin Chaintreau.
Tagging Systems Austin Wester. Tags A keywords linked to a resource (image, video, web page, blog, etc) by users without using a controlled vocabulary.
CS 345A Data Mining Lecture 1
CS 345A Data Mining Lecture 1 Introduction to Web Mining.
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
Measurement and Analysis of Online Social Networks By Alan Mislove, Massimiliano Marcon, Krishna P. Gummadi, Peter Druschel, Bobby Bhattacharjee Attacked.
Computing Trust in Social Networks
Social Media Motion: How to Get Started & Keep Going With Facebook, Twitter & More Presented by Eli Lilly and Company Hosted by Rob Robinson McNeely Pigott.
Measurement and Analysis of Online Social Networks Alan Mislove,Massimiliano Marcon, Krishna P. Gummadi, Peter Druschel, Bobby Bhattacharjee Presented.
1 YouTube Traffic Characterization: A View From the Edge Phillipa Gill¹, Martin Arlitt²¹, Zongpeng Li¹, Anirban Mahanti³ ¹ Dept. of Computer Science, University.
1 Measurement and Analysis of Online Social Networks A. Mislove, M. Marcon, K Gummadi, P. Druschel, B. Bhattacharjee Presentation by Yong Wang (Defense.
A Measurement-driven Analysis of Information Propagation in the Flickr Social Network WWW09 报告人: 徐波.
Measurement and Evolution of Online Social Networks Review of paper by Ophir Gaathon Analysis of Social Information Networks COMS , Spring 2011,
1 Advanced Archive-It Application Training: Archiving Social Networking and Social Media Sites.
Add image. 3 “ Content is NOT king ” today 3 40 analog cable digital cable Internet 100 infinite broadcast Time Number of TV channels.
+ The Future of Social Media By Abigail Boghurst.
Core Publisher: Station Administrator Tools. Training 1: Site Administration Training 2: Programs Training 3: Content Tagging Training 4: Creating Posts.
Empowering Families Through Social Networking Region 4 PTAC Summer Working Meeting June 24-25, 2015 Traverse City, Michigan.
Using Social Networks in Education Region One Technology Conference May 11, 2010.
1 Speaker : 童耀民 MA1G Authors: Ze Li Dept. of Electr. & Comput. Eng., Clemson Univ., Clemson, SC, USA Haiying Shen ; Hailang Wang ; Guoxin.
University of California at Santa Barbara Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Zhao.
Web Characterization: What Does the Web Look Like?
Authors: Xu Cheng, Haitao Li, Jiangchuan Liu School of Computing Science, Simon Fraser University, British Columbia, Canada. Speaker : 童耀民 MA1G0222.
1 Computing with Social Networks on the Web (2008 slide deck) Jennifer Golbeck University of Maryland, College Park Jim Hendler Rensselaer Polytechnic.
Imagery 2.0 –you are here and there A brief introduction to social photo and video.
Knowing Your Facebook From Your Flickr Dan O’ Neill – -
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
Online Social Networks Modified from Philipa Gill CPE 401 / 601 Computer Network Systems Mehmet Hadi Gunes.
Data Analysis in YouTube. Introduction Social network + a video sharing media – Potential environment to propagate an influence. Friendship network and.
WALKING IN FACEBOOK: A CASE STUDY OF UNBIASED SAMPLING OF OSNS junction.
Understanding Cross-site Linking in Online Social Networks Yang Chen 1, Chenfan Zhuang 2, Qiang Cao 1, Pan Hui 3 1 Duke University 2 Tsinghua University.
Network Characterization via Random Walks B. Ribeiro, D. Towsley UMass-Amherst.
M EASUREMENT AND A NALYSIS OF O NLINE S OCIAL N ETWORKS Professor : Dr Sheykh Esmaili Presenters: Pourya Aliabadi Boshra Ardallani Paria Rakhshani 1.
COM1721: Freshman Honors Seminar A Random Walk Through Computing Lecture 2: Structure of the Web October 1, 2002.
Understanding Crowds’ Migration on the Web Yong Wang Komal Pal Aleksandar Kuzmanovic Northwestern University
The Birth & Growth of Web 2.0 COM 415-Fall II Ashley Velasco (Prince)
1 The Other Kind of Networking: Social Networks on the Web Dr. Jennifer Golbeck University of Maryland, College Park March 20, 2006.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Core Publisher: Station Administrator Tools. Training 1: Site Administration Training 2: Programs Training 3: Content Tagging Training 4: Creating Posts.
SocialTube: P2P-assisted Video Sharing in Online Social Networks
A measurement-driven Analysis of Information Propagation in the Flickr Social Network Meeyoung Cha Alan Mislove Krisnna P. Gummadi.
User Interactions in Social Networks and their Implications Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, Ben Y. Zhao (UC Santa.
We.b : The web of short URLs Demetris Antoniades, lasonas Polakis, Gerogios Kontaxis, Elias Athansapoulos, Sotiris loannidis, Evangelos P.Markatos, Thomas.
Optimizing today's websites using tomorrow's technologies.
Understanding Online Social Network Usage from a Network Perspective F. Schneider et al (T-Labs, AT&T) Internet Measurement Conference 2009 Networking.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
Victor PTSA Fall Forum Don’t Lose Touch With Your Teen Tuesday, October 22, 2013 – 7PM Social media is now an integral part of our every day lives. For.
Topics In Social Computing (67810) Module 1 Introduction & The Structure of Social Networks.
GRAPH AND LINK MINING 1. Graphs - Basics 2 Undirected Graphs Undirected Graph: The edges are undirected pairs – they can be traversed in any direction.
Introduction to Web Mining
Using Google Plus Skills: Use Google Plus
CS 345A Data Mining Lecture 1
CS 345A Data Mining Lecture 1
Introduction to Web Mining
CS 345A Data Mining Lecture 1
Presentation transcript:

CSE390 Advanced Computer Networks Lecture 15: Online Social Networks (The network is people) Based on slides by A. Mislove, F. Schneider, and W. Willinger. Updated by P Gill. Fall 2014

What are (online) social networks? 2  Social networks are graphs of people  Graph edges connect friends  `Friend’ has different implications  How hard is it to be Facebook `friends’?  Online social networking  Social network hosted by a Web site  Friendship represents shared interest or trust  Online friends may have never met

What are online social networks used for? 3  Popular for sharing content  Photos (Flickr), videos (YouTube), blogs (LiveJournal), profiles (Facebook, Orkut)  Fixed broadband (Sandvine Q1 2014) YouTube 5.5% upload, 13.2% down Facebook 2.2% upload, 2.0% down  Popular with users on the go  Mobile (Sandvine Q1 2014) YouTube 3.8% up, 17.6% down Facebook 27.0% up, 14.0% down

Why are social networks interesting? 4  Popular way to connect  Estimated 1.32B users online each day  Average American spends 40 minutes/day on the site  Changing the flow of information  Formerly few ``writers’’ many ``readers’’ online  Now anyone can write!  What does this mean for Internet traffic?  Important in regions with strict media controls  E.g., Iran, Egypt using social media platforms to get word out in times of unrest  Useful in times of disaster

Notable incidents … 5

Not just a social phenomenon… 6 Facebook now contains photo and video  Content delivery challenges! YouTube is a large fraction of Google’s traffic! Understanding properties of these networks is important to understand how we build systems to support them!

Graph Properties of OSNs (Mislove et al 2007) The Demise of MySpace (Torkjazi et al 2009) How do people use OSNs (Schneider et al 2009) Outline 7

Required reading: Mislove et al  One of the first measurement studies of online social networks (OSNs)  Large-scale measurement study and analysis of multiple online social networks  11 M users, 328 M links  Four diverse OSNs  Flickr: photosharing  LiveJournal: blogging  Orkut: social networking  YouTube: video sharing  Goals:  Measure OSNs at scale  Understand their static structural properties

How to measure OSNs at scale? 9  Sites are reluctant to give out data  Cannot enumerate user list in general  Instead, performed crawls of the user graph  Picked known seed user  Crawled all of his friends  Added new users to list  Continued until all known users crawled  Effectively performed a BFS of graph

Challenges faced 10  Obtaining data using crawling presents unique challenges  Need to crawl quickly!  Underlying network changes rapidly  Consistent snapshot is hard to get  Crawling completely  Social networks aren’t necessarily connected Some users have no links! Or are in small clusters. Need to estimate the crawl coverage

How fast could they crawl? 11  Crawled using a cluster of 58 machines  Used APIs where available  Otherwise, used screen scraping  Crawls took varying times  Flickr, YouTube 1 day  LiveJournal 3 days  Orkut (Partial) 39 days)  Crawls subject to rate-limiting  Discovered appropriate rates

Data collected 12 Able to crawl a large portion of the network Node degrees vary by orders of magnitude However, networks share many key properties To ground analysis, will compare to Web [Broder et al.,

How are links distributed 13

What fraction of links are symmetric? 14

Aside: User relationships on Twitter  Broadcasters  News outlets, radio stations  No reason to follow anyone  Post playlists, headlines 13

Aside: User relationships on Twitter  Acquaintances  Similar number of followers and following  Along the diagonal  Green portion is top 1- percentile of tweeters 14

Aside: User relationships on Twitter  Miscreants?  Some people follow many users (programmatically)  Hoping some will follow them back  Spam, widgets, celebrities (at top) 15

Aside: User relationships on Twitter 18 Twitter noticed the miscreants… … enacted the 10% rule (you can follow 10% more people than follow you)

Implications of high symmetry 19  High link symmetry implies indegree equals outdegree  Users tend to receive as many links as the give  Unlike other complex networks, such as the Web  Sites like cnn.com receive much links more than they give  Implications is that ‘hubs’ become ‘authorities’  May impact search algorithms (PageRank, HITS)  So far, observed networks are power-law with high symmetry  Take a closer look next

Complex network structure 20

Does a core exist? 21

How clustered is the fringe? 22

Implications 23

Graph Properties of OSNs (Mislove et al 2007) The Demise of MySpace (Torkjazi et al 2009) How do people use OSNs (Schneider et al 2009) Outline 24

Hot Today, Gone Tomorrow… 25  Slides borrowed from W. Willinger  Paper: Hot Today, Gone Tomorrow: On the Migration of MySpace Users. M. Torkjazi, R. Rejaie, and W. Willinger.

Motivation  A majority of empirical studies of Online Social Networks (OSNs) has focused on their associated friendship graphs What about the temporal dynamics of OSNs? What about the “active” portion of an OSN? A majority of empirical studies of OSNs has examined the growth of these systems What about the patterns of decline in user population? What about changes over time in user activity? A majority of empirical studies of OSNs has been based on connectivity information What about timing information? How to obtain relevant timing information? 8/17/2009WOSN Barcelona 26

This Study We examine the evolution of user population and user activity in MySpace User arrival/activity/departure, life cycle of MySpace Why MySpace? It is one of the largest and most popular OSNs It provides several features making our study feasible  Main challenges OSNs are often studied when they are popular and the number of departure is negligible Popular OSNs tend to hide the information about user departures 8/17/2009WOSN Barcelona 27

MySpace Features (I) Provides explicit profile status Public Private Invalid Availability of users’ last login Enables assessment of the level of activity among users Importantly, allows inference of population growth of MySpace (see later for details) Global visibility 8/17/2009WOSN Barcelona 28

MySpace Features (II) Monotonic assignment of numeric ID Searched periodically for currently smallest unassigned ID and checked that all larger IDs are unassigned; after waiting for a short period, we observed that the smallest unassigned ID (and others after it) are now assigned. Found no apparent patterns in gaps between consecutive invalid IDs No evidence for re-assignement of deleted IDs  Makes the selection of random samples of MySpace users easy. 8/17/2009WOSN Barcelona 29 No visible pattern

Measurement Feb. 26 th 2009: MySpace ID space [1 … 455,881,700] 50 parallel samplers to collect 360K users in less than 12 hours (0.1% of MySpace population) Using HTML parser to post-process the downloaded profiles and extract User s’ profile status (invalid, public, private) Users’ last login date Users’ friend list (only for public profiles) Unable to parse last login info for 0.96% of public and 0.08% of private profiles Last login info is not provided or is provided with obvious errors (e.g. 1/1/0001) 8/17/2009WOSN Barcelona 30

On the Population size of MySpace  Population of valid MySpace users (Feb. 26, 2009) was about ( )% of 455,881,700 = 268M  Compare with who has 266,029,430 friends (Aug. 13, 2009)  How has MySpace grown during the past years?  How many “active” users are there in MySpace? 8/17/2009WOSN Barcelona 31 TotalInvalidPublicPrivate 362K149K (41.2%)150K (41.5%)63K (17.3%)

On User Arrival 8/17/2009WOSN Barcelona 32 Public users What does u ser ID say about account creation time? Plot user ID vs. last login of that user for all our users Private users

On User Arrival  32% of public and 18% of private users are tourists  Discovery of “tourists” enables accurate estimation of user account creation time based on their associated user ID 8/17/2009WOSN Barcelona 33 Tourists What does user ID say about account creation time? “Clean edge” = users whose last login is shortly after their account creation time = “MySpace tourists”

On MySpace’s Growth Use the observed uniform spread of tourists across entire ID space Estimate account creation time by last login time Estimate account creation time of all sampled accounts based on their ID. 8/17/2009WOSN Barcelona 34 April 2008 Estimating the user population of MySpace in the past ?  Slope of the top line shows the growth rate of MySpace population  Exponential growth until about April 2008  Visible knee around April 2008 followed by a slow-down in growth

On User Departure 8/17/2009WOSN Barcelona 35 More public and private profiles in the first half of ID space More invalid profiles in the second half of ID space  Users joining the system earlier have been more likely to keep their accounts than newer users Are newer users more likely to leave than older ones?

MySpace Life Cycle (I) Slow-down in the growth rate of MySpace is related to emergence of Facebook Informal evidence (Alexa.com): Daily accesses to Facebook surpassed that of MySpace, at around April /17/2009WOSN Barcelona 36 Possible reasons behind MySpace’s decline?

Graph Properties of OSNs (Mislove et al 2007) The Demise of MySpace (Torkjazi et al 2009) How do people use OSNs (Schneider et al 2009) Outline 37

Understanding Online Social Network Usage from a Network Perspective 38  F. Schneider, A. Feldmann, B. Krishnamurthy, and W. Willinger. ACM Internet Measurement Conference 2009  Slides borrowed from F. Schneider.  This study differs from a lot of related work by looking at OSN behavior at the network traffic level  Vs. crawling the application-level social graph

General Approach 39

OSNs studied 40

HTTP Traces 41

Categories of pages 42 Pages manually classified based on small user generated traces in the lab setting

Session Characteristics 43

Action popularity 44

Feature sequences 45

OSNs: Wrap up 46  Many different types of OSNs  Photos, video, profile-based  Some extremely popular source of much Internet traffic  Facebook, YouTube  New ones emerging  Instagram, snapchat  Old ones fading  MySpace, Friendster  Studying their properties can inform how we build networks and systems to support them!

End 47

BitTorrent Overview 48 Tracker SwarmLeechers Seeder