Presentation is loading. Please wait.

Presentation is loading. Please wait.

Danny Hendler Advanced Topics in on-line Social Networks Analysis Social networks analysis seminar Introductory lecture Danny Hendler, Ben-Gurion University.

Similar presentations


Presentation on theme: "Danny Hendler Advanced Topics in on-line Social Networks Analysis Social networks analysis seminar Introductory lecture Danny Hendler, Ben-Gurion University."— Presentation transcript:

1 Danny Hendler Advanced Topics in on-line Social Networks Analysis Social networks analysis seminar Introductory lecture Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

2 Seminar requirements 1.Select a paper and notify me by Thursday, November 5, 2015 2.Study the paper well and prepare a good presentation 3.Meet with me to receive feedback before your talk 1.At least 1 week before presentation 4.Give the seminar talk 5.Participate in at least 80% of seminar talks Recommended reading:  “ Networks, crowds, and markets: reasoning about a highly connected world ”. Easley & Kleinberg, 2010. Available online.  “Social Media Mining: an Introduction”. Zafarani, Abassi & Liu, 2014. Available online.  Papers list (to be published soon). Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

3 Seminar schedule Introductory lecture #1 25/10/15 Semester ends Students send their 3 preferences Introductory lecture #2, papers list published 8/11/ 15 3/11/1 5 First student talk 1/11/15 10 weeks of Student talks Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

4 Talk outline  Social networks  Properties of social networks  Small-world phenomenon  Power-law distribution  Community structure  Community detection  Newman & Girvan algorithm  Click Percolation Method (CPM) Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

5 Social networks What is a social network? A network, represented by a graph where nodes represent actors and edges represent interactions / relationships Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

6 Social networks: an example Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

7 Social networks: an example Giant component Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

8 Social networks: an example Some nodes are very active Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

9 Types of online social media Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

10 Top 20 USA websites 1Google.com11Craiglist.com 2Facebook.com12Netflix.com 3Amazon.com13Live.com 4Youtube.com14Bing.com 5Yahoo.com15Linkedin.com 6Wikipedia.org16Pinterest.com 7Ebay.com17Espn.go.com 8Twitter.com18Imgur.com 9Go.com19Tumblr.com 10Reddit.com20Chase.com Source: Alexa report, October, 2015 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

11 Top 20 USA websites 25% social network sites Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis 1Google.com11Craiglist.com 2Facebook.com12Netflix.com 3Amazon.com13Live.com 4Youtube.com14Bing.com 5Yahoo.com15Linkedin.com 6Wikipedia.org16Pinterest.com 7Ebay.com17Espn.go.com 8Twitter.com18Imgur.com 9Go.com19Tumblr.com 10Reddit.com20Chase.com Source: Alexa report, October, 2015

12 Top 20 USA websites 25% social network sites 25% additional sites with social network aspects Source: Alexa report, February, 2014 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis 1Google.com11Craiglist.com 2Facebook.com12Netflix.com 3Amazon.com13Live.com 4Youtube.com14Bing.com 5Yahoo.com15Linkedin.com 6Wikipedia.org16Pinterest.com 7Ebay.com17Espn.go.com 8Twitter.com18Imgur.com 9Go.com19Tumblr.com 10Reddit.com20Chase.com

13 Knowledge we may gain: Identifying romantic ties in facebook. (*) Backstrom & Kleinberg. Romantic partnerships and the dispersion of social ties: a network analysis of relationship status on facebook. CSCW 2014, pp. 831-841. Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

14 Knowledge we may gain: Web structure (*) Broder et al. Graph structure in the Web. WWW 2000, pp. 309-320. Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

15 Knowledge we may gain: Dynamic of viral marketing. (*) Leskovec et al.. The dynamics of viral marketing. Transactions on the Web, 2007. Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

16 Knowledge we may gain: Identify “key players”, collaborations. Paul Erdős, 1913-1996 “A mathematician is a machine for turning coffee into theorems” Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

17 Paul Erdős, 1913-1996 A mathematician is a machine for turning coffee into theorems Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Knowledge we may gain: Identify “key players”, collaborations.

18 Erdős number Bacon number Knowledge we may gain: Identify “key players”, collaborations. Paul Erdős's Bacon number is 5 Paul Erdős and Ronald Graham appeared in N Is a Number: A Portrait of Paul Erdős. Ronald Graham and Merce Cunningham appeared in Great Genius and Profound Stupidity. Merce Cunningham and Dennis Hopper appeared in John Cage: The Revenge of the Dead Indians. Dennis Hopper and Chris Penn appeared in True Romance. Chris Penn and Kevin Bacon appeared in Footloose Source: wiki Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

19 Properties of social networks Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis  Social networks  Properties of social networks  Small-world phenomenon  Power-law distribution  Community structure  Community detection  Newman & Girvan algorithm  Click Percolation Method (CPM)

20 Milgram's small world phenomenon experiment (1967) Six degrees of separation: “I read somewhere that everybody on this planet is separated by only six other people. Six degrees of separation between us and everyone else on this planet.” (*) (*) John Guare. Six Degrees of Separation: A Play. Vintage Books, 1990. Milgram decided to check if this is the case… Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

21 Milgram's experiment  Budget: $680!!!  A set of “starters”, all try to forward a letter to a single “target” person  Starters notified of target’s name/address/occupation  Must forward letter to someone known on “first-name basis” Image taken from Wiki. Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

22 Milgram's experiment: results  64 chains arrived  Median length: 6 source: “Networks, crowds and Markets”, D. Easley & J. Kleinberg. (Book is online) Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

23 A slightly more modern example (2008): Microsoft instant messenger shortest paths Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

24 Average path-length in Real-World networks Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis source: “Social Media Mining, an Introduction”, R. Zafarani, M. A. Abbasi & H. Liu. (Book is online)

25 Properties of social networks Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis  Social networks  Properties of social networks  Small-world phenomenon  Power-law distribution  Community structure  Community detection  Newman & Girvan algorithm  Click Percolation Method (CPM)

26 A matter of popularity… As a function of k: what fraction of Web pages have k in-links? ~1/k 2.1 (*) (*) Broder et al. Graph structure in the Web. WWW 2000, pp. 309-320. Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

27 Degrees A.k.a. long tail distribution, scale-free distribution The power low distribution Fraction of nodes Most nodes have low degrees Few nodes do have extremely high degrees Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

28 Web pages in-degree: log-log scale Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

29 Some more examples Friendship Network in FlickrFriendship Network in YouTube Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

30 Why is popularity power-law? Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

31 A simple game… Procedure for creating Web page j  {1,2…N} Choose page i<j randomly & uniformly: a.With probability p, create a link to page i b.With probability 1-p, create a link to the page pointed to by page i As a function of k: what fraction of Web pages have k in-links? ~1/k c, lim c =-2 p  0 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

32 Rich get richer… Procedure for creating Web page j  {1,2…N} Choose page i<j randomly & uniformly: a.With probability p, create a link to page i b.With probability 1-p, create a link to the page pointed to by page i = a.With probability p, choose page i<j uniformly and create a link to page i b.With probability 1-p, choose a page i<j with probability proportional to i‘th number of links and create a link to i Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

33 The situation in random graphs  Nodes connected at random  Node degrees follow a binomial distribution  Probability of “very popular” nodes practically 0 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

34 Communities (a.k.a. clusters/modules)  Community structure: the organization of vertices in clusters, with “many” edges joining vertices of the same community and “relatively few” edges joining different communities  Often represent sets of actors sharing similar properties/roles. Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

35 Community detection Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis  Social networks  Properties of social networks  Small-world phenomenon  Power-law distribution  Community structure  Community detection  Newman & Girvan algorithm  Click Percolation Method (CPM)

36 Why is community-detection important?  A community ``summarizes” a group of actors and is relatively easy to visualize/understand  Partition to communities reveals high-level domain structure  May reveal important properties without compromising individuals' privacy Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

37 Community detection applications  Clustering web clients with geographical proximity and similar access patterns  cache servers positioning [Krishnamurty & Wang, SIGCOMM 2000]  Clustering customers with similar interests  Recommendation systems [Reddy et al., DNIS 2002]  Analysing structural positions  Identifying central actors and inter-community mediators  Follow political trends  Detect malicious actors (e.g. spammers)  … Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

38

39 Community detection  Social networks  Properties of social networks  Small-world phenomenon  Power-law distribution  Community structure  Community detection  Newman & Girvan algorithm  Click Percolation Method (CPM)

40 “Edge-betweeness” based detection  A divisive method (as opposed to agglomerative methods)  Look for an edge that is most “between” pairs of nodes o Responsible for connecting many pairs  Remove edge and recalculate Newman and Girvan. Finding and evaluating community structure in networks, 2003 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

41 Shortest-path betweeness  Compute all-pairs shortest paths  For each edge, compute the number of such paths it belongs to  Remove a maximum-weight edge Repeat until no edges (more on this later) Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

42 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

43 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 24 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

44 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

45 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 9 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

46 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 3 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

47 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

48 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

49 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

50 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

51 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

52 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

53 Shortest-path betweeness: an example 6 7 9 8 1 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

54 Shortest-path betweeness: an example 6 7 9 8 1 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

55 Shortest-path betweeness: an example 6 7 9 8 1 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

56 Shortest-path betweeness: an example 6 7 9 8 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

57 Shortest-path betweeness: an example What if there are several shortest paths? 1 4 3 2 5 4 3 3 2.5 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

58 Dendrograms (hierarchical trees)  A dendrogram (hierarchical tree) illustrates the output of hierarchical clustering algorithms  Leaves represent graph nodes, top represents original graph  As we move down the tree, larger communities are partitioned to smaller ones 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

59 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 24 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

60 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 9 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

61 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 3 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

62 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

63 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

64 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

65 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

66 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

67 Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

68 Shortest-path betweeness: an example 6 7 9 8 1 1234567890 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

69 Shortest-path betweeness: an example 6 7 9 8 1 1234567890 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

70 Shortest-path betweeness: an example 6 7 9 8 1 1234567890 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

71 Shortest-path betweeness: an example 6 7 9 8 123456789 0 1 2 3 5 4 0 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

72 Evaluation: computer-generated networks  Large number of graphs with 128 nodes and 4 communities of 32-nodes each  Probability p in for intra-community edges Probablilty p ext for inter-community edges Chosen such that expected vertex degree is 16 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

73 Results (for 64-nodes networks) z in =6, z out =2 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

74 Evaluation: The Zachary karate club Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

75 Results on Zachary club network Shortest-pathShortest-path no recalculation Shortest path 2-communities partition missed just a single person! Re-calculation of betweeness essential Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

76 Quality functions  Hierarchical clustering algorithms create numerous partitions  In general, we do not know how many communities we should seek. How will we know that our clustering is “good” We need a quality function Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

77 The modularity quality function Newman and Girvan. Finding and evaluating community structure in networks, 2003  No communities in random graphs  Equal probabilities for all edges  Check how far intra-community and inter-community densities are from those you would expect in a random graph with identical nodes and same degree-distribution Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

78 The modularity quality function Clauset, Newman and Moore. Finding community structure in very large networks, 2004 Modularity value # edges Graph adjacency matrix Degrees of nodes-pair Probability of an edge if degrees are set and edges placed in random In-same-cluster indicator variable Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

79 Computer-generated networks: modularity Modularity maximized at correct partition Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

80 Zachary club network: modularity One of two local maxima at correct partition Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis

81  Social networks  Properties of social networks  Small-world phenomenon  Power-law distribution  Community structure  Community detection  Newman & Girvan algorithm  Click Percolation Method (CPM) Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Community detection

82 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Clique Percolation Method (CPM)  Input: A parameter k, and a network  Procedure:  Find out all cliques of size k in the given network  Construct a clique graph  Two cliques are adjacent if they share k-1 nodes  These connected components in the clique graph form a community Slide based on “Social Media Mining, an Introduction”, R. Zafarani, M. A. Abbasi & H. Liu.

83 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Clique Percolation Method: an example Slide based on “Social Media Mining, an Introduction”, R. Zafarani, M. A. Abbasi & H. Liu. Cliques of size 3: {1, 2, 3}, {3, 4,5}, {4, 5, 7}, {4,5, 6}, {4,6,7}, {5,6, 7}, {6, 7, 8}, {8,9,10} Communities: {1, 2, 3} {8,9,10} {3,4, 5, 6, 7, 8}

84 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Clique Percolation Method: an example Slide based on “Social Media Mining, an Introduction”, R. Zafarani, M. A. Abbasi & H. Liu. Communities: {1, 2, 3} {8,9,10} {3,4, 5, 6, 7, 8} Reveals overlapping community structure


Download ppt "Danny Hendler Advanced Topics in on-line Social Networks Analysis Social networks analysis seminar Introductory lecture Danny Hendler, Ben-Gurion University."

Similar presentations


Ads by Google