Presentation is loading. Please wait.

Presentation is loading. Please wait.

Connectivity & Cohesion Overview Background: Small World = Connected What distinguishes simple connection from cohesion? Moody & White Bearman, Faris &

Similar presentations


Presentation on theme: "Connectivity & Cohesion Overview Background: Small World = Connected What distinguishes simple connection from cohesion? Moody & White Bearman, Faris &"— Presentation transcript:

1 Connectivity & Cohesion Overview Background: Small World = Connected What distinguishes simple connection from cohesion? Moody & White Bearman, Faris & Moody  Narrative Networks Powell et al –> Biotech Firm network evolution Methods: Identify components and bicomponents

2 Across a large number of substantive settings, Barabási points out that the distribution of network involvement (degree) is highly and characteristically skewed. Scale-Free Networks Measuring Networks: Large-Scale Models

3 Many large networks are characterized by a highly skewed distribution of the number of partners (degree) Scale Free Networks Measuring Networks: Large-Scale Models

4 Many large networks are characterized by a highly skewed distribution of the number of partners (degree) Scale Free Networks Measuring Networks: Large-Scale Models

5 The scale-free model focuses on the distance-reducing capacity of high-degree nodes: Scale Free Networks Measuring Networks: Large-Scale Models

6 The scale-free model focuses on the distance-reducing capacity of high- degree nodes, as ‘hubs’ create shortcuts that carry network flow. Scale Free Networks Measuring Networks: Large-Scale Models

7 Colorado Springs High-Risk (Sexual contact only) Network is approximately scale-free, with = -1.3 But connectivity does not depend on the hubs. Scale Free Networks Measuring Networks: Large-Scale Models

8 Connectivity and the Small World Started by asking the probability than any two people would know each other. Extended to the probability that people could be connected through paths of 2, 3,…,k steps Linked to diffusion processes: If people can reach others, then their diseases can reach them as well, and we can use the structure of the network to model the disease. The reachability structure was captured by comparing curves with a random network, which we will do later today.

9 Connectivity and the Small World Travers and Milgram’s work on the small world is responsible for the standard belief that “everyone is connected by a chain of about 6 steps.” Two questions: Given what we know about networks, what is the longest path (defined by handshakes) that separates any two people? Is 6 steps a long distance or a short distance?

10 Longest Possible Path: Two Hermits on the opposite side of the country About 12-13 steps. OH Hermit Mt. Hermit Store Owner Store Owner Truck Driver Truck Driver Manager Corporate Manager Corporate Manager Corporate President Corporate President Congress Rep. Congress Rep.

11 What if everyone maximized structural holes? Associates do not know each other: Results in an exponential growth curve. Reach entire planet quickly.

12 What if people know each other randomly? Random graph theory shows us that we could reach people quite quickly if ties were random.

13 Random Reachability: By number of close friends

14 Milgram’s test: Send a packet from sets of randomly selected people to a stockbroker in Boston. Experimental Setup: Arbitrarily select people from 3 pools: a) People in Boston b) Random in Nebraska c) Stockholders in Nebraska

15 Milgram’s Findings: Distance to target person, by sending group.

16 Most chains found their way through a small number of intermediaries. What do these two findings tell us of the global structure of social relations?

17 Milgram’s Findings: Length of completed chains

18 Duncan Watts: Networks, Dynamics and the Small-World Phenomenon Asks why we see the small world pattern and what implications it has for the dynamical properties of social systems. His contribution is to show that globally significant changes can result from locally insignificant network change.

19 Duncan Watts: Networks, Dynamics and the Small-World Phenomenon Watts says there are 4 conditions that make the small world phenomenon interesting: 1) The network is large - O(Billions) 2) The network is sparse - people are connected to a small fraction of the total network 3) The network is decentralized -- no single (or small #) of stars 4) The network is highly clustered -- most friendship circles are overlapping

20 Duncan Watts: Networks, Dynamics and the Small-World Phenomenon Formally, we can characterize a graph through 2 statistics. 1) The characteristic path length, L The average length of the shortest paths connecting any two actors. (note this only works for connected graphs) 2) The clustering coefficient, C Version 1: the average local density. That is, C v = ego-network density, and C = C v /n Version 2: transitivity ratio. Number of closed triads divided by the number of closed and open triads. A small world graph is any graph with a relatively small L and a relatively large C.

21 The most clustered graph is Watt’s “Caveman” graph:

22 Duncan Watts: Networks, Dynamics and the Small-World Phenomenon 0 0.2 0.4 0.6 0.8 1 1.2 020406080100120 Degree (k) Clustering Coefficient 0 20 40 60 80 100 120 140 Characteristic Path Length C and L as functions of k for a Caveman graph of n=1000

23 Compared to random graphs, C is large and L is long. The intuition, then, is that clustered graphs tend to have (relatively) long characteristic path lengths. But the small world phenomenon rests on just the opposite: high clustering and short path distances. How is this so? Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

24 A model for pair formation, as a function of mutual contacts formations. Duncan Watts: Networks, Dynamics and the Small-World Phenomenon Using this equation,  produces networks that range from completely ordered to completely random. (M ij is the number of friends in common, p is a baseline probability of a tie, and k is the average degree of the graph)

25 A model for pair formation, as a function of mutual contacts formations. Duncan Watts: Networks, Dynamics and the Small-World Phenomenon

26 C=Large, L is Small = SW Graphs

27 Why does this work? Key is fraction of shortcuts in the network In a highly clustered, ordered network, a single random connection will create a shortcut that lowers L dramatically Watts demonstrates that Small world graphs occur in graphs with a small number of shortcuts

28 Empirical Examples 1) Movie network: Actors through Movies L o /L r = 1.22C o /C r = 2925 2) Western Power Grid: L o /L r = 1.50C o /C r = 16 3) C. elegans L o /L r = 1.17C o /C r = 5.6

29 What are the substantive implications? Return to the initial interest in connectivity: disease diffusion 1) Diseases move more slowly in highly clustered graphs (fig. 11) - not a new finding. 2) The dynamics are very non-linear -- with no clear pattern based on local connectivity. Implication: small local changes (shortcuts) can have dramatic global outcomes (disease diffusion)

30 Duncan Watts: Networks, Dynamics and the Small-World Phenomenon How do we know if an observed graph fits the SW model? Random expectations: For basic one-mode networks (such as acquaintance nets), we can get approximate random values for L and C as: L random ~ ln(n) / ln(k) C random ~ k / n As k and n get large. Note that C essentially approaches zero as N increases, and K is assumed fixed. This formula uses the density-based measure of C.

31 Duncan Watts: Networks, Dynamics and the Small-World Phenomenon How do we know if an observed graph fits the SW model? One problem with using the simple formulas for most extant data on large graphs is that, because the data result from people overlapping in groups/movies/publications, necessary clustering results from the assignment to groups. G1 G2 G3 G4 G5 Amy 1 0 1 0 0 Billy 0 1 0 1 0 Charlie 0 1 0 1 0 Debbie 1 0 0 0 0 Elaine 1 0 1 0 1 Frank 0 1 0 1 0 George 0 1 0 1 0.... LINES CUT..... William 0 1 0 0 0 Xavier 0 1 0 1 0 Yolanda 1 0 1 0 0 Zanfir 0 1 1 1 1 12 14 9 14 5

32 Duncan Watts: Networks, Dynamics and the Small-World Phenomenon How do we know if an observed graph fits the SW model? Newman, M. E. J.; Strogatz, S. J., and Watts, D. J. “Random Graphs with arbitrary degree distributions and their applications” Phys. Rev. E. 2001 This paper extends the formulas for expected clustering and path length using a generating functions approach, making it possible to calculate E(C,L) for graphs with any degree distribution. Importantly, this procedure also makes it possible to account for clustering in a two-mode graph caused by the distribution of assignment to groups.

33 Duncan Watts: Networks, Dynamics and the Small-World Phenomenon How do we know if an observed graph fits the SW model? Newman, M. E. J.; Strogatz, S. J., and Watts, D. J. “Random Graphs with arbitrary degree distributions and their applications” Phys. Rev. E. 2001 Where N is the size of the graph, Z 1 is the average number of people 1 step away (degree) and z2 is the average number of people 2 steps away. Theoretically, these formulas can be used to calculate many properties of the network – including largest component size, based on degree distributions. A word of warning: The math in these papers is not simple, sharpen your calculus pencil before reading the paper…

34 Duncan Watts: Networks, Dynamics and the Small-World Phenomenon How do we know if an observed graph fits the SW model? Since C is just the transitivity ratio, there are a number of good formulas for calculating the expected value. Using the ratio of complete to (incomplete + complete) triads, we can use the expected values from the triad distribution in PAJEK for a simple graph or we can use the expected value conditional on the dyad types (if we have directed data) using the formulas in SPAN and W&F.

35 The line of work most closely related to the small world is that on biased and random networks. Recall the reachability curves in a random graph: 0 20% 40% 60% 80% 100% Percent Contacted 0123456789101112131415 Remove Degree = 4 Degree = 3 Degree = 2 Random Reachability: By number of close friends

36 For a random network, we can estimate the trace curves with the following equation: Where P i is the proportion of the population newly contacted at step i, X i is the cumulative number contacted by step i, and a is the mean number of contacts people have. This model describes the reach curves for a random network. The model is based on a, which (essentially) tells us how many new people we will reach from the new people we just contacted. This is based on the assumption that people’s friends know each other at a simple random rate.

37 For a real network, people’s friends are not random, but clustered. We can modify the random equation by adjusting a, such that some portion of the contacts are random, the rest not. This adjustment is a ‘bias’ - I.e. a non-random element in the model -- that gives rise to the notion of ‘biased networks’. People have studied (mathematically) biases associated with: Race (and categorical homophily more generally) Transitivity (Friends of friends are friends) Reciprocity (i--> j, j--> i) There is still a great deal of work to be done in this area empirically, and it promises to be a good way of studying the structure of very large networks.

38 Figure 1. Connectivity Distribution for a large Jr. High School (Add Health data) Random graph Observed

39 How useful are C & L for characterizing a network? These two graphs both have high C

40 Uzzi & Spiro: Small worlds on Broadway

41

42

43

44

45 Figure 1. Q over time, from descriptive table.

46 Uzzi & Spiro: Small worlds on Broadway Components: CC Ratio

47 Uzzi & Spiro: Small worlds on Broadway Beware ratio of ratios (of ratios!)

48 To calculate Average Path Length and Clustering in UCINET 1)Load the network 2)To keep w. Watts, make the network symmetric Transform > Symmetrasize > Maximum Note what you saved the graph as 3)Calculate clustering coefficient Network > Network Properties > Clustering Coefficient The local density version is the “overall clustering coefficient” The transitivity version is the “weighted clustering coefficient” Clustering Coefficient

49 To calculate Average Path Length and Clustering in UCINET 1)Load the network 2)To keep w. Watts, make the network symmetric Transform > Symmetrasize > Maximum Note what you saved the graph as 3)Calculate Distance Network  cohesion  Distance Tools  Statistics  Univariate  Matrix Average Length

50 Connectivity & Cohesion Background: 1) Durkheim: What is social solidarity? 2) Simmel: Dyad and Triad 3) Small world: What does it mean to be connected? 4) Can we move beyond small-group ideas of cohesion?

51 Connectivity & Cohesion What are the essential elements of solidarity? 1) Ideological: Common Consciousness 2) Relational: Structural Cohesion Groups that are ‘held together’ well Groups should have ‘connectedness’ cohesion = a “field of forces” that keep people in the group “resistance of the group to disruptive forces” “sticking together”

52 Connectivity & Cohesion Analytically, most of these definitions & operationalizations of cohesion do not distinguish the social fact of cohesion from the psychological or behavior outcomes resulting from cohesion. Def. 1: “A collectivity is cohesive to the extent that the social relations of its members hold it together.” What network pattern embodies all the elements of this intuitive definition?

53 Connectivity & Cohesion This definition contains 5 essential elements: 1.Focuses on what holds the group together 2.Expressed as a group level property 3.The conception is continuous 4.Rests on observable social relations 5.Applies to groups of any size

54 Connectivity & Cohesion 1) Actors must be connected: a collection of isolates is not cohesive. Not cohesive Minimally cohesive: a single path connects everyone

55 Connectivity & Cohesion 1) Reachability is an essential element of relational cohesion. As more paths re-link actors in the group, the ability to ‘hold together’ increases. Cohesion increases as # of paths connecting people increases The important feature is not the density of relations, but the pattern.

56 Connectivity & Cohesion Consider the minimally cohesive group: D =. 25 Moving a line keeps density constant, but changes reachability.

57 Connectivity & Cohesion What if density increases, but through a single person? D =. 25 D =. 39Removal of 1 person destroys the group.

58 Connectivity & Cohesion Cohesion increases as the number of independent paths in the network increases. Ties through a single person are minimally cohesive. D =. 39 Minimal cohesion D =. 39 More cohesive

59 Connectivity & Cohesion Substantive differences between networks connected through a single actor and those connected through many. Minimally CohesiveStrongly Cohesive Power is centralizedPower is decentralized Information is concentratedInformation is distributed Expect actor inequalityActor equality Vulnerable to unilateral action Robust to unilateral action Segmented structureEven structure Def 2. “A group is structurally cohesive to the extent that multiple independent relational paths among all pairs of members hold it together.”

60 Networks are structurally cohesive if they remain connected even when nodes are removed Node Connectivity 01 23 Social Cohesion Measuring Networks: Large-Scale Models

61 Connectivity & Cohesion Formalize the argument: If there is a path between every node in a graph, the graph is connected, and called a component. In every component, the paths linking actors i and j must pass through a set of nodes, S, that if removed would disconnect the graph. The number of nodes in the smallest S is equal to the number of independent paths connecting i and j.

62 Connectivity & Cohesion 1 2 5 4 3 6 8 7 Components and cut-sets: Every path from 1 to 8 must go through 4. S(1,8) = 4, and N(1,8)=1. That is, the graph is a component.

63 Connectivity & Cohesion 1 2 5 4 3 6 8 7 Components and cut-sets: In this graph, there are multiple paths connecting nodes 1 and 8. 1 2 5 8 3 6 7 8 4 6 7 8 4 5 8 But only 2 of them are independent. 1 5 8 1 2 3 6 7 8 N(1,8) = 2.

64 Connectivity & Cohesion The relation between cut-set size and number of paths leads to the two versions of our final definition: Def 3a “A group’s structural cohesion is equal to the minimum number of actors who, if removed from the group, would disconnect the group.” Def 3b “A group’s structural cohesion is equal to the minimum number of independent paths linking each pair of actors in the group.” These two definitions are equivalent.

65 Connectivity & Cohesion Some graph theoretic properties of k-components 1) Every member of a k-components must have at least k-ties. If a person has less than k ties, then there would be fewer than k paths connecting them to the rest of the network. 2) A graph where every person has k-ties is not necessarily a k-component. That is, (1) does not work in reverse. Structures can have high degree, but low connectivity. 3) Two k-components can only overlap by k-1 members. If the k-components overlap by more than k-1 members, then there would be at least k paths connecting the two components, and they would be a single k-component. 4) A clique is n-1 connected. 5) k-components can be nested, such that a k+l component is contained within a k-component.

66 Connectivity & Cohesion Nested connectivity sets: An operationalization of embeddedness. 17 18 19 20 21 22 23 8 11 10 14 12 9 15 16 13 4 1 75 6 3 2

67 Connectivity & Cohesion Nested connectivity sets: An operationalization of embeddedness. “Embeddedness” refers to the fact that economic action and outcomes, like all social action and outcomes, are affected by actors’ dyadic (pairwise) relations and by the structure of the overall network of relations. As a shorthand, I will refer to these as the relational and the structural aspects of embeddedness. The structural aspect is especially crucial to keep in mind because it is easy to slip into “dyadic atomization,” a type of reductionism. (Granovetter 1992:33, italics in original)

68 G {7,8,9,10,11 12,13,14,15,16} {1, 2, 3, 4, 5, 6, 7, 17, 18, 19, 20, 21, 22, 23} {7, 8, 11, 14} {1,2,3,4, 5,6,7} {17, 18, 19, 20, 21, 22, 23} Connectivity & Cohesion Nested connectivity sets: An operationalization of embeddedness.

69 Connectivity & Cohesion Empirical Examples: a) Embeddedness and School Attachment b) Political similarity among Large American Firms

70 Connectivity & Cohesion School Attachment

71 Connectivity & Cohesion Business Political Action

72 Connectivity & Cohesion Theoretical Implications: Resource and Risk Flow Structural cohesion increases the probability of diffusion in a network, particularly if flow depends on individual behavior (as opposed to edge capacity).

73 0 0.2 0.4 0.6 0.8 1 1.2 23456 Path distance probability Probability of infection by distance and number of paths, assume a constant p ij of 0.6 10 paths 5 paths 2 paths 1 path

74 Connectivity & Cohesion

75

76 Theoretical Implications: Community & Class Formation Community is conceptualized as a structurally cohesive group, and class reproduction is generated by information/resource flow within that group. Power Structurally cohesive groups are fundamentally more equal than are groups dominated by relations through a single person, since nobody can monopolize resource flow.

77 Connectivity & Cohesion Blocking the Future: Uses bicomponents to identify historical cases. Argument: The Danto Problem: Sociologically, the future can always change the meaning of a past event, as new information changes the significance of a past event. Examples: 1) The battle of Wounded Knee 2) If we were to discover Clinton was from Mars 3) Battles over the meaning of historical monuments & events (such as Pearl Harbor, or dropping the bomb on Hiroshima, etc.) Not an issue just of data: An “Ideal Chronicler” would have the same problem. The problem of doing history is identifying a case: telling a convincing story, that is robust to changes in our knowledge and our understanding of relations among past events.

78 Connectivity & Cohesion Blocking the Future: Uses bicomponents to identify historical cases. Basic argument: The meaning of an event is conditioned by its position in a sequence of interrelated events. If we can capture the structure of interrelation among events, we can identify the unique features that define an historical case. We propose that multiple connectivity (here bicomponents) linking narratives provide just such a way of casing historical events.

79 Connectivity & Cohesion Blocking the Future An example: Sewell’s account of “Inventing Revolution at the Bastille”. Food problems Dispute over National Assembly France is nearly bankrupt Set of ‘crises’

80 Connectivity & Cohesion Blocking the Future The problem with these kinds of narratives, is that small changes in facts or understanding of events changes the entire flow of the narrative. Strong theories (i.e. parsimonious) generate weak structures. In contrast, we propose connecting multiple “histories” and based on the resulting pattern, induce historical cases.

81 Connectivity & Cohesion Blocking the Future The empirical setting: A small village in northern china (Liu Ling), reporting on events surrounding the communist revolution. The data: Life stories of 14 people in the village.

82 Connectivity & Cohesion Blocking the Future Kinship structure of the storytellers. Different positions in the village yield different insights into their life stories.

83 Connectivity & Cohesion Blocking the Future An example of a villager life story

84 Connectivity & Cohesion Blocking the Future Traditional summary of events in Liu Ling (condensed)

85 Connectivity & Cohesion Blocking the Future Combining all individual stories:

86 Blocking the Future Of the nearly 2000 total events, about 1500 are linked in a single component

87 Blocking the Future Of the nearly 1500 total events, about 500 are linked in a single bicomponent. This is our candidate for a ‘case’.

88 Blocking the Future Same figure, with dark cases being representatives of the events in the summary history of the village.

89 Adding Edges at RandomSubtracting Edges at Random Number of Edges Changed 0.8 0.9 1.0 123571012357 Adjusted Rand Statistic Case Resilience to Perturbation Number of Edges Changed Blocking the Future

90 Hybrid Model of Firm Ownership

91

92

93

94

95 Business groups nested within the core

96 Hybrid Model of Firm Ownership Core Members are multiply connected & higher in revenue

97 Hybrid Model of Firm Ownership Core Members are multiply connected & higher in revenue

98 Network Dynamics and Field Evolution Powell, White, Koput & Smith AJS 2005 Goal is to ask how biotech organizational networks grow over time, contrasting different models for network growth

99 Network Dynamics and Field Evolution Powell, White, Koput & Smith AJS 2005 Goal is to ask how biotech organizational networks grow over time, contrasting different models for network growth. H1: Preferential Attachment

100 Network Dynamics and Field Evolution Powell, White, Koput & Smith AJS 2005

101 Network Dynamics and Field Evolution Powell, White, Koput & Smith AJS 2005

102 Network Dynamics and Field Evolution Powell, White, Koput & Smith AJS 2005

103 Network Dynamics and Field Evolution Powell, White, Koput & Smith AJS 2005

104 Network Dynamics and Field Evolution Powell, White, Koput & Smith AJS 2005

105 Network Dynamics and Field Evolution Powell, White, Koput & Smith AJS 2005

106 Network Dynamics and Field Evolution Powell, White, Koput & Smith AJS 2005

107 Network Dynamics and Field Evolution Powell, White, Koput & Smith AJS 2005

108 Network Dynamics and Field Evolution Powell, White, Koput & Smith AJS 2005

109 Network Dynamics and Field Evolution Powell, White, Koput & Smith AJS 2005

110 Methods & Analysis Strategies Node Connectivity is difficult to identify in graphs. The intuition for finding node-connected sets is to identify the smallest cut-sets, then identify the chunks left over after the cut-sets are removed. Finding cut-sets is nontrivial. The basic idea is to trick a maximum flow algorithm to identify the nodes with the weakest flow. a b d c e a1a1 b1b1 d1d1 c1c1 e1e1 a2a2 b2b2 d2d2 c2c2 e2e2 “within” node edges are weak, between nodes arcs are strong, then run a max-flow. This tells you the size of the cut, but you also have to keep track of the identity of the cut(s). You then repeat that for all nodes, and then loop recursively within each identified remainder cut.

111 Methods & Analysis Strategies Can do it in SAS (SPAN) or iGraph Can find pair-wise connectivity in UCINET (but pair-wise connectivity is not sufficient to identify nested sets). The algorithm returns a fully-nested partition tree (for some examples: http://www.soc.duke.edu/~jmoody77/Prot/index.htm )http://www.soc.duke.edu/~jmoody77/Prot/index.htm Since groups can overlap, each partition is a unique (0/1) split on the graph.

112 K=1 K=2 K=3 K=4 K=1 N=10 K=2 N=9 K=3 N=4 K=4 N=5 1 2 3 4 5 6 7 8 9 0 1. 3 3 3 2 2 2 2 2 1 2 3. 3 3 2 2 2 2 2 1 3 3 3. 3 2 2 2 2 2 1 4 3 3 3. 2 2 2 2 2 1 5 2 2 2 2. 4 4 4 4 1 6 2 2 2 2 4. 4 4 4 1 7 2 2 2 2 4 4. 4 4 1 8 2 2 2 2 4 4 4. 4 1 9 2 2 2 2 4 4 4 4. 1 10 1 1 1 1 1 1 1 1 1. Average K = 2.38 Methods & Analysis Strategies Summarizing the effects of cohesion:

113 Methods & Analysis Strategies Summarizing the effects of cohesion: Prosper Distribution of Connectivity Mean across nets

114 Methods & Analysis Strategies Summarizing the effects of cohesion: Prosper Distribution of Connectivity Mean across nets

115 Methods & Analysis Strategies Summarizing the effects of cohesion: Prosper Distribution of Connectivity Mean across nets

116 Methods & Analysis Strategies Approximations for large networks: a)Use k-cores – much faster, often right, but runs a risk of mistaking “star” patterns for cohesion. b)Sample pairs of nodes and generate mean pairwise cohesion from the sample. c)There are some new heuristic approaches that are quite fast…but not exact.


Download ppt "Connectivity & Cohesion Overview Background: Small World = Connected What distinguishes simple connection from cohesion? Moody & White Bearman, Faris &"

Similar presentations


Ads by Google