Presentation is loading. Please wait.

Presentation is loading. Please wait.

Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network.

Similar presentations


Presentation on theme: "Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network."— Presentation transcript:

1 Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network Operationalization a)Graph Theoretic b)Heuristic Algorithms a)Graph search & modularity b)Cluster analysis c)LDA/Principle components c)Fundamental limitations b)Roles/Positions a)Literature grounded in structural anthropology & kinship b)Roles as relations imply paired sets c)Goal is to identify nodes with common patterns a)Original is CONCOR b)Alternatives based on triads, other clusterings

2 Social Sub-groups Lin Freeman: The sociological concept of “Group” Focus on collectivities that are: “Relatively small, informal, and involve close personal ties.” What we would call “Primary Groups” What (network) structure characterizes such a group? Goal: Identify (a) non-overlapping groups that allow one to (b) identify internal group structure.

3 Social Sub-groups Lin Freeman: The sociological concept of “Group” Winship’s Model: 1) Assign people to equivalence classes that are hierarchically nested:

4 Social Sub-groups Lin Freeman: The sociological concept of “Group” In words, this means that whatever metric you define, a person is closer to themselves than to anyone else, that the relation be symmetric, and that triads be transitive (which, given the symmetric condition, means that they be complete). You can then identify partitions by scaling the proximity, such that these three conditions are met. Winship’s Model:

5 Social Sub-groups Lin Freeman: The sociological concept of “Group” A B C D E F G H I J K A. 5 5 4 4 4 4 3 3 3 3 B 5. 5 4 4 4 4 3 3 3 3 C 5 5. 4 4 4 4 3 3 3 3 D 4 4 4. 5 5 5 3 3 3 3 E 4 4 4 5. 5 5 3 3 3 3 F 4 4 4 5 5. 5 3 3 3 3 G 4 4 4 5 5 5. 3 3 3 3 H 3 3 3 3 3 3 3. 5 5 5 I 3 3 3 3 3 3 3 5. 5 5 J 3 3 3 3 3 3 3 5 5. 5 K 3 3 3 3 3 3 3 5 5 5. Winship’s Model:

6 Social Sub-groups Lin Freeman: The sociological concept of “Group” total {A-G} {H-K} {A-C} {D-G} Winship’s Model:

7 Social Sub-groups Lin Freeman: The sociological concept of “Group” Granovetter’s Model: Proceed exactly as in Winship, but treat intransitivity differently when looking at strong or weak ties. If x and y are strongly connected, and y and z are strongly connected, then x and z should be at least weakly connected.

8 An example of a graph fitting the prohibition against G- intransitive relations. Social Sub-groups Lin Freeman: The sociological concept of “Group” Granovetter’s Model:

9 Social Sub-groups The Davis - “Old South” Example

10 Social Sub-groups The Davis - “Old South” Example: Ties > 2

11 Social Sub-groups The Davis - “Old South” Example: Ties > 3

12 Social Sub-groups The Davis - “Old South” Example: Ties > 4 Meets the G-transitivity condition

13 Social Sub-groups The Davis - “Old South” Example: Ties > 5 Stronger than the G-transitivity condition

14 Social Sub-groups Lin Freeman: The sociological concept of “Group” Freeman argues that the G-intransitivity model fits the data best for each of the 7 groups he studies. Substantively, the types of groups this model predicts are very similar to those predicted by the general transitivity model, except re-cast as a valued relation. Empirically, if you want to identify groups based on levels like this, you can use PAJEK and walk through the model in just the same way as we did with “Old South” or you can use UCI-NET (or program it, it’s not hard)

15 Methods: How do we identify primary groups in a network? A) Classic graph theoretical methods: Cliques and extensions of cliques Cliques k-cores k-plexes Freeman (1992) Models K-components (we talked about these already) B) Algorithmic methods: search through a network trying to maximize for a particular pattern (I.e. like Frank & Yasumoto) Adjust assignment of actors to groups until a particular pattern of ties (block diagonal, usually) is identified. Standard models: - Factions (UCI-NET) - KliqueFinder (Frank) -RNM/CROWDS/JIGGLE (Moody) -Principle component analysis (PCA) -Flow models (MCL) -Modularity Maximization routines - General Distance & Clustering Methods

16 Methods: How do we identify primary groups in a network? Graph Theoretical Models. Start with a clique. A clique is defined as a maximal subgraph in which every member of the graph is connected to every other member of the graph. Cliques are collections of nodes where density = 1.0. Properties of cliques: Density: 1.0 Everyone connected to n-1 alters Distance between every pair is 1 Ratio of within group ties to between group ties is infinite All triads are transitive

17 Methods: How do we identify primary groups in a network? Graph Theoretical Models. In practice, complete cliques are not very useful. They tend to overlap heavily and are limited in their size. Graph theorists have thus relaxed the complete connectivity requirement (with varying degrees of success). See the Moody & White paper on cohesion for a discussion of many of these attempts.

18 Methods: How do we identify primary groups in a network? Graph Theoretical Models. k-cores: Every person connected to at least k other people. Ideally, they would look something like this (here two 3- cores). However, adding a single tie from A to B would make the whole graph a 3-core

19 Methods: How do we identify primary groups in a network? Graph Theoretical Models. Extensions of this idea include: K-plex: Every member connected to at least n-k other people in the graph (recall in a clique everyone is connected to n-1, so this relaxes that condition. n-clique: Every person is connected by a path of N or less (recall a clique is with distance = 1). N-clan: same as an n-clique, but all paths must be inside the group. I’ve never had much luck with any of these methods empirically. Real data is usually too messy to work well. You should try them, and gain some intuition for yourself. The place to start is in UCINET.

20 Methods: How do we identify primary groups in a network? UCINET will compute all of the best-known graph theoretic treatments for subgroups Graph Theoretical Models.

21 Methods: How do we identify primary groups in a network? Consider running different methods on a known group structure: Graph Theoretical Models.

22 Methods: How do we identify primary groups in a network? Graph Theoretical Models.

23 Methods: How do we identify primary groups in a network? Cliques Graph Theoretical Models.

24 Methods: How do we identify primary groups in a network? The only way to get something meaningful from this is to analyze the clique overlap matrix, which is what the “Clique by partion” dataset does, using cluster analysis Cliques

25 Heuristic strategies for identifying primary groups: Search: 1) Fit Measure: Identify a measure of groupness (usually a function of the number of ties that fall within group compared to the number of ties that fall between group). 2) Algorithm to maximize fit. Once we have the index, we need a clever method for searching through the network to maximize the fit. Destroy: Break apart the network in strategic ways, removing the weakest parts first, what’s left are your primary groups. See “edge betweeness” “MCL” Evade: Don’t look directly, instead find a simpler problem that correlates: Examples: Generalized cluster analysis, Factor Analysis, RM. Methods: How do we identify primary groups in a network?

26 Segregation Index ( Freeman, L. C. 1972. "Segregation in Social Networks." Sociological Methods and Research 6411-30. ) Freeman asked how we could identify segregation in a social network. Theoretically, he argues, if a given attribute (group label) does not matter for social relations, then relations should be distributed randomly with respect to the attribute. Thus, the difference between the number of cross-group ties expected by chance and the number observed measures segregation. Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

27 Consider the (hypothetical) network below. There are two attributes in this network: people with Blue eyes and Brown eyes and people who are square or not (they must be hip). Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

28 Segregation Index Mixing Matrix: Blue Brown Blue 6 17 Brown 17 16 Hip Square Hip 20 3 Square 3 30 Seg = -0.25 Seg = 0.78 Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

29 Segregation Index One problem with the segregation index is that it is not ‘margin free.’ That is, if you were to change the distribution of the category of interest (say race) by a constant but not the core association between race and friendship choice, you can get a different segregation level. One antidote to this problem is to use odds ratios. In this case, and odds ratio tells us the relative likelihood that two people in the same category will choose each other as friends. Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

30 Odds Ratios The odds ratio tells us how much more likely people in the same group are to nominate each other. You calculate the odds ratio based on the number of ties in a group and their relative size, based on the following table: Member of: Same Group Different Group Friends A B Not Friends C D OR = AD/ BC Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

31 Hip Square Hip 20 323 Square 3 3033 23 33 56 Observed Odds Ratios There are 6 hip people and 9 square people in this network. This implies that there are the following number of possible ties in the network: Group Same Dif Yes 50 6 Friend No 52 102 Hip Square Hip 30 54 Square 54 72 Diagonal = n i (n i -1) off diagonal = n i 2 OR = (50)102 / 52(6) = 16.35

32 Log(Same-Sex Odds Ratio) Friendship Segregation Index Segregation index compared to the odds ratio: r=.95 Complete Network Analysis Network Connections: Social Subgroups

33 The second problem is that the Segregation index has no clear maximum – if every node is assigned to a single group the value can be higher than if everyone is assigned to the “right” group. -- it tends to have a monotonically changing score. This means you can’t just keep adjusting nodes until you see a best fit, but instead have to look for changes in fit. The modularity score solves this problem by re-organizing the expectation in a way that forces the value to 0 if everyone is in a single group. Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

34 We can also measure the extent that ties fall within clusters with the modularity score: Where: s indexes clusters in the network l s is the number of lines in cluster s d s is the sum of the degrees of s L is the total number of lines M has the advantage of going to 0 if there is only 1 group, which means maximizing the score is sensible Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

35 We can also measure the extent that ties fall within clusters with the modularity score: Where: m is the number of edges k is the degree A ij is the edge weight between ij  (c i c j ) is 1 if in the same group  is the resolution parameter Q has the advantage of going to 0 if there is only 1 group, which means maximizing the score is sensible. Note resolution parameter means N of groups is not truly “automatic” Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

36 Modularity Scores Comparison to Segregation Index – comparing values for known solutions Modularity Score Plotted against Segregation Index for various nets Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

37 Number of groups  In-group Density  Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

38 Louvain Method (Blondel et al) in PAJEK & R Factions in UCI-NET Multiple options for the exact factor maximized. I recommend either the density or the correlation function, and I would calculate the distance in each case. Frank’s KliqueFinder Moody’s crowds / Jiggle Generalized blockmodel in PAJEK iGraph (R) has a couple that see this sort (Fast-Greedy is good) Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

39 Factions in UCI-NET Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

40 Factions in UCI-NET

41

42 Reduced BlockMatrix 1 2 3 4 5 6 -- -- -- -- -- -- 1 59 1 2 14 1 0 2 1 54 0 1 12 2 3 1 2 55 0 1 12 4 9 1 1 51 0 0 5 0 12 2 0 62 1 6 1 0 9 2 0 64 Fit perfectly

43 UCINET Biggest drawbacks of FACTIONS are: A)SLOW B)Have to specify the number of groups. Methods: How do we identify primary groups in a network? Search: Optimize a partition to fit

44 PAJEK – Generalized Blockmodel

45

46 Fits fine, but it’s slow!

47 R – “Fast Greedy” This is a direct optimization of Modularity

48 PAJEK – “Louvain” This is a direct optimization of Modularity

49 Cluster analysis In addition to tools like FACTIONS, we can use the distance information contained in a network to cluster observations that are ‘close’ to each other. In general, cluster analysis is a set of techniques that allows you to identify collections of objects that are simmilar to each other in some degree. A very good reference is the SAS/STAT manual section called, “Introduction to clustering procedures.” ( http://wks.uts.ohio-state.edu/sasdoc/8/sashtml/stat/chap8/index.htm ) ( See also Wasserman and Faust, though the coverage is spotty). We are going to start with the general problem of hierarchical clustering applied to any set of analytic objects based on similarity, and then transfer that to clustering nodes in a network. Methods: How do we identify primary groups in a network? Evade: Find a “cheap” indicator, and cluster/optimize that

50 Cluster analysis Imagine a set of objects (say people) arrayed in a two dimensional space. You want to identify groups of people based on their position in that space. How do you do it? How Cool you are How Smart you are

51 Start by choosing a pair of people who are very close to each other (such as 15 & 16) and now treat that pair as one point, with a value equal to the mean position of the two nodes. x Methods: How do we identify primary groups in a network? Evade: Find a “cheap” indicator, and cluster/optimize that

52 Now repeat that process for as long as possible. Methods: How do we identify primary groups in a network? Evade: Find a “cheap” indicator, and cluster/optimize that

53 This process is captured in the cluster tree (called a dendrogram) Methods: How do we identify primary groups in a network? Evade: Find a “cheap” indicator, and cluster/optimize that

54 As with the network cluster algorithms, there are many options for clustering. The three that I use most are: Ward’s Minimum Variance -- the one I use almost 95% of the time Average Distance -- the one used in the example above Median Distance -- very similar Again, the SAS manual is the best single place I’ve found for information on each of these techniques. Some things to keep in mind: Units matter. The example above draws together pairs horizontally because the range there is smaller. Get around this by standardizing your data. This is an inductive technique. You can find clusters in a purely random distribution of points. Consider the following example. Methods: How do we identify primary groups in a network? Evade: Find a “cheap” indicator, and cluster/optimize that

55 data random; do i=1 to 20; x= rannor (0); y=rannor(0); output; end; run; The data in this scatter plot are produced using this code: Cluster analysis Methods: How do we identify primary groups in a network? Evade: Find a “cheap” indicator, and cluster/optimize that

56 Cluster analysis Resulting dendrogram Methods: How do we identify primary groups in a network? Evade: Find a “cheap” indicator, and cluster/optimize that

57 Cluster analysis Resulting cluster solution

58 Cluster analysis Cluster analysis works by building a distance matrix between each pair of points. In the example above, it used the Euclidean distance which in two dimensions is simply the physical distance between the points in a plot. Can work on any number of dimensions. To use cluster analysis in a network, we base the distance on the path- distance between pairs of people in the network. Consider again the blue-eye hip example: Methods: How do we identify primary groups in a network? Evade: Find a “cheap” indicator, and cluster/optimize that

59 Cluster analysis Distance Matrix 0 1 3 2 3 3 4 3 3 2 3 2 2 1 1 1 0 2 2 2 3 3 3 2 1 2 2 1 2 1 3 2 0 3 2 4 3 3 2 1 1 1 2 2 3 2 2 3 0 1 1 2 1 1 2 3 3 3 2 1 3 2 2 1 0 2 1 1 1 1 2 2 3 3 2 3 3 4 1 2 0 1 1 2 3 4 4 4 3 2 4 3 3 2 1 1 0 2 2 2 3 3 4 4 3 3 3 3 1 1 1 2 0 1 2 3 3 4 3 2 3 2 2 1 1 2 2 1 0 1 2 2 3 3 2 2 1 1 2 1 3 2 2 1 0 1 1 2 2 2 3 2 1 3 2 4 3 3 2 1 0 1 2 2 3 2 2 1 3 2 4 3 3 2 1 1 0 1 1 2 2 1 2 3 3 4 4 4 3 2 2 1 0 2 2 1 2 2 2 3 3 4 3 3 2 2 1 2 0 1 1 1 3 1 2 2 3 2 2 2 3 2 2 1 0 Methods: How do we identify primary groups in a network? Evade: Find a “cheap” indicator, and cluster/optimize that

60 The distance matrix implies a space that nodes are embedded within. Using something like MDS, we can represent the space implied by the distance matrix in two dimensions. This is the image of the network you would get if you did that. Methods: How do we identify primary groups in a network? Evade: Find a “cheap” indicator, and cluster/optimize that

61 Cluster analysis When you use variables, the cluster analysis program generates a distance matrix. We can, instead use the network distance matrix directly. If we do that with this example network, we get the following:

62 Cluster analysis

63 In SAS you use two commands to get a cluster analysis. The first does the hierarchical clustering. The second analyzes the cluster output to create the tree. Example 1. Using variables to define the space (like income and musical taste): proc cluster data=a method=ave out=clustd std; var x y; id node; run; proc tree data=clustd ncl=5 out=cluvars; run;

64 Cluster analysis Example 2. Using a pre- defined distance matrix to define the space (as in a social network). You first create the distance matrix (in IML), then use it in the cluster program. proc iml; %include 'c:\moody\sas\programs\modules\reach.mod'; /* blue eye example */ mat2=j(15,15,0); mat2[1,{2 14 15}]=1; /* lines cut here */ mat2[15,{1 14 2 4}]=1; dmat=reach(mat2); mattrib dmat format=1.0; print dmat; id=1:nrow(dmat); id=id`; ddat=id||dmat; create ddat from ddat; /* creates the dataset */ append from ddat; quit; data ddat (type=dist); /* tells SAS it is a distance */ set ddat; /* matrix */ run;

65 Cluster analysis Example 2. Using a pre-defined distance matrix to define the space (as in a social network). Once you have it, the cluster program is just the same. proc cluster data=ddat method=ward out=clustd; id col1; run; proc tree data=clustd ncl=3 out=netclust; copy col1; run; proc freq data=netclust; tables cluster; run; proc print data=netclust; var col1 cluster; run;

66 Moody’s CROWDS algorithm combines the search approach with an initial cluster analysis and a routine for determining how many clusters are in the network. It does so by using the Segregation index and all of the information from the cluster hierarchy, combining two groups only if it improves the segregation fit for both groups. Methods: How do we identify primary groups in a network? Evade: Find a “cheap” indicator, and cluster/optimize that

67 The logic behind these algorithms is that you remove some weak links and see what is left. Most popular is the “edge betweenness” algorithm. Methods: How do we identify primary groups in a network? Destroy: Remove lines/nodes until what is left over reveals something of interest

68 UCINET has the MCL (Markov clustering, based on flow betweenness in a random walk sense) algorithm programmed. Methods: How do we identify primary groups in a network? Destroy: Remove lines/nodes until what is left over reveals something of interest

69 “Evade” – look for something that correlates with your split Newman’s Leading Eigenvector (in R – this is the “bottom” partition, not the best fit, which aggregates/joins from here)

70 The Recursive Neighborhood Means algorithm creates the variables that are then used in the cluster analysis to identify groups. Start by randomly assigning every node a value on k variables Then calculate the average for each variable for the people each person is tied to Repeat this process multiple times  This results in people who have many ties to each other having similar values on the k random variables. This similarity then gets picked up in a cluster analysis. “Evade” – look for something that correlates with your split

71 Example of the RNM procedure Time 1 Time 2 Time 3

72 Example of the RNM procedure

73 As an example, consider the process active on a known-to-be clustered networks, starting with 2 random k variables. You get something like this, where the nodes are now placed according to their resulting values on the 2 variables.

74

75 The algorithm does a good job uncovering clusters in fake datasets.

76

77 Compared to real data: RNM Partition on the Prison data

78 Strategies for identifying primary groups: Evade Factor Analysis: Treat the adjacency/similarity matrix as a set of N variables and look for latent factors that explain the variance in the data. SES IQ Income Math Score   1.0 0.0 We often use simple indicators and assume they measure our concepts

79 Strategies for identifying primary groups: Evade Factor Analysis: Treat the adjacency/similarity matrix as a set of N variables and look for latent factors that explain the variance in the data. SES IQ Income Reading Score Occupation Highest Degree House Size Languages Spoken Math Score  But we don’t have to! We can imagine that each latent concept causes our indicators, and build a measurement model.

80 Strategies for identifying primary groups: Evade Factor Analysis: Treat the adjacency/similarity matrix as a set of N variables and look for latent factors that explain the variance in the data. But we don’t have to! We can imagine that each latent concept causes our indicators, and build a measurement model.

81 Strategies for identifying primary groups: Evade Factor Analysis: Treat the adjacency/similarity matrix as a set of N variables and look for latent factors that explain the variance in the data. In a network, we assume that the tie pattern is an imperfect measure of an underlying latent structure that we can explain with similar factors. Instead of lots of “measurements” we have many columns in the adjacency (sim) matrix, and we can summarize that with factor scores. -- works best if the similarity matrix has more information – so multiple account data are perfect. – or you can transform the data in some way to more information (like use a distance matrix.

82 Strategies for identifying primary groups: Evade Factor Analysis: Treat the adjacency/similarity matrix as a set of N variables and look for latent factors that explain the variance in the data. /* this section builds info on how to weight dyads for in-group, out-group. */ twostp=((adjmat+adjmat`)>0)*adjmat; /* make it either direction w. the first term */ ttie=adjmat#twostp; /*=1 if tie contributes to a transitive triple */ ttie=((ttie+ttie`)); adjraw=adjmat; adjmat=(adjmat+adjmat`); /* force it to be symetric, 1=asym 2=reciped */ adjmat=adjmat-diag(adjmat); /* remove any self ties */ d2=reachlim((adjmat>0),3); /* re-weight to bias toward recip ties */ wm_4 = (d2=1)#(adjmat=2)#8; /* recip direct ties */ wm_2a = (d2=1)#(adjmat=1)#4; /* unrecip direct ties */ wm_1 = 2*(d2=2);/* ties 2-steps out */ wm_p5 = 0*(d2=3); /* ties 3-steps out - note it's zeroed out here*/ wm=wm_4+wm_2a+wm_1++wm_p5+(3*(ttie/(max(ttie)))); /* transitivity is at the end*/ wm=wm-diag(wm); Here is code I used in the PROSPER data:

83 Strategies for identifying primary groups: Evade Factor Analysis: Treat the adjacency/similarity matrix as a set of N variables and look for latent factors that explain the variance in the data. Here is code I used in the PROSPER data: /* run factor analysis. Note nfactors is a high value, should only take those w. EV > 2, but this gives us room... */ proc factor rotate=varimax min=&minev out=factset data=symmat nfactors=175 outstat=fscores noprint; run; quit;

84 Strategies for identifying primary groups: Evade Result:

85 Strategies for identifying primary groups: Evade Result: Each column is a person, these are the factor loadings for each person on each retained factor.

86 Strategies for identifying primary groups: Evade Result: Sociogram for a single school

87 Strategies for identifying primary groups: Evade Result: Sociogram for a single school. Problem is that there are no necessary connectivity checks – you can get “groups” that are disconnected. Biggest strengths are: a)Really fast b)Allows for overlapping groups c)Gives you “embeddedness” scores based on factor loadigs

88 The Crowds Algorithm 1. Identify members of network bicomponents, remove people not included. 2. Cluster the reduced network. - Identify optimal number of groups: (TREEWALK) - For each level of the cluster partition tree do (BFS): -Move up the tree from smaller to larger groups. -If the fit for both groups is improved by joining them then do so. -If not, then identify group at that level. -End TREEWALK. Do until all groups are identified (GLOBAL LOOP): 3. Evaluate node fit. Do until nodes cannot be moved: For each identified cluster do (GRPCHECK): - Ensure group is a bi-component. -Calculate effect on group a of moving node j to group a. -Calculate effect on j's present group of removing j. - If there is a positive net gain to moving j from own group to a, then do so. End. 4. Identify Bridging members. -If removing j from group a would improve the fit of group a, AND assigning j to any other group would lower the fit for that group, then j is considered a bridge. Place all bridges in separate class. 5. Group Check. Check returns to combining groups. IF merging groups would improve the fit of all groups to be merged, then do so. - Evaluate bridges, to be sure that they are not bridging two groups that have now merged. End Global loop. Strategies for identifying primary groups: Hybrid

89 Social Sub-groups Frank & Yasumoto: Action and Structure They expect to find evidence of enforceable trust within social subgroups and evidence of reciprocity between such groups. To do so, they must identify primary subgroups within the network. They do so using a density based criterion. Frank’s algorithm iteratively assigns nodes to subgroups until a parameter that maximizes in-group density is reached. Basic model is: logit(Y ij )=  +  ij Seek to find an assignment of nodes to groups (g) that maximizes fit. This results in a ‘block diagonal’ adjacency matrix, where most of the ties fall along the diagonal.

90 Relations among the French Financial Elite (as drawn by F&Y) Group-weighted MDS Relations within group are weighted heavier than between to generate this picture:

91 Return to first question: What is a group? The simple notions of a complete clique are difficult to square w. real-world data. Density is an indicator, but subject to over-grouping (no connectivity) and star-patterns. Groups are likely internally differentiated – with “core” vs. “periphery” members Most sociological theories of groups rest on transitive closure and short distances There’s a sense that members are equal – a tight-knit group The group should be fairly small – face-to-face scale The social processes underlying the group turn on reciprocity, trust, communication, homogeneity of norms & beliefs. Almost all require a comparative set: in-group to out-group. It is relational not essential. Cross-cutting social circles – would lead us to expect overlapping groups, but in practice most methods do not do that, as it’s analytically too cumbersome. Practically, group detection is hard and most methods will give you (slightly) different results. You can compare results using a Rand statistic (proportion of pairs similarly categorized in two partitions), but for small settings these differences can matter.

92 Fast & GreedyLouvainEdge Between Markov ChainLeading Eigenvect RNM (CROWDS)

93

94 Overview Social life can be described (at least in part) through social roles. To the extent that roles can be characterized by regular interaction patterns, we can summarize roles through common relational patterns. Identifying these sets is the goal of block-model analyses. Nadel: The Coherence of Role Systems Background ideas for White, Boorman and Brieger. Social life as interconnected system of roles Important feature: thinking of roles as connected in a role system = social structure White, Boorman and Breiger: Social structure from Multiple Networks I. Blockmodels of Roles and Positions The key article describing the theoretical and technical elements of block-modeling Roles & Positions

95 Nadel: The Coherence of Role Systems Elements of a Role: Rights and obligations with respect to other people or classes of people Roles require a ‘role compliment’ another person who the role- occupant acts with respect to Examples: Parent - child, Teacher - student, Lover - lover, Friend - Friend, Husband - Wife, etc. Nadel (Following functional anthropologists and sociologists) defines ‘logical’ types of roles, and then examines how they can be linked together.

96 Nadel describes how various roles fit together to form a coherent whole. Roles are collected in people through the ‘summation of roles” Necessary: Some roles fit together necessarily. For example, the expected interaction patterns of “son-in-law” are implied through the joint roles of “Husband” and “Spouse-Parent” Coincidental: Some roles tend to go together empirically, but they need not (businessman & club member, for example). Distinguishing the two is a matter of usefulness and judgement, but relates to social substitutability. The distinction reverts to how the system as a whole will be held together in the face of changes in role occupants. Nadel: The Coherence of Role Systems

97 Given that roles can be identified as ‘going together’ is there a logic that underlies their connection? Nadel uses a functional description based on ascription and achievement:

98 Nadel: The Coherence of Role Systems And he gives an example of a simple role system: Nadel’s task is to make sense of these roles, to identify how they are interconnected to form a system -- a coherent structure. This is a difficult task to do analytically, as the eventual failure of Parsonian functionalism shows.

99 White et al: From logical role systems to empirical social structures With the fall of parsons and functionalism in the late 60s, many of the ideas about social structure and system were also tossed. White et al demonstrate how we can understand social structure as the intercalation of roles, without the a priori logical categories. Start with some basic ideas of what a role is: An exchange of something (support, ideas, commands, etc) between actors. Thus, we might represent a family as:

100 Start with some basic ideas of what a role is: An exchange of something (support, ideas, commands, etc) between actors. Thus, we might see an exchange network such as: Provides food for Romantic Love Bickers with White et al: From logical role systems to empirical social structures

101 Start with some basic ideas of what a role is: An exchange of something (support, ideas, commands, etc) between actors. Which is a summary of a (sort of) family. H W C C C Provides food for Romantic Love Bickers with (and there are, of course, many other relations inside the family) White et al: From logical role systems to empirical social structures

102 The key idea, is that we can express a role through a relation (or set of relations) and thus a social system by the inventory of roles. If roles equate to positions in an exchange system, then we need only identify particular aspects of a position. But what aspect? Block modeling focuses on equivalence positions. Structural Equivalence Two actors are structurally equivalent if they have the same types of ties to the same people. That is, they have the exact same ties.

103 Structural Equivalence A single relation

104 Structural Equivalence Graph reduced to positions

105 Alternative notions of equivalence Instead of exact same ties to exact same alters, you look for nodes with similar ties to similar types of alters

106 Blockmodeling: basic steps In any positional analysis, there are 4 basic steps: 1) Identify a definition of equivalence 2) Measure the degree to which pairs of actors are equivalent 3) Develop a representation of the equivalencies 4) Assess the adequacy of the representation

107 1) Identify a definition of equivalence Structural Equivalence: Two actors are equivalent if they have the same type of ties to the same people.

108 Automorphic Equivalence: Actors occupy indistinguishable structural locations in the network. That is, that they are in isomorphic positions in the network. Two graphs are isomorphic if there is some mapping of nodes to positions that equates the two. For example, all 030T triads are isomorphic. A graph is automorphic, if there are patterns internal to the graph that are equated (if the mapping goes from the set of nodes in the graph to other nodes in the graph). In general, automorphicaly equivalent nodes are equivalent with respect to all graph theoretic properties (I.e. degree, number of people reachable, centrality, etc.) and are structurally indistinguishable. Key difference from structural equivalence is relaxing of the necessity of being linked to the same nodes. 1) Identify a definition of equivalence

109 Automorphic Equivalence:

110 Regular Equivalence: Regular equivalence does not require actors to have identical ties to identical actors or to be structurally indistinguishable. Actors who are regularly equivalent have identical ties to and from equivalent actors. If actors i and j are regularly equivalent, and actor i has a tie to/from some actor, k, then actor j must have the same kind of tie to/from some actor l, and actors k and l must be regularly equivalent. So effectively this is a recursive definition, and not necessarily unique. There may be several ways to assign actors to clusters that satisfy this definition. (This is related to graph colorings, regular equivalence definitions are those where nodes have neighbors of the same color). 1) Identify a definition of equivalence

111 Regular Equivalence: There may be multiple regular equivalence partitions in a network, and thus we tend to want to find the maximal regular equivalence position, the one with the fewest positions.

112 Role or Local Equivalence: While most equivalence measures focus on position within the full network, some measures focus only on the patters within the local tie neighborhood. These have been called ‘local role’ equivalence. Note that: Structurally equivalent actors are automorphically equivalent, Automorphically equivalent actors are regularly equivalent. Structurally equivalent and automorphically equivalent actors are role equivalent In practice, we tend to ignore some of these fine distinctions, as they get blurred quickly once we have to operationalize them in real graphs. It turns out that few people are ever exactly equivalent, and thus we approximate the links between the types. In all cases, the procedure can work over multiple relations simultaneously. The process of identifying positions is called blockmodeling, and requires identifying a measure of similarity among nodes.

113 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 1 1 1 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 1 1 1 0 1 0 0 1 0 0 0 0 0 1 1 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 Blockmodeling is the process of identifying these types of positions. A block is a section of the adjacency matrix - a “group” of people. Here I have blocked structurally equivalent actors

114 . 1 1 1 0 0 0 0 0 0 0 0 0 0 1. 0 0 1 1 0 0 0 0 0 0 0 0 1 0. 1 0 0 1 1 1 1 0 0 0 0 1 0 1. 0 0 1 1 1 1 0 0 0 0 0 1 0 0. 1 0 0 0 0 1 1 1 1 0 1 0 0 1. 0 0 0 0 1 1 1 1 0 0 1 1 0 0. 0 0 0 0 0 0 0 0 0 1 1 0 0 0. 0 0 0 0 0 0 0 0 1 1 0 0 0 0. 0 0 0 0 0 0 0 1 1 0 0 0 0 0. 0 0 0 0 0 0 0 0 1 1 0 0 0 0. 0 0 0 0 0 0 0 1 1 0 0 0 0 0. 0 0 0 0 0 0 1 1 0 0 0 0 0 0. 0 0 0 0 0 1 1 0 0 0 0 0 0 0. 1 2 3 4 5 6 1 0 1 1 0 0 0 2 1 0 0 1 0 0 3 1 0 1 0 1 0 4 0 1 0 1 0 1 5 0 0 1 0 0 0 6 0 0 0 1 0 0 Once you block the matrix, reduce it, based on the number of ties in the cell of interest. The key values are a zero block (no ties) and a one-block (all ties present): Structural equivalence thus generates 6 positions in the network 123456 1 2 3 4 5 6

115 . 1 1 1 0 0 0 0 0 0 0 0 0 0 1. 0 0 1 1 0 0 0 0 0 0 0 0 1 0. 1 0 0 1 1 1 1 0 0 0 0 1 0 1. 0 0 1 1 1 1 0 0 0 0 0 1 0 0. 1 0 0 0 0 1 1 1 1 0 1 0 0 1. 0 0 0 0 1 1 1 1 0 0 1 1 0 0. 0 0 0 0 0 0 0 0 0 1 1 0 0 0. 0 0 0 0 0 0 0 0 1 1 0 0 0 0. 0 0 0 0 0 0 0 1 1 0 0 0 0 0. 0 0 0 0 0 0 0 0 1 1 0 0 0 0. 0 0 0 0 0 0 0 1 1 0 0 0 0 0. 0 0 0 0 0 0 1 1 0 0 0 0 0 0. 0 0 0 0 0 1 1 0 0 0 0 0 0 0. 1 2 3 1 1 1 0 2 1 1 1 3 0 1 0 Once you partition the matrix, reduce it: Regular equivalence 12 3 (here I placed a one in the image matrix if there were any ties in the ij block)

116 To get a block model, you have to measure the similarity between each pair. If two actors are structurally equivalent, then they will have exactly similar patterns of ties to other people. Consider the example again:. 1 1 1 0 0 0 0 0 0 0 0 0 0 1. 0 0 1 1 0 0 0 0 0 0 0 0 1 0. 1 0 0 1 1 1 1 0 0 0 0 1 0 1. 0 0 1 1 1 1 0 0 0 0 0 1 0 0. 1 0 0 0 0 1 1 1 1 0 1 0 0 1. 0 0 0 0 1 1 1 1 0 0 1 1 0 0. 0 0 0 0 0 0 0 0 0 1 1 0 0 0. 0 0 0 0 0 0 0 0 1 1 0 0 0 0. 0 0 0 0 0 0 0 1 1 0 0 0 0 0. 0 0 0 0 0 0 0 0 1 1 0 0 0 0. 0 0 0 0 0 0 0 1 1 0 0 0 0 0. 0 0 0 0 0 0 1 1 0 0 0 0 0 0. 0 0 0 0 0 1 1 0 0 0 0 0 0 0. 123456 1 2 3 4 5 6 C D Match 1 1 1 0 0 1. 1 0 1. 0 0 0 1 1 1 1 0 0 1 Sum: 12 C and D match on 12 other people

117 If the model is going to be based on asymmetric or multiple relations, you simply stack the various relations: H W C C C Provides food for Romantic Love Bickers with Romance 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 Feeds 0 0 1 1 1 0 0 0 0 0 Bicker 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 0 Stacked

118 0 8 7 7 5 5 11 11 11 11 7 7 7 7 8 0 5 5 7 7 7 7 7 7 11 11 11 11 7 5 0 12 0 0 8 8 8 8 4 4 4 4 7 5 12 0 0 0 8 8 8 8 4 4 4 4 5 7 0 0 0 12 4 4 4 4 8 8 8 8 5 7 0 0 12 0 4 4 4 4 8 8 8 8 11 7 8 8 4 4 0 12 12 12 8 8 8 8 11 7 8 8 4 4 12 0 12 12 8 8 8 8 11 7 8 8 4 4 12 12 0 12 8 8 8 8 11 7 8 8 4 4 12 12 12 0 8 8 8 8 7 11 4 4 8 8 8 8 8 8 0 12 12 12 7 11 4 4 8 8 8 8 8 8 12 0 12 12 7 11 4 4 8 8 8 8 8 8 12 12 0 12 7 11 4 4 8 8 8 8 8 8 12 12 12 0 For the entire matrix, we get: (number of agreements for each ij pair)

119 1.00 -0.20 0.08 0.08 -0.19 -0.19 0.77 0.77 0.77 0.77 -0.26 -0.26 -0.26 -0.26 -0.20 1.00 -0.19 -0.19 0.08 0.08 -0.26 -0.26 -0.26 -0.26 0.77 0.77 0.77 0.77 0.08 -0.19 1.00 1.00 -1.00 -1.00 0.36 0.36 0.36 0.36 -0.45 -0.45 -0.45 -0.45 -0.19 0.08 -1.00 -1.00 1.00 1.00 -0.45 -0.45 -0.45 -0.45 0.36 0.36 0.36 0.36 0.77 -0.26 0.36 0.36 -0.45 -0.45 1.00 1.00 1.00 1.00 -0.20 -0.20 -0.20 -0.20 -0.26 0.77 -0.45 -0.45 0.36 0.36 -0.20 -0.20 -0.20 -0.20 1.00 1.00 1.00 1.00 The metric used to measure structural equivalence by White, Boorman and Brieger is the correlation between each node’s set of ties. For the example, this would be: Another common metric is the Euclidean distance between pairs of actors, which you then use in a standard cluster analysis.

120 The initial method for finding structurally equivalent positions was CONCOR, the CONvergence of iterated CORrelations. 1.00 -.77 0.55 0.55 -.57 -.57 0.95 0.95 0.95 0.95 -.75 -.75 -.75 -.75 -.77 1.00 -.57 -.57 0.55 0.55 -.75 -.75 -.75 -.75 0.95 0.95 0.95 0.95 0.55 -.57 1.00 1.00 -1.0 -1.0 0.73 0.73 0.73 0.73 -.75 -.75 -.75 -.75 -.57 0.55 -1.0 -1.0 1.00 1.00 -.75 -.75 -.75 -.75 0.73 0.73 0.73 0.73 0.95 -.75 0.73 0.73 -.75 -.75 1.00 1.00 1.00 1.00 -.77 -.77 -.77 -.77 -.75 0.95 -.75 -.75 0.73 0.73 -.77 -.77 -.77 -.77 1.00 1.00 1.00 1.00 Concor iteration 1:

121 Concor iteration 2: 1.00 -.99 0.94 0.94 -.94 -.94 0.99 0.99 0.99 0.99 -.99 -.99 -.99 -.99 -.99 1.00 -.94 -.94 0.94 0.94 -.99 -.99 -.99 -.99 0.99 0.99 0.99 0.99 0.94 -.94 1.00 1.00 -1.0 -1.0 0.97 0.97 0.97 0.97 -.97 -.97 -.97 -.97 -.94 0.94 -1.0 -1.0 1.00 1.00 -.97 -.97 -.97 -.97 0.97 0.97 0.97 0.97 0.99 -.99 0.97 0.97 -.97 -.97 1.00 1.00 1.00 1.00 -.99 -.99 -.99 -.99 -.99 0.99 -.97 -.97 0.97 0.97 -.99 -.99 -.99 -.99 1.00 1.00 1.00 1.00 The initial method for finding structurally equivalent positions was CONCOR, the CONvergence of iterated CORrelations.

122 1.00 -1.0 1.00 1.00 -1.0 -1.0 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0 -1.0 1.00 -1.0 -1.0 1.00 1.00 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00 1.00 -1.0 1.00 1.00 -1.0 -1.0 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0 -1.0 1.00 -1.0 -1.0 1.00 1.00 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00 1.00 -1.0 1.00 1.00 -1.0 -1.0 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0 -1.0 1.00 -1.0 -1.0 1.00 1.00 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00 Concor iteration 3: The initial method for finding structurally equivalent positions was CONCOR, the CONvergence of iterated CORrelations.

123 Concor iteration 3: 1.00 1.00 1.00 1.00 1.00 1.00 1.00 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1 3 4 7 8 9 10 2 5 6 11 12 13 14 The initial method for finding structurally equivalent positions was CONCOR, the CONvergence of iterated CORrelations.

124 Repeat the process on the resulting 1-blocks until you have reached structural equivalent blocks Because CONCOR splits every sub-group into two groups, you get a partition tree that looks something like this:

125 CONCOR example: Consider a simple senate voting network: Network is dense, since every cell has some score and dynamic the pattern changes over time. Color by structural equivalence…

126 Network is dense, since every cell has some score and dynamic the pattern changes over time. Adjust position to collapse SE positions. CONCOR example: Consider a simple senate voting network:

127 Network is dense, since every cell has some score and dynamic the pattern changes over time. And then adjust color, line width, etc. for clarity. While we’ve gone some distance with identifying relevant information from the mass, how do we account for time? CONCOR example: Consider a simple senate voting network:

128 CONCOR example: Repeat at each wave, linking positions over time

129 CONCOR example:

130 Automorphic and Regular equivalence are more difficult to find, and require iteratively searching over possible class assignments for sets that have the same graph theoretic patterns. Usually start with a set of nodes defined as similar on a number of network measures, then look within these classes for automorphic equivalence classes. The classic reference is REGE (White & Reitz 1985), which recursively defines the degree of equivalence between pairs and then adjusts for as many iterations as you specify. A theoretically appealing method for finding structures that are very similar to regular equivalence, role equivalence, uses the triad census. Each node is involved in (n-1)(n-2)/2 triads, and occupies a particular position in each of these triads. These positions are summarized in the following figure:

131 Network Sub-Structure: Triads 003 (0) 012 (1) 102 021D 021U 021C (2) 111D 111U 030T 030C (3) 201 120D 120U 120C (4) 210 (5) 300 (6) Intransitive Transitive Mixed

132 An Example of the triad census Type Number of triads --------------------------------------- 1 - 003 21 --------------------------------------- 2 - 012 26 3 - 102 11 4 - 021D 1 5 - 021U 5 6 - 021C 3 7 - 111D 2 8 - 111U 5 9 - 030T 3 10 - 030C 1 11 - 201 1 12 - 120D 1 13 - 120U 1 14 - 120C 1 15 - 210 1 16 - 300 1 --------------------------------------- Sum (2 - 16): 63

133 003 012_S 012_E 012_I 102_D 102_I 021D_S021D_E 021U_S 021U_E 021C_S 021C_B021C_E 111D_S 111D_B 111D_E111U_S111U_B 111U_E 030T_S030T_B 030T_E 030C 201_S 201_B 120D_S 120D_E 120U_S 120U_E 120C_S 120C_B 120C_E 210_S 210_B 300 Triadic Position Census: 36 Positions within 16 Directed Triads Indicates the position.

134 Triadic Position Census: 40 Positions within all mutual ties but two types of relations

135 36 36 10 10 10 10 43 43 43 43 43 43 43 43 0 0 0 0 0 0 0 0 0 0 0 0 0 0 20 20 41 41 41 41 14 14 14 14 14 14 14 14 9 9 11 11 11 11 12 12 12 12 12 12 12 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 10 1 1 1 1 8 8 8 8 8 8 8 8 2 2 10 10 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 5 5 5 5 1 1 1 1 1 1 1 1 Triad position vectors for the example network, resulting in 3 positions:

136 1.00 1.00 0.64 0.64 0.64 0.64 0.98 0.98 0.98 0.98 0.98 0.98 0.98 0.98 0.64 0.64 1.00 1.00 1.00 1.00 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.98 0.98 0.50 0.50 0.50 0.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 Correlating each person’s triad position vector with each other persons results in the following table, which clearly shows the positions that are equivalent:

137 Jefferson High SchoolSunshine High School School provides a good boundary for social relations School does not provide a good boundary for social relations Complete Network Analysis Network Connections: Role Positions

138 Jefferson High SchoolSunshine High School Image networks. Width of tie is proportional to the ratio of cell density to mean cell density. 34% 32% 33% 4% 43% 52% Complete Network Analysis Network Connections: Role Positions

139 Once you have decided on a number of blocks, you need to determine what counts as a ‘one’ block or a ‘zero’ block. Usually this is a some function of the density of the resulting block. General rules: “Fat Fit” Only put a one in blocks with all ones in the adjacency matrix “Lean Fit” Put a zero if all the cells are zero, else put a one “Density fit” If the average value of the cell is above a certain cutoff. White, Boorman and Breiger used a ‘lean fit’ (zeroblock) rule for the examples in their paper:

140 An example: White et al, figure 1. Biomedical Specialty data:

141 White et al, figure 3. Biomedical Specialty data: Key to structure lies in zero blocks

142 Recent models Recent work has generalized blockmodels in two directions: Specific structural hypotheses example: Core-periphery models or Structural Hole ideas Generalized blockmodeling based on particular relationship types & patterns. Pat Doreian’s recent work the the PAJEK folks. Connectivity sets. Identifying sets of nodes with some common patter of connectivity. This is a merge/mingle of community detection & positions. Moody & White would be an example.

143 To identify a core- periphery structure, we compare an observed block structure to an ideal block structure. 1 1 1 1 1 1 1 1 1 1 An ideal core- periphery network: Borgatti SP and Everett M G (1999) Models of core/periphery structures. Social Networks 21 375-395 Recent models Core-Periphery

144 To identify a core-periphery structure, we compare an observed block structure to an ideal block structure. (observed blocked network) Recent models Core-Periphery

145 (observed blocked network) (Ideal CP blocked network) To identify a core-periphery structure, we compare an observed block structure to an ideal block structure. Recent models Core-Periphery

146 (observed blocked network) (Ideal CP blocked network) A core periphery structure exists to the extent that the correlation between the ideal structure and the observed structure is high. We can search for cores by simply proposing a partition (many times) and then selecting the best fitting partition. But that’s silly-slow! To identify a core-periphery structure, we compare an observed block structure to an ideal block structure. Recent models Core-Periphery

147 A continuous version of “coreness” can be had by generalizing the ideal image seen above. Instead of just 0/1, pairs of “high core” nodes have a very strong tie connecting them, and core-periphery nodes have a very low score. Coreness can thus be defined as a type of centrality, but one that assumes a particular underlying structure to the network. Nodes with high coreness are more likely to be at the center of a core-periphery structure. As it turns out, coreness is essentially Eigenvector centrality, and UCINET sorts nodes by eigenvector centrality and build the “core” until the correlation between ideal/observed drops. To identify a core-periphery structure, we compare an observed block structure to an ideal block structure. Recent models Core-Periphery

148 Recent models Core-Periphery

149 The recent work on generalization focuses on the patterns that determine a block. Instead of focusing on just the density of a block, you can identify a block as any set that has a particular pattern of ties to any other set. This work starts from the observation that types of equivalence limit the observed types of blocks. So, for example, regularly equivalent blocks must be either empty, complete, or 1-covered. The “direct” approach is thus to search for these sorts of coverings. Recent models Generalized Block Models

150 Recent models Generalized Block Models

151 Recent models Generalized Block Models From Carrington, Scott & Wasserman. Models & Methods in Social Network Analysis

152 “A friend of a friend is a friend” “The enemy of an enemy is a friend” + + + + - - F x F = F E x E = F We can generalize the balance rule to multitudes of “compound relations” Use matrices for primary relations and matrix multiplication for compounds Compound Relations.

153 One of the most powerful tools in role analysis involves looking at role systems through compound relations. A compound relation is formed by combining relations in single dimensions. The best example of compound relations come from kinship. Sibling Child of Sibling 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 Child of 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 x = Nephew/Niece 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 S  C = SC

154 An example of compound relations can be found in W&F. This role table catalogues the compounds for two relations “Is boss of” and “Is on the same level as” Consider a system with two sorts of relations. Here, one is hierarchical and the other defines “within class”. We can build a role table with Boolean multiplation of the relations

155 An example of compound relations can be found in W&F. This role table catalogues the compounds for two relations “Is boss of” and “Is on the same level as” “Boss” X “boss of my boss is my boss”

156 An example of compound relations can be found in W&F. This role table catalogues the compounds for two relations “Is boss of” and “Is on the same level as” “On the same level” X

157 An example of compound relations can be found in W&F. This role table catalogues the compounds for two relations “Is boss of” and “Is on the same level as”

158 Kinship networks form a foundation to social structures. In the west, we have 2 primary relations (Parent of, married to) and one partitioning attribute (male or female). So: Parent of a Parent = Grandparent Father’s Father = Paternal Grandfather Mother’s Father = Maternal Grandfather Wife’s Mother’s Son = Brother-in-law Mother’s Mother’s son’s son = Cousin (mom’s side) Quality: The entire western kinship structure can be decomposed into a set of equations consisting of only Parent, Child, and Gender. Quantity: Given a fertility rate of 2 kids, the two-step * kinship neighborhood would have 26 people; if the fertility rate were 3 the same count goes up to 46. *2-steps includes aunt’s & uncles, but not their spouses. Compound Relations.

159 The scientists second rule has to be to look for regularity and exploit that for theory. Consider as a good example, Harrison White’s Kinship model: Compound Relations.

160 Ego connects to any of these Compound Relations. The scientists second rule has to be to look for regularity and exploit that for theory. Consider as a good example, Harrison White’s Kinship model:

161 Kinship networks form a foundation to social structures. In China, we have the same 2 primary relations: Parent of Married to But 3 partitioning attributes: Gender Relative Age Relational Order (1 st wife, 2 nd wife, etc) This means that compounds we name as equivalent (cousin, uncle) are named differently. But, while westerners largely ignore gender for anything other than final designation (aunt/uncle, niece/nephew), Chinese kinship terms are differentiated by parent’s line (maternal aunt, maternal uncle, etc.). We know this designation, but use it rarely. Compound Relations.

162 *2-steps includes aunt’s & uncles, but not their spouses. Compound Relations.

163 Uncles Compound Relations.

164

165 The Chinese extended family network – for “normal” relations westerners would recognize – includes 74 unique kinship terms. The same set in the west has 28 different terms. Each of these terms carries a different expected gift exchange system at holidays and mourning attire at death. Compound Relations.

166 How has this system changed? Consider the effects of the 1-child policy: Source: Population research Bureau With a fertility of 6, 2-step kinship nets would have 166 people; with 2 it’s 26. A full implementation of 1-child removes the “relative age” operator, erasing every kinship term dependent on “older” or “younger” and means that families play either in a maternal or a paternal line, but not both. Compound Relations.

167 Using Compound Relations theoretically: James Montgomery & Patronage systems

168 Using Compound Relations theoretically: James Montgomery & Patronage systems

169 Using Compound Relations theoretically:

170

171 Other work on this general topic:

172 Using Compound Relations theoretically: Other work on this general topic:

173 Methods: How to? The basic block model formation can be done in multiple ways: 1.Apply any of our group-finding algorithms to a role-based similarity matrix -Here you’re simply converting the conditions for equivalence to adjacency and solving for modularity. Requires either a community detection algorithm that uses valued ties or a binarization of the similarity matrix. 2.Cluster node-level structural indices (get at regular/automorphic equivalence) - This is the “evade” correlate to SE from community detection: cluster on a BUNCH of easy-to-calculate node-level network statistics and this gives you nodes that are equivalent (with respect to the measures you used!)

174 Methods: How to? The basic block model formation can be done in multiple ways: Role-specific algorithms:

175 Methods: How to? The basic block model formation can be done in multiple ways: Role-specific algorithms:

176 Methods: How to? Triad Structural Equivalence in SAS

177 Methods: How to? Triad Structural Equivalence in SAS

178 Methods: How to? Triad Structural Equivalence in SAS

179 Addendum A new statistic for determining the number of groups in a network. Proc cluster gives you a statistic for the basic “fit” of a cluster solution. This statistic varies depending on the method used, but is usually something like an R2. Consider this dendrogram:

180 Addendum A new statistic for determining the number of groups in a network. Proc cluster gives you a statistic for the basic “fit” of a cluster solution. This statistic varies depending on the method used, but is usually something like an R2. Consider this dendrogram: The SPRSQ and the RSQ are your fit statistics.

181 Addendum A new statistic for determining the number of groups in a network. SPRSQ RSQ A sharp change in the statistic is your best indicator.

182 Addendum A new statistic for determining the number of groups in a network. Modularity: M is the modularity score S indexes each group (“module”) ls is the number of lines in group s L is the total number of lines ds is the sum of the degrees of the nodes in s Nm is the number of groups

183 Role Positions Identifying positions: Could use the Modularity score at each tree cut…

184 Role Positions Example positions identified in a single school network (role 7 is a “leading crowd” in the simplest sum-of-in-degree sense)

185 Repeating this process across all networks, generates a population of within-school position profiles. We then pool & cluster these position profiles in a “2 nd -order clustering” to identify a set of roles that can be compared across the populations. We settle on 5 position solution: Role Positions 89/2313260/501 39/107850/1235 4/815 35/263 50/819 0/416 Outsiders Aloofs FriendsHangers Central Core

186 Role Positions Uninvolved outsiders (35% of students, 28% of role groups) Largely uninvolved: nominate few and are nominated rarely by others. Includes isolated dyads & small groups; mixing matrix show that few friends tend to be others in same positon.

187 Role Positions Non-Reciprocated (17% of students, 15% of role groups) Makes nominations, but rarely reciprocated and has low in-degree, targeting highly central nodes with nominations. “Hangers on” position.

188 Role Positions Basically average – positive scores largely because the isolates have been removed – liked by some, like others. Everyday kids: good friends (21% of students, 29% of role groups)

189 Role Positions “Popular Aloof” (9% of students, 9% of role groups) High in-degree but low out-degree, but the few they do nominate tend to reciprocate.

190 Role Positions Central Core (17.5% of students, 17.8% of role groups) Highly reciprocated ties, active, very central; both high in-degree and reciprocation rates.

191 Role Positions How stable is occupancy of a school role?

192 Role Positions How stable is occupancy of a school role?


Download ppt "Communities & Roles Two types ways of identifying nodes that “go together” a)Communities/Groups a)Cohesive subgroups literature: start w. Freeman b)Network."

Similar presentations


Ads by Google