Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Introduction to Social Network Analysis James Moody Department of Sociology The Ohio State University.

Similar presentations


Presentation on theme: "An Introduction to Social Network Analysis James Moody Department of Sociology The Ohio State University."— Presentation transcript:

1 An Introduction to Social Network Analysis James Moody Department of Sociology The Ohio State University

2 Introduction The world we live in is connected: Jim Moody Craig Calhoun Isaias Afworki

3 Introduction These patterns of connection form a social space. Social network analysis maps and analyzes this social space.

4 Adolescent Social Structure

5

6

7 Yet standard social science analysis methods do not take this space into account. Moreover, the complexity of the relational world makes it impossible (in most cases) to understand this connectivity using only our intuitive understanding of a setting. Introduction

8 Why networks matter: Intuitive: information travels through contacts between actors, which can reflect a power distribution or influence attitudes and behaviors. Our understanding of social life improves if we account for this social space. Less intuitive: patterns of inter-actor contact can have effects on the spread of “goods” or power dynamics that could not be seen focusing only on individual behavior. Introduction

9 Social network analysis is: a set of relational methods for systematically understanding and identifying connections among actors a body of theory relating to types of observable social spaces and their relation to individual and group behavior. Introduction

10 Network analysis assumes that: How actors behave depends in large part on how they are linked together Example: Adolescents with peers that smoke are more likely to smoke themselves. The success or failure of organizations may depend on the pattern of relations within the organization Example: The ability of companies to survive strikes depends on how product flows through factories and storehouses. (continued…..) Introduction

11 Patterns of relations reflect the power structure of a given setting, and clustering may reflect coalitions within the group Example: Overlapping voting patterns in a coalition government Network analysis assumes that: Introduction

12 An information network: Email exchanges within the Reagan white house, early 1980s (source: Blanton, 1995) Introduction

13 Power positions and potential influence Introduction

14 Overview Conclusions Flows within Networks Structure of Social Space Introduction Basic Concepts Tools, Models & Methods For Flows and Structures

15 Origins of network analysis: Beginning in the 1930s, a systematic approach to theory and research, based on the notion that relations matter, began to emerge In 1934 Jacob Moreno introduced the ideas and tools of sociometry At the end of World War II, Alex Bavelas founded the Group Networks Laboratory at M.I.T. Basic Concepts

16 From the outset, the network analysis has been: a. guided by formal theory organized in mathematical terms, and b. grounded in the systematic analysis of empirical data In the 1970s, when modern discrete combinatorics (esp. graph theory) developed rapidly and powerful computers became readily available that the study of social networks began to flourish Basic Concepts

17 Actors are nodes Ideas, Papers, Events, Individuals, Organizations, Nations Relations are lines between pairs of nodes Symmetric (shares a room with) Asymmetric (gives an order to) Valued (number of times seen together) Basic Concepts

18 Network data are familiar to you For example: - Personal, face-to-face contact - Telephone contact - Email contact - Contact through faxes or wires - Snail-mail contact - Membership in the same organization - Attendance at the same meetings - Graduates of the same university Basic Concepts

19 For example, you might be tracking the activities of a number of people in related, but not identical cases, including meetings they attended. You may know little of the content of the event, or what they may have said to each other, only whether particular people were at the event. Your data might look like: Basic Concepts

20 11.19.2001. Meeting at Brussels. Attending: Smith, Johnson, Davis, James, Jackson 12.22.2001. Meeting at Paris. Attending: Johnson, James, Jones, Wilson 1.12.2001. Meeting in New York. Attending: Jones, Carter, Burns 2.14.2001. Meeting in Denver. Attending: Wilson, Burns, Wilf, Newman (Red bold indicates people who are the focus of an investigation)

21 Basic Concepts While perhaps not immediately apparent when looking at the list of names, a simple algorithm reveals connections among these actors. Jackson Davis Smith Johnson Wilson Newman Wilf Burns Carter Jones James

22 Basic concepts Types of network data: 1) Ego-network - Have data on a respondent (ego) and the people they are connected to (alters) - May include estimates of connections among alters

23 Basic concepts Types of network data: 2) Partial network - Ego networks plus some amount of tracing to reach contacts of contacts - Something less than full account of connections among all pairs of actors in the relevant population - Example: CDC Contact tracing data for STDs

24 Basic concepts Types of network data: 3) Complete - Data on all actors within a particular (relevant) boundary - Never exactly complete, but boundaries are set - Example: Coauthorship data among all writers in the social sciences

25 Contact’s contact Trace Relation Alter Relation Examples: linked levels of data Actor Key contact Primary Relation

26 Why networks matter: Consider the following (much simplified) scenario: Probability that actor i passes information to actor j (p ij )is a constant over all relations = 0.6 S & T are connected through the following structure: S T The probability that S passes the information to T through either path would be: 0.090.09

27 Probability of transfer of information over independent paths: The probability that the information passes from i to j is assumed constant at p ij. The probability that the information passes through multiple links (i to j, and from j to k) is the joint probability of each (link1 and link2 and … link k) = p ij d where d is the path distance. To calculate the probability of of the information passing through multiple paths, use the compliment of it not passing through any paths. The probability of not passing through path l is 1-p ij d, and thus the probability of not passing through any path is (1-p ij d ) k, where k is the number of paths Thus, the probability of i passing the information to j given k independent paths is: Why matter Distance

28 Probability of information passing over non-independent paths: - To get the probability that I passes the information to j given that paths intersect at 4, I calculate Using the independent paths formula.formula

29 Why networks matter: Now consider the following (similar?) scenario: S T Every actor but one has the exact same number of contacts The category-to-category mixing is identical The distance from S to T is the same (7 steps) S and T have not changed their behavior Their contacts’ contacts have the same behavior But the probability of the information passing from S to T is:probability = 0.148 Different outcomes & different potentials for intervention

30 Overview Conclusions Flows within Networks Structure of Social Space Introduction Basic Concepts Tools, Models & Methods For Flows and Structures

31 Network Flow In addition to the simple probablity that one actor passes information on to another (p ij ), two factors affect flow through a network: Topology -the shape, or form, of the network - Example: one actor cannot pass information to another unless they are either directly or indirectly connected Time - the timing of contact matters - Example: an actor cannot pass information he has not receive yet

32 Topology Two features of the network’s shape are known to be important: connectivity and centrality Connectivity refers to how actors in one part of the network are connected to actors in another part of the network. Reachability: Is it possible for actor i to reach actor j? This can only be true if there is a chain of contact from one actor to another. Distance: Given they can be reached, how many steps are they from each other? Number of paths: How many different paths connect each pair?

33 Network topology: reachability Without full network data, you can’t distinguish actors with limited information from those more deeply embedded in a setting. a b c

34 Network topology: distance & number of paths Given that ego can reach alter, distance determines the likelihood of information passing from one end of the chain to another. Because information spread is never certain, the probability of transfer decreases over distance. However, the probability of transfer increases with each alternative path connecting pairs of people in the network.

35 Network topology: distance & number of paths a Distance is measured by the (weighted) number of relations separating a pair: Actor “a” is: 1 step from 4 2 steps from 5 3 steps from 4 4 steps from 3 5 steps from 1

36 Network topology: distance & number of paths Paths are the different routes one can take. Node-independent paths are particularly important. a b There are 2 independent paths connecting a and b. There are many non- independent paths

37 0 0.2 0.4 0.6 0.8 1 1.2 23456 Path distance probability Probability of information transfer by distance and number of paths, assume a constant p ij of 0.6 10 paths 5 paths 2 paths 1 path

38 Reachability in Colorado Springs (Sexual contact only) (Node size = log of degree) High-risk actors over 4 years 695 people represented Longest path is 17 steps Average distance is about 5 steps Average person is within 3 steps of 75 other people 137 people connected through 2 independent paths, core of 30 people connected through 4 independent paths

39 Centrality refers to (one dimension of) location, identifying where an actor resides in a network. For example, we can compare actors at the edge of the network to actors at the center. In general, this is a way to formalize intuitive notions about the distinction between insiders and outsiders. Network topology: centrality

40 Centrality example: At the local level, we expect people like NSJMP and NSOLN to have greater access to information than others in the network. Network analysis gives us a set of tools to quantify this difference.

41 (Node size proportional to betweenness centrality ) Centrality example: Actors that appear very different when seen individually, are comparable in the global network.

42 Information flows Two factors that affect network flows: Topology - the shape, or form, of the network - simple example: one actor cannot pass information to another unless they are either directly or indirectly connected Time - the timing of contacts matters - simple example: an actor cannot pass information he has not receive yet

43 Timing in networks A focus on contact structure often slights the importance of network dynamics Time affects networks in two important ways: 1) The structure itself goes through phases that are correlated with information spread 2) The timing of contact constrains information flow

44 Sexual Relations among A syphilis outbreak Jan - June, 1995 Rothenberg et al map the pattern of sexual contact among youth involved in a Syphilis outbreak in Atlanta over a one year period. (Syphilis cases in red) Changes in Network Structure

45 Sexual Relations among A syphilis outbreak July-Dec, 1995

46 Sexual Relations among A syphilis outbreak July-Dec, 1995

47 Data on drug users in Colorado Springs, over 5 years Drug Relations, Colorado Springs, Year 1

48 Data on drug users in Colorado Springs, over 5 years Drug Relations, Colorado Springs, Year 2 Current year in red, past relations in gray

49 Data on drug users in Colorado Springs, over 5 years Drug Relations, Colorado Springs, Year 3 Current year in red, past relations in gray

50 Data on drug users in Colorado Springs, over 5 years Drug Relations, Colorado Springs, Year 4 Current year in red, past relations in gray

51 Data on drug users in Colorado Springs, over 5 years Drug Relations, Colorado Springs, Year 5 Current year in red, past relations in gray

52 What impact does timing have on flow through the network? In addition to changes in the shape over time, contact timing constrains how information can flow through the network. Consider the following example:

53 B C E DF A 2 - 5 3 - 7 0 - 1 8 - 9 3 - 5 A hypothetical contact network Numbers above lines indicate contact periods

54 B C E DF A The path graph for the hypothetical contact network

55 Direct contact network of 8 people in a ring (adjacency matrix: cell = number of paths from row to column)

56 Implied contact network of 8 people in a ring All contacts concurrent

57 Implied contact network of 8 people in a ring Mixed Concurrent 2 2 1 1 2 2 3 3 Density = 0.57

58 Implied contact network of 8 people in a ring Serial (1) 1 2 3 7 6 5 8 4 Density = 0.73

59 Implied contact network of 8 people in a ring Serial (2) 1 2 3 7 6 1 8 4 Density = 0.51

60 Implied contact network of 8 people in a ring Serial (3) 1 2 1 1 2 1 2 2 Density = 0.43

61 Information flows Summary: Topology: - Information requires connected communication chains - Real-world networks are too complex to map these without specialized tools. Time: - Network topology changes over time. This has implications for information flow. - Because small changes in relationship timing can have dramatic effects on information flow, it is impossible to know this intuitively.

62 Overview Conclusions Flows within Networks Structure of Social Space Introduction Basic Concepts Tools, Models & Methods For Flows and Structures

63 Structure of Social Space Information flows are only one use of networks It is also possible to characterize the key topological features of any social network. These features include things such as the extent of hierarchy and clustering.

64 1) Identify core groups & patterns of relations among groups a. embeddedness in groups constrains action b. group structure affects stability & resource distribution 2) Locate tensions or inconsistencies in a relational structure that might indicate sources of social change. Structure of Social Space

65 Two features of interest related to network structure: 1) Cohesive groups: Sets of people who interact frequently with each other. These are often groups that work together. Groups are often organized into positions within a network that indicate particular roles or access resources 2) Hierarchy: Relational structure can identify the leadership positions within a network, though either direction of ties or periphery status Structure of Social Space

66 Structure of cohesive groups A cohesive group is a set of actors with more interaction inside the group than outside the group, mutually connected through multiple paths.

67 Cohesive Group Structure “Immaculate Preparatory High School”

68 Cohesive Group Structure: 3 types of positions “Immaculate Preparatory High School”

69 Cohesive Group Structure: Group member “Immaculate Preparatory High School”

70 Cohesive Group Structure: Group Member “Immaculate Preparatory High School”

71 Cohesive Group Structure: Bridge between groups “Immaculate Preparatory High School”

72 Cohesive Group Structure: Outsider “Immaculate Preparatory High School”

73 Cohesive Groups: Relevance Identify people who bridge important constituencies - people who are between groups have a unique ability to control information Such actors are said to bridge structural holes, the number of “holes” an actor bridges gives insight into an actor’s power position in the network.

74 Hierarchy and network position Many cohesive groups are embedded within a hierarchy, which one can map using relational tools. Changes in the hierarchical position indicate changes in the power structure.

75 Examples of Hierarchical Systems Linear Hierarchy (all triads transitive) Simple Hierarchy Branched Hierarchy Mixed Hierarchy

76 Hierarchy and network position If you don’t know the hierarchy of the network, asymmetry optimization techniques allow one to identify levels in a hierarchy

77 Hierarchy and network position If you don’t know the hierarchy of the network, asymmetry optimization techniques allow one to identify levels in a hierarchy

78 Start with some basic ideas of what a role is: An exchange of something (support, ideas, commands, etc) between actors. Thus, we might represent a family as: H W C C C Provides food for Romantic Love Bickers with (and there are, of course, many other relations inside the family) Group structure through multiple relations

79 The key idea, is that we can express a role through a relation (or set of relations) and thus a social system by the inventory of roles. If roles equate to positions in an exchange system, then we need only identify particular aspects of a position. But what aspect? Structural Equivalence Two actors are structurally equivalent if they have the same types of ties to the same people. Group structure through multiple relations

80 Structural Equivalence A single relation

81 Structural Equivalence Graph reduced to positions

82 Alternative notions of equivalence Instead of exact same ties to exact same alters, you look for nodes with similar ties to similar types of alters

83 Overview Conclusions Flows within Networks Structure of Social Space Introduction Basic Concepts Tools, Models & Methods For Flows and Structures

84 Tools, Methods & Models Data Representations 1 2 3 5 4 Graph Adjacency Matrix Arc List Node List Send 1 2 3 4 5 Recv 2 3 4 2 1 2 3 5 1 3 4

85 Tools, Methods & Models Graphical Display Benefits: Intuitive way to display networks. Helps people see the social space – it is a map. A concise presentation of a great deal of data. Costs: Lack of standards for how to display can create misleading images. Displays of large networks tend to reveal only the roughest properties of the network

86 Tools, Methods & Models Graphical Display: Software PAJEK Program for analyzing and plotting very large networks Intuitive windows interface Used for most of the real data plots in this presentation Mainly a graphics program, but is expanding the analytic capabilities Free Available from:

87 Tools, Methods & Models Graphical Display: Software Cyram Netminer for Windows Very new: largely untested Price range depends on application Limited to smaller networks O(100)

88 Tools, Methods & Models Graphical Display: Software NetDraw Also very new, but by one of the best known names in network analysis software. Free Limited to smaller networks O(100)

89 Tools, Methods & Models Analysis Methods: Descriptive / Measurement Wasserman, Stanley and Katherine Faust. 1994. Social Network Analysis. Cambridge: Cambridge University Press. The key text for methods and measurement is: The basic network measures use graph theory to formalize aspects of the network, and always work from either an adjacency matrix (slow for large graphs) or an edge/node list.

90 Tools, Methods & Models Analysis Methods: Descriptive / Measurement Properties of interest include: Individual Level: Degree: Number of contacts for each person - Sum over the row/column of the adjacency matrix. Closeness Centrality: Inverse of the distance to every other node in the network. Count path distances from ego to alters. Sub-group Level: Group Membership: Which groups are there? Various search algorithms for identifying groups. Group Position: Where does a given group fit in the overall flow of relations? Various Equivalence algorithms. Graph Level: Density: Number of ties present as a percentage of all possible ties. Centralization: To what degree are edges focused through a small number of nodes. Various formulas for different centrality indices.

91 Tools, Methods & Models Analysis Methods: Descriptive / Measurement: Software 1) UCI-NET General Network analysis program, runs in Windows Good for computing measures of network topography for single nets Input-Output of data is a little clunky, but workable. Not optimal for large networks Available from: Analytic Technologies Borgatti@mediaone.net 2) STRUCTURE “A General Purpose Network Analysis Program providing Sociometric Indices, Cliques, Structural and Role Equivalence, Density Tables, Contagion, Autonomy, Power and Equilibria In Multiple Network Systems.” DOS Interface w. somewhat awkward syntax Great for role and structural equivalence models Manual is a very nice, substantive, introduction to network methods Available from a link at the INSNA web site: http://www.heinz.cmu.edu/project/INSNA/soft_inf.html

92 Tools, Methods & Models Analysis Methods: Descriptive / Measurement: Software 3) NEGOPY Program designed to identify cohesive sub-groups in a network, based on the relative density of ties. DOS based program, need to have data in arc-list format Moving the results back into an analysis program is difficult. Available from: William D. Richards http://www.sfu.ca/~richards/Pages/negopy.htm 4) SPAN - Sas Programs for Analyzing Networks (Moody, ongoing) is a collection of IML and Macro programs that allow one to: a) create network data structures from nomination data b) import/export data to/from the other network programs c) calculate measures of network pattern and composition d) analyze network models Allows one to work with multiple, large networks Easy to move from creating measures to analyzing data All of the Add Health data are already in SAS Available by sending an email to: Moody.77@osu.edu

93 Tools, Methods & Models Analysis Methods: Statistical Models There are two general classes of statistical models for networks: 1) Models of the network itself The statistical question is how an observed network fits into the class of all possible random graphs with a given set of topological characteristics. The whole network is the substantive unit of analysis, though technically one works with the dyads from the network. Examples: p* models (Wasserman and Pattison), MCMC random graph models (Tom Snijders, Mark Handcock) 2) Models of individual behavior that incorporate network characteristics The statistical question is whether or not network properties affect individual behaviors. Examples: Network regressive-autoregressive models (Doriean), Peer influence models (Friedkin)

94 Tools, Methods & Models Analysis Methods: Statistical Models Exponential Random Graph Models Where: z is a collection of r explanatory variables, calculated on x 2 is a collection of r parameters to be estimated k is a normalizing constant that ensures the probability sums to 1. As it turns out, k is incredibly difficult to identify, introducing a number of complexities to the model.

95 Exponential Random Graph Model Details Kindly provided by Mark Handcock, University of Washington Statistics Department.

96 Exponential Random Graph Model Details Kindly provided by Mark Handcock, University of Washington Statistics Department.

97 Exponential Random Graph Model Details Kindly provided by Mark Handcock, University of Washington Statistics Department.

98 Exponential Random Graph Model Details Kindly provided by Mark Handcock, University of Washington Statistics Department.

99 Exponential Random Graph Model Details Kindly provided by Mark Handcock, University of Washington Statistics Department.

100 Exponential Random Graph Model Details Kindly provided by Mark Handcock, University of Washington Statistics Department.

101 Exponential Random Graph Model Details Kindly provided by Mark Handcock, University of Washington Statistics Department.

102 Tools, Methods & Models Analysis Methods: Statistical Models Exponential Random Graph Models To estimate the model, we work with the conditional probabilities (X ij |X c ij ) instead of the full graph. This transforms the exponential model to a logit model on the dyads:

103 Analysis Methods: Statistical Models Exponential Random Graph Models Software for analyzing these models is available from: Logit Pseudo-Likelihood estimation: http://kentucky.psych.uiuc.edu/pstar/index.htmlhttp://kentucky.psych.uiuc.edu/pstar/index.html (SPSS programs) http://www.sfu.ca/~richards/Pages/pspar.htmlhttp://www.sfu.ca/~richards/Pages/pspar.html (Program for Large graphs) Empirically, these models are tricky to estimate, as the potential result space can easily become degenerate, particularly as z starts to include a more complicated rage of dependencies. MCMC Estimation: Ongoing work by Mark Handcock, Tom Snijders and Co.

104 Tools, Methods & Models Analysis Methods: Statistical Models Network Effect Models Question is whether or not being connected to a particular set of people affects an individual’s behavior. The key statistical point is that we have abandoned the assumption that our cases are independent. These models originated in spatial statistics – looking at the effect of an adjacent geographic area on outcomes for any given area.

105 Basic Peer Influence Model Formal Model (1) (2) Y (1) = an N x M matrix of initial opinions on M issues for N actors X = an N x K matrix of K exogenous variable that affect Y B = a K x M matrix of coefficients relating X to Y  = a weight of the strength of endogenous interpersonal influences W = an N x N matrix of interpersonal influences

106 Basic Peer Influence Model Formal Model (1) This is the basic general linear model. It says that a dependent variable (Y) is some function (B) of a set of independent variables (X). At the individual level, the model says that: Usually, one of the covariates is , the model error term.

107 Basic Peer Influence Model (2) This part of the model taps social influence. It says that each person’s final opinion is a weighted average of their own initial opinions And the opinions of those they communicate with (which can include their own current opinions)

108 Basic Peer Influence Model The key to the peer influence part of the model is W, a matrix of interpersonal weights. W is a function of the communication structure of the network, and is usually a transformation of the adjacency matrix. In general: Various specifications of the model change the value of w ii, the extent to which one weighs their own current opinion and the relative weight of alters.

109 Basic Peer Influence Model Formal Properties of the model The model is directly related to spatial econometric models: If we allow the model to run over t, we can describe the model as: Where the two coefficients (  and  ) are estimated directly (See Doreian, 1982, SMR)

110 Overview Conclusions Flows within Networks Structure of Social Space Introduction Basic Concepts Tools, Models & Methods For Flows and Structures

111


Download ppt "An Introduction to Social Network Analysis James Moody Department of Sociology The Ohio State University."

Similar presentations


Ads by Google