Social Network Analysis and Complex Systems Science

Slides:



Advertisements
Similar presentations
PRAGMA – 9 V.S.S.Sastry School of Physics University of Hyderabad 22 nd October, 2005.
Advertisements

Tests of Hypotheses Based on a Single Sample
An introduction to exponential random graph models (ERGM)
Where we are Node level metrics Group level metrics Visualization
The Statistical Analysis of the Dynamics of Networks and Behaviour. An Introduction to the Actor-based Approach. Christian Steglich and Tom Snijders ——————
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Strong and Weak Ties Chapter 3, from D. Easley and J. Kleinberg book.
Analysis and Modeling of Social Networks Foudalis Ilias.
Chapter 4 Randomized Blocks, Latin Squares, and Related Designs
Modeling Malware Spreading Dynamics Michele Garetto (Politecnico di Torino – Italy) Weibo Gong (University of Massachusetts – Amherst – MA) Don Towsley.
SOCI 5013: Advanced Social Research: Network Analysis Spring 2004.
Flipping A Biased Coin Suppose you have a coin with an unknown bias, θ ≡ P(head). You flip the coin multiple times and observe the outcome. From observations,
Introduction of Probabilistic Reasoning and Bayesian Networks
Advanced Topics in Data Mining Special focus: Social Networks.
Decision Making: An Introduction 1. 2 Decision Making Decision Making is a process of choosing among two or more alternative courses of action for the.
4. PREFERENTIAL ATTACHMENT The rich gets richer. Empirical evidences Many large networks are scale free The degree distribution has a power-law behavior.
Weighted networks: analysis, modeling A. Barrat, LPT, Université Paris-Sud, France M. Barthélemy (CEA, France) R. Pastor-Satorras (Barcelona, Spain) A.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
Exponential random graph (p*) models for social networks Workshop Harvard University February 2002 Philippa Pattison Garry Robins Department of Psychology.
The dynamics of iterated learning Tom Griffiths UC Berkeley with Mike Kalish, Steve Lewandowsky, Simon Kirby, and Mike Dowman.
This presentation can be downloaded at Water Cycle Projections over Decades to Centuries at River Basin to Regional Scales:
Network Morphospace Andrea Avena-Koenigsberger, Joaquin Goni Ricard Sole, Olaf Sporns Tung Hoang Spring 2015.
Some results from Scottish data The Statistical Analysis of the Dynamics of Networks and Behaviour: An Application to Smoking and Drinking Behaviour among.
0 Network Effects in Coordination Games Satellite symposium “Dynamics of Networks and Behavior” Vincent Buskens Jeroen Weesie ICS / Utrecht University.
Network Statistics Gesine Reinert. Yeast protein interactions.
Alon Arad Alon Arad Hurst Exponent of Complex Networks.
Joint social selection and social influence models for networks: The interplay of ties and attributes. Garry Robins Michael Johnston University of Melbourne,
Exponential Random Graph Models (ERGM) Michael Beckman PAD777 April 9, 2010.
Simulation Models as a Research Method Professor Alexander Settles.
CHAPTER 6 Statistical Analysis of Experimental Data
The role of theory in research
Sunbelt 2009statnet Development Team ERGM introduction 1 Exponential Random Graph Models Statnet Development Team Mark Handcock (UW) Martina.
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
Social network analysis in business and economics Marko Pahor.
The Erdös-Rényi models
Information Networks Power Laws and Network Models Lecture 3.
Exploring the dynamics of social networks Aleksandar Tomašević University of Novi Sad, Faculty of Philosophy, Department of Sociology
The Examination of Residuals. The residuals are defined as the n differences : where is an observation and is the corresponding fitted value obtained.
Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take.
LANGUAGE NETWORKS THE SMALL WORLD OF HUMAN LANGUAGE Akilan Velmurugan Computer Networks – CS 790G.
Lecture 8: Generalized Linear Models for Longitudinal Data.
What does Public Policy ask of Complexity Science? Peter Dick Department of Health, UK Government ECCS ’12, Brussels, September 2012.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Random-Graph Theory The Erdos-Renyi model. G={P,E}, PNP 1,P 2,...,P N E In mathematical terms a network is represented by a graph. A graph is a pair of.
Markov Random Fields Probabilistic Models for Images
Neighbourhood-based models for social networks: model specification issues Pip Pattison, University of Melbourne [with Garry Robins, University of Melbourne.
Online Social Networks and Media
Yongqin Gao, Greg Madey Computer Science & Engineering Department University of Notre Dame © Copyright 2002~2003 by Serendip Gao, all rights reserved.
A two minute introduction to: Exponential random graph (p*)models for social networks SNAC Workshop, Illinois, November 2005 Garry Robins, University of.
Question paper 1997.
MHEDIC Structure and Accomplishments Naorah Lockhart, Liz Mellin, Paul Flaspohler, & Seth Bernstein.
Most of contents are provided by the website Network Models TJTSD66: Advanced Topics in Social Media (Social.
Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Introduction to Statistical Models for longitudinal network data Stochastic actor-based models Kayo Fujimoto, Ph.D.
6.4 Random Fields on Graphs 6.5 Random Fields Models In “Adaptive Cooperative Systems” Summarized by Ho-Sik Seok.
Network Science K. Borner A.Vespignani S. Wasserman.
Assumptions of Multiple Regression 1. Form of Relationship: –linear vs nonlinear –Main effects vs interaction effects 2. All relevant variables present.
Hierarchical Organization in Complex Networks by Ravasz and Barabasi İlhan Kaya Boğaziçi University.
Introduction to ERGM/p* model Kayo Fujimoto, Ph.D. Based on presentation slides by Nosh Contractor and Mengxiao Zhu.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Theoretical distributions: the Normal distribution.
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Carina Omoeva, FHI 360 Wael Moussa, FHI 360
Topics In Social Computing (67810)
Exponential random graph models for multilevel networks
Clustering Evaluation The EM Algorithm
Hypotheses Hypothesis Testing
The Examination of Residuals
Advanced Topics in Data Mining Special focus: Social Networks
Presentation transcript:

Social Network Analysis and Complex Systems Science Pip Pattison University of Melbourne CSIRO Complex Systems Symposium, Pelican Beach, 10-12 Aug 2004

In collaboration with: Garry Robins, University of Melbourne Tom Snijders, University of Groningen Henry Wong, University of Melbourne Jodie Woolcock, University of Melbourne Emmanuel Lazega, University of Lille I Kim Albert, University of Melbourne Anne Mische, Rutgers University John Padgett, University of Chicago Peng Wang, University of Melbourne

1. Why are social networks important? For understanding action in relation to its social context network ties link actors to each other as well as to groups, cultural resources, neighbourhoods, communities networks structure opportunities and constraints For understanding social dynamics social action is interactive: one person’s action changes the context for those to whom they are connected To understand the cumulation of local processes into population level outcomes The structure of networks and the dynamics of local processes are critical to understanding how locally interactive, context-dependent actions cumulate into outcomes at higher levels (eg communities, populations)

A simplified multi-layered and relational framework for the social world Social units individuals groups ... Ties among social units person-to-person person-to-group Settings geographical sociocultural For example: Interactions between social units depend on proximity through ties Interactions between ties depend on proximity through settings There are interactions within and between levels Social structure: regularities in interactions

2: Typical data structures Network observations give rise to relational data structures, e.g.: People  groups, people  attributes, groups  attributes, people  settings, groups  settings, people  people people  people  types of tie, people  people  settings, … Some important design issues: Network boundaries? Complete: which “nodes” to include? Which network ties? What are the relevant network links? How do we best “measure” them?

Example 1: Management consulting firm node colour codes workgroup membership node size codes extent of cohesive beliefs ties: “Who do you ask when you want to find out what is going on..?”

Example 2: Network of Mutual Collaboration Ties (Lazega, 1999)

Example 3: Change in interorganizational networks (Goldman et al, 1994) Data are from an evaluation of the Robert Wood Johnson Program on Chronic Mental Illness in 6 US cities (one of which was a “control” site) Organisations Mental health agencies in the “control” site (n =37) Networks at time 1 and time 2 (x1, x2) Client referrals Information-sharing Fund-sharing Data are from key informants and were gathered two years apart

Client referrals: time 1

Client referrals: time 2

3: Modelling networks and other relational structures Guiding principles: 1. Network ties (and other observations) are the outcome of unobserved processes that tend to be local and interactive 2. There are both regularities and irregularities in these local interactive processes Hence we aim for a stochastic model formulation in which: local interactions are permitted and assumptions about “locality” are explicit regularities are represented by model parameters and estimated from data consequences of local regularities for global network properties can be understood and can also provide an exacting approach to model evaluation

Building models for social networks We model tie variables: X = [Xij] Xij = 1 if i has a tie to j 0 otherwise realisation of X is denoted by x = [xij] Two modelling steps: methodological: define two network tie variables to be neighbours if they are conditionally dependent, given the values of all other tie variables Substantive: what are appropriate assumptions about the neighbourhood relation (ie about the network topology)?

Network topologies: which tie variables are neighbours? Two tie variables are neighbours if: they share a dyad dyad-independent model they share an actor Markov model they share a connection realisation-dependent model with the same tie They share a connection k-triangle model with two ties etc...

Models for interactive systems of variables (Besag, 1974) Hammersley-Clifford theorem: A model for X has a form determined by its neighbourhoods, where a neighbourhood is a set of mutually neighbouring variables This general approach leads to: P(X = x) = (1/c) exp{Q QzQ(x)} normalizing quantity parameter network statistic the summation is over all neighbourhoods Q zQ(x) = XijQxij signifies whether c = xexp{Q QzQ(x)} all ties in Q are observed in x

Neighbourhoods depend on proximity assumptions Assumptions: two ties are neighbours: if they share a dyad dyad-independence if they share an actor Markov if they share a connection with the same tie realisation-dependent Configurations for neighbourhoods edge + 2-star 3-star 4-star ... triangle + ... 3-path 4-cycle “coathanger”

Neighbourhoods, continued k-triangle model 2 ties are neighbours if they create a 4-cycle configurations include: k nodes k-independent k-triangle 2-path useful for higher-order clustering effects

Homogeneous Markov random graphs (Frank & Strauss, 1986) P(X = x) = (1/c) exp{L(x) + 2S2(x) + … + kSk(x) + … + T(x)} where: L(x) no of edges in x S2(x) no of 2-stars in x … Sk(x) no. of k-stars in x … T(x) no of triangles in x

Simulating from homogeneous Markov random graph distributions on 36 nodes: a typical graph Parameter values:  = -3 2 = 2  = 0 3 = -2 Average statistics: edges 57.0 2-stars 133.8 triangles 2.3 3-stars 68.4

Typical graphs for  = 0, 2, 5, 6

A typical graph for  = 10 Parameter values:  = -3 2 = 2  = 10 3 = -2 Average statistics: edges 92.0 2-stars 390.0 triangles 130.0 3-stars 440.0

These models can represent very different network structures: eg small worlds: =-4, 2=0.1, 3=-0.05, =1 [Robins, Pattison & Woolcock, in press] No of edges L=126 path length distribution Q1 = 4 (5) Q2 = 5 (7) Q3 =  () clustering coefficient Cluster = 0.09 (0.02) figures for Bernoulli distribution in red

Longer path worlds: =-1. 2, 2=0 Longer path worlds: =-1.2, 2=0.05, 3=-1, =1 but levels of clustering are still high No of edges=118 Q1 = 5 (5) Q2 = 7 (7) Q3 = 9 () Cluster = 0.08 (0.02)

Very long path worlds: =-2.2, 2=0.05, 3=-2, =1 (no clustering) Q1 =  (11) Q2 =  () Q3 =  () Cluster = 0.00 (0.02)

no of successful moves high probability for high values of 2 Simulations of two-star models (n=30) (a)  = 0, 2 =[0.00, 0.01,…0.10] (see also Handcock, 2004; Park & Newman, 2004; Snijders, 2002) average no of degree 2-stars complete graph has no of successful moves high probability for high values of 2 Metropolis algorithm multiple random starts

(b)  = -2.5, 2 =[-0.50, -0.45,…,0.25] average no of degree 2-stars sharp transition from low to high no of successful moves density graphs around 2 = -/(n-2)

“Freezing” at 2 = -/(n-2): (,2) = (-14,0.5)/t for t = 0,1,… Average degree Successful moves See Park and Newman (2004) for an analytical solution (including phase diagram)

4: Applications: Estimation of model parameters and model evaluation Estimation of model parameters from data: MLE via MCMC approaches (Snijders, 2002; Handcock et al, 2004) Model evaluation: do substantively important global properties of the observed data resemble simulated data? For example: Degree distribution Path length distribution Presence of clustering, cycles The overall aim is to identify regularities in local relational structures, and at the same time build models that reproduce global network structure from empirically-grounded local regularities

The alternating k-star, k-independent 2-path and k-triangle hypotheses (Snijders, Pattison, Robins & Handcock, 2004) Suppose that: k = -k-1/ where   1 is a (fixed) constant alternating k-star hypothesis Then kSk(x)k = S[](x) 2 where: S[](x) = 2 i{(1 - 1/)d(i) + d(i)/ - 1} and d(i) denote the degree of node i alternating k-star statistic Likewise: If Uk(x) = no of k-independent 2-paths in x, with corresponding parameter k and Tk(x) = no of k-triangles in x, with corresponding parameter k We can suppose that: k+1 = - k/ alternating independent 2-path hypothesis k+1 = - k/ alternating k-triangle hypothesis

Network of Collaboration Ties

Realisation-dependent model for colaboration ties among lawyers (Pattison & Robins, 2002)

MCMCML parameter estimates for collaboration network (SIENA, conditioning on total ties, partners only) Model 1 Model 2 Parameter est s.e. est s.e. alternating k-stars (=3) -0.083 0.316 Alternating ind. 2-paths (=3) -0.042 0.154 Alternating k-triangles (=3) 0.572 0.190 0.608 0.089 No pairs connected by a 2–path -0.025 0.188 No pairs lying on a triangle 0.486 0.513 Seniority main effect 0.023 0.006 0.024 0.006 Practice (corp. law) main effect 0.391 0.116 0.375 0.109 Same practice 0.390 0.100 0.385 0.101 Same gender 0.343 0.124 0.359 0.120 Same office 0.577 0.110 0.572 0.100

Modelling group cohesion (Albert, 2002) Network ties are important in understanding social processes, but so are: cultural and psychological resources and aspirations (beliefs, values, attitudes, knowledge) settings (geographical locations, physical and organisational constraints) Lindenberg (1997) on groups: Three overlapping forms of interdependence: functional (common goals and tasks) workgroup membership cognitive (psychological representations) beliefs structural (patterning of interpersonal ties) network ties Albert (2002) on group cohesion: An illustrative analysis of interdependent functional, cognitive and structural aspects of group cohesion using generalised relational structures

Management consulting firm node colour codes group membership node size codes extent of cohesive beliefs ties: “Who do you ask when you want to find out what is going on..?”

Functional, structural and cognitive interdependence Evidence for separable tendencies: structural logic of information seeking: hierarchical with differentiation in information seeking structural interdependence information ties within groups structural & functional interdependence shared beliefs within groups cognitive and functional interdependence shared beliefs within groups among those linked by an information tie cognitive, structural and functional interdependence

5: A dynamic perspective co-evolution of action, networks, settings

Dynamic models Suppose that Xij(t) are time-dependent relational variables At any moment t, suppose that there is a possible change in status for some randomly chosen Xij with a transition rate logistic(Q Q(zQ(x*ij(t)) - zQ(x(t)))) where: x(t) denotes the state of the network at time t; x*ij(t) equals x(t) but with the value of Xij(t) changed from xij(t) to 1-xij(t);  is a rate parameter; logistic(z)=exp(z)/(1+exp(z)) Then this continuous-time Markov process converges to the distribution Pr (X = x) = (1/c) exp{QQ zQ(x)} parameters can be estimated from longitudinal data (using approach adapted from that developed by Snijders, 2001, 2002)

Client referrals: time 1

Client referrals: time 2

Modelling client referrals Time 1 Time 2 Time 2 Time1Time2 PLE PLE cond MCMCMLE* cond estimate Edge -3.02 -3.20 - -2.74 (0.35) 2-in-star 0.01 0.05 0.06 (.03) 0.04 (0.03) 2-path -0.08 -0.07 -0.05 (.02) -0.05 (0.02) 2-out-star 0.09 0.10 0.08 (.02) 0.09 (0.02) mutual tie 2.54 1.73 1.72 (.29) 1.39 (0.28) 3-cycle -0.20 -0.14 -0.15 (.09) -0.14 (0.09) transitive triad 0.21 0.19 0.16 (.03) 0.14 (0.03) *using SIENA, conditioning on number of ties

Early 1990s in Brazil: student, civic, political and business groups time 1 time 2 time 3 -3.222(.44) -3.805( .44) -4.678( .46) -2.223(1.1) -6.665(1.8) -10.71(1.5) -4.405(.98) -6.333(1.5) -9.322(1.8) 0.099(.02) 0.116(.02) 0.170(.02) 0.123(.17) 0.734(.17) 1.051(.15) 0.198(.02) 0.207(.03) 0.202(.02) 0.204(.04) 0.309(.06) 0.459(.14) 0.745(.10) 0.886(.14) 0.906(.12) 0.320(.06) 0.443(.09) 0.444(.06) -0.177(.04) -0.123(.05) -0.022(.04) -0.461(.06) -0.307(.06) 0.000(.06) -0.146(.07) -0.041(.05) -0.024(.03) 0.808(.08) 0.472(.07) 0.139(.06) Key : organisation project event

6. Concluding comments Models can display complex behaviour (e.g. nonlinearities, phase transitions) creating some statistical difficulties! Nonetheless, a statistical approach allows us to stay close to empirical data, and model parameters can be estimated from data. For a well-specified model We can test hypotheses about local contextual effects We can predict the evolution of the system (and its variability) We can understand the aggregate-level consequences of local contextual effects (and their variability) Realisation-dependent models appear to be necessary, and reflect a “capacity for actors to transform as well as reproduce long-standing structures, frameworks and networks of interaction” (Emirbayer & Goodwin, 1994)

Some modelling challenges Scaling up: the role of space Spatial random graph models (Henry Wong) Co-evolution Dynamic interactions across levels Evolution of multiple networks Social “innovation” and transformation Multiple networks are implicated theoretically e.g Padgett et al on the evolution of markets in Florence “Emergent” phenomena? Eg emergence of social institutions such as groups Technical issues Sampling, estimation, missing data…