Download presentation

Presentation is loading. Please wait.

Published byNoel nolan Glover Modified over 2 years ago

1
Missing data in social networks - Problems and prospects for model-based inference Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre for Network Analysis Tuesday, 20 December

2
A relational perspective – networks matter Vegetarian partner x Ethical Economics Health Taste Dr D eats (predominantly) vegetarian food... Dr Dean Lushers relational take

3
A relational perspective – networks matter Someone close to you is unhappy will you remain unaffected?

4
A relational perspective – networks matter Equal opportunities based on our individual qualities......

5
A relational perspective – networks matter... bowl alone others bowl in leagues Some people bowl alone

6
Part 1 Network representations

7
Social networks mary paul We conceive of a network as a Relation defined on a collection of individuals relates to … go to for advice…

8
Social networks mary paul We conceive of a network as a Relation defined on a collection of individuals relates to … consider a friend…

9
Social networks mary paul We conceive of a network as a Relation defined on a collection of individuals relates to on off Generally binary Tie present Tie absent

10
Network representations A non-directed graph A social network of tertiary students – Kalish (2003)

11
Network representations

12
Police training squad: Confiding network (Pane, 2003)

13
Network representations World Trade in 1992 Plümper, 2003, JOSS

14
Network representations: attributes The actors (nodes) in the network are individuals with –attitudes, behaviours, and attributes These may –guide them in their choices of partners –be shaped (influenced) by their partners The actors may have individual and collective outcomes

15
Network representations: attributes A non-directed graph A social network of tertiary students – Kalish (2003)

16
Network representations: attributes A non-directed graph A social network of tertiary students – Kalish (2003) JewishArab

17
Network representations: attributes High School friendship, Moody, 2001 white black other

18
Network representations: attributes Romantic/sexual relationships at a US high school (Bearman, Moody & Stovel, 2004) Guess the blue and pink

19
Network representations: attributes detachedteam orientedpositive Team structures in training squads (Pane, 2003) (friendship network in 12 th week of training)

20
Multiple relations – entrailment, exchange, and generalized exchange Physical violence Violence & attitudes among school boys (Lusher, 2003)

21
Social networks We conceive of the Graph as a collection of Tie variables: {X ij : i,j V} john pete mary paul i - x ij x ik x il jx ji -x jl kx ki x kj - x kl lx li x lj x lk - x = i j k l =

22
Social networks We conceive of the Graph as a collection of Tie variables: {X ij : i,j V} i - x ij x ik x il jx ji -x jl kx ki x kj - x kl lx li x lj x lk - x = i j k l = l i j k

23
Social networks The Adjacency matrix: The matrix of the collection Tie var. {X ij : i,j V} i - x ij x ik x il jx ji -x jl kx ki x kj - x kl lx li x lj x lk - x =

24
Social networks: adjacency matrix Read Highland tribes

25
Social networks: adjacency matrix Read Highland tribes

26
Social networks: adjacency matrix Read Highland tribes

27
Social networks: adjacency matrix Read Highland tribes Symmetric for a non-directed network

28
Social networks: adjacency matrix Read Highland tribes

29
Social networks: adjacency matrix Read Highland tribes Zeroes along the diagonal – self ties not permitted

30
Part 2 Analysing social networks – Putting the building blocks of networks together using ERGM

31
Do we need to analyse networks? -Is the network a unique narrative? -stick to an ethnography? Possible answers -Detecting systematic tendencies -Social mechanisms -lift the description to describe network in generalizable terms

32
Networks matter – ERGMS matter 6018 grade 6 children 1966 FEMALE Male

33
Networks matter – ERGMS matter 6018 grade 6 children 1966 – 300 schools Stockholm

34
Networks matter – ERGMS matter 6018 grade 6 children 1966 – 200 schools Stockholm Koskinen and Stenberg (in press) JEBS

35
Networks matter – ERGMS matter 6018 grade 6 children 1966 – 200 schools Stockholm Koskinen and Stenberg (in press) JEBS

36
Networks matter – ERGMS matter 6018 grade 6 children 1966 – 200 schools Stockholm Koskinen and Stenberg (in press) JEBS

37
Networks matter – ERGMS matter 6018 grade 6 children 1966 – 200 schools Stockholm Koskinen and Stenberg (in press) JEBS

38
Do we need to analyse networks? -Is the network a unique narrative? -stick to an ethnography? Possible answers -Detecting systematic tendencies -Social mechanisms -lift the description to describe network in generalizable terms Conceptualising the network as a graph is what enables this

39
ERGMS – modelling graphs

40

41
ERGMS – modelling graphs: example Marriage network of Padgetts Florentine families

42
ERGMS – modelling graphs: example Marriage network of Padgetts Florentine families Model this as combination of 4 local structures Their importance measured by their parameters

43
ERGMS – modelling graphs: example effectMLES.E. Edge star star Triangle

44
ERGMS – modelling graphs: example effectMLES.E. Edge star star Triangle

45
Part 3 Modelling graphs – deriving building blocks out of dependencies

46
Independence - Deriving the ERGM l i j k m n heads tails l i l i heads tails i k i k

47
Independence - Deriving the ERGM 0.25 AUD 0.5 SEK 0.5 l i k l i k Knowledge of AUD, e.g. does not help us predict SEK e.g. whetheror

48
Independence - Deriving the ERGM i i k Knowledge of AUD, e.g.does not help us predict SEK e.g. whetheror even though dyad { i,l } l i and dyad { i,k } have vertex i in common

49
Independence - Deriving the ERGM AUD 0.5 SEK 0.5 l i k l i k May we find model such that knowledge of AUD, e.g. does help us predict SEK e.g. whetheror?

50
Deriving the ERGM: From Markov graph to Dependence graph john pete mary paul Consider the tie-variables that have Mary in common How may we make these dependent?

51
Deriving the ERGM: From Markov graph to Dependence graph john pete mary paul pete mary

52
Deriving the ERGM: From Markov graph to Dependence graph john pete mary paul john pete mary

53
Deriving the ERGM: From Markov graph to Dependence graph john pete mary paul john pete mary paul mary

54
Deriving the ERGM: From Markov graph to Dependence graph john pete mary paul john pete mary paul mary pete john

55
Deriving the ERGM: From Markov graph to Dependence graph john pete mary paul john pete mary paul mary paul john pete john

56
Deriving the ERGM: From Markov graph to Dependence graph john pete mary paul john pete mary paul mary paul pete paul john pete john

57
Deriving the ERGM: From Markov graph to Dependence graph john pete mary paul m,pa pa,pe pa,j m,pe pe,j m,j The probability structure of a Markov graph is described by cliques of the dependence graph (Hammersley-Clifford)….

58
Deriving the ERGM: From Markov graph to Dependence graph pete mary paul m,pa pa,pe pa,j m,pe pe,j m,j

59
Deriving the ERGM: From Markov graph to Dependence graph john pete mary paul m,pe

60
Deriving the ERGM: From Markov graph to Dependence graph john pete mary paul m,pa pa,pe pa,j m,pe pe,j m,j

61
Deriving the ERGM: From Markov graph to Dependence graph mary john pete paul m,pa pa,pe pa,j m,pe pe,j m,j

62
Deriving the ERGM: From Markov graph to Dependence graph mary john pete paul m,pe m,j

63
Deriving the ERGM: From Markov graph to Dependence graph mary john pete paul m,pa pa,pe pa,j m,pe pe,j m,j

64
Deriving the ERGM: From Markov graph to Dependence graph mary john pete paul m,pa pa,pe pa,j m,pe pe,j m,j

65
Deriving the ERGM: From Markov graph to Dependence graph mary john pete paul m,pe pe,j m,j

66
Deriving the ERGM: From Markov graph to Dependence graph mary john pete paul m,pa pa,pe pa,j m,pe pe,j m,j

67
Deriving the ERGM: From Markov graph to Dependence graph mary john pete paul m,pa pa,pe pa,j m,pe pe,j m,j

68
Deriving the ERGM: From Markov graph to Dependence graph mary john pete paul m,pa pa,j m,pe pe,j

69
Deriving the ERGM: From Markov graph to Dependence graph mary john pete paul m,pa pa,pe pa,j m,pe pe,j m,j

70
From Markov graph to Dependence graph – distinct subgraphs? too many statistics (parameters)

71
The homogeneity assumption = = = =

72
A log-linear model (ERGM) for ties Aggregated to a joint model for entire adjacency matrix Interaction terms in log-linear model of types

73
A log-linear model (ERGM) for ties By definition of (in-) dependence E.g. andco-occuring i j i j k i k Main effectsinteraction term More than is explained by margins

74
Part 4 Estimation of ERGM

75
Likelihood equations for exponential fam Aggregated to a joint model for entire adjacency matrix X Sum over all 2 n(n-1)/2 graphs The MLE solves the equation (cf. Lehmann, 1983):

76
Likelihood equations for exponential fam Solving Using the cumulant generating function (Corander, Dahmström, and Dahmström, 1998) Stochastic approximation (Snijders, 2002, based on Robbins-Monro, 1951) Importance sampling (Handcock, 2003; Hunter and Handcock, 2006, based on Geyer- Thompson 1992)

77
Robbins-Monro algorithm Solving Snijders, 2002, algorithm - Initialisation phase - Main estimation - convergence check and cal. of standard errors MAIN: Draw using MCMC

78
Geyer-Thompson Solving Handcock, 2003, approximate Fisher scoring MAIN: Approximated using importance sample from MCMC

79
Bayes: dealing with likelihood The normalising constant of the posterior not essential for Bayesian inference, all we need is: … but Sum over all 2 n(n-1)/2 graphs

80
Bayes: MCMC? Consequently, in e.g. Metropolis-Hastings, acceptance probability of move to θ … which contains

81
Bayes: Linked Importance Sampler Auxiliary Variable MCMC LISA (Koskinen, 2008; Koskinen, Robins & Pattison, 2010): Based on Møller et al. (2006), we define an auxiliary variable And produce draws from the joint posterior using the proposal distributions and

82
Bayes: alternative auxiliary variable LISA (Koskinen, 2008; Koskinen, Robins & Pattison, 2010): Based on Møller et al. (2006), we define an auxiliary variable Improvement: use exchange algorithm (Murray et al. 2006) Many linked chains: - Computation time - storage (memory and time issues) and Accept θ* with log-probability: Caimo & Friel, 2011

83
Bayes: Implications of using alternative auxiliary variable Improvement: use exchange algorithm (Murray et al. 2006) and Accept θ* with log-probability: Caimo & Friel, 2011 Storing only parameters No pre tuning – no need for good initial values Standard MCMC properties of sampler Less sensitive to near degeneracy in estimation Easier than anything else to implement QUICK and ROBUST

84
Part 5 Types of missing data

85
Sampling in/on networks

86

87

88

89

90

91

92

93
missing data observed data Sampling in/on networks

94
= 0 x = - 1 1

95
Sampling in/on networks = x =

96
Sampling in/on networks = x =

97
Sampling in/on networks = x =

98
Sampling in/on networks = x =

99
Sampling in/on networks = ? 0 ? 0 ? ? ? ? x = ? ? ? ? ? ? - 1 ? ? 1 -

100
Ignoring non-sampled? = ? 0 ? 0 ? ? ? ? x = ? ? ? ? ? ? - 1 ? ? 1 -

101
What about alter – alter across ego? = ? 0 ? 0 ? ? ? ? x = ? ? ? ? ? ? - ? ? ? ? -

102
School classes

103

104

105

106

107
Multilevel attribute models If network like another level: Groups: Group indicators: Networks in groups (scaled): With random intercepts:

108
Empirical setup =

109
=

110
Problem of boundary specification By design – children do not nominate alters outside of school class

111
Problem of boundary specification By design – children do not nominate alters outside of school class Out of school To other school class

112
Problem of boundary specification By design – children do not nominate alters outside of school class Out of school To other school class

113
Multilevel autocorrelation/nef models =

114
= ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

115
Part 6 Estimation of ERGM with missing data

116
Model assisted treatment of missing network data missing data observed data If you dont have a model for what you have observed How are you going to be able to say something about what you have not observed using what you have observed

117
Model assisted treatment of missing network data Importance sampling (Handcock & Gile 2010; Koskinen, Robins & Pattison, 2010) Stochastic approximation and the missing data principle (Orchard & Woodbury,1972) (Koskinen & Snijders, forthcoming) Bayesian data augmentation (Koskinen, Robins & Pattison, 2010)

118
What about alter – alter across ego? missing data observed data Available case analysis: pretend missing does not exist

119
The principled approach in ERGM framework missing data observed data We have to simulate the missing (complement) and pool our inferences

120
Subgraph of ERGM not ERGM i j k Dependence in ERGMWe may also have dependence i j l k But if k ? j We should include counts of: Marginalisation (Snijders, 2010; Koskinen et al, 2010)

121
Bayesian Data Augmentation With missing data: Simulate parameters In each iteration simulate graphs missing Bayesian Data Augmentation

122
Simulate parameters With missing data: In each iteration simulate graphs missing Most likely missing given current Bayesian Data Augmentation

123
Simulate parameters With missing data: In each iteration simulate graphs missing Most likely given current missing Bayesian Data Augmentation

124
Simulate parameters With missing data: In each iteration simulate graphs missing Most likely missing given current Bayesian Data Augmentation

125
Simulate parameters With missing data: In each iteration simulate graphs missing Most likely given current missing Bayesian Data Augmentation

126
Simulate parameters With missing data: In each iteration simulate graphs missing Most likely missing given current Bayesian Data Augmentation

127
Simulate parameters With missing data: In each iteration simulate graphs missing and so on… Bayesian Data Augmentation

128
Simulate parameters With missing data: In each iteration simulate graphs missing … until Bayesian Data Augmentation

129
What does it give us? Distribution of parameters Distribution of missing data Subtle point Missing data does not depend on the parameters (we dont have to choose parameters to simulate missing) missing Bayesian Data Augmentation

130
What does it give us? Distribution of parameters Distribution of missing data Subtle point Missing data does not depend on the parameters (we dont have to choose parameters to simulate missing) missing Bayesian Data Augmentation

131
Part 7 Estimation of ERGM with missing data - Example Missing ties

132
Bayesian Data AugmentationLazegas (2001) Lawyers Collaboration network among 36 lawyers in a New England law firm (Lazega, 2001) Boston office: Hartford office: Providence off.: least senior: most senior:

133
Bayesian Data AugmentationLazegas (2001) Lawyers 133 Edges: Seniority: Practice: Homophily Sex: Office: GWESP: with 8 = log( ) Practice: Main effect t 1 : t 2 : etc. ( b i = 1, if i corporate, 0 litigation) t 3 :

134
Bayesian Data Augmentation Lazegas (2001) Lawyers – ERGM posteriors (Koskinen, 2008)

135
Bayesian Data Augmentation Cross validation (Koskinen, Robins & Pattison, 2010) Remove 200 of the 630 dyads at random Fit inhomogeneous Bernoulli model obtain the posterior predictive tie-probabilities for the missing tie-variables Fit ERGM and obtain the posterior predictive tie- probabilities for the missing tie-variables (Koskinen et al., in press) Fit Hoffs (2008) latent variable probit model with linear predictor T z(x ij ) + w i w j T Repeat many times

136
Bayesian Data Augmentation ROC curve for predictive probabilities combined over 20 replications (Koskinen et al. 2010)

137
Bayesian Data Augmentation ROC curve for predictive probabilities combined over 20 replications (Koskinen et al. 2010)

138
Bayesian Data Augmentation ROC curve for predictive probabilities combined over 20 replications (Koskinen et al. 2010)

139
Part 8 Estimation of ERGM with missing data - Sampled data and covert actors

140
Bayesian Data AugmentationSnowball sampling Snowball sampling design ignorable for ERGM (Thompson and Frank, 2000, Handcock & Gile 2010; Koskinen, Robins & Pattison, 2010)... but snowball sampling rarely used when population size is known... Using the Sageman (2004) clandestine network as test-bed for unknown N

141
Bayesian Data AugmentationSnowball sampling Snowball sampling design ignorable for ERGM (Thompson and Frank, 2000, Handcock & Gile 2010; Koskinen, Robins & Pattison, 2010)... but snowball sampling rarely used when population size is known... Using the Sageman (2004) clandestine network as test-bed for unknown N

142
Bayesian Data Augmentationthe Sageman (2004) N = 366 network

143
Bayesian Data Augmentationthe Sageman (2004) N = 366 network

144
Bayesian Data Augmentationthe Sageman (2004) N = 366 network

145
Bayesian Data Augmentationthe Sageman (2004) N = 366 network

146
Bayesian Data Augmentationthe Sageman (2004) N = 366 network Take seed of size n = 120 Snowball out 1 wave. Additional nodes m = 160

147
Bayesian Data Augmentationthe Sageman (2004) N = 366 network Seed n = 120, first wave m = 160, N 280

148
Bayesian Data Augmentationthe Sageman (2004) N = 366 network Seed n = 120, first wave m = 160, N 280 Assume N = 281

149
Bayesian Data Augmentationthe Sageman (2004) N = 366 network Seed n = 120, first wave m = 160, N 280 Assume N = 281 N = 291

150
Bayesian Data Augmentationthe Sageman (2004) N = 366 network Seed n = 120, first wave m = 160, N 280 Assume N = 281 N = 291 N = 301

151
Bayesian Data Augmentationthe Sageman (2004) N = 366 network Seed n = 120, first wave m = 160, N 280 Assume N = 281 N = 291 N = 301 N = 311

152
Bayesian Data Augmentationthe Sageman (2004) N = 366 network Seed n = 120, first wave m = 160, N 280 Assume N = 281 N = 291 N = 301 N = N = 391 N = 396 N = 399

153
Bayesian Data Augmentationthe Sageman (2004) N = 366 network Seed n = 120, first wave m = 160, N 280 Assume N = 281 N = 291 N = 301 N = N = 391 N = 396 N = credibility intervals N

154
Bayesian Data Augmentationthe Sageman (2004) N = 366 network Seed n = 120, first wave m = 160, N 280 Assume N = 281 N = 291 N = 301 N = N = 391 N = 396 N = credibility intervals

155
Bayesian Data Augmentationthe Sageman (2004) N = 366 network Seed n = 120, first wave m = 160, N 280 Assume N = 281 N = 291 N = 301 N = N = 391 N = 396 N = credibility intervals

156
Bayesian Data Augmentationthe Sageman (2004) N = 366 network Seed n = 120, first wave m = 160, N prediction intervals Bernoulli ERGM obs

157
Bayesian Data Augmentationthe Sageman (2004) N = 366 network Seed n = 120, first wave m = 160, N prediction intervals Bernoulli ERGM obs

158
Bayesian Data AugmentationSnowball sampling – Next steps We can fit and predict missing conditional on N Next: Marginalise with respect to N, and Estimate N -Use path sampler -Take combinatorics of zero block into account

159
Part 9 Further issues

160
How large networks can we allow for? Large N ERGMs do not scale up (cp missing data experiments) Lot of unobserved data – lot of unobserved covariates Computational issues – time and memory Heterogeneity…

161
How large networks can we allow for? ERGMs typically assume homogeneity (A)Block modelling and ERGM (Koskinen, 2009) (B) Latent class ERGM (Schweingberger & Handcock)

162
Solutions and future directions Ignoring unknown N : - Conditional MLE for snowball sample does not require knowledge of N (sic!) (Pattison et al., in preparation) Estimating N : - Bernoulli assumptions (Frank and Snijders, 1994 JOS) - Using ERGM and Bayes factors? (Koskinen et al., in preparation) - Using heuristic GOF; posterior predictive distributions, re-sampling and copula (?)

163
Wrap-up ERGMs - Increasingly being used - Increasingly being understood - I ncreasingly being able to handle imperfect data (also missing link prediction) Methods -Plenty of open issues -Bayes is the way of the future Legitimacy and dissemination - e.g. Lusher, Koskinen, Robins ERGMs for SN, CUP, 2011

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google