Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to ERGM/p* model Kayo Fujimoto, Ph.D. Based on presentation slides by Nosh Contractor and Mengxiao Zhu.

Similar presentations


Presentation on theme: "Introduction to ERGM/p* model Kayo Fujimoto, Ph.D. Based on presentation slides by Nosh Contractor and Mengxiao Zhu."— Presentation transcript:

1 Introduction to ERGM/p* model Kayo Fujimoto, Ph.D. Based on presentation slides by Nosh Contractor and Mengxiao Zhu

2 Four parts of ERGM Observed network data Network statistics (or counts) of each configuration ERG Modeling Conditional probability and Change statistics Estimation and Simulation Estimate Parameters by Simulation Method: MCMC ML estimation Goodness of fit test (convergence t-test) Compare observed and simulated graphs Recent development in ERGM New model specification

3 Exponential Random Graph Model (ERGM) ERGMs take the form of a probability distribution of graphs: ERGMs take the form of a probability distribution of graphs: Y is a set of tie indicator variables Y Y is a set of tie indicator variables Y y is a realization, the observed network y is a realization, the observed network g( y ) is a vector of network statistics g( y ) is a vector of network statistics θ is a parameter vector corresponding to g( y ) θ is a parameter vector corresponding to g( y ) k( θ ) is a normalizing factor calculated by summing up k( θ ) is a normalizing factor calculated by summing up exp{ θ ’g(y)} over all possible network configurations

4 Observed network Graph statistics (or counts) of each configuration

5 Network Statistics Examples for Undirected Networks Example: Edge: 6 2-Star: 1+3+1+6+0=11 3-Star: 0+1+0+4+0=5 4-Star: 1 Triangle: 2 a b d c e

6 Num of Edges Undirected Network Configurations 0 1 2 3 A Simple Example of ERGM Homogeneous Assumption Number of configurations: Directed Network: Undirected Network:

7 A Simple ERG model Predict network using edge count θ can take different values: θ = 0, θ = -0.69, θ = 0.69 L(y) can the following values: L(y) = 0, L(y) = 1, L(y) = 2, L(y) = 3

8 Example 1: θ = 0, L=0 Model: Model: ERGM Formula ERGM Formula Probability of getting networks with 0 edge θ = 0

9 Example 1: θ = 0, L=1 Model: Model: ERGM Formula ERGM Formula θ = 0 Probability of getting networks with 1 edge

10 Example 1: θ = 0, L=2 Model: Model: ERGM Formula ERGM Formula θ = 0 Probability of getting networks with 2 edge

11 Probability of getting networks with 3 edge Example 1: θ = 0, L=3 Model: Model: ERGM Formula ERGM Formula θ = 0

12 Example 1: θ = 0 Model: Model: ERGM Formula ERGM Formula θ = 0

13 Example 2: θ = -0.69 Model: Model: ERGM Formula ERGM Formula θ = -0.69

14 Example 3: θ = 0.69 Model: Model: ERGM Formula ERGM Formula θ = 0.69

15 Why Change Statistics? Huge Sample Space Huge Sample Space Num of configurations :

16 ERG modeling Conditional Probability and Change Statistics

17 Conditional Probability vs. Total Probability Total probability of the whole network Total probability of the whole network It is impossible to calculate when the size of the network gets large It is impossible to calculate when the size of the network gets large Introduce the Conditional Probability of edges Introduce the Conditional Probability of edges Reduce sample space Reduce sample space

18 Avoid the Calculation on Sample Space Conditional Probability of an Edge to exist Conditional Probability of an Edge to exist Conditional Probability of an Edge to be absent is Conditional Probability of an Edge to be absent is Logit p* model: model log odds ratio of Yij exists Logit p* model: model log odds ratio of Yij exists

19 Change Statistics (logit p* model) From the end of last slide, we have: From the end of last slide, we have: Define Change Statistics as: Define Change Statistics as: Model log odds of a tie being present to absent: Model log odds of a tie being present to absent:

20 Estimation and Simulation (Monte Carlo Markov Chain Maximum Likelihood Method)

21 Review: Maximum Likelihood Estimation (MLE) Likelihood functions Likelihood functions Estimate parameter θ given the observed network. Estimate parameter θ given the observed network. Maximum Likelihood Estimation Maximum Likelihood Estimation Find θ values such that the observed statistics are equal to the expected statistics Find θ values such that the observed statistics are equal to the expected statistics Approximate MLE by simulation Approximate MLE by simulation

22 Procedures for simulating ERG distribution Markov Chain Monte Carlo Maximum Likelihood Estimation (MCMCMLE) Markov Chain Monte Carlo Maximum Likelihood Estimation (MCMCMLE) 1. Simulate a distribution of random graphs from a starting set of parameter values 2. Refine the parameter values by comparing the distribution of graphs against the observed graph 3. Repeat this process until the parameter estimate stabilize

23 Convergence T-statistics Test adequacy of parameter values estimated Test adequacy of parameter values estimated 1. T-statistics for each configuration T <|.1|  good fit NOTE: If the parameter estimates do not converge, the model is degenerate

24 A Simple Example of MCMCMLE Model: Model: Observed Network y: Observed Network y: Goal: Find θ value such that the observed number of edges are equal to the expected number of edges Goal: Find θ value such that the observed number of edges are equal to the expected number of edges

25 If θ can be chosen from the following 3 cases, θ=-0.69 is preferred because it gives the highest probability for the observed network θP(Y=y) -0.690.444 00.375 0.690.222 Given the observed Network y: Given the observed Network y:

26 Markov dependence (Frank and Strauss, 1986) Potential ties are dependent only if they share a common actor Two possible network ties are conditionally independent unless they share a common actor Once homogeneity assumption is imposed, we obtain the following configurations…

27 Markov random graph models (non-directed networks) Density or edge(  ) Two-star(  2 ) Three-star(  3 ) Triangle(  )

28 Problems of degeneracy for Markov random models Certain parameter values place almost all of the probability mass on either the empty or the full graph Simulation studies showed that Markov random graph models are degenerate for many empirical networks with high level of clustering A few very high degree nodes Some regions of high triangulation

29 Two possibilities for the degeneracy problem (Snijders, et al 2006) Makov dependence assumption may be too restrictive The representation of transitivity by the total number of triangles might be too simplistic  New specification of higher order network dependency

30 New development in ERGM Partial conditional dependence assumption and new model specification

31 Partial conditional dependence (Social circuit dependence) Two possible network ties being conditionally dependent if their observation would lead to a 4-cycle i k j l = possible edges = observed edges

32 Partial conditional dependence (Example) Daughter A Father B Father A Daughter B

33 Difference between the two types of dependence assumptions i k j l Markov dependence assumptions Partial conditional dependence assumptions = potential tie = ties which affect the formation of the potential tie = ties with no effect on the potential tie i k j l

34 New Specifications of ERGM Represent structural parameters similar to the Markov parameters Effects are incorporated within the one configuration parameter Three new statistics for non-directed network Alternating k-stars Alternating k-triangles Alternating independent two-paths

35 Alternating k-star configuration (degree dist’n): Alternating k-triangle (tendency to form triads): Alternating k-two-path (tendency to form cycles) Examples of new specifications

36 Interpretation of the parameter Positive alternating k-star parameter Networks with some higher degree nodes are highly probable.  Core-periphery structure Positive alternating k-triangle parameter Triangulation in the network as well as tendencies for triangles themselves group together in larger higher order “clump” Positive alternating k-path parameter Tendency for 4-cycles in the network

37 Summary for model construction Random variables Each network tie (Yij) among nodes of a network A random tie variable Yij=1 if a tie form i to j exist, Yij=0 otherwise y ij the observed value of the variable Yij Dependence assumptions Define contingencies among network variables Determine the type of parameters in the model Ties also depends on node-level attributes (homophily) Homogeneity assumption Simplify parameters by imposing homogeneity constraints. Estimation procedures Find the best parameter values based on the observed network Use simulation (MCMLE)

38 Software for ERGM SIENA (Snijders, and colleagues) PNet (Robbins, and colleagues) Statnet (Butts, and colleagues)

39 Reference Harrigan, Nicholas. “ Exponential Rnadom Graph (ERG) models and their application to the study of corporate elites. Robins, Garry (manuscript). Exponential Random Graph (p*) models for social Networks, published in Melnet website. Robins, G., Pattison, P. Kalish, y. Lusher, D. (2007). “An introduction to exponential random graph (p*) models for social networks”. Social Networks, 29, 173-191. Snijders, T.A.B., Pattison, P., Robins, G, Hancock M. (2006). “New specifications for exponential random graph models. Sociological Methodology, 36: 99-153.

40 Thank you for your attention Any questions?


Download ppt "Introduction to ERGM/p* model Kayo Fujimoto, Ph.D. Based on presentation slides by Nosh Contractor and Mengxiao Zhu."

Similar presentations


Ads by Google