Assortativity (people associate based on common attributes)

Slides:



Advertisements
Similar presentations
Generated Waypoint Efficiency: The efficiency considered here is defined as follows: As can be seen from the graph, for the obstruction radius values (200,
Advertisements

Identity and search in social networks Presented by Pooja Deodhar Duncan J. Watts, Peter Sheridan Dodds and M. E. J. Newman.
Company LOGO 1 Identity and Search in Social Networks D.J.Watts, P.S. Dodds, M.E.J. Newman Maryam Fazel-Zarandi.
School of Information University of Michigan SI 614 Random graphs & power law networks preferential attachment Lecture 7 Instructor: Lada Adamic.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
By: Roma Mohibullah Shahrukh Qureshi
Lecture 9 Measures and Metrics. Structural Metrics Degree distribution Average path length Centrality Degree, Eigenvector, Katz, Pagerank, Closeness,
A/S/L? Homophily of Online and Face to Face Social Ties Gustavo S. Mesch & Ilan Talmud Department of Sociology and Anthropology, University of Haifa.
Sampling Distributions
(hyperlink-induced topic search)
1 Modularity and Community Structure in Networks* Final project *Based on a paper by M.E.J Newman in PNAS 2006.
The Very Small World of the Well-connected. (19 june 2008 ) Lada Adamic School of Information University of Michigan Ann Arbor, MI
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
Using Friendship Ties and Family Circles for Link Prediction Elena Zheleva, Lise Getoor, Jennifer Golbeck, Ugur Kuter (SNAKDD 2008)
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
Homophily, Social Influence, & Affiliation Dr. Frank McCown Intro to Web Science Harding University This work is licensed under a Creative Commons Attribution-NonCommercial-
Small World Social Networks With slides from Jon Kleinberg, David Liben-Nowell, and Daniel Bilar.
Principles of Social Network Analysis. Definition of Social Networks “A social network is a set of actors that may have relationships with one another”
Online Social Networks and Media
A Graph-based Friend Recommendation System Using Genetic Algorithm
Other Chi-Square Tests
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Networks and Surrounding Contexts Chapter 4, from D. Easley and J. Kleinberg book.
Exponential Random Graph Models Under Measurement Error Zoe Rehnberg with Dr. Nan Lin Washington University in St. Louis ARTU 2014.
Yongqin Gao, Greg Madey Computer Science & Engineering Department University of Notre Dame © Copyright 2002~2003 by Serendip Gao, all rights reserved.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
11.2 Tests Using Contingency Tables When data can be tabulated in table form in terms of frequencies, several types of hypotheses can be tested by using.
Graphs G = (V,E) V is the vertex set. Vertices are also called nodes and points. E is the edge set. Each edge connects two different vertices. Edges are.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
CS 590 Term Project Epidemic model on Facebook
Small World Social Networks With slides from Jon Kleinberg, David Liben-Nowell, and Daniel Bilar.
Network Theory: Community Detection Dr. Henry Hexmoor Department of Computer Science Southern Illinois University Carbondale.
Social Network Theory Dr. Zaheeruddin Asif.
Spanning Trees Dijkstra (Unit 10) SOL: DM.2 Classwork worksheet Homework (day 70) Worksheet Quiz next block.
Dynamic Network Analysis Case study of PageRank-based Rewiring Narjès Bellamine-BenSaoud Galen Wilkerson 2 nd Second Annual French Complex Systems Summer.
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
Lecture 9 Measures and Metrics. Cocitation and Bibliographic coupling 2.
Section 7.13: Homophily (or Assortativity)
A Gentle Introduction to Social Network Analysis
Hypothesis Testing Hypothesis testing is an inferential process
Chapter 12 Chi-Square Tests and Nonparametric Tests
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Groups of vertices and Core-periphery structure
Topics In Social Computing (67810)
SAMPLING Purposes Representativeness “Sampling error”
Are people associating based on gender similarity?
Local Networks Overview Personal Relations: Core Discussion Networks
Lecture 9 Measures and Metrics.
Empirical analysis of Chinese airport network as a complex weighted network Methodology Section Presented by Di Li.
Guerilla Data Inc. Presents:.
Adoption of Health Information Exchanges and Physicians’ Referral Patterns: Are they Mutually Reinforcing? SAEEDE EFTEKHARI*, School of Management, State.
Comparing Two Proportions
Section 8.6: Clustering Coefficients
Focus: Sociology is a behavioral science that looks a human behavior in groups. Sociologists must maintain objectivity, perspective and imagination. Sociology.
Milgram’s experiment really demonstrated two striking facts about large social networks: first, that short paths are there in abundance;
surrounding contexts:
Lecture 10 Measures and Metrics.
Degree and Eigenvector Centrality
Network Science: A Short Introduction i3 Workshop
Section 8.6 of Newman’s book: Clustering Coefficients
Section 7.12: Similarity By: Ralucca Gera, NPS.
Why Social Graphs Are Different Communities Finding Triangles
Clustering Coefficients
Degree Distributions.
Katz Centrality (directed graphs).
Section 8.3: Degree Distribution
Sampling Design Basic concept
Degree Distribution Ralucca Gera,
Mathematical Foundations of BME Reza Shadmehr
Graphs G = (V,E) V is the vertex set.
“The Spread of Physical Activity Through Social Networks”
Presentation transcript:

Assortativity (people associate based on common attributes)

Are people associating based on gender similarity?

Homophily or assortativity Sociologists have observed network partitioning based on the following characteristics: Friendships, acquaintances, business relationships Relationships based on certain characteristics: Age Nationality Language Education Income level Homophily: It is the tendency of individuals to choose friends with similar characteristic. “Like links with like.” How do we compute it?

calculate the actual number of the same gender ties Naïve Approach: calculate the actual number of the same gender ties

First Make a Block Model adjacency matrix Attribute N1 N2 N3 N4 N5 N6 N7 N8 N9 Male 1 Female

First Make a Block Model Attribute N1 N2 N3 N4 N5 N6 N7 N8 N9 Female 1 Male

First Make a Block Model Attribute N2 N4 N9 N1 N3 N5 N6 N7 N8 Female 1 Male

First Make a Block Model 3 3 2 Block Densities 5 6⋅3 10 6 2 Attribute N2 N4 N9 N1 N3 N5 N6 N7 N8 Female 1 Male

Naïve Approach – calculate the fraction of same gender ties 1 2 3 4 5 6 7 8 9 10 N1 N6 N3 N2 N7 N4 N8 N5 N9 72% (13/18) of the edges are between vertices of the same gender

Finding the number of same-class ties (“Turn off the mixed-class ties with a Kronecker Delta”) Kronecker Delta 𝛿 𝑐 𝑖 , 𝑐 𝑗 = 0, 𝑖𝑓 𝑐 𝑖 ≠ 𝑐 𝑗 &1, 𝑖𝑓 𝑐 𝑖 = 𝑐 𝑗

Finding the number of same-class ties (“Turn off the mixed-class ties with a Kronecker Delta”) Kronecker Delta Actual number of same-class ties 𝛿 𝑐 𝑖 , 𝑐 𝑗 = 0, 𝑖𝑓 𝑐 𝑖 ≠ 𝑐 𝑗 &1, 𝑖𝑓 𝑐 𝑖 = 𝑐 𝑗 𝑒𝑑𝑔𝑒𝑠 (𝑖,𝑗) 𝛿 𝑐 𝑖 , 𝑐 𝑗 = 1 2 𝑖𝑗 𝐴 𝑖𝑗 𝛿 𝑐 𝑖 , 𝑐 𝑗 =13

estimating the number of expected edges Kleinberg’s method: estimating the number of expected edges

Proportion of Males and Females Nodes: P(male) p = 6/9 N3 Males N2 Total number of nodes Nodes: P(Female) q = 3/9 N7 N4 Females Total number of nodes N8 N5 N9

Probability of Selecting a Male or Female Nodes: P(male) p = 6/9 p = 2/3 N3 N2 Nodes: P(Female) q = 3/9 q = 1/3 N7 N4 N8 N5 N9

Probability of a Male selecting a Male-Male, Female-Female, Male-Female N1 N6 Nodes: P(male) p = 6/9 p = 2/3 N3 Edges: P(m-m) p2 =4/9 N2 Nodes: P(Female) q = 3/9 q = 1/3 N7 N4 Edges: P(f-f) q2 =1/9 N8 N5 N9 Edges: P(male-female) P(female-male) 2pq = 4/9

Male-Male, Female-Female, Male-Female Ties Expected number of Male-Male, Female-Female, Male-Female Ties N1 N6 Nodes: P(male) p = 6/9 p = 2/3 N3 Edges: P(m-m) p2 =4/9 p2 =8/18 N2 Nodes: P(Female) q = 3/9 q = 1/3 N7 N4 expecting 8 edges to be male-male out of the total 18 edges Edges: P(f-f) q2 =1/9 q2 =2/18 N8 N5 N9 Edges: P(male-female) P(female-male) 2pq = 4/9 2pq = 8/18

Expected number of Male-Male, Female-Female, Male-Female Ties Nodes: P(male) p = 6/9 p = 2/3 N3 Edges: P(m-m) p2 =4/9 p2 =8/18 8 M-M N2 Nodes: P(Female) q = 3/9 q = 1/3 N7 N4 Edges: P(f-f) q2 =1/9 q2 =2/18 2 F-F N8 N5 N9 Edges: P(male-female) P(female-male) 2pq = 4/9 2pq = 8/18 8 M-F Total expected # of same gender ties/edges: 10

“Make connections at random while preserving the vertex degrees. Newman’s approach “Make connections at random while preserving the vertex degrees. Ignoring vertex degrees and making connections truly at random has been shown to give much poorer results” 1 2 𝑖𝑗 𝑘 𝑖 𝑘 𝑗 2𝑚 𝛿 𝑐 𝑖 , 𝑐 𝑗 Note: 2𝑚 is sum of degrees, where 𝑚 is the number of edges

Expected number of same-class ties m = number of edges = 18 1 2 𝑖𝑗 𝑘 𝑖 𝑘 𝑗 2𝑚 𝛿 𝑐 𝑖 , 𝑐 𝑗 =10.36

Computing Assortativity/Homophily: The difference between the present number of same ties and the expected number of same ties

Measuring the Presence of Homophily – Calculating modularity If there is no homophily effect, we should expect to see 10.36 same gender ties. Since we see 13 same gender ties instead of 10.36, there is some evidence of homophily We see about 3 more same gender ties than we would expect if gender had no effect on tie formation. 1 2 𝑖𝑗 𝐴 𝑖𝑗 𝛿 𝑐 𝑖 , 𝑐 𝑗 − 1 2 𝑖𝑗 𝑘 𝑖 𝑘 𝑗 2𝑚 𝛿 𝑐 𝑖 , 𝑐 𝑗 = 1 2 𝑖𝑗 𝐴 𝑖𝑗 − 𝑘 𝑖 𝑘 𝑗 2𝑚 𝛿 𝑐 𝑖 , 𝑐 𝑗

Measuring the Presence of Homophily - Calculating modularity If there is no homophily effect, we should expect to see 57.55% (10.36/18) same gender ties. Since we see 72.22% (13/18) same gender ties instead of 57%, there is some evidence of homophily We see 14.6% more same gender ties than what we would expect if gender had no effect on tie formation. The modularity score (difference) is 0.146 𝑄= 1 2𝑚 𝑖𝑗 𝐴 𝑖𝑗 − 𝑘 𝑖 𝑘 𝑗 2𝑚 𝛿 𝑐 𝑖 , 𝑐 𝑗 =.7222−.5755= 0.146

Making Sociology Relevant: What do we want to say? A few empirical facts: Some racially heterogeneous schools are socially segregated

Making Sociology Relevant: What do we want to say? A few empirical facts: … while other heterogeneous schools are socially integrated. Why?

Making Sociology Relevant: What do we want to say?

Scalar Characteristics Assortative Mixing by Scalar Characteristics

How do we compute/visualize it with NetworkX and Python?

Assortativity/Homophily in Gephi Here (the inner circle is the hub node) https://gephi.org/tutorials/gephi-tutorial-layouts.pdf

Assortativity/Homophily in Python In NetworkX, to check degree Assortativity (categories are the degrees rather than gender): assortivity = nx.degree_assortativity_coefficient(G) To check an attribute’s assortativity (the attribute “gender” can be replaced by other attributes that your data was tagged with): assortivity = nx.attribute_assortativity_coefficient(G, “gender“)