A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State.

Slides:



Advertisements
Similar presentations
Dr. Henry Hexmoor Department of Computer Science Southern Illinois University Carbondale Network Theory: Computational Phenomena and Processes Social Network.
Advertisements

Graph Theory Arnold Mesa. Basic Concepts n A graph G = (V,E) is defined by a set of vertices and edges v3 v1 v2Vertex (v1) Edge (e1) A Graph with 3 vertices.
Social network partition Presenter: Xiaofei Cao Partick Berg.
Network Matrix and Graph. Network Size Network size – a number of actors (nodes) in a network, usually denoted as k or n Size is critical for the structure.
Chapter 4 Systems of Linear Equations; Matrices
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 – CHAPTER 4 GRAPHS 1.
Introduction to Network Theory: Modern Concepts, Algorithms
R. Johnsonbaugh Discrete Mathematics 5 th edition, 2001 Chapter 8 Network models.
Matrices: Inverse Matrix
Systems of Linear Equations
V4 Matrix algorithms and graph partitioning
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
Graph & BFS.
Centrality and Prestige HCC Spring 2005 Wednesday, April 13, 2005 Aliseya Wright.
Fast algorithm for detecting community structure in networks.
Graph & BFS Lecture 22 COMP171 Fall Graph & BFS / Slide 2 Graphs * Extremely useful tool in modeling problems * Consist of: n Vertices n Edges D.
Centrality Measures These measure a nodes importance or prominence in the network. The more central a node is in a network the more significant it is to.
Shortest path algorithm. Introduction 4 The graphs we have seen so far have edges that are unweighted. 4 Many graph situations involve weighted edges.
Review of Matrix Algebra
Expanders Eliyahu Kiperwasser. What is it? Expanders are graphs with no small cuts. The later gives several unique traits to such graph, such as: – High.
Using Social Networks to Analyze Sexual Relations by Amanda Dargie.
MOHAMMAD IMRAN DEPARTMENT OF APPLIED SCIENCES JAHANGIRABAD EDUCATIONAL GROUP OF INSTITUTES.
Arithmetic Operations on Matrices. 1. Definition of Matrix 2. Column, Row and Square Matrix 3. Addition and Subtraction of Matrices 4. Multiplying Row.
MATRICES. Matrices A matrix is a rectangular array of objects (usually numbers) arranged in m horizontal rows and n vertical columns. A matrix with m.
1 Operations with Matrice 2 Properties of Matrix Operations
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
5  Systems of Linear Equations: ✦ An Introduction ✦ Unique Solutions ✦ Underdetermined and Overdetermined Systems  Matrices  Multiplication of Matrices.
Linear Algebra With Applications by Otto Bretscher. Page The Determinant of any diagonal nxn matrix is the product of its diagonal entries. True.
Systems and Matrices (Chapter5)
 Row and Reduced Row Echelon  Elementary Matrices.
Rev.S08 MAC 1140 Module 10 System of Equations and Inequalities II.
Systems of Linear Equation and Matrices
Chapter 3: The Fundamentals: Algorithms, the Integers, and Matrices
Some matrix stuff.
Presentation: Random Walk Betweenness, J. Govorčin Laboratory for Data Technologies, Faculty of Information Studies, Novo mesto – September 22, 2011 Random.
Statistics and Linear Algebra (the real thing). Vector A vector is a rectangular arrangement of number in several rows and one column. A vector is denoted.
8.1 Matrices & Systems of Equations
Matrices CHAPTER 8.1 ~ 8.8. Ch _2 Contents  8.1 Matrix Algebra 8.1 Matrix Algebra  8.2 Systems of Linear Algebra Equations 8.2 Systems of Linear.
Yaomin Jin Design of Experiments Morris Method.
Digital Image Processing CCS331 Relationships of Pixel 1.
Presentation: A measure of betweenness centrality based on random walks M.E.J. Newman ELSEVIER Social Networks November 2004 A measure of betweenness centrality.
School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2013 Figures are taken.
Computing Eigen Information for Small Matrices The eigen equation can be rearranged as follows: Ax = x  Ax = I n x  Ax - I n x = 0  (A - I n )x = 0.
Lecture 13: Network centrality Slides are modified from Lada Adamic.
Chapter 5 MATRIX ALGEBRA: DETEMINANT, REVERSE, EIGENVALUES.
PHARMACOECONOMIC EVALUATIONS & METHODS MARKOV MODELING IN DECISION ANALYSIS FROM THE PHARMACOECONOMICS ON THE INTERNET ®SERIES ©Paul C Langley 2004 Maimon.
Algorithms 2005 Ramesh Hariharan. Algebraic Methods.
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
Slides are modified from Lada Adamic
Hierarchy Overview Background: Hierarchy surrounds us: what is it? Micro foundations of social stratification Ivan Chase: Structure from process Action.
4.5 Inverse of a Square Matrix
DM GROUP MEETING PRESENTATION PLAN Eigenvector-based Centrality Measures For Temporal Networks by D Taylor et.al. Uncovering the Small Community.
Selected Topics in Data Networking Explore Social Networks: Center and Periphery.
Matrices and Matrix Operations. Matrices An m×n matrix A is a rectangular array of mn real numbers arranged in m horizontal rows and n vertical columns.
How to Analyse Social Network? Social networks can be represented by complex networks.
Informatics tools in network science
Presented by Alon Levin
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Importance Measures on Nodes Lecture 2 Srinivasan Parthasarathy 1.
(CSC 102) Lecture 30 Discrete Structures. Graphs.
Network Topology Single-level Diversity Coding System (DCS) An information source is encoded by a number of encoders. There are a number of decoders, each.
Lecture 20. Graphs and network models 1. Recap Binary search tree is a special binary tree which is designed to make the search of elements or keys in.
Matrices and Vector Concepts
L8 inverse of the matrix.
5 Systems of Linear Equations and Matrices
Network analysis.
Network Science: A Short Introduction i3 Workshop
Centralities (4) Ralucca Gera,
Graphs All tree structures are hierarchical. This means that each node can only have one parent node. Trees can be used to store data which has a definite.
Matrices and Determinants
Presentation transcript:

A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State University

Overview Introduction Centrality Measures Types Of Betweenness Random-walk Betweenness –A current flow analogy –Random walks Comparison Of Different Betweenness Measures Correlation With Other Measures Examples Applications

Centrality Measures Degree: Simplest centrality measure Number of edges incident on a vertex in a network The number of ties an actor has in social network parlance A measure in some sense of the popularity of an actor. a b c Degree of b = ?

Centrality Measures Continued… Closeness: Centrality measure which is the mean geodesic (shortest-path) distance between a vertex and all other vertices reachable from it. Measure of how long it will take information to spread from a given vertex to others in the network Betweenness: Measure of the extent to which a vertex lies on the paths between others

Betweenness The betweenness of a vertex i is defined to be the fraction of shortest paths between pairs of vertices in a network that pass through i. Σ s<t g i (st) /n st b i = ½ n(n − 1) n = Total no of vertices in network g i (st) = no of geodesic paths from vertex s to vertex t that pass through i. n st = total no of geodesic paths from s to t.

Shortest-path Betweenness A measure of the extent to which an actor has control over information flowing between others. In a network in which flow is entirely or at least mostly along geodesic paths, the betweenness of a vertex measures how much flow will pass through that particular vertex. Betweenness can be calculated for all vertices in time O(mn) m: edges n: vertices

Example of Shortest-path Betweenness Shortest path betweenness Vertices A and B will have high (shortest-path) betweenness in this configuration, while vertex C will not

Drawbacks of Shortest-path Betweenness Does information flow only along geodesic paths? News, rumor, fad, message – does it know the ideal route To get from one place to another more likely a message wanders around more randomly, encountering who it will. Certainly it is possible for information to flow between two individuals via a third mutual acquaintance, even when the two individuals in question are themselves well acquainted A realistic betweenness measure should include non-geodesic paths in addition to geodesic ones

Flow betweenness Flow betweenness of a vertex i is defined as the amount of flow through vertex i when the maximum flow is transmitted from s to t, averaged over all s and t. Flow betweenness can be thought of as measuring the betweenness of vertices in a network in which a maximal amount of information is continuously pumped between all sources and targets. Maximum flow from a given s to all reachable targets t can be calculated in worst-case time O(m 2 ) and hence the flow betweenness for all vertices can be calculated in time O(m 2 n)

Example of Flow Betweenness While calculating flow betweenness, vertices A and B will get high scores while vertex C will not

Drawbacks of Flow Betweenness Does information “know” the ideal route (or one of the ideal routes) from each source to each target, in order to realize the maximum flow? Although the flow betweenness does take account of paths other than the shortest path this still seems unrealistic for many practical situations Flow betweenness suffers from some of the same drawbacks as shortest-path betweenness, as in flow does not take any sort of ideal path from source to target, be it the shortest path, the maximum flow path, or another kind of ideal path

Random Walk Betweenness Random-walk betweenness of a vertex i is equal to the number of times that a random walk starting at s and ending at t passes through i along the way, averaged over all s and t Random-walk betweenness can be calculated for all vertices in a network in worst-case time O((m + n)n 2 ) using matrix methods This measure is appropriate to a network in which information wanders about essentially at random until it finds its target

Basic Matrix Notations Adjacency Matrices

Basic Matrix Notations Continued… Rank of a matrix A = maximal number of linearly independent rows/columns A square matrix A nxn is invertible only if rank A = n Product of eigen values of a matrix = determinant of the matrix Ax = λxx = non-zero eigen vector A = matrix λ = eigen value Singular matrix = determinant is zero = matrix is non-invertible

Current Flow Analogy Current flow betweenness of a vertex i is the amount of current that flows through i averaged over all source and target points Kirchoff’s law of current: Total current flow into or out of any vertex is zero i = i 1 + i 2 i xy = [V(x) – V(y)] / R xy Injected current = 1 unit Extracted current = 1 unit Resistance = 1 unit

Current flow betweenness By Kirchoff’s law of current conservation, the voltages satisfy: Σ j A ij (V i - V j ) = δ is – δ it A ij is an element of adjacency matrix A ij = δ ij is the Knocker δ = 1if there is an edge between i and j 0otherwise 1if i = j 0otherwise

Current flow betweenness Continued… Since Σ j A ij = k i  degree of vertex i Therefore, (D - A). V = s D = diagonal matrix with elements D ii = k i s is source vector with elements s i = +1for i = s -1for i = t 0otherwise

Current flow betweenness Continued… To obtain V, matrix (D – A) cannot be inverted as it is singular Removing v th row of D – A. V v = 0 as voltage is measured with respect to corresponding vertex Removing v th column, D v - A v a square matrix (n-1) * (n-1) V = (D v - A v ) -1. S Adding back the missing vertex with values all equal to zero Voltage at i: V i (st) = T is – T it  T is the resulting matrix

Current flow betweenness Continued… Current through i = half of the sum of the absolute values of the currents flowing along the edges incident on that vertex I i (st) = ½ ∑ j A ij |V i (st) – V j (st) | = ½ ∑ j A ij |T is – T it – T js + T jt | for i != s,t I s (st) = 1 I t (st) = 1 ∑ s<t I i (st) Betweenness: b i = avg. of current flow over all ½ n(n-1) source-target pairs * Calculated separately for each component of a graph with more than one components

Time of Calculation Inversion of matrix takes O(n 3 ) Betweenness equation takes O(mn) for each vertex or O(mn 2 ) for all of them Total running time to calculate current flow betweenness for all vertices = O((m+n) n 2 ) O(n 3 )for a sparse graph

Random Walks s t s = source t = target m = message m ? ?

Random-walk Betweenness Definition: Betweenness measure for a vertex i is the number of times a message passes through i on its journey, averaged over a large number of trials of the random walk, this value averaged over all possible source/target pairs s, t is random-walk betweenness Betweenness of vertex i is the net number of times a walk passes through i.

Calculating Random-walk Betweenness Consider an absorbing random walk, a walk that starts at vertex s and makes random moves around the network until it finds itself at vertex t and then stops. If at some point in this walk we find ourselves at vertex j, then the probability that we will find ourselves at i on the next step is given by the matrix element: M ij = A ij / k j, for j != t, s t i j

Calculating Random-walk Betweenness Continued… A ij -element of the adjacency matrix, k j = Σ i A ij -degree of vertex j In matrix notation, M = A ・ D −1 D -diagonal matrix with elements D ii = k i. M it = 0 for all i M t = A t ・ D −1 after removing row and column t

Calculating Random-walk Betweenness Continued… For a walk from s, probability of reaching j after r steps: [M r t ] js probability that step is taken to an adjacent vertex i: k j −1 [M r t ] js Summing over all values of r from 0 to ∞ k j −1 [(I −M t ) −1 ] js total no.of times we go from j to i averaged over all possible walks In matrix notation we can write this as an element of the vector V = D t −1. (I −M t ) −1. s = (D t − A t ) −1. s

Calculating Random-walk Betweenness Continued.. |V i − V j | Net flow of the random walk from j to i ∑ s<t I i (st) Betweenness: b i = ½ n(n-1) -Final net flow of random walks through vertex i

Summarization Construct the matrix D−A, where D is the diagonal matrix of vertex degrees and A is the adjacency matrix. Remove any single row, and the corresponding column. For example, one could remove the last row and column. Invert the resulting matrix and then add back in a new row and column consisting of all zeros in the position from which the row and column were previously removed (e.g., the last row and column). Call the resulting matrix T, with elements T ij. Calculate the betweenness from Eq. of b i using the values of I i

Comparison In each network, we intuitively expect vertex C to have betweenness lower than that of vertices A and B, but higher than that of vertices X and Y.

Comparison Continued… Shortest path betweenness fails to give a higher score to vertex C in the first network than to any of the other vertices within the two communities, while flow betweenness has the same problem with vertex C in the second network. Random-walk measure orders the vertices correctly in each case.

Correlation With Other Measures Shortest-path betweenness is known to be strongly correlated with vertex degree in most networks If the two are strongly correlated, then why calculate betweenness, when degree is almost the same and much easier to calculate? There are usually a small number of vertices in a network for which betweenness and degree are very different so need betweenness to identify these vertices.

Example on Correlation With Other Measures Vertices with higher degree or higher shortest-path betweenness tend also to have higher random-walk betweenness. This misses the real point of interest, that there are a few vertices that have random-walk betweenness values quite different from their scores on the other two measures.

The size of the vertices increases linearly with their random- walk betweenness. The highlighted vertices are those for which the random-walk betweenness is substantially greater than shortest-path betweenness (a factor of two or more). The largest component of a network of sexual contacts between high-risk actors in the city of Colorado Springs Example continued…

Example Applications The network of intermarriage relations between the 15th century Florentine families Example 1

Ranking on random-walk betweenness Medici come out well ahead of the competition, and they easily best their arch-rivals, the Strozzi. It is suggested that it was in part the Medici’s skillful manipulation of this marriage network that led to their eventual dominance of the Florentine political landscape.

Example 2 Increasing size of vertices: score on the random-walk betweenness measure A: brokers who establish connections between different groups B: lying on paths where there are two (or more) paths to an outlying group of vertices The largest component of the co-authorship network of scientists working on networks

Thank You