A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State.

A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State University

Overview Introduction Centrality Measures Types Of Betweenness Random-walk Betweenness –A current flow analogy –Random walks Comparison Of Different Betweenness Measures Correlation With Other Measures Examples Applications

Centrality Measures Degree: Simplest centrality measure Number of edges incident on a vertex in a network The number of ties an actor has in social network parlance A measure in some sense of the popularity of an actor. a b c Degree of b = ?

Centrality Measures Continued… Closeness: Centrality measure which is the mean geodesic (shortest-path) distance between a vertex and all other vertices reachable from it. Measure of how long it will take information to spread from a given vertex to others in the network Betweenness: Measure of the extent to which a vertex lies on the paths between others

Betweenness The betweenness of a vertex i is defined to be the fraction of shortest paths between pairs of vertices in a network that pass through i. Σ s<t g i (st) /n st b i = ---------------- ½ n(n − 1) n = Total no of vertices in network g i (st) = no of geodesic paths from vertex s to vertex t that pass through i. n st = total no of geodesic paths from s to t.

Shortest-path Betweenness A measure of the extent to which an actor has control over information flowing between others. In a network in which flow is entirely or at least mostly along geodesic paths, the betweenness of a vertex measures how much flow will pass through that particular vertex. Betweenness can be calculated for all vertices in time O(mn) m: edges n: vertices

Example of Shortest-path Betweenness Shortest path betweenness Vertices A and B will have high (shortest-path) betweenness in this configuration, while vertex C will not

Drawbacks of Shortest-path Betweenness Does information flow only along geodesic paths? News, rumor, fad, message – does it know the ideal route To get from one place to another more likely a message wanders around more randomly, encountering who it will. Certainly it is possible for information to flow between two individuals via a third mutual acquaintance, even when the two individuals in question are themselves well acquainted A realistic betweenness measure should include non-geodesic paths in addition to geodesic ones

Flow betweenness Flow betweenness of a vertex i is defined as the amount of flow through vertex i when the maximum flow is transmitted from s to t, averaged over all s and t. Flow betweenness can be thought of as measuring the betweenness of vertices in a network in which a maximal amount of information is continuously pumped between all sources and targets. Maximum flow from a given s to all reachable targets t can be calculated in worst-case time O(m 2 ) and hence the flow betweenness for all vertices can be calculated in time O(m 2 n)

Example of Flow Betweenness While calculating flow betweenness, vertices A and B will get high scores while vertex C will not

Drawbacks of Flow Betweenness Does information “know” the ideal route (or one of the ideal routes) from each source to each target, in order to realize the maximum flow? Although the flow betweenness does take account of paths other than the shortest path this still seems unrealistic for many practical situations Flow betweenness suffers from some of the same drawbacks as shortest-path betweenness, as in flow does not take any sort of ideal path from source to target, be it the shortest path, the maximum flow path, or another kind of ideal path

Random Walk Betweenness Random-walk betweenness of a vertex i is equal to the number of times that a random walk starting at s and ending at t passes through i along the way, averaged over all s and t Random-walk betweenness can be calculated for all vertices in a network in worst-case time O((m + n)n 2 ) using matrix methods This measure is appropriate to a network in which information wanders about essentially at random until it finds its target

Basic Matrix Notations 4 3 21 4 3 21 Adjacency Matrices

Basic Matrix Notations Continued… Rank of a matrix A = maximal number of linearly independent rows/columns A square matrix A nxn is invertible only if rank A = n Product of eigen values of a matrix = determinant of the matrix Ax = λxx = non-zero eigen vector A = matrix λ = eigen value Singular matrix = determinant is zero = matrix is non-invertible

Current Flow Analogy Current flow betweenness of a vertex i is the amount of current that flows through i averaged over all source and target points Kirchoff’s law of current: Total current flow into or out of any vertex is zero i = i 1 + i 2 i xy = [V(x) – V(y)] / R xy Injected current = 1 unit Extracted current = 1 unit Resistance = 1 unit

Current flow betweenness By Kirchoff’s law of current conservation, the voltages satisfy: Σ j A ij (V i - V j ) = δ is – δ it A ij is an element of adjacency matrix A ij = δ ij is the Knocker δ = 1if there is an edge between i and j 0otherwise 1if i = j 0otherwise

Current flow betweenness Continued… Since Σ j A ij = k i  degree of vertex i Therefore, (D - A). V = s D = diagonal matrix with elements D ii = k i s is source vector with elements s i = +1for i = s -1for i = t 0otherwise

Current flow betweenness Continued… To obtain V, matrix (D – A) cannot be inverted as it is singular Removing v th row of D – A. V v = 0 as voltage is measured with respect to corresponding vertex Removing v th column, D v - A v a square matrix (n-1) * (n-1) V = (D v - A v ) -1. S Adding back the missing vertex with values all equal to zero Voltage at i: V i (st) = T is – T it  T is the resulting matrix

Current flow betweenness Continued… Current through i = half of the sum of the absolute values of the currents flowing along the edges incident on that vertex I i (st) = ½ ∑ j A ij |V i (st) – V j (st) | = ½ ∑ j A ij |T is – T it – T js + T jt | for i != s,t I s (st) = 1 I t (st) = 1 ∑ s<t I i (st) Betweenness: b i = ----------- avg. of current flow over all ½ n(n-1) source-target pairs * Calculated separately for each component of a graph with more than one components

Time of Calculation Inversion of matrix takes O(n 3 ) Betweenness equation takes O(mn) for each vertex or O(mn 2 ) for all of them Total running time to calculate current flow betweenness for all vertices = O((m+n) n 2 ) O(n 3 )for a sparse graph

Random Walks s t s = source t = target m = message m ? ?

Random-walk Betweenness Definition: Betweenness measure for a vertex i is the number of times a message passes through i on its journey, averaged over a large number of trials of the random walk, this value averaged over all possible source/target pairs s, t is random-walk betweenness Betweenness of vertex i is the net number of times a walk passes through i.

Calculating Random-walk Betweenness Consider an absorbing random walk, a walk that starts at vertex s and makes random moves around the network until it finds itself at vertex t and then stops. If at some point in this walk we find ourselves at vertex j, then the probability that we will find ourselves at i on the next step is given by the matrix element: M ij = A ij / k j, for j != t, s t i j

Calculating Random-walk Betweenness Continued… A ij -element of the adjacency matrix, k j = Σ i A ij -degree of vertex j In matrix notation, M = A ・ D −1 D -diagonal matrix with elements D ii = k i. M it = 0 for all i M t = A t ・ D −1 after removing row and column t

Calculating Random-walk Betweenness Continued… For a walk from s, probability of reaching j after r steps: [M r t ] js probability that step is taken to an adjacent vertex i: k j −1 [M r t ] js Summing over all values of r from 0 to ∞ k j −1 [(I −M t ) −1 ] js total no.of times we go from j to i averaged over all possible walks In matrix notation we can write this as an element of the vector V = D t −1. (I −M t ) −1. s = (D t − A t ) −1. s

Calculating Random-walk Betweenness Continued.. |V i − V j | Net flow of the random walk from j to i ∑ s<t I i (st) Betweenness: b i = ----------- ½ n(n-1) -Final net flow of random walks through vertex i

Summarization Construct the matrix D−A, where D is the diagonal matrix of vertex degrees and A is the adjacency matrix. Remove any single row, and the corresponding column. For example, one could remove the last row and column. Invert the resulting matrix and then add back in a new row and column consisting of all zeros in the position from which the row and column were previously removed (e.g., the last row and column). Call the resulting matrix T, with elements T ij. Calculate the betweenness from Eq. of b i using the values of I i

Comparison In each network, we intuitively expect vertex C to have betweenness lower than that of vertices A and B, but higher than that of vertices X and Y.

Comparison Continued… Shortest path betweenness fails to give a higher score to vertex C in the first network than to any of the other vertices within the two communities, while flow betweenness has the same problem with vertex C in the second network. Random-walk measure orders the vertices correctly in each case.

Correlation With Other Measures Shortest-path betweenness is known to be strongly correlated with vertex degree in most networks If the two are strongly correlated, then why calculate betweenness, when degree is almost the same and much easier to calculate? There are usually a small number of vertices in a network for which betweenness and degree are very different so need betweenness to identify these vertices.

Example on Correlation With Other Measures Vertices with higher degree or higher shortest-path betweenness tend also to have higher random-walk betweenness. This misses the real point of interest, that there are a few vertices that have random-walk betweenness values quite different from their scores on the other two measures.

The size of the vertices increases linearly with their random- walk betweenness. The highlighted vertices are those for which the random-walk betweenness is substantially greater than shortest-path betweenness (a factor of two or more). The largest component of a network of sexual contacts between high-risk actors in the city of Colorado Springs Example continued…

Example Applications The network of intermarriage relations between the 15th century Florentine families Example 1

Ranking on random-walk betweenness Medici come out well ahead of the competition, and they easily best their arch-rivals, the Strozzi. It is suggested that it was in part the Medici’s skillful manipulation of this marriage network that led to their eventual dominance of the Florentine political landscape.

Example 2 Increasing size of vertices: score on the random-walk betweenness measure A: brokers who establish connections between different groups B: lying on paths where there are two (or more) paths to an outlying group of vertices The largest component of the co-authorship network of scientists working on networks

Thank You

A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State.

Similar presentations

Presentation on theme: "A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State.

Similar presentations

Presentation on theme: "A measure of betweenness centrality based on random walks Author: M. E. J. Newman Presented by: Amruta Hingane Department of Computer Science Kent State."— Presentation transcript:

Similar presentations

About project

Feedback