Random Walk on Graph t=0 Random Walk Start from a given node at time 0

Slides:



Advertisements
Similar presentations
Markov Models.
Advertisements

CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Analysis and Modeling of Social Networks Foudalis Ilias.
Information Networks Link Analysis Ranking Lecture 8.
Graphs, Node importance, Link Analysis Ranking, Random walks
Practical Recommendations on Crawling Online Social Networks
CS345 Data Mining Link Analysis Algorithms Page Rank Anand Rajaraman, Jeffrey D. Ullman.
Link Analysis: PageRank
Markov Chains 1.
Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / Random Graph Models: Create/Explain Complex Network Properties.
11 - Markov Chains Jim Vallandingham.
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Entropy Rates of a Stochastic Process
6.896: Probability and Computation Spring 2011 Constantinos (Costis) Daskalakis lecture 2.
More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.
15 October 2012 Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / Discrete Time Markov Chains.
1 Walking on a Graph with a Magnifying Glass Stratified Sampling via Weighted Random Walks Maciej Kurant Minas Gjoka, Carter T. Butts, Athina Markopoulou.
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov
Mining and Searching Massive Graphs (Networks)
Algorithmic and Economic Aspects of Networks Nicole Immorlica.
Link Analysis Ranking. How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would.
Introduction to PageRank Algorithm and Programming Assignment 1 CSC4170 Web Intelligence and Social Computing Tutorial 4 Tutor: Tom Chao Zhou
Estimating the Global PageRank of Web Communities Paper by Jason V. Davis & Inderjit S. Dhillon Dept. of Computer Sciences University of Texas at Austin.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
Link Analysis, PageRank and Search Engines on the Web
CS345 Data Mining Link Analysis Algorithms Page Rank Anand Rajaraman, Jeffrey D. Ullman.
Expanders Eliyahu Kiperwasser. What is it? Expanders are graphs with no small cuts. The later gives several unique traits to such graph, such as: – High.
Minas Gjoka, UC IrvineWalking in Facebook 1 Walking in Facebook: A Case Study of Unbiased Sampling of OSNs Minas Gjoka, Maciej Kurant ‡, Carter Butts,
1 Uniform Sampling from the Web via Random Walks Ziv Bar-Yossef Alexander Berg Steve Chien Jittat Fakcharoenphol Dror Weitz University of California at.
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
Thrasyvoulos Spyropoulos / Eurecom, Sophia-Antipolis 1  How many samples do we need to converge?  How many Random Walk steps to get.
6. Markov Chain. State Space The state space is the set of values a random variable X can take. E.g.: integer 1 to 6 in a dice experiment, or the locations.
Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / Using Complex Networks for Mobility Modeling and Opportunistic Networking: Part.
1 Link-Trace Sampling for Social Networks: Advances and Applications Maciej Kurant (UC Irvine) Join work with: Minas Gjoka (UC Irvine), Athina Markopoulou.
Piyush Kumar (Lecture 2: PageRank) Welcome to COT5405.
Google’s Billion Dollar Eigenvector Gerald Kruse, PhD. John ‘54 and Irene ‘58 Dale Professor of MA, CS and I T Interim Assistant Provost Juniata.
1 Applications of Relative Importance  Why is relative importance interesting? Web Social Networks Citation Graphs Biological Data  Graphs become too.
WALKING IN FACEBOOK: A CASE STUDY OF UNBIASED SAMPLING OF OSNS junction.
Network Characterization via Random Walks B. Ribeiro, D. Towsley UMass-Amherst.
Lecture 5: Mathematics of Networks (Cont) CS 790g: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
Palette: Distributing Tables in Software-Defined Networks Yossi Kanizo (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) and David Hay.
Markov Chain Monte Carlo and Gibbs Sampling Vasileios Hatzivassiloglou University of Texas at Dallas.
Challenges and Opportunities Posed by Power Laws in Network Analysis Bruno Ribeiro UMass Amherst MURI REVIEW MEETING Berkeley, 26 th Oct 2011.
Lecture 13: Network centrality Slides are modified from Lada Adamic.
PageRank. s1s1 p 12 p 21 s2s2 s3s3 p 31 s4s4 p 41 p 34 p 42 p 13 x 1 = p 21 p 34 p 41 + p 34 p 42 p 21 + p 21 p 31 p 41 + p 31 p 42 p 21 / Σ x 2 = p 31.
Thursday, May 9 Heuristic Search: methods for solving difficult optimization problems Handouts: Lecture Notes See the introduction to the paper.
Markov Chains and Random Walks. Def: A stochastic process X={X(t),t ∈ T} is a collection of random variables. If T is a countable set, say T={0,1,2, …
Slides are modified from Lada Adamic
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Seminar on random walks on graphs Lecture No. 2 Mille Gandelsman,
15 October 2012 Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / Discrete Time Markov Chains.
Date: 2005/4/25 Advisor: Sy-Yen Kuo Speaker: Szu-Chi Wang.
COMS Network Theory Week 5: October 6, 2010 Dragomir R. Radev Wednesdays, 6:10-8 PM 325 Pupin Terrace Fall 2010.
Link Analysis Algorithms Page Rank Slides from Stanford CS345, slightly modified.
15 October 2012 Eurecom, Sophia-Antipolis Thrasyvoulos Spyropoulos / Absorbing Markov Chains 1.
STAT 534: Statistical Computing
Importance Measures on Nodes Lecture 2 Srinivasan Parthasarathy 1.
Random Sampling Algorithms with Applications Kyomin Jung KAIST Aug ERC Workshop.
1 Coarse-Grained Topology Estimation via Graph Sampling Maciej Kurant 1 Minas Gjoka 2 Yan Wang 2 Zack W. Almquist 2 Carter T. Butts 2 Athina Markopoulou.
Topics In Social Computing (67810) Module 1 (Structure) Centrality Measures, Graph Clustering Random Walks on Graphs.
Web Mining Link Analysis Algorithms Page Rank. Ranking web pages  Web pages are not equally “important” v  Inlinks.
Jeffrey D. Ullman Stanford University.  Web pages are important if people visit them a lot.  But we can’t watch everybody using the Web.  A good surrogate.
Markov Chains Mixing Times Lecture 5
Link-Based Ranking Seminar Social Media Mining University UC3M
DTMC Applications Ranking Web Pages & Slotted ALOHA
Community detection in graphs
Haim Kaplan and Uri Zwick
Markov Chain Monte Carlo: Metropolis and Glauber Chains
Random Walk on Graph t=0 Random Walk Start from a given node at time 0
Presentation transcript:

Random Walk on Graph t=0 Random Walk Start from a given node at time 0 Choose a neighbor randomly (including previous) and move there Repeat until time t = n Q1. Where does this converge to as n  ∞ Q2. How fast does it converge? Q3. What are the implications for different applications?

Random Walks on Graphs A = Node degree ki  move to any neighbor with prob = 1/ki This is a Markov chain! Start at a node i  p(0) = (0,0,…,1,…0,0) p(n) = p(0) An π = π A [where π = limn∞ p(n)] Q: what is π for a random walk on a graph? 1 1/k1 1/k2 1/k3 1/k4 1/k5 A =

Random Walks on Undirected Graphs Stationarity: π(z) = Σxπ(x)p(x,z) p(x,y) = 1/kx Could try to solve these or global balance. Not Easy!! Define N(z): {neighbors of z) Σx ∈ N(z) kx⋅p(x,z) = Σx ∈ N(z) kx⋅(1/kx) = Σx ∈ N(z)1 = kz Normalize by (dividing both sides with) Σxkx Σxkx = 2|E| (|E| = m = # of edges) Σx ∈ N(z) (kx/2|E|)⋅p(x,z) = kz/2|E| π(x) = kx/2|E| is the stationary distribution always satisfies the stationarity eq π(x) = π(x)P

What about Random Walks on Directed Graphs? 1/8 4/13 2/13 1/13 Assign each node centrality 1/n (for n nodes)

A Problematic Graph Q: What is the problem with this graph? A: All centrality “points” will eventually go to F and G Solution: when at node i With probability β jump to any (of the total N) node(s) With 1-β jump to a random neighbor of i Q: Does this remind you of something? A: PageRank algorithm! PageRank of node i is the stationary probability for a random walk on this (modified) directed graph factor β in PageRank function avoids this problem by “leaking” some small amount of centrality from each node to all other nodes

PageRank Centrality A (bored) web surfer Either surf a linked webpage PageRank as a Random Walk A (bored) web surfer Either surf a linked webpage with probability 1-β Or surf a random page (e.g. new search) with probability β The probability of ending up at page X, after a large enough time = PageRank of page X! Can generalize PageRank with general β = (β1,β2,…,βn) Undirected network: removing β  degree centrality

Applications of RW: Measuring Large Networks We are interested in studying the properties (degree distribution, path lengths, clustering, connectivity, etc.) of many real networks (Internet, Facebook, YouTube, Flickr, etc.) as this contain many important ($$$) information E.g. to plot degree distribution, we need to crawl the whole network and obtain a “degree value” for each node. This networks might contain millions of nodes!!

Online Social Networks (OSNs) Size Traffic 500 million 2 200 million 9 130 million 12 100 million 43 75 million 10 29 > 1 billion users October 2010 (over 15% of world’s population, and over 50% of world’s Internet users !)

This is neither feasible nor practical. Measuring FaceBook Facebook: 500+M users 130 friends each (on average) 8 bytes (64 bits) per user ID The raw connectivity data, with no attributes: 500 x 130 x 8B = 520 GB To get this data, one would have to download: 100+ TB of (uncompressed) HTML data! This is neither feasible nor practical. Solution: Sampling!

Measuring Large Networks (for the mere mortals) Obtaining complete dataset difficult companies usually unwilling to share data for privacy and performance reasons (e.g. Facebook will ban accounts if it sees extensive crawling) tremendous overhead to measure all (~100TB for Facebook) Representative samples desirable study properties test algorithms

Sampling What: How: Topology? Directly? Nodes? Exploration?

(1) Breadth-First-Search (BFS) Starting from a seed, explores all neighbor nodes. Process continues iteratively without replacement. BFS leads to bias towards high degree nodes Lee et al, “Statistical properties of Sampled Networks”, Phys Review E, 2006 Early measurement studies of OSNs use BFS as primary sampling technique i.e [Mislove et al], [Ahn et al], [Wilson et al.]

(2) Random Walk (RW) Explores graph one node at a time with replacement Restart from different seeds Or multiple seeds in parallel Does this lead to a good sample??

Implications for Random Walk Sampling Say, we collect a small part of the Facebook graph using RW Higher chance to visit high-degree nodes High-degree nodes overrepresented Low-degree nodes under-represented sampled degree distribution 2? Random Walk (RW): Real degree distribution sampled degree distribution 1? [1] M. Gjoka, M. Kurant, C. T. Butts and A. Markopoulou, “Walking in Facebook: A Case Study of Unbiased Sampling of OSNs”, INFOCOM 2010.

Random Walk Sampling of Facebook sampled real Real average node degree: 94 Observed average node degree: 338 Q: How can we fix this? A: Intuition  Need to reduce (increase) the probability of visiting high (low) degree nodes

Markov Chain Monte Carlo (MCMC) Q:How should we modify the Random Walk? A: Markov Chain Monte Carlo theory Original chain: move xy with prob Q(x,y) Stationary distribution π(x) Desired chain: Stationary distribution w(x) (for uniform sampling: w(x) = 1/N) New transition probabilities

MCMC (2) a(x,y): probability of accepting proposed move Q: How should we choose a(x,y) so as to converge to the desired stationary distribution w(x)? A: w(x) station. distr.  w(x)P(x,y) = w(y)P(y,x) (for all x,y) Q: Why? Local balance (time-reversibility) equations w(x)Q(x,y)a(x,y) = w(y)Q(y,x)a(y,x) (denote b(x,y) = b(y,x)) a(x,y) ≤ 1 (probability)  b(x,y) ≤ w(x)Q(x,y) b(x,y) = b(y,x) ≤ w(y)Q(y,x) 

MCMC for Uniform Sampling w(x) = w(y) (= 1/n…doesn’t really matter) Q(y,x)/Q(x,y) = kx/ky Metropolis-Hastings random walk: Move to lower degree node  always accepted Move to higher degree node  reject with prob related to degree ratio

Metropolis-Hastings (MH) Random Walk Explore graph one node at a time with replacement In the stationary distribution

Degree Distribution of FB with MHRW Sampled degree distribution almost identical to real one MCMC methods have MANY other applications Sampling Optimization

Node Importance: Who is most “central”?

Node Centrality: Depends on Application Influence: Which social network nodes should I pick to advertise/spread a video/product/opinion? Resilience: Which node(s) should I attack to disconnect the network? Malware/Virus Infection: Which nodes should I immunize (e.g. upload a patch) to stop a given Internet “worm” from spreading quickly? Performance: Which nodes are the bottleneck in a network? Search Engines: Which nodes contain the most relevant information? A centrality measure implicitly solves some optimization problem

Centrality: Importance based on network position In each of the following networks, X has higher centrality than Y according to a particular measure indegree outdegree betweenness closeness

Degree Centrality He who has many friends is most important. When is the number of connections the best centrality measure? people who will do favors for you people you can talk to (influence set, information access, …) influence of an article in terms of citations (using in-degree)

Normalized Degree Centrality divide by the max. possible, i.e. (N-1)

Betweeness Centrality: Definition paths between j and k that pass through i betweenness of vertex i all paths between j and k Where gjk = the number of shortest paths connecting j-k, and gjk = the number that node i is on. Usually normalized by:

betweenness on toy networks non-normalized version: bridge

Betweeness vs. Degree Centrality Nodes are sized by degree, and colored by betweenness. Can you spot nodes with high betweenness but relatively low degree? What about high degree but relatively low betweenness?

Why is Betweeness Centrality Important? Connectivity Remove random node Remove high degree node Remove high betweeness node

Why is Betweeness Centrality Important? The network below is a wireless network (e.g. sensor network) Nodes run on battery  total energy Emax Each node picks a destination randomly and sends data at constant rate every packet going through a node spends E of its energy Q: How long would it take until the first node dies out of battery? D1 S1 D2 S2

How About in This Network?

Why is Betweeness Centrality Important? Monitoring Where would you place a traffic monitor in order to track the maximum number of packets (if this was your university network)? Where would you place traffic cameras if that was a street network?

Why is Betweeness Centrality Important? Traffic Flow: Each link has capacity 1 Q: What is the maximum throughput between S-D? A: Max Flow – Min Cut theorem  max flow equal to min number of links removed to disconnect S-D  S-D throughput = 1 S D

Spectral Analysis of (ergodic) Markov Chains If a Markov Chain (defined by transition matrix P) is ergodic (irreducible, aperiodic, and positive recurrent) P(n)ik  πk and π = [π1, π2,…, πn] Q: But how fast does the chain converge? E.g. how many steps until we are “close enough” to π A: This depends on the eigenvalues of P The convergence time is also called the mixing time

Eigenvalues and Eigenvectors of matrix P Left Eigenvectors A row vector π is a left eigenvector for eigenvalue λ of matrix P iff πP = λπ  Σk πk pki = λπi Right Eigenvectors A column vector v is a right eigenvector for eigenvalue λ of matrix P iff Pv = λv  Σk pik vk = λvi Q: What eigenvalues and eigenvectors can we guess already? A: λ = 1 is a left eigenvalue with eigenvector π the stationary distr. λ = 1 is a right eigenvalue with eigenvector v=1 (all 1s)

Eigenvalues and Eigenvectors for 2-state Chains Both sets have non-zero solutions  (P-λI) is singular There exists v ≠ 0 such that (P-λI)v = 0 Determinant |P-λI| = 0 (p11- λ)(p22- λ)-p12p21 = 0 λ1=1, λ2 = 1 – p12 – p21 (replace above and confirm using some algebra) |λ2| < 1 (normalized: π(1) to be a stationary distribution AND v(i) ∙π(i) = 1, ∀i)

Diagonalization Eigenvalue decomposition: P = U Λ U-1 Q: What is P(n)? => Q: How fast does the chain converge to stationary distrib.? A: It converges exponentially fast in n, as (λ2)n

Generalization for M-state Markov Chains We’ll assume that there are M distinct eigenvalues (see notes for repeated ones) Matrix P is stochastic  all eigenvalues |λi| ≤ 1 Q: Why? A: Q: How fast does an (ergodic) chain converge to stationary distribution? A: Exponentially with rate equal to 2nd largest eigenvalue

Speed of Sampling on this Network? 39 λ2 (2nd largest eigenvalue) related to (balanced) min-cut of the graph The more “partitioned” a graph is into clusters with few links between them  the longer the convergence time for the respective MC  the slower the random walk search

Community Detection - Clustering

Faloutsos, Tong Laplacian 4 1 L= D-A= 2 3 Diagonal matrix, dii=di

Faloutsos, Tong Weighted Laplacian 4 10 1 4 0.3 2 3 2

Laplacian: fast facts If k connected components, Fiedler (‘73) called so, zero is an eigenvalue If k connected components, Fiedler (‘73) called “algebraic connectivity of a graph” The further from 0, the more connected.

G(V,E) L= eig(L)= Connected Components 1 2 3 4 6 7 5 Faloutsos, Tong Connected Components G(V,E) 1 2 3 L= 4 6 #zeros = #components 7 5 eig(L)=

G(V,E) L= eig(L)= Connected Components 1 2 3 4 6 7 5 0.01 Faloutsos, Tong Connected Components G(V,E) 1 2 3 L= 0.01 4 6 #zeros = #components Indicates a “good cut” 7 5 eig(L)=

Spectral Image Segmentation (Shi-Malik ‘00)

The second eigenvector

Second Eigenvector’s sparsest cut