Computation on Graphs. Graphs and Sparse Matrices 1 1 1 2 1 1 1 3 1 1 1 4 1 1 5 1 1 6 1 1 1 2 3 4 5 6 3 6 2 1 5 4 Sparse matrix is a representation of.

Slides:



Advertisements
Similar presentations
Great Theoretical Ideas in Computer Science
Advertisements

Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
Graph Algorithms Carl Tropper Department of Computer Science McGill University.
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 – CHAPTER 4 GRAPHS 1.
Markov Models.
Matrices, Digraphs, Markov Chains & Their Use by Google Leslie Hogben Iowa State University and American Institute of Mathematics Leslie Hogben Iowa State.
Information Networks Link Analysis Ranking Lecture 8.
Graphs, Node importance, Link Analysis Ranking, Random walks
CS345 Data Mining Link Analysis Algorithms Page Rank Anand Rajaraman, Jeffrey D. Ullman.
Link Analysis: PageRank
Matrices, Digraphs, Markov Chains & Their Use. Introduction to Matrices  A matrix is a rectangular array of numbers  Matrices are used to solve systems.
More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.
Experiments with MATLAB Experiments with MATLAB Google PageRank Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University, Taiwan
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
1 Representing Graphs. 2 Adjacency Matrix Suppose we have a graph G with n nodes. The adjacency matrix is the n x n matrix A=[a ij ] with: a ij = 1 if.
Link Analysis Ranking. How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would.
Introduction to PageRank Algorithm and Programming Assignment 1 CSC4170 Web Intelligence and Social Computing Tutorial 4 Tutor: Tom Chao Zhou
Applied Discrete Mathematics Week 12: Trees
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
Lecture 16 Graphs and Matrices in Practice Eigenvalue and Eigenvector Shang-Hua Teng.
Chapter 9 Graph algorithms. Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, Slides for Chapter 1:
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Link Analysis, PageRank and Search Engines on the Web
CS240A: Computation on Graphs. Graphs and Sparse Matrices Sparse matrix is a representation.
1 Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as a i,j and elements of.
Graph Operations And Representation. Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
CS246 Link-Based Ranking. Problems of TFIDF Vector  Works well on small controlled corpus, but not on the Web  Top result for “American Airlines” query:
Stochastic Approach for Link Structure Analysis (SALSA) Presented by Adam Simkins.
L21: “Irregular” Graph Algorithms November 11, 2010.
Computer Science 112 Fundamentals of Programming II Introduction to Graphs.
Distributed Coloring Discrete Mathematics and Algorithms Seminar Melih Onus November
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
Computation on meshes, sparse matrices, and graphs Some slides are from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
Overview of Web Ranking Algorithms: HITS and PageRank
A graph problem: Maximal Independent Set Graph with vertices V = {1,2,…,n} A set S of vertices is independent if no two vertices in S are.
PageRank. s1s1 p 12 p 21 s2s2 s3s3 p 31 s4s4 p 41 p 34 p 42 p 13 x 1 = p 21 p 34 p 41 + p 34 p 42 p 21 + p 21 p 31 p 41 + p 31 p 42 p 21 / Σ x 2 = p 31.
1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Christian Schindelhauer Search Algorithms Winter Semester 2004/ Nov.
Week 11 - Monday.  What did we talk about last time?  Binomial theorem and Pascal's triangle  Conditional probability  Bayes’ theorem.
CompSci 100E 3.1 Random Walks “A drunk man wil l find his way home, but a drunk bird may get lost forever”  – Shizuo Kakutani Suppose you proceed randomly.
CS774. Markov Random Field : Theory and Application Lecture 02
Data Structures & Algorithms Graphs
Ranking Link-based Ranking (2° generation) Reading 21.
CS240A: Computation on Graphs. Graphs and Sparse Matrices Sparse matrix is a representation.
Data Structures and Algorithms in Parallel Computing Lecture 2.
Data Structures and Algorithms in Parallel Computing Lecture 3.
Graph Algorithms Gayathri R To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003.
GRAPHS. Graph Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component, spanning tree Types of graphs: undirected,
Graph Theory. undirected graph node: a, b, c, d, e, f edge: (a, b), (a, c), (b, c), (b, e), (c, d), (c, f), (d, e), (d, f), (e, f) subgraph.
CompSci 100E 4.1 Google’s PageRank web site xxx web site yyyy web site a b c d e f g web site pdq pdq.. web site yyyy web site a b c d e f g web site xxx.
Link Analysis Algorithms Page Rank Slides from Stanford CS345, slightly modified.
Ljiljana Rajačić. Page Rank Web as a directed graph  Nodes: Web pages  Edges: Hyperlinks 2 / 25 Ljiljana Rajačić.
Class 2: Graph Theory IST402.
COMPSCI 102 Introduction to Discrete Mathematics.
Google’s means to provide better search results Qi-Yuan Gou.
Week 11 - Wednesday.  What did we talk about last time?  Graphs  Paths and circuits.
CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning
Web Mining Link Analysis Algorithms Page Rank. Ranking web pages  Web pages are not equally “important” v  Inlinks.
Parallel Graph Algorithms
15. Directed graphs and networks
CS 290N / 219: Sparse Matrix Algorithms
DTMC Applications Ranking Web Pages & Slotted ALOHA
Computation on meshes, sparse matrices, and graphs
Computational meshes, matrices, conjugate gradients, and mesh partitioning Some slides are from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy.
Iterative Aggregation Disaggregation
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval"
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval"
CS 584 Project Write up Poster session for final Due on day of final
Adjacency Matrices and PageRank
Presentation transcript:

Computation on Graphs

Graphs and Sparse Matrices Sparse matrix is a representation of a (sparse) graph Matrix entries are edge weights Number of nonzeros per row is the vertex degree Edges represent data dependencies in matrix-vector multiplication

3 Graph partitioning Assigns subgraphs to processors Determines parallelism and locality. Tries to make subgraphs all same size (load balance) Tries to minimize edge crossings (communication). Exact minimization is NP-complete. edge crossings = 6 edge crossings = 10

Link analysis of the web Web page = vertex Link = directed edge Link matrix: A ij = 1 if page i links to page j

Web graph: PageRank (Google) Web graph: PageRank (Google) [Brin, Page] Markov process: follow a random link most of the time; otherwise, go to any page at random. Importance = stationary distribution of Markov process. Transition matrix is p*A + (1-p)*ones(size(A)), scaled so each column sums to 1. Importance of page i is the i-th entry in the principal eigenvector of the transition matrix. But the matrix is 1,000,000,000,000 by 1,000,000,000,000. An important page is one that many important pages point to.

A Page Rank Matrix Importance ranking of web pages Stationary distribution of a Markov chain Power method: matvec and vector arithmetic Matlab*P page ranking demo (from SC’03) on a web crawl of mit.edu (170,000 pages)

Social Network Analysis in Matlab: 1993 Co-author graph from 1993 Householder symposium

Social Network Analysis in Matlab: 1993 Which author has the most collaborators? >>[count,author] = max(sum(A)) count = 32 author = 1 >>name(author,:) ans = Golub Sparse Adjacency Matrix

Social Network Analysis in Matlab: 1993 Have Gene Golub and Cleve Moler ever been coauthors? >> A(Golub,Moler) ans = 0 No. But how many coauthors do they have in common? >> AA = A^2; >> AA(Golub,Moler) ans = 2 And who are those common coauthors? >> name( find ( A(:,Golub).* A(:,Moler) ), :) ans = Wilkinson VanLoan

A graph problem: Maximal Independent Set Graph with vertices V = {1,2,…,n} A set S of vertices is independent if no two vertices in S are neighbors. An independent set S is maximal if it is impossible to add another vertex and stay independent An independent set S is maximum if no other independent set has more vertices Finding a maximum independent set is intractably difficult (NP-hard) Finding a maximal independent set is easy, at least on one processor. The set of red vertices S = {4, 5} is independent and is maximal but not maximum

Sequential Maximal Independent Set Algorithm S = empty set; 2.for vertex v = 1 to n { 3. if (v has no neighbor in S) { 4. add v to S 5. } 6.} S = { }

Sequential Maximal Independent Set Algorithm S = empty set; 2.for vertex v = 1 to n { 3. if (v has no neighbor in S) { 4. add v to S 5. } 6.} S = { 1 }

Sequential Maximal Independent Set Algorithm S = empty set; 2.for vertex v = 1 to n { 3. if (v has no neighbor in S) { 4. add v to S 5. } 6.} S = { 1, 5 }

Sequential Maximal Independent Set Algorithm S = empty set; 2.for vertex v = 1 to n { 3. if (v has no neighbor in S) { 4. add v to S 5. } 6.} S = { 1, 5, 6 } work ~ O(n), but span ~O(n)

Parallel, Randomized MIS Algorithm [Luby] S = empty set; C = V; 2.while C is not empty { 3. label each v in C with a random r(v); 4. for all v in C in parallel { 5. if r(v) < min( r(neighbors of v) ) { 6. move v from C to S; 7. remove neighbors of v from C; 8. } 9. } 10.} S = { } C = { 1, 2, 3, 4, 5, 6, 7, 8 }

Parallel, Randomized MIS Algorithm [Luby] S = empty set; C = V; 2.while C is not empty { 3. label each v in C with a random r(v); 4. for all v in C in parallel { 5. if r(v) < min( r(neighbors of v) ) { 6. move v from C to S; 7. remove neighbors of v from C; 8. } 9. } 10.} S = { } C = { 1, 2, 3, 4, 5, 6, 7, 8 }

Parallel, Randomized MIS Algorithm [Luby] S = empty set; C = V; 2.while C is not empty { 3. label each v in C with a random r(v); 4. for all v in C in parallel { 5. if r(v) < min( r(neighbors of v) ) { 6. move v from C to S; 7. remove neighbors of v from C; 8. } 9. } 10.} S = { } C = { 1, 2, 3, 4, 5, 6, 7, 8 }

Parallel, Randomized MIS Algorithm [Luby] S = empty set; C = V; 2.while C is not empty { 3. label each v in C with a random r(v); 4. for all v in C in parallel { 5. if r(v) < min( r(neighbors of v) ) { 6. move v from C to S; 7. remove neighbors of v from C; 8. } 9. } 10.} S = { 1, 5 } C = { 6, 8 }

Parallel, Randomized MIS Algorithm [Luby] S = empty set; C = V; 2.while C is not empty { 3. label each v in C with a random r(v); 4. for all v in C in parallel { 5. if r(v) < min( r(neighbors of v) ) { 6. move v from C to S; 7. remove neighbors of v from C; 8. } 9. } 10.} S = { 1, 5 } C = { 6, 8 }

Parallel, Randomized MIS Algorithm [Luby] S = empty set; C = V; 2.while C is not empty { 3. label each v in C with a random r(v); 4. for all v in C in parallel { 5. if r(v) < min( r(neighbors of v) ) { 6. move v from C to S; 7. remove neighbors of v from C; 8. } 9. } 10.} S = { 1, 5, 8 } C = { }

Parallel, Randomized MIS Algorithm [Luby] S = empty set; C = V; 2.while C is not empty { 3. label each v in C with a random r(v); 4. for all v in C in parallel { 5. if r(v) < min( r(neighbors of v) ) { 6. move v from C to S; 7. remove neighbors of v from C; 8. } 9. } 10.} Theorem: This algorithm “very probably” finishes within O(log n) rounds. work ~ O(n log n), but span ~O(log n)