Why Social Graphs Are Different Communities Finding Triangles

Slides:



Advertisements
Similar presentations
Complex Networks Luis Miguel Varela COST meeting, Lisbon March 27 th 2013.
Advertisements

Lecture 5 Graph Theory. Graphs Graphs are the most useful model with computer science such as logical design, formal languages, communication network,
22C:19 Discrete Math Graphs Fall 2014 Sukumar Ghosh.
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Analysis and Modeling of Social Networks Foudalis Ilias.
Social Network Analysis and Its Applications By Paul Rossman Indiana University of Pennsylvania.
Graph A graph, G = (V, E), is a data structure where: V is a set of vertices (aka nodes) E is a set of edges We use graphs to represent relationships among.
Prof. Swarat Chaudhuri COMP 482: Design and Analysis of Algorithms Spring 2013 Lecture 4.
Advanced Topics in Data Mining Special focus: Social Networks.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
Peer-to-Peer and Grid Computing Exercise Session 3 (TUD Student Use Only) ‏
CSE 321 Discrete Structures Winter 2008 Lecture 25 Graph Theory.
How is this going to make us 100K Applications of Graph Theory.
Chapter 11 Graphs and Trees This handout: Terminology of Graphs Eulerian Cycles.
The Shortest Path Problem
CS8803-NS Network Science Fall 2013
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
Cloud and Big Data Summer School, Stockholm, Aug., 2015 Jeffrey D. Ullman.
Section 8 – Ec1818 Jeremy Barofsky March 31 st and April 1 st, 2010.
Graph Theory in Computer Science
Mining Social Network Graphs Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata November 13, 17, 2014.
7.1 and 7.2: Spanning Trees. A network is a graph that is connected –The network must be a sub-graph of the original graph (its edges must come from the.
COLOR TEST COLOR TEST. Social Networks: Structure and Impact N ICOLE I MMORLICA, N ORTHWESTERN U.
Social Network Basics CS315 – Web Search and Data Mining.
Professor Yashar Ganjali Department of Computer Science University of Toronto
ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: 2620a.htm Office: TEL 3049.
L – Modelling and Simulating Social Systems with MATLAB Lesson 6 – Graphs (Networks) Anders Johansson and Wenjian Yu (with S. Lozano.
Minimal Spanning Tree Problems in What is a minimal spanning tree An MST is a tree (set of edges) that connects all nodes in a graph, using.
Introduction to Graph Theory
The Structure of the Web. Getting to knowing the Web How big is the web and how do you measure it? How many people use the web? How many use search engines?
Comp. Genomics Recitation 10 Clustering and analysis of microarrays.
Graph theory and networks. Basic definitions  A graph consists of points called vertices (or nodes) and lines called edges (or arcs). Each edge joins.
Class 2: Graph Theory IST402. Can one walk across the seven bridges and never cross the same bridge twice? Network Science: Graph Theory THE BRIDGES OF.
Class 2: Graph Theory IST402.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
Importance Measures on Nodes Lecture 2 Srinivasan Parthasarathy 1.
Jeffrey D. Ullman Stanford University.  Different algorithms for the same problem can be parallelized to different degrees.  The same activity can (sometimes)
Lecture 23: Structure of Networks
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Graphs and Social Networks
School of Computing Clemson University Fall, 2012
Social Networks Analysis
Structural Properties of Networks: Introduction
CSE373: Data Structures & Algorithms Lecture 16: Introduction to Graphs Linda Shapiro Winter 2015.
B. Jayalakshmi and Alok Singh 2015
Network analysis.
Community detection in graphs
Lecture 23: Structure of Networks
Structural Properties of Networks: Introduction
Network Science: A Short Introduction i3 Workshop
CSE373: Data Structures & Algorithms Lecture 16: Introduction to Graphs Linda Shapiro Spring 2016.
Section 8.6 of Newman’s book: Clustering Coefficients
Betweeness Counting Triangles Transitive Closure
Counting Triangles Transitive Closure
Graphs Chapter 11 Objectives Upon completion you will be able to:
Peer-to-Peer and Social Networks Fall 2017
Graph Operations And Representation
Clustering Coefficients
Network Graph Chuck Alice Vertices Nodes Edges Links Bob.
Lecture 23: Structure of Networks
ITEC 2620M Introduction to Data Structures
EMIS 8374 Search Algorithms Updated 9 February 2004
Warm Up – Tuesday Find the critical times for each vertex.
Network Models Michael Goodrich Some slides adapted from:
Advanced Topics in Data Mining Special focus: Social Networks
Graphs G = (V,E) V is the vertex set.
EMIS 8374 Search Algorithms Updated 12 February 2008
Chapter 9 Graph algorithms
GRAPHS.
Presentation transcript:

Why Social Graphs Are Different Communities Finding Triangles Graphs and Social Networks Why Social Graphs Are Different Communities Finding Triangles Jeffrey D. Ullman Stanford University/Infolab

Social Graphs Graphs can be either directed or undirected. Example: The Facebook “friends” graph (undirected). Nodes = people; edges between friends. Example: Twitter followers (directed). Nodes = people; arcs from a person to one they follow. Example: Phonecalls (directed, but could be considered undirected as well). Nodes = phone numbers; arc from caller to callee, or edge between both.

Properties of Social Graphs Locality (edges are not randomly chosen, but tend to cluster in “communities”). Small-world property (low diameter = maximum distance from any node to any other).

Locality A graph exhibits locality if when there is an edge from x to y and an edge from y to z, then the probability of an edge from x to z is higher than one would expect given the number of nodes and edges in the graph. Example: On Facebook, if y is friends with x and z, then there is a good chance x and z are friends. Community = set of nodes with an unusually high density of edges.

It’s a Small World After All Many very large graphs have small diameter (maximum distance between two nodes). Called the small world property. Example: 6 degrees of Kevin Bacon. Example: “Erdos numbers.” Example: Most pairs of Web pages are within 12 links of one another. But study at Google found pairs of pages whose shortest path has a length about a thousand.

Heavy Hitters Two Kinds of Triangles Optimal Algorithm Finding Triangles Heavy Hitters Two Kinds of Triangles Optimal Algorithm

Counting Triangles Why Care? Density of triangles measures maturity of a community. As communities age, their members tend to connect. The algorithm is actually an example of a recent and powerful theory of optimal join computation.

Needed Data Structures Assume that in O(1) time we can answer the question “is there an edge between nodes x and y?” Question for thought: What data structure works? Assume that if a node x has degree d, then in O(d) time we can find all the nodes adjacent to x.

First Observations Let the undirected graph have N nodes and M edges. N < M < N2. One approach: Consider all N-choose-3 sets of nodes, and see if there are edges connecting all 3. An O(N3) algorithm. Another approach: consider all edges e and all nodes u and see if both ends of e have edges to u. An O(MN) algorithm. Note that can’t be worse than O(N3).

Heavy Hitters To find a better algorithm, we need to use the concept of a heavy hitter – a node with degree at least M. Note: there can be no more than 2M heavy hitters, or the sum of the degrees of all nodes exceeds 2M. Remember: sum of node degrees = 2 times the number of edges. A heavy-hitter triangle is one whose three nodes are all heavy hitters.

Finding Heavy-Hitter Triangles Consider all triples of heavy hitters and see if there are edges between each pair of the three. Takes time O(M1.5), since there is a limit of 2M on the number of heavy hitters.

Finding Other Triangles At least one node is not a heavy hitter. Consider each edge e. If both ends are heavy hitters, ignore. Otherwise, let end node u not be a heavy hitter. For each of the at most M nodes v connected to u, see whether v is connected to the other end of e. Takes time O(M1.5). M edges, and at most O(M) work with each.

Optimality of This Algorithm Both parts take O(M1.5) time and together find any triangle in the graph. For any N and M, you can find a graph with N nodes, M edges, and (M1.5) triangles, so no algorithm can do significantly better. Note that M1.5 can never be greater than the running times of the two obvious algorithms with which we began: N3 and MN. And if M is strictly between N and N2, then M1.5 is strictly better than either.