CS 590 Term Project Epidemic model on Facebook

Slides:



Advertisements
Similar presentations
Social network partition Presenter: Xiaofei Cao Partick Berg.
Advertisements

LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Introduction to Markov Random Fields and Graph Cuts Simon Prince
Analysis and Modeling of Social Networks Foudalis Ilias.
SOCELLBOT: A New Botnet Design to Infect Smartphones via Online Social Networking th IEEE Canadian Conference on Electrical and Computer Engineering(CCECE)
Label Placement and graph drawing Imo Lieberwerth.
Modularity and community structure in networks
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
 Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Enabling Networked Knowledge.
Nodes, Ties and Influence
V4 Matrix algorithms and graph partitioning
Author: Jie chen and Yousef Saad IEEE transactions of knowledge and data engineering.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
Mining and Searching Massive Graphs (Networks)
Communities in Heterogeneous Networks Chapter 4 1 Chapter 4, Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool,
Algorithmic and Economic Aspects of Networks Nicole Immorlica.
Graph & BFS.
Centrality Measures These measure a nodes importance or prominence in the network. The more central a node is in a network the more significant it is to.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
Lecture 11. Matching A set of edges which do not share a vertex is a matching. Application: Wireless Networks may consist of nodes with single radios,
The Shortest Path Problem
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
Models of Influence in Online Social Networks
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Section 8 – Ec1818 Jeremy Barofsky March 31 st and April 1 st, 2010.
Computer Science 112 Fundamentals of Programming II Introduction to Graphs.
Community detection algorithms: a comparative analysis Santo Fortunato.
Random Walks and Semi-Supervised Learning Longin Jan Latecki Based on : Xiaojin Zhu. Semi-Supervised Learning with Graphs. PhD thesis. CMU-LTI ,
Dijkstra’s Algorithm. Announcements Assignment #2 Due Tonight Exams Graded Assignment #3 Posted.
Lectures 6 & 7 Centrality Measures Lectures 6 & 7 Centrality Measures February 2, 2009 Monojit Choudhury
DATA MINING LECTURE 13 Pagerank, Absorbing Random Walks Coverage Problems.
A Graph-based Friend Recommendation System Using Genetic Algorithm
Workshop on Applications of Wireless Communications (WAWC 2008) 21 August 2008, Lappeenranta - Finland CONTROLLING EPIDEMICS IN WIRELESS NETWORKS Ranjan.
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
Slides are modified from Lada Adamic
CS 361 – Chapter 16 Final thoughts on minimum spanning trees and similar problems Flow networks Commitment: –Decide on presentation order.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Data Structures and Algorithms in Parallel Computing Lecture 3.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Community Discovery in Social Network Yunming Ye Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
A Framework for Reliable Routing in Mobile Ad Hoc Networks Zhenqiang Ye Srikanth V. Krishnamurthy Satish K. Tripathi.
1 Finding Spread Blockers in Dynamic Networks (SNAKDD08)Habiba, Yintao Yu, Tanya Y., Berger-Wolf, Jared Saia Speaker: Hsu, Yu-wen Advisor: Dr. Koh, Jia-Ling.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Class 2: Graph Theory IST402. Can one walk across the seven bridges and never cross the same bridge twice? Network Science: Graph Theory THE BRIDGES OF.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Informatics tools in network science
Network Theory: Community Detection Dr. Henry Hexmoor Department of Computer Science Southern Illinois University Carbondale.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Importance Measures on Nodes Lecture 2 Srinivasan Parthasarathy 1.
Analysis of Massive Data Sets Prof. dr. sc. Siniša Srbljić Doc. dr. sc. Dejan Škvorc Doc. dr. sc. Ante Đerek Faculty of Electrical Engineering and Computing.
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Network Flow.
Groups of vertices and Core-periphery structure
Department of Computer and IT Engineering University of Kurdistan
by Hyunwoo Park and Kichun Lee Knowledge-Based Systems 60 (2014) 58–72
Greedy Algorithm for Community Detection
Lectures on Network Flows
Network analysis.
Special Graphs: Modeling and Algorithms
Community detection in graphs
Section 7.12: Similarity By: Ralucca Gera, NPS.
Segmentation Graph-Theoretic Clustering.
Why Social Graphs Are Different Communities Finding Triangles
Katz Centrality (directed graphs).
3.3 Network-Centric Community Detection
Practical Applications Using igraph in R Roger Stanton
Special Graphs: Modeling and Algorithms
Analysis of Large Graphs: Overlapping Communities
Presentation transcript:

CS 590 Term Project Epidemic model on Facebook ChoungRyeol LEE, Shubham Agrawal, Ashwin Jiwane

Facebook (partial) Network Source: Facebook ego network, Stanford Network Analysis Project

Data Limitation and Processing It is infeasible for us to access (and handle) the complete Facebook data Analysis is done on partial dataset obtained from Stanford Network Analysis Project The original data is the directed ego-network (without ego) of 10 nodes which we had to reconstruct, i.e. make it undirected and add the ego-edges Ego-network:

Source: Slides by Giorgos Cheliotis, National University of Singapore What is Ego Network? Source: Slides by Giorgos Cheliotis, National University of Singapore

Network Characteristics Value Number of nodes (n) 3963 Number of edges (m) 88156 Number of cluster (c) 1 Minimum degree (dmin) 2 Maximum degree (dmax) 1034 Average degree (d) 22.245 Average path length (l) 3.776 Diameter (D) 8 Global clustering coefficient (cc) 0.5212 Maximum clique size 57

Centrality Measures Weight Centrality Measures Node Weighted Eigenvector 2160 Pagerank 1641 Closeness 100 Betweenness Degree Non-Weighted 1868 3381

Facebook Interpretation Basic Analysis Centrality Measure Facebook Interpretation Pagerank It is very likely to visit his profile in random surfing starting from anyone else’s profile Eigenvector This person has ‘influential’ (or social) friends Betweenness This person is an important connection between different people Closeness This person uses minimum amount of ‘mutual friends’ link to connect to anyone else Degree This person has maximum number of friends Observations: The graph follows the “Small World Phenomenon” as the average path length is 3.776 but it is not a “Scale-Free” network since it doesn’t follow Power-Law

Power-Law

Friendship Strength In FB, possible ways to measures friendship: Mutual friends Common biography (location, education, etc) Mutual interests (pages, likes, etc) Common social groups Due to limitation of data, we considered only Mutual Friends as the weighing measure

Cosine Similarity Cosine similarity measures the normalized number of common friends Basic principle is to take the cosine of the vectors (rows) from adjacency matrix In study network: Maximum cosine value = 0.961454 Minimum cosine value = 0.003408

Epidemic Models SI and SIR Model: SI Model: SIR Model: A node is susceptible to infected node with certain probability You repost/share from friends SI Model: Once a node is infected, it remains infected Post remains active on the wall SIR Model: Once a node is infected, it remains infected for certain time period Post gets inactive after certain time period

Model Simulation Simulated epidemic model on the graph Pre-infected a particular node Compared the results with different nodes of importance Checked for the time steps required for complete cascade in SI model Checked for the time steps required to reach stable condition in SIR model Stable condition means no more node is getting infected due to ‘died’ nodes

Model Simulation Model assumptions: Probability of infection Discrete time intervals Assumed two scenarios of probability: Function of weight Similar to ‘Top News’ posts Independent of weight Similar to ‘Most Recent’ posts

SIR Model Results Function W 0.2 Model SI Importance Eigen Page Degree Node 2160 1641 100 1868 3381 TimeStep Freq 1 2 167 172 219 168 120 233 3 254 476 693 491 276 679 4 269 133 403 463 225 669 5 41 338 789 872 464 925 6 46 598 723 482 771 507 7 435 352 427 663 766 309 8 629 702 249 412 906 227 9 549 631 164 96 257 114 10 633 169 109 78 38 16 11 273 93 55 117 15 12 213 110 30 13 123 14 116 21   28 17 18 19 20

SIR Model Results

SIR Interpretation Unweighted graph (p=0.2): Weighted graph(p=0.2): Degree: steepest curve, infects less people, EigenVector: steep curve, infects most people Pagerank: grows slowest, infects more people Weighted graph(p=0.2): Degree: steepest curve, infects less people EigenVector: grows slowest, infects most people Pagerank: grows slow, infects more people, better than eigenvector due to weights

SI Model Results 2160 – Weig, 1641 – Wpage, 100 – Wdegree; 1868 – Eig, 3381 – Page, 100 - Degree

Currently working on.. Quarantine Strategy: Vaccination Strategy: Choose the nodes to quarantine at a certain time interval such that they don’t affect others Account blocked (reported as spam) Vaccination Strategy: Choose the nodes to vaccinate, i.e. make them safe from certain viral, such that epidemic doesn’t flow through them Spam filter Objective is to minimize the cost of prevention and/or precaution with the aim of ‘curing’ epidemic Quarantine – We are picking up people based on the time they have been infected and their importance in the network Vaccination – We are saving people based on the number of infected neighbors and their importance in the network

Communities in Facebook network Held together by some common interests and ideas of a large group of people in Facebook Any one person may be part of many communities which are overlapping and nested structure Groups within social networks might highly correspond to social units or communities in reality A subset of Facebook users within the graph such that connections between the users are denser than connections with the rest of the network. One person has only one community

Reviews of Community Detection Two methods for discovering groups in networks Graph partitioning Pre-fixed number of parts by minimizing “cut edge” Computation load(NP-Hard) Community structure detection Suitable for the structure of large-scale network data Provides information on topology of the network Two approaches really want to address the same question with somewhat different means.

Community Detection using iGraph Algorithms in Igraph Optimal communities Basic framework Infomap 92 communities Compressing the description of information flows on networks.  Leading Eigenvector 18 communities Calculation of Leading non-negative eigenvector of the modularity matrix of the graph and distributions of vertices by the sign of eigenvector Label Propagation 57 communities Labeling with unique labels and updating by majority voting in the neighbors of the vertex Multilevel 17 communities Contribution to modularity with sequential changes of assignment of nodes

Thank You