1 Patterns of Cascading Behavior in Large Blog Graphs Jure Leskoves, Mary McGlohon, Christos Faloutsos, Natalie Glance, Matthew Hurst SDM 2007 Date:2008/8/21.

Slides:



Advertisements
Similar presentations
1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
Advertisements

Modeling Blog Dynamics Speaker: Michaela Götz Joint work with: Jure Leskovec, Mary McGlohon, Christos Faloutsos Cornell University Carnegie Mellon University.
Toyota InfoTechnology Center U.S.A, Inc. 1 Mixture Models of End-host Network Traffic John Mark Agosta, Jaideep Chandrashekar, Mark Crovella, Nina Taft.
Lecture 21 Network evolution Slides are modified from Jurij Leskovec, Jon Kleinberg and Christos Faloutsos.
Patterns of Influence in a Recommendation Network Jure Leskovec, CMU Ajit Singh, CMU Jon Kleinberg, Cornell School of Computer Science Carnegie Mellon.
Finding Self-similarity in People Opportunistic Networks Ling-Jyh Chen, Yung-Chih Chen, Paruvelli Sreedevi, Kuan-Ta Chen Chen-Hung Yu, Hao Chu.
Power Laws: Rich-Get-Richer Phenomena
Advanced Topics in Data Mining Special focus: Social Networks.
On Power-Law Relationships of the Internet Topology Michalis Faloutsos Petros Faloutsos Christos Faloutsos.
Farnoush Banaei-Kashani and Cyrus Shahabi Criticality-based Analysis and Design of Unstructured P2P Networks as “ Complex Systems ” Mohammad Al-Rifai.
Web Graph Characteristics Kira Radinsky All of the following slides are courtesy of Ronny Lempel (Yahoo!)
4. PREFERENTIAL ATTACHMENT The rich gets richer. Empirical evidences Many large networks are scale free The degree distribution has a power-law behavior.
Masters Thesis Defense Amit Karandikar Advisor: Dr. Anupam Joshi Committee: Dr. Finin, Dr. Yesha, Dr. Oates Date: 1 st May 2007 Time: 9:30 am Place: ITE.
DYNAMICS OF PREFIX USAGE AT AN EDGE ROUTER Kaustubh Gadkari, Dan Massey and Christos Papadopoulos 1.
CMPT 855Module Network Traffic Self-Similarity Carey Williamson Department of Computer Science University of Saskatchewan.
On the Self-Similar Nature of Ethernet Traffic - Leland, et. Al Presented by Sumitra Ganesh.
Cascading Behavior in Large Blog Graphs Patterns and a Model Leskovec et al. (SDM 2007)
Alon Arad Alon Arad Hurst Exponent of Complex Networks.
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
CS Lecture 6 Generative Graph Models Part II.
Blogosphere  What is blogosphere?  Why do we need to study Blog-space or Blogosphere?
On Power-Law Relationships of the Internet Topology CSCI 780, Fall 2005.
Cascading Behavior in Large Blog Graphs: Patterns and a model offence.
Analysis of the Internet Topology Michalis Faloutsos, U.C. Riverside (PI) Christos Faloutsos, CMU (sub- contract, co-PI) DARPA NMS, no
A Measurement -driven Analysis of Information Propagation in the Flickr Social Network Offense April 29.
Decoding the Structure of the WWW : A Comparative Analysis of Web Crawls AUTHORS: M.Angeles Serrano Ana Maguitman Marian Boguna Santo Fortunato Alessandro.
Those who don’t know statistics are condemned to reinvent it… David Freedman.
Computer Science Sampling Biases in IP Topology Measurements John Byers with Anukool Lakhina, Mark Crovella and Peng Xie Department of Computer Science.
Analysis of Social Information Networks Thursday January 27 th, Lecture 3: Popularity-Power law 1.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.
Computer Science 1 Web as a graph Anna Karpovsky.
A Measurement-driven Analysis of Information Propagation in the Flickr Social Network WWW09 报告人: 徐波.
On Power-Law Relationships of the Internet Topology.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Patterns And A Generative Model Jan 24, 2014 Authors: Jianwei Niu, Wanjiun Liao, Jing Peng, Chao Tong Presenter: Guoming Wang Published: Performance Computing.
Weighted Graphs and Disconnected Components Patterns and a Generator IDB Lab 현근수 In KDD 08. Mary McGlohon, Leman Akoglu, Christos Faloutsos.
1 Computing with Social Networks on the Web (2008 slide deck) Jennifer Golbeck University of Maryland, College Park Jim Hendler Rensselaer Polytechnic.
Information Diffusion Mary McGlohon CMU /23/10.
Data Analysis in YouTube. Introduction Social network + a video sharing media – Potential environment to propagate an influence. Friendship network and.
NOTES The Normal Distribution. In earlier courses, you have explored data in the following ways: By plotting data (histogram, stemplot, bar graph, etc.)
COLOR TEST COLOR TEST. Social Networks: Structure and Impact N ICOLE I MMORLICA, N ORTHWESTERN U.
A Graph-based Friend Recommendation System Using Genetic Algorithm
Selfishness, Altruism and Message Spreading in Mobile Social Networks September 2012 In-Seok Kang
Challenges and Opportunities Posed by Power Laws in Network Analysis Bruno Ribeiro UMass Amherst MURI REVIEW MEETING Berkeley, 26 th Oct 2011.
NTU Natural Language Processing Lab. 1 Investment and Attention in the Weblog Community Advisor: Hsin-Hsi Chen Speaker: Sheng-Chung Yen.
1 Graph mining techniques applied to blogs Mary McGlohon Seminar on Social Media Analysis- Oct
Sampling Techniques for Large, Dynamic Graphs Daniel Stutzbach – University of Oregon Reza Rejaie – University of Oregon Nick Duffield – AT&T Labs—Research.
LOGO Identifying Opinion Leaders in the Blogosphere Xiaodan Song, Yun Chi, Koji Hino, Belle L. Tseng CIKM 2007 Advisor : Dr. Koh Jia-Ling Speaker : Tu.
Du, Faloutsos, Wang, Akoglu Large Human Communication Networks Patterns and a Utility-Driven Generator Nan Du 1,2, Christos Faloutsos 2, Bai Wang 1, Leman.
R-MAT: A Recursive Model for Graph Mining Deepayan Chakrabarti Yiping Zhan Christos Faloutsos.
Mobility Models for Wireless Ad Hoc Network Research EECS 600 Advanced Network Research, Spring 2005 Instructor: Shudong Jin March 28, 2005.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
1 Finding Spread Blockers in Dynamic Networks (SNAKDD08)Habiba, Yintao Yu, Tanya Y., Berger-Wolf, Jared Saia Speaker: Hsu, Yu-wen Advisor: Dr. Koh, Jia-Ling.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Near repeat burglary chains: describing the physical and network properties of a network of close burglary pairs. Dr Michael Townsley, UCL Jill Dando Institute.
1 Dr. Michael D. Featherstone Introduction to e-Commerce Network Theory 101.
The World Wine Web: The Structure of the Global Wine Trade
Topics In Social Computing (67810)
NetMine: Mining Tools for Large Graphs
Generative Model To Construct Blog and Post Networks In Blogosphere
Part 1: Graph Mining – patterns
R-MAT: A Recursive Model for Graph Mining
The likelihood of linking to a popular website is higher
Discovery of Blog Communities based on Mutual Awareness
Discovering Important Nodes through Graph Entropy
Advanced Topics in Data Mining Special focus: Social Networks
CPSC 641: Network Traffic Self-Similarity
Heterogeneous Graph Attention Network
Presentation transcript:

1 Patterns of Cascading Behavior in Large Blog Graphs Jure Leskoves, Mary McGlohon, Christos Faloutsos, Natalie Glance, Matthew Hurst SDM 2007 Date:2008/8/21 Advisor: Dr. Koh, Jia Ling Speaker: Li, Huei Jyun

2 Outline Introduction Preliminaries Experimental setup Observations, patterns and laws Temporal dynamics of posts and links Blog network and Post network topology Patterns in the cascades Conclusions

3 Introduction By examining linking patterns from one blog post to another, we can infer the way information spreads through a social network over the web Does traffic in the network exhibit bursty and/or periodic behavior? After a topic becomes popular, how does interest die off? – linearly, or exponentially?

4 Introduction We would also like to discover topological patterns in information propagation graphs (cascades) Do graphs of information cascades have common shapes? What are their properties? What are characteristic in-link patterns for different nodes in a cascade What can we say about the size distribution of cascades?

5 Preliminaries Blogs (weblogs) are web sites that are updated on a regular basis Blogs are composed of posts that typically have room for comments by readers Blogs and posts typically link each other, as well as other resources on the web

6 Preliminaries

7 From the Post network, we extract information cascades

8 Preliminaries Cascades form two main shapes, which we refer to as stars and chains A star occurs when a single center post is linked by several other posts, but the links do not propagate further This produces a wide, shallow tree

Preliminaries A chain occurs when a root is linked by a single post, which in turn is linked by another post This creates a deep tree that has little breadth 9

10 Experimental setup We are interested in blogs and posts that actively participate in discussions, so we biased our dataset towards the more active part of the blogsphere Focused on the most-cited blogs and traced forward and backward conversation trees containing these blogs This process produced a dataset of 2,422,704 posts from 44,362 blogs gathered over the two-month period(August and September 2005) There are 245,404 links among the posts of our dataset

11 Experimental setup Reduced time resolution to one day Removed edges pointing to webpages outside the dataset and to posts supposedly written in the future Removed links where a post pointed to itself (although a link to a previous post in the same blog was allowed)

12 Observations, patterns, and laws Temporal dynamics of posts and links Traffic in the blogsphere is not uniform Posting and blog-to-blog linking patterns tend to have a weekend effect, with frequency sharply dropping off at weekends

Observations, patterns, and laws Examine how a post’s popularity grows and declines over time Collect all in-links to a post and plot the number of links occurring after each day following the post 13

Observations, patterns, and laws The weekend effect creates abnormalities Smooth the in-link plots by applying a weighting parameter to the plots separated by day of week For each delay △ on the horizontal axis, we estimate the corresponding day of week d, and we prorate the count for △ by dividing it by l(d)  l(d) is the percent of blog links occurring on day of week d 14

Observations, patterns, and laws Fit the power-law distribution with a cut-off in the tail Fit on 30 days of data, as most posts in the graph have complete in-link patterns for the 30 days following publication Found a stable power-law exponent of around

Observations, patterns, and laws Blog network and Post network topology Blog network’s topology 16

Observations, patterns, and laws Blog network and Post network topology Blog network’s topology 17

Observations, patterns, and laws Blog network and Post network topology Post network’s topology 18

Observations, patterns, and laws Patterns in the cascades Found all cascade initiator nodes, i.e., nodes that have zero out-degree, and started following their in-links This process gives us a directed acyclic graph with a single root node 19

Observations, patterns, and laws Common cascade shapes A node represents a post and the influence flows from the top to the bottom Cascades tend to be wide and not too deep – stars and shallow bursty cascades are the most common type of cascades 20

Observations, patterns, and laws Cascade topological properties What is the common topological pattern in the cascades? From the Post network we extract all the cascades and measure the overall degree distribution 21

Observations, patterns, and laws The in-degree exponent is stable and does not change much given level L in the cascade A node is at level L if it is L hops away from the cascade initiator Posts still attract attention even if they are some what late in the cascade and appear towards the bottom of it 22

Observations, patterns, and laws What distribution do cascade sizes follow? Does the probability of observing a cascade on n nodes decreases exponentially with n? Examine the Cascade Size Distributions over the bag of cascades extracted from the Post network 23

Observations, patterns, and laws All follow a heavy-tailed distribution, with slopes ≒ -2 overall The probability of observing a cascade on n nodes follows a Zipf distribution: Stars have the power-law exponent ≒ -3.1 Chains are small and rare and decay with exponent ≒

Conclusion Trying to find how blogs behave and how information propagates through the blogsphere Temporal patterns: The decline of a post’s popularity follows a power-law, rather than a exponential dropoff as might be expected 25

Conclusion Topological patterns: Almost any metric we examined follows a power law: size of cascades, size of blogs, in- and out-degrees Stars and chains are basic components of cascades, with stars being more common 26