Social Network Analysis

Slides:



Advertisements
Similar presentations
Complex Networks Advanced Computer Networks: Part1.
Advertisements

Algorithmic and Economic Aspects of Networks Nicole Immorlica.
The Architecture of Complexity: Structure and Modularity in Cellular Networks Albert-László Barabási University of Notre Dame title.
Analysis and Modeling of Social Networks Foudalis Ilias.
VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.
Synopsis of “Emergence of Scaling in Random Networks”* *Albert-Laszlo Barabasi and Reka Albert, Science, Vol 286, 15 October 1999 Presentation for ENGS.
Advanced Topics in Data Mining Special focus: Social Networks.
Hierarchy in networks Peter Náther, Mária Markošová, Boris Rudolf Vyjde : Physica A, dec
1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.
Complex Networks Third Lecture TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA TexPoint fonts used in EMF. Read the.
Network Models Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Models Why should I use network models? In may 2011, Facebook.
Scale-free networks Péter Kómár Statistical physics seminar 07/10/2008.
Exp. vs. Scale-Free Poisson distribution Exponential Network Power-law distribution Scale-free Network.
The Barabási-Albert [BA] model (1999) ER Model Look at the distribution of degrees ER ModelWS Model actorspower grid www The probability of finding a highly.
Biological Networks Feng Luo.
1 Complex systems Made of many non-identical elements connected by diverse interactions. NETWORK New York Times Slides: thanks to A-L Barabasi.
CS 728 Lecture 4 It’s a Small World on the Web. Small World Networks It is a ‘small world’ after all –Billions of people on Earth, yet every pair separated.
Peer-to-Peer and Grid Computing Exercise Session 3 (TUD Student Use Only) ‏
Sedgewick & Wayne (2004); Chazelle (2005) Sedgewick & Wayne (2004); Chazelle (2005)
Global topological properties of biological networks.
The structure of the Internet. How are routers connected? Why should we care? –While communication protocols will work correctly on ANY topology –….they.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
The structure of the Internet. The Internet as a graph Remember: the Internet is a collection of networks called autonomous systems (ASs) The Internet.
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Summary from Previous Lecture Real networks: –AS-level N= 12709, M=27384 (Jan 02 data) route-views.oregon-ix.net, hhtp://abroude.ripe.net/ris/rawdata –
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Stefano Boccaletti Complex networks in science and society *Istituto Nazionale di Ottica Applicata - Largo E. Fermi, Florence, ITALY *CNR-Istituto.
Complex Networks First Lecture TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA TexPoint fonts used in EMF. Read the.
Computing & Information Sciences Kansas State University Laboratory for Knowledge Discovery in Databases PhD Research Proficiency Exam Jing.
Social Network Analysis Prof. Dr. Daning Hu Department of Informatics University of Zurich Mar 5th, 2013.
Networks Igor Segota Statistical physics presentation.
Class 9: Barabasi-Albert Model-Part I
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
LECTURE 2 1.Complex Network Models 2.Properties of Protein-Protein Interaction Networks.
Most of contents are provided by the website Network Models TJTSD66: Advanced Topics in Social Media (Social.
What Is A Network? (and why do we care?). An Introduction to Network Theory | Kyle Findlay | SAMRA 2010 | 2 “A collection of objects (nodes) connected.
How Do “Real” Networks Look?
March 3, 2009 Network Analysis Valerie Cardenas Nicolson Assistant Adjunct Professor Department of Radiology and Biomedical Imaging.
Hierarchical Organization in Complex Networks by Ravasz and Barabasi İlhan Kaya Boğaziçi University.
Netlogo demo. Complexity and Networks Melanie Mitchell Portland State University and Santa Fe Institute.
Social Network Analysis and Mining June 10, CENG 514.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Lecture II Introduction to complex networks Santo Fortunato.
The simultaneous evolution of author and paper networks
Network (graph) Models
Lecture 23: Structure of Networks
Structures of Networks
Bioinformatics 3 V6 – Biological Networks are Scale- free, aren't they? Fri, Nov 2, 2012.
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Lecture 1: Complex Networks
Topics In Social Computing (67810)
Biological networks CS 5263 Bioinformatics.
How Do “Real” Networks Look?
Lecture 23: Structure of Networks
How Do “Real” Networks Look?
How Do “Real” Networks Look?
Models of Network Formation
Models of Network Formation
Peer-to-Peer and Social Networks Fall 2017
Models of Network Formation
How Do “Real” Networks Look?
Models of Network Formation
Peer-to-Peer and Social Networks
Lecture 23: Structure of Networks
Lecture 9: Network models CS 765: Complex Networks
Network Models Michael Goodrich Some slides adapted from:
Advanced Topics in Data Mining Special focus: Social Networks
Advanced Topics in Data Mining Special focus: Social Networks
Presentation transcript:

Social Network Analysis Social Network Introduction Statistics and Probability Theory Models of Social Network Generation Networks in Biological System November 21, 2018 Data Mining: Concepts and Techniques

Six Degrees of Separation Society Nodes: individuals Links: social relationship (family/work/friendship/etc.) S. Milgram (1967) Six Degrees of Separation John Guare Social networks: Many individuals with diverse social interactions between them. November 21, 2018 Data Mining: Concepts and Techniques

Communication networks The Earth is developing an electronic nervous system, a network with diverse nodes and links are -computers -routers -satellites -phone lines -TV cables -EM waves Communication networks: Many non-identical components with diverse connections between them. November 21, 2018 Data Mining: Concepts and Techniques

New York Times Complex systems NETWORK Made of many non-identical elements connected by diverse interactions. NETWORK November 21, 2018 Data Mining: Concepts and Techniques

“Natural” Networks and Universality Consider many kinds of networks: social, technological, business, economic, content,… These networks tend to share certain informal properties: large scale; continual growth distributed, organic growth: vertices “decide” who to link to interaction restricted to links mixture of local and long-distance connections abstract notions of distance: geographical, content, social,… Do natural networks share more quantitative universals? What would these “universals” be? How can we make them precise and measure them? How can we explain their universality? This is the domain of social network theory Sometimes also referred to as link analysis November 21, 2018 Data Mining: Concepts and Techniques

Some Interesting Quantities Connected components: how many, and how large? Network diameter: maximum (worst-case) or average? exclude infinite distances? (disconnected components) the small-world phenomenon Clustering: to what extent that links tend to cluster “locally”? what is the balance between local and long-distance connections? what roles do the two types of links play? Degree distribution: what is the typical degree in the network? what is the overall distribution? November 21, 2018 Data Mining: Concepts and Techniques

A “Canonical” Natural Network has… Few connected components: often only 1 or a small number, indep. of network size Small diameter: often a constant independent of network size (like 6) or perhaps growing only logarithmically with network size or even shrink? typically exclude infinite distances A high degree of clustering: considerably more so than for a random network in tension with small diameter A heavy-tailed degree distribution: a small but reliable number of high-degree vertices often of power law form MIGHT GIVE REAL EXAMPLES HERE? FROM WATTS? November 21, 2018 Data Mining: Concepts and Techniques

Probabilistic Models of Networks All of the network generation models we will study are probabilistic or statistical in nature They can generate networks of any size They often have various parameters that can be set: size of network generated average degree of a vertex fraction of long-distance connections The models generate a distribution over networks Statements are always statistical in nature: with high probability, diameter is small on average, degree distribution has heavy tail Thus, we’re going to need some basic statistics and probability theory November 21, 2018 Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques Zipf’s Law Look at the frequency of English words: “the” is the most common, followed by “of”, “to”, etc. claim: frequency of the n-th most common ~ 1/n (power law, α = 1) General theme: rank events by their frequency of occurrence resulting distribution often is a power law! Other examples: North America city sizes personal income file sizes genus sizes (number of species) People seem to dither over exact form of these distributions (e.g. value of α), but not heavy tails November 21, 2018 Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques Zipf’s Law The same data plotted on linear and logarithmic scales. Both plots show a Zipf distribution with 300 datapoints Linear scales on both axes Logarithmic scales on both axes November 21, 2018 Data Mining: Concepts and Techniques

Social Network Analysis Social Network Introduction Statistics and Probability Theory Models of Social Network Generation Networks in Biological System Summary November 21, 2018 Data Mining: Concepts and Techniques

Some Models of Network Generation Random graphs (Erdös-Rényi models): gives few components and small diameter does not give high clustering and heavy-tailed degree distributions is the mathematically most well-studied and understood model Watts-Strogatz models: give few components, small diameter and high clustering does not give heavy-tailed degree distributions Scale-free Networks: gives few components, small diameter and heavy-tailed distribution does not give high clustering Hierarchical networks: few components, small diameter, high clustering, heavy-tailed Affiliation networks: models group-actor formation November 21, 2018 Data Mining: Concepts and Techniques

The Clustering Coefficient of a Network Let nbr(u) denote the set of neighbors of u in a graph all vertices v such that the edge (u,v) is in the graph The clustering coefficient of u: let k = |nbr(u)| (i.e., number of neighbors of u) choose(k,2): max possible # of edges between vertices in nbr(u) c(u) = (actual # of edges between vertices in nbr(u))/choose(k,2) 0 <= c(u) <= 1; measure of cliquishness of u’s neighborhood Clustering coefficient of a graph: average of c(u) over all vertices u k = 4 choose(k,2) = 6 c(u) = 4/6 = 0.666… November 21, 2018 Data Mining: Concepts and Techniques

The Clustering Coefficient of a Network Clustering: My friends will likely know each other! Probability to be connected C » p # of links between 1,2,…n neighbors C = n(n-1)/2 Networks are clustered [large C(p)] but have a small characteristic path length [small L(p)]. November 21, 2018 Data Mining: Concepts and Techniques

Erdos-Renyi: Clustering Coefficient Generate a network G according to G(N,p) Examine a “typical” vertex u in G choose u at random among all vertices in G what do we expect c(u) to be? Answer: exactly p! In G(N,m), expect c(u) to be 2m/N(N-1) Both cases: c(u) entirely determined by overall density Baseline for comparison with “more clustered” models Erdos-Renyi has no bias towards clustered or local edges November 21, 2018 Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques Scale-free Networks The number of nodes (N) is not fixed Networks continuously expand by additional new nodes WWW: addition of new nodes Citation: publication of new papers The attachment is not uniform A node is linked with higher probability to a node that already has a large number of links WWW: new documents link to well known sites (CNN, Yahoo, Google) Citation: Well cited papers are more likely to be cited again November 21, 2018 Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques Scale-Free Networks Start with (say) two vertices connected by an edge For i = 3 to N: for each 1 <= j < i, d(j) = degree of vertex j so far let Z = S d(j) (sum of all degrees so far) add new vertex i with k edges back to {1, …, i-1}: i is connected back to j with probability d(j)/Z Vertices j with high degree are likely to get more links! “Rich get richer” Natural model for many processes: hyperlinks on the web new business and social contacts transportation networks Generates a power law distribution of degrees exponent depends on value of k November 21, 2018 Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques Scale-Free Networks Preferential attachment explains heavy-tailed degree distributions small diameter (~log(N), via “hubs”) Will not generate high clustering coefficient no bias towards local connectivity, but towards hubs November 21, 2018 Data Mining: Concepts and Techniques

Social Network Analysis Social Network Introduction Statistics and Probability Theory Models of Social Network Generation Networks in Biological System Mining on Social Network Summary November 21, 2018 Data Mining: Concepts and Techniques

Bio-Map GENOME protein-gene interactions PROTEOME protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical reactions November 21, 2018 Data Mining: Concepts and Techniques

Bio-chemical reactions Metabolic Network Bio-chemical reactions METABOLISM Citrate Cycle November 21, 2018 Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques Boehring-Mennheim November 21, 2018 Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques Metab-movie Metabolic Network Nodes: chemicals (substrates) Links: bio-chemical reactions November 21, 2018 Data Mining: Concepts and Techniques

Meta-P(k) Metabolic Network Archaea Bacteria Eukaryotes Organisms from all three domains of life are scale-free networks! H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi, Nature, 407 651 (2000) November 21, 2018 Data Mining: Concepts and Techniques

Bio-Map GENOME protein-gene interactions PROTEOME protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical reactions November 21, 2018 Data Mining: Concepts and Techniques

protein-protein interactions Protein Network PROTEOME protein-protein interactions November 21, 2018 Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques Prot Interaction map Yeast Protein Network Nodes: proteins Links: physical interactions (binding) P. Uetz, et al. Nature 403, 623-7 (2000). November 21, 2018 Data Mining: Concepts and Techniques

Topology of the Protein Network Prot P(k) Topology of the Protein Network H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature 411, 41-42 (2001) November 21, 2018 Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques p53 Network Nature 408 307 (2000) … “One way to understand the p53 network is to compare it to the Internet. The cell, like the Internet, appears to be a ‘scale-free network’.” November 21, 2018 Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques P53 P(k) p53 Network (mammals) November 21, 2018 Data Mining: Concepts and Techniques