Presentation on theme: "1 Small World Networks Jean Vaucher Ift6802 - Avril 2005."— Presentation transcript:
1 Small World Networks Jean Vaucher Ift Avril 2005
ift68022 Contents Pertinence of topic Characterization of networks Regular, Random or Natural Properties of networks Diameter, clustering coefficient Watt’s network models ( alpha & beta ) Power Law networks Clustered networks with short paths Can these short paths be found ?
ift68023 Duncan J. Watts Six degrees - the science of a connected age, 2003, W.W. Norton. I read somewhere that everybody on this planet is separated by only six other people. Six degrees of separation between us and everybody on this planet. Six degrees of separation by John Guare
ift68024 Networks Networks are everywhere Internet Neurons is brains Social networks Transportation Networks have been studied long time Euler (1736): Bridges of Königsberg theory of graphs, which is now a major (and difficult! – or almost obvious) branch in mathematics
ift68025 So what is new? Global interconnections Internet Power grids Mass travel, mass culture FAILURES Computer Viruses Power Blackouts Epidemics Modeling & analysis
ift68026 Milgram’s Experiment Found short chains of acquaintances linking pairs of people in USA who didn’t know each other; Source person in Nebraska Target person in Massachusetts. Sends message by forwarding to people they knew personally (who should be closer to target) Average length of the chains that were completed was between 5 and 6 steps “Six degrees of separation” principle
ift68027 Correct question WHY are there short chains of acquaintances linking together arbitrary pairs of strangers??? Or Why is this surprising
ift68028 Random networks In a random network, if everybody has 100 friends distributed randomly in the world population, this isn’t strange In 6 hops, you can reach people - a million million > 6,000 million (world pop.) BUT: our social networks tend to be clustered.
ift68029 Social networks Not random But Clustered Most of our friends come from our geographical or professional neighbourhood. Our friends tend to have the same friends BUT In spite of having clustered social networks, there seem to exist short paths between any random nodes.
ift Social network research Devise various classes of networks Study their properties
ift Network parameters Network type Regular Random Natural Size: # of nodes Number of connexions: average & distribution Selection of neighbours
ift STAR TREE GRID BUSRING REGULAR Network Topologies
ift Connectivity in Random graphs Nodes connected by links in a purely random fashion How large is the largest connected component? (as a fraction of all nodes) Depends on the number of links per node (Erdös, Rényi 1959)
ift Connecting Nodes
ift Random Network (1) add random paths
ift paths trees Random Network (2)
ift paths trees networks Random Network (3)
ift paths trees networks ….. Random Network (3+)
ift paths trees networks fully connected Network Connectivity (4)
ift Connectivity of a random graph 1 1 Average number of links per node Fraction of all nodes in largest component 0 Disconnected phase Conected phase
ift Regular or Ordered Network
ift Network measures Connectivity is not main measure. Characteristic Path Length (L) : the average length of the shortest path connecting each pair of agents (nodes). Clustering Coefficient (C) is a measure of local interconnection if agent i has k i immediate neighbors, Ci, is the fraction of the total possible k i *(k i -1) / 2 connections that are realized between i's neighbors. C, is just the average of the Ci's. Diameter: maximum value of path length
ift Regular vs Random Networks Average number of connections/node Diameter Number of connections needed to fully connect few, clustered Random Regular fewer, spread largemoderate manyfewer (<2/3)
ift Natural networks Between regular grids and totally random graphs Need for parametrized models: Regular -> natural -> random Watts Alpha model ( not intuitive) Beta rewiring model
ift Clustering Clustering measures the fraction of neighbors of a node that are connected themselves Regular Graphs have a high clustering coefficient but also a high diameter Random Graphs have a low clustering coefficient but a low diameter Both models do match the properties expected from real networks! Random Graph (k=4) Short path length L~log k N Almost no clustering C~k/n Regular Graph (k=4) Long paths L ~ n/(2k) Highly clustered C~3/4 Base metwork is circle
ift Small-World Networks Random rewiring of regular graph (by Watts and Strogatz) With probability p (or ) rewire each link in a regular graph to a randomly selected node Resulting graph has properties, both of regular and random graphs High clustering and short path length FreeNet has been shown to result in small world graphs
ift Example: 4096 node ring Regular graph: n nodes, k nearest neighbors path length ~ n/2k 4096/16 = 256 Random graph: path length ~ log (n)/log(k) ~ 4 Rewired graph (1% of nodes): path length ~ random graph clustering ~ regular graph Small World Graph K=4
ift Small- world networks Beta network Rewiring probability L C
ift More exactly …. (p = ) Small world behaviour C L
ift Effect of short-cuts Huge effect of just a few short-cuts. First 5 rewirings reduces the path length by half, regardless of size of network Further 50% gain requires 50 more short-cuts
ift The strength of weak ties Granovetter (1973): effective social coordination does not arise from densely interlocking strong ties, but derives from the occasional weak ties this is because valuable information comes from these relations (it is valuable if/because it is not available to other individuals in your immediate network)
ift Two ways of constructing
ift Alpha model Watts’ first Model (1999) Inspired by Asimov’s “I, Robot” novels R. Daneel Olivaw Elijah Baley Caves of Steel (Earth) Solaria
ift Two extreme types of social networks Caveman’s world people live in isolated communities probability meeting a random person is high if you have mutual friends and very low if you don’t Solaria people live isolated from each other but with supreme communication capabilities your social history is irrelevant to your future
ift Alpha network Alpha ( ) distance parameter =0 : if A and B have a friend in common, they know each other (Caveman world) =∞ : A & B don’t know each other, no matter how many common friends they have (Solarian world)
ift Number of mutual friends shared by A and B Likelihood that A meets B Caveman world Solaria world =0 == =1
ift Fragmented networks Small- world net- works Alpha network Path length L critical Clustering coefficient C L drops because we only count nodes that are connected
ift How about real networks All nodes in alpha and beta networks are equal in the sense that the number of connections each nodes has is not very far from the average Watts and Strogatz had used normal distribution Real world is not like that Sizes of cities, Wealth of individuals in USA, Hubs in transportation systems Barabási and Albert (1999) Scale-free networks, whose connectivity is defined by a power-law distribution
ift Random Networks Each node is connected to a few other nodes. The number of connections per node forms a Poisson distribution, with a small average of number of connections per node. This & three following graphics from: Linked: The New Science of Networks by Albert-Laszlo Barabasi; 2002
ift Scale-Free Networks Each node is connected to at least one other; most are connected to only one, while a few are connected to many. The number of connections per node forms a hyperbolic distribution, with no meaningful average number of connections per node.
ift RandomScale-Free Scale-free networks are associated with networks that grow by “natural” processes in which the number of nodes increases with time not just the number of connections.
ift Power law phenomena Average & median are far apart Whales and minnows Average from a few large nodes Median governed by majority of small nodes
ift Performance Real power law networks also have short distances Existence of central backbone of highly connected HUBS nodes Similar phenomena noted in linguistics and economics Zipf Pareto
ift Zipf's law - linguistics Zipf, a Harvard linguistics professor, sought to determine the frequency of use of the 3rd or 8th or 100th most common words in English text. Zipf's law states that the frequency y is inversely proportional to it's rank r: Y ~ r -b, with b close to unity. Zipf Presentations
ift The Pareto Income Distribution The Pareto distribution gives the probability that a person's income is greater than or equal to x and is expressed as
ift Vilfredo Pareto, Italian economist Born in Paris Polytechnic Institute in Turin in 1869, Worked for the railroads. Pareto did not study economics seriously until he was 42. In 1893 he succeeded his mentor, Walras, as chair of economics at the University of Lausanne.
ift Pareto’s contributions Pareto optimality. A Pareto-optimal allocation of resources is achieved when it is not possible to make anyone better off without making someone else worse off. Pareto's law of income distribution. I n 1906, Italian economist Vilfredo Pareto created a mathematical formula to describe the unequal distribution of wealth in his country, observing that 20% of the people owned 80% of the wealth.
ift Pareto distribution, m=10000, k=1 log-log plot Pareto distribution is said to be scale-free because it lacks a characteristic length scale
ift Building Power-law networks It is easy to create PL networks Build network node by node Connect new node to an existing node Probability of connection proportional to its number of links The rich get richer The poor get poorer
ift Structure and dynamics The case of centrality centers are in networks by design (central control, dictatorship) by non-design (unnoticed critical resources, informal groups) or they emerge as a consequence of certain events ”he was at the right place at a right time” clapping in unison
ift Further applications Search in networks Short paths are not enough Epidemics: medical & software Danger of short-cuts Paths + infectiousness Infection by ideas Fads & Economic Bubbles Individual rationality Peer pressure
ift Getting practical: search in networks A node may be linked to another node via a short path but what does it matter if you cannot find the path? In alpha and beta networks there is no notion of distance, therefore directed searches cannot recognize shortcuts Kleinberg’s (gamma) networks (2000)
ift Kleinberg’s Small-World Model Embed the graph into an r-dimensional grid (2D in examples) constant number p of short range links (neighborhood) q long range links: choose long-range links such that the probability to have a long range contact is proportional to 1/d r Importance of r ! Decentralized (greedy) routing performs best iff. r = dimension of space (here=2) r = 2
ift Influence of “r” (1) Each peer u has link to the peer v with probability proportional to where d(u,v) is the distance between u and v. Optimal value: r = dim = dimension of the space If r < dim we tend to choose more far away neighbors (decentralized algorithm can quickly approach the neighborhood of target, but then slows down till finally reaches target itself). If r > dim we tend to choose more close neighbors (algorithm finds quickly target in it’s neighborhood, but reaches it slowly if it is far away). When r = 0 – long range contacts are chosen uniformly. Random graph theory proves that there exist short paths between every pair of vertices, BUT there is no decentralized algorithm capable finding these paths
ift r (log scale) p(r) (log scale) increasing =0 Typical length of directed search 2 short paths cannot be found no short paths
ift Influence of “r” ( or ) Given node u if we can partition the remaining peers into sets A 1, A 2, A 3, …, A logN, where A i, consists of all nodes whose distance from u is between 2 i and 2 i+1, i=0..logN-1. Then given r = dim each long range contact of u is nearly equally likely to belong to any of the sets A i
ift The New Yorker View When gamma is at its critical value two, the resulting network has the peculiar property that nodes possess the same number of ties at all length scales ( in 2D world )
ift DHTs (distributed hash tables) and Kleinberg model P-Grid’s model Kleinberg ’ s model Balanced n-ary search
ift More hierarchy Kleinberg’s model has only one distance measure, geographical (2D) In human society the social distance is multidimensional if A is close to B and C is close to B but in different dimension then A and C can be very far from each other ”violation of the triangle inequality” but multidimensionality may enable messages to be transmitted in networks very efficiently
ift Watts et al (2002) search in social networks Searchable networks H Kleinberg condition = homophily, the tendency of like to associate with like H=number of dimensions along which individuals measure similarity
ift Small Worlds & Epidemic diseases Nodes are living entities Link is contact 3 States Uninfected Infected Recovered (or dead)
ift Epidemic diseases Level of infectiousness needed to start an epidemic varies with presence of shortcuts In regular grid, disease may die out due to lack of victims In small world, pandemics are facilitated SRAS Mad cow disease in England
ift Failures in networks Fault propagation or viruses Scale-free networks are far more resistant to random failures than ordinary random networks because of most nodes are leaves But failure of hubs can be catastrophic vulnerable or targets of deliberate attacks which may make scale-free networks more vulnerable to deliberate attacks Cascades of failures
64 Back to Social Networks
ift Spread of ideas Messages in social networks Fads & fashions Body piercing, baseball caps Harry Potter, Amélie Poulin Innovation, scientific revolutions Solar-centric universe Plate tectonics Is it like the spread of disease ?
ift Effect of peers & pundits People’s decisions are affected by what others do and think Presure to conform ? Efficient strategy when insufficient knowledge or expertise Ex: picking a restaurant
ift Economic models Selfish agents Individual rationality Markets Equilibrium ??? Many agents are trend followers Speculation crashes
ift Social Experiments Factors which affect decisions Milgram Asch
ift Stanley Milgram ( ) Controversial social psychologist Yale & Harvard Small world experiment, degrees of separation Obedience to authority
ift Validity of Milgram’s experiment Global connectivity ? US: Omaha Boston stockbroker Only 96 valid subjects (out of 300) 100 from Boston 100 big investors 96 picked at random in Nebraska Success? 18 out of 96 Other experiments: 3 out of 60 Worse….
ift Conformity Other presentation
ift Threshold models of decisions Number of infected neighbors 1 Probability of infection 0 Fraction of neighbors choosing A over B 1 Probability of choosing option A 0 Critical Threshold Standard disease spreading model Social decision making