Presentation is loading. Please wait.

Presentation is loading. Please wait.

School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic.

Similar presentations


Presentation on theme: "School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic."— Presentation transcript:

1 School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic

2 Outline Milgram’s small world experiment Watts & Strogatz small world model Kleinberg small world model Watts, Dodds & Newman community model Network models: a few examples Things for Lada to remember: Online survey for short profiles Email in groups by Monday Feb. 6 th Cormen chapter now available as PDF on cTools PS 1 graded PS 3 available by tomorrow

3 NE MA Milgram’s experiment (1960’s): Given a target individual and a particular property, pass the message to a person you correspond with who is “closest” to the target. Small world experiments then

4 Milgram’s small world experiment Target person worked in Boston as a stockbroker. 296 senders from Boston and Omaha. 20% of senders reached target. average chain length = 6.5. “Six degrees of separation”

5 Small world experiments now email experiment Dodds, Muhamad, Watts, Science 301, (2003) 18 targets 13 different countries 60,000+ participants 24,163 message chains 384 reached their targets average path length 4.0 image by Stephen G. Eick http://www.bell-labs.com/user/eick/index.html (unrelated to small world experiment…)

6 Targets for the small world experiment at Columbia a professor at an Ivy League university, an archival inspector in Estonia, a technology consultant in India, a policeman in Australia, a veterinarian in the Norwegian army. no chain reached the target in Croatia 

7 Accounting for attrition Approximate 37% participation rate approximately. Probability of a chain of length 10 getting through:.37 10 ~ 5 x 10 -5 so only one out of 20,000 chains would make it actual # of completed chains: 384 (1.6% of all chains). Small changes in attrition rates lead to large changes in completion rates e.g., a 15% decrease in attrition rate would lead to a 800% increase in completion rate

8 Estimating ‘recovered’ chain lengths for uncompleted chains = 4.05 for all completed chains L * = Estimated `true' median chain length Intra-country chains: L * = 5 Inter-country chains: L * = 7 All chains: L * = 7 Milgram: L * ~ 8-9 hops

9 Attrition rate stays approx. constant throughout r L – probability of not passing on the message at distance L from the source average 95 % confidence interval

10 Estimated ‘recovered’ chain lengths observed chain lengths ‘recovered’ histogram of path lengths inter-country intra-country

11 Small world experiment at Columbia Successful chains disproportionately used weak ties (Granovetter) professional ties (34% vs. 13%) ties originating at work/college target's work (65% vs. 40%)... and disproportionately avoided hubs (8% vs. 1%) (+ no evidence of funnels) family/friendship ties (60% vs. 83%) Strategy: Geography -> Work

12 How many hops actually separate any two individuals in the world? Participants are not perfect in routing messages They use only local information “The accuracy of small world chains in social networks” Peter D. Killworth, Chris McCarty, H. Russell Bernard& Mark House: Analyze 10920 shortest path connections between 105 members of an interviewing bureau, together with the equivalent conceptual, or ‘small world’ routes, which use individuals’ selections of intermediaries. This permits the first study of the impact of accuracy within small world chains. The mean small world path length (3.23) is 40% longer than the mean of the actual shortest paths (2.30) Model suggests that people make a less than optimal small world choice more than half the time.

13 Why study small world phenomena? Curiosity: Why is the world small? How are people able to route messages? Social Networking as a Business: Friendster, Orkut, MySpace LinkedIn, Spoke, VisiblePath

14 Six degrees of separation - to be expected Pool and Kochen (1978) - average person has 500-1500 acquaintances Ignoring clustering (the probability that my friend’s friend is not someone unknown to me, but is actually my friend…) ~ 10 3 first neighbors, 10 6 second neighbors, 10 9 third neighbors Since the number of neighbors grows exponentially with distance (measured in hops traversed in a breadth-first search) Connected random networks have short average path lengths: ~ log(N) N = population size, d AB = distance between nodes A and B. But: social networks aren't random…

15 Reverse small world experiment Killworth & Bernard (1978): Given hypothetical targets (name, occupation, location, hobbies, religion…) participants choose an acquaintance for each target Acquaintance chosen based on (most often) occupation, geography only 7% because they “know a lot of people” Simple greedy algorithm: most similar acquaintance two-step strategy rare

16 The small world model High clustering: my friends’ friends tend to be my friends Watts & Strogatz (1998) - a few random links in an otherwise clustered graph give an average shortest path close to that of a random graph

17 Networks in nature (empirical observations) neural network of C. elegans, semantic networks of languages, actor collaboration graph, food webs.

18 Model proposed Crossover from regular lattices to random graphs Tunable Small world network with (simultaneously): Small average shortest path Large clustering coefficient (not obeyed by RG)

19 Two ways of constructing a small world graph As in many network generating algorithms Disallow self-edges Disallow multiple edges Select a fraction p of edges Reposition on of their endpoints Add a fraction p of additional edges leaving underlying lattice intact

20 Original model Each node has K>=4 nearest neighbors (local) Probability p of rewiring to randomly chosen nodes p small: regular lattice p large: classical random graph

21 p=0 Ordered lattice Compute the clustering coefficient as follows each node is connected to K neighbors, who can have K*(K-1)/2 pairwise connections between them some of the connections between them are present in the lattice If K = 4 (connected to two closest neighbors on each side) C = 3*2/4/3 = ½ Caution: sometimes the lattice will be specified as each node connects to K closest neighbors each node connects to all neighbors within distance k (k = K/2)

22 Clustering coefficient for regular lattice In general, can have any K a neighbor K/2 hops away from i can connect to (K/2 – 1) of i’s neighbors a neighbor K/2-1 hops away can connect to (1 + K/2 – 1) neighbors K/2 – 2 hops away (2 + K/2 – 1) neighbors 1 hop away 2*(K/2 – 1) Sum this up multiply by factor of 2 because i has neighbors on both sides divide by a factor of 2 because edges are undirected i i i

23 Clustering coefficient for regular lattice The number of connections between neighbors is given by i i i The maximum number of connections is K*(K-1)/2 → clustering coefficient is

24 Average shortest path – regular lattice Average node is N/4 hops away (a quarter of the way around the ring), and you can hop over K/2 nodes at a time

25 p=1 Random graph We’ll talk more about this next week There are an average of K links per node. The probability that any two nodes are connected is p = K/N. The probability that two nodes which share in a neighbor in common are connected themselves is the same as any two random nodes: K/N (actually (K-1)/N because they have already expended one edge on their common neighbor.

26 What happens in between? Small shortest path means small clustering? Large shortest path means large clustering? Through numerical simulation As we increase p from 0 to 1 Fast decrease of mean distance Slow decrease in clustering

27 Change in clustering coefficient and average path length as a function of the proportion of rewired edges l(p)/l(0) C(p)/C(0) 10% of links rewired 1% of links rewired No exact analytical solution Exact analytical solution

28 Clustering coefficient for SW model with rewiring The probability that a connected triple stays connected after rewiring probability that none of the 3 edges were rewired (1-p) 3 probability that edges were rewired back to each other very small, can ignore Clustering coefficient = C(p) = C(p=0)*(1-p) 3 C(p)/C(0) p

29 Clustering coefficient: addition of random edges How does C depend on p? C’(p)= 3xnumber of triangles / number of connected triples C’(p) computed analytically for the small world model without rewiring p C’(p)

30 Degree distribution p=0 delta-function p>0 broadens the distribution Edges left in place with probability (1-p) Edges rewired towards i with probability 1/N

31 Model: small world with probability p of rewiring visit nodes sequentially and rewire links exponential decay, all nodes have similar number of links 1000 vertices random network with average connectivity K Why does each node keep at least K/2 links? Even at p = 1, graph is not a purely random graph

32 Some examples for real networks (in averages) Networksize vertex degree shorte st path Shortest path in fitted random graph Clustering (# triangles) Clustering (averaged over vertices) Clustering in random graph Film actors225,226613.652.990.200.790.00027 MEDLINE co- authorship 1,520,25 1 18.14.64.910.450.561.8 x 10 -4 E.Coli substrate graph 2827.352.93.040.320.026 C.Elegans282142.652.250.280.05

33 What if long range links depend on distance? “The geographic movement of the [message] from Nebraska to Massachusetts is striking. There is a progressive closing in on the target area as each new person is added to the chain” S.Milgram ‘The small world problem’, Psychology Today 1,61,1967 NE MA

34 nodes are placed on a lattice and connect to nearest neighbors additional links placed with p uv ~ Kleinberg’s geographical small world model Kleinberg, ‘The Small World Phenomenon, An Algorithmic Perspective’ (Nature 2000)

35 When r=0, links are randomly distributed, ASP ~ log(n), n size of grid no locality

36 Links highly localized links on a lattice

37 Links balanced between long and short range

38 How the small world phenomenon arises T S R  |R|<|R’|< |R| k = c log 2 n calculate probability that s fails to have a link in R’ R’

39 Kleinberg, ‘Small-World Phenomena and the Dynamics of Information’ NIPS 14, 2001 Hierarchical network models: Individuals classified into a hierarchy, h ij = height of the least common ancestor. Group structure models: Individuals belong to nested groups q = size of smallest group that v,w belong to f(q) ~ q -  h b=3 e.g. state-county-city-neighborhood industry-corporation-division-group

40 Identity and search in social networks Watts, Dodds, Newman (Science,2001) individuals belong to hierarchically nested groups multiple independent hierarchies h=1,2,..,H coexist corresponding to occupation, geography, hobbies, religion… p ij ~ exp(-  x)

41

42 Other generative models Assign properties to nodes (e.g. spatial location, group membership) Add or rewire links according to some rule optimize for a particular property (simulated annealing) add links with probability depending on property of existing nodes, edges (preferential attachment, link copying) simulate nodes as agents ‘deciding’ whether to rewire or add links

43 Example: trade-off between wiring and connectivity E is the ‘energy’ cost we are trying to minimize L is the average shortest path in ‘hops’ W is the total length of wire used Small worlds: How and Why, Nisha Mathias and Venkatesh Gopal

44 Network configuration rewire using simulated annealing sequence is shown in order of increasing

45 Small worlds: the how and the why same networks, but the vertices are allowed to move using a spring layout algorithm wiring cost associated with the physical distance between nodes

46 Shape and efficiency in spatial distribution networks Michael Gastner & Mark Newman (a) Commuter rail network in the Boston area. The arrow marks the assumed root of the network. (b) Star graph. (c) Minimum spanning tree. (d) The model of Eq. (3) applied to the same set of stations.

47 Assign an effective cost to each edge l incorporates a person’s preference for short distances or a small number of hops car travel: short distance airplane travel: small number of hops, sometimes at the expense of total distance Construct network using simulated annealing physical distance number of hops

48 slide by Mark Newman

49

50

51 RoadsAir routes slide by Mark Newman

52 How do networks become navigable? Aaron Clauset and Christopher Moore arxiv.org/abs/cond-mat/0309415 start with a 1-D lattice (a ring) each node is connected to its 2 nearest neighbors and has one long range link (initially just a self-loop) we start going from x to y, but go no more than a certain threshold # of steps which repesents how many hops we think getting to y should take if we give up, we rewire x’s long range link to the last node we reached In the limit N-> long range link distribution becomes 1/r, r = lattice distance between nodes search time starts scaling as log(N) x y

53 Summary The world is small! Watts & Strogatz came up with a simple model to explain why Later, more sophisticated models of social structure were developed There are many, many more models that can be thought up and that give useful insights


Download ppt "School of Information University of Michigan SI 614 Small Worlds Lecture 5 Instructor: Lada Adamic."

Similar presentations


Ads by Google