Presentation on theme: "By Jon Kleinberg Bo Young Kim Applied Algorithm Lab."— Presentation transcript:
By Jon Kleinberg Bo Young Kim Applied Algorithm Lab
Research of large-scale network structure Importance: Limit of Reductionism Mathematics, computer science, social science and biological science - Computer science: Internet, WWW - Social science: social network - Biological science: interaction in the pathways of a cells metabolism, Neurology (e.g. Neural burst modeling)
Euler(1736)- Graph Theory New Problem- How is a network created? What are rules dominate its topology and structure?
Observe real-world network property Modeling (produced by a random mechanism) Reproduce another properties (It may be observed in the real-world network) We can explain and predict!
Erdos, Renyi (1959)- Random graph theory Different systems have different rules- intentionally ignored Connecting a pair of nodes randomly Giant component(Phase transition) G(n,p) where p= (c: const.) 1) c<1: G(n,p) consists a.a.s. of small components all of which have O(logn) vertices 2) c>1: a.a.s. a unique large component which consists of Θ(n) vertices
six degrees of separation I read somewhere that everybody on this planet is separated by only six other people. Six degrees of separation between us and everyone else on this planet. The President of the United States, a gondolier in Venice, just fill in the names. by Gaure(1991) or by Karinthy(1929) Yuna Kim, and You, just fill in the names.
Stanley Milgrams experiment (1967) Want to know the distance of two person in America Target person(stockbroker in Boston) Considerably Randomly chosen starters(Wichita, Cansas, Omaha, Nebraska) Personal information Forward letter to a person on a first-name basis. The median length among the complete paths was 5.5. (42/160) We are living in a Small World!
Example- Social Network, Web, Biology… Barabasi(1998) Web – 19 degrees of separation d=0.35+2logN (d: average distance, N: # of web pages) General phenomenon observed in a lot of network Caution: It doesnt mean we can find someone/something easily. (We dont know the shortest path)
1. Such short chains are ubiquitous. 2. Individuals operating with purely local information are very adept at finding these chains. (using analysis) Length of the shortest path 6
Thm (Bollobas, de la Vega, 1982) Fix a constant k3. If we choose u.a.r. from the set of all n-node graph in which each node has degree exactly k, the with high probability every pair of nodes will be joined by a path length of O(logn).
Thm (Bollobas,Chung 1988) Consider a graph G formed by adding a random matching to an n-node cycle(assume n is even, pair up the nodes on the cycle u.a.r. and add edges between each of these node pairs). With high probability, every pair of nodes will be joined by a path length O(logn).
Granoveter(1972)- Existence of Cluster Real Network – Highly clustered(Erdos number, No cluster in Erdos-Reney model need to be modified! Watts-Strogatz model (1998, Nature) nxn grid-based model For each node v, one extra directed edge to some other node w chosen u.a.r. (w: long range contact. local contacts) Superposition of structured and random links. Trade off- clustering no clustering large world small world
Two-dimensional grid with a sin gle random shortcut superimpos ed. Two-dimensional grid wit h many random shortcuts superimposed (as in the Watts-Strogatz model).
Decentralized search algorithm An algorithm finding efficient paths to a destination using purely local information i.e. an algorithm searching the shortest path under the following rule; At each step, the holder of the message must pass it across one of its connections. (In grid model, current holder doesnt know the long-range connection of nodes that have not touched the message.) Thm (Kleinberg, 2000) The delivery time of any decentralized algorithm in the grid-based model is Ω(n 2/3 ).
Extend model (Kleinberg, 2000) – Watt-strogatz model has no decentralized algorithm finding short paths. α0 controls long range link correlated with the geometry of the underlying grid Grid distance ρ(v,w) Choose u.a.r w for v with probability proportional to ρ(v,w) -α α=0, Watts-Strogatz model α is small: long range links are too random α is large: not random enough.
Thm ( Kleinberg, 2000) 1. 0 α<2, delivery time of any decentralized algorithm in the grid-based model: Ω(n (2- α)/3 ) 2. α=2, There is a decentralized algorithm with delivery time: O(log 2 n) 3. α>2, delivery time of any decentralized algorithm in the grid-based model: Ω(n ( α-2)/(α-1) )
A node with several random shor tcuts spanning different distance scales.
Network is embedded in a hierarchy; Node resides at the leaves if a complete b-ary tree Natural variation – Milgrams experiment, Web page Arts Music Opera Verdis Aida Science Biology Genetics Yeast genome
Def b-ary tree A tree with no more than b children for each node Def depth of a node The distance from the node to the root of the tree Def complete b-ary tree A b-ary tree with all leaf nodes at same depth. All internal node have b children. …
Natural assumption: density of links is lower for node pairs that are more widely separated in the underlying hierarchy. Hierarchical model with exponent β. Complete b-ary tree with n leaves(h=log b n) Tree distance h(v,w)=the height of their lowest common ancestor Define random graph G on the set V of leaves k edge out of each v w as endpoint of the ith edge independently with probability proportional to b -βh(v,w). (β0)
Starting node s, target node t It must construct a path from s to t We know: edges out of nodes that it explicitly visit. Caution: G may not contain a path from s to t. Def Delivery time f(n) A decentralized algorithm has delivery time f(n) on a randomly generated n-node network, with s and t chosen u.a.r., the algorithm produces a path of length O(f(n)) with probability at least 1-ε(n), ε0 as n
Thm (Kleinberg, 2001) (a) In the hierarchical model with exponent β=1 and out-degree k=clog 2 n, for a sufficiently large const. a, a decentralized algorithm with polylogarithmic delivery time. (b) β1 and every polylogarithmic function k(n), there is no decentralized algorithm (in the hierarchical model with exponent β and out-degree k(n)) that achieves polylogarithmic delivery time.
Watts, Dodds and Newman (2002) independently proposed a similar model.
Napster and music file sharing (1999) Centralized index Decentralized algorithm Focused web crawler standard web search engine
(Adamic, Adar, 2005) e-mail network ofobservation: g(v,w) -3/4 compared with g(v,w) -1. (Liben-Nowell, 2005) LiveJournal observation Rank-based friendship Thm (Liben-Nowell, 2005) For an arbitrary population density on a grid, the expected delivery time of the decentralized greedy algorithm in the rank- based friendship model is O(log 3 n).
Experiment in the social science : Highlights a fundamental and non-obvious property of network (efficient searchability in this case) Random graph modeling, analyzing measure on large-scale data further results, question in algorithm, graph theory and discrete probability
J. Kleinberg. Navigation in a Small World. Nature 406( 2000), 845.Navigation in a Small World. J. Kleinberg. The Small-World Phenomenon and Dece ntralized Search. A short essay as part of Math Aware ness Month 2004, appearing in SIAM News 37(3), Apri l 2004The Small-World Phenomenon and Dece ntralized Search.Math Aware ness Month 2004 J. Kleinberg. Complex Networks and Decentralized Se arch Algorithms. Proceedings of the International Con gress of Mathematicians (ICM), 2006.Complex Networks and Decentralized Se arch Algorithms. Albert-László Barabási, Linked: How Everything Is Connected to Everything Else and What It Means(2002) Noga Alon, Joel H. Spencer, The Probabilistic Method, 2 nd Edition(2000)