Presentation on theme: "CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian."— Presentation transcript:
CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian
Previously in this class Some common properties of social networks Need for generative models Several generative models for power law dist Optimization Multiplicative processes Preferential growth Power law graph models: preferential attachment
This lecture Analysis of the degree sequence of preferential attachment graphs Other power law graph models The copying model Heuristically optimized tradeoffs models for small world networks (time permitting)
Preferential attachment, recap. Start with a graph with one node. Vertices arrive one by one. When a vertex arrives, it connects itself to one (m, in general) of the previous vertices, with probability proportional to their degrees.
Preferential attachment Heuristic analysis (Barabasi-Albert): degree distribution follows a power law with exponent -3. Theorem (Bollobas, Riordan, Spencer, Tusnady). For d < n 1/16, the fraction of vertices that have degree d is almost surely around
Copying models Kleinberg et al. 1999 and Kumar et al. 2000 Vertices join one by one, and each new vertex connects to m old vertices (picked as follows). A new vertex picks an old vertex uniformly at random as its prototype. For each link on the prototype, the new vertex copies the link with probability p, or replaces the link by a link to a randomly selected vertex with probability 1-p. Captures the power law, as well as the “locally dense, globally sparse” features of the web.
Heuristically Optimized Tradeoff Fabrikant, Koutsoupias, Papadimitriou, 2002 Each node is a point in the unit square Nodes arrive one by one Upon arrival, node i connects to a node j that minimizes .d ij + h j, where d ij is the Euclidean distance between i and j, and h j is the graph distance between j and node 1 (the center).
Small World Networks Low average distance L Definition: The average distance L of a network is the number of edges in the shortest path between two vertices, averaged over all pairs of vertices. High clustering coefficient C Definition: The clustering coefficient C of a network is the probability that two neighbors of a random vertex are connected by a single edge.
Small World Networks Many examples Film actors: edge means actors appeared in a film together Power grid: edge represents high-voltage transmission lines between generators, transformers, or substations Neural network of worm C. elegans: two neurons joined by an edge if connected by synapse or gap junction L actual L random C actual C random Film actors3.652.990.790.00027 Power grid18.712.40.080.005 C. Elegans2.652.250.280.05 Data from Watts-Strogatz
Models Regular network, e.g. C n k High clustering coefficient (C ¼ ¾) High average distance (L ¼ n/2k) Random network, e.g. G(n,k/n) Low average distance (L ¼ ln(n)/ln(k)) Low clustering coefficient (C ¼ k/n)
Watts & Strogatz Model Add a small amount of random noise Start with regular graph, e.g. C n k Randomly “rewire” each edge with probability p