Presentation on theme: "1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &"— Presentation transcript:
1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &
2 Network and application studies need properties and models of: Internet graphs & Internet Traffic. Shift of networking paradigm: Open, decentralized, dynamic. Intense measurement efforts. Intense modeling efforts. Internet Measurement and Models Routers WWW P2P
4 Real Internet Graphs CAIDA http://www.caida.org Average Degree = Constant A Few Degrees VERY LARGE Degrees not sharply concentrated around their mean.
5 Degree-Frequency Power Law degree 1345 102100 frequenc y WWW measurement: Kumar et al 99 Internet measurement: Faloutsos et al 99 E[d] = const., but No sharp concentration
6 Degree-Frequency Power Law 1345 102100 frequenc y E[d] = const., but No sharp concentration degree E[d] = const., but No sharp concentration Erdos-Renyi sharp concentration Models by Kumar et al 00, x Bollobas et al 01, x Fabrikant et al 02
7 Rank-Degree Power Law rank degree 1234510 Internet measurement: Faloutsos et al 99 UUNET Sprint C&WUSA AT&T BBN
8 Eigenvalue Power Law rank eigenvalue 1234510 Internet measurement: Faloutsos et al 99
14 Main Result of the Paper The largest eigenvalues of the adjacency martix of a graph whose large degrees are power law distributed (Zipf), are also power law distributed. Explains Internet measurements. Negative implications for the spectral filtering method in information retrieval.
15 Random Graph Model let Connectivity analyzed by Chung & Lu ‘01
20 Proof: Step 2: Vertex Disjoint Stars Degrees of each Vertex Disjoint Stars Sharply Concentrated around its Mean d_i Hence Principal Eigenvalue Sharply Concentrated around
21 Proof: Step 3: LL, RR, LR-extra LR-extra has max degree LL has edges RR has max degree
22 Proof: Step 3: LL, RR, LR-extra LR-extra has max degree RR has max degree LL has edges
23 Proof: Step 4: Matrix Perturbation Theory Vertex Disjoint Stars have principal eigenvalues All other parts have max eigenvalue QED
24 Implication for Info Retrieval Spectral filtering, without preprocessing, reveals only the large degrees. Term-Norm Distribution Problem :
25 Implication for Info Retrieval Term-Norm Distribution Problem : Spectral filtering, without preprocessing, reveals only the large degrees. Local information. No “latent semantics”.
26 Implication for Information Retrieval Application specific preprocessing (normalization of degrees) reveals clusters: WWW: related to searching, Kleinberg 97 IR, collaborative filtering, … Internet: related to congestion, Gkantsidis et al 02 Open : Formalize “preprocessing”. Term-Norm Distribution Problem :