Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical perturbation theory for spectral clustering Harrachov, 2007 A. Spence and Z. Stoyanov.

Similar presentations


Presentation on theme: "Statistical perturbation theory for spectral clustering Harrachov, 2007 A. Spence and Z. Stoyanov."— Presentation transcript:

1 Statistical perturbation theory for spectral clustering Harrachov, 2007 A. Spence and Z. Stoyanov

2 Plan of the Talk A. Clustering (Brief overview). B. Deterministic Perturbation Theory. C. Statistical Perturbation Theory.

3 Graph Clustering 3 4 1 2 6 7 5

4 3 4 1 2 6 7 5

5 Graph Clustering + Perturbation 3 4 1 2 6 7 5 ?

6 Gene Expression Data Clustering An Application There are over 10 000 genes expressed in any one tissue; DNA arrays typically produce very noisy data. 1.Genes in same cluster behave similarly? 2. Genes in different clusters behave differently? 1.Genes in same cluster behave similarly? 2. Genes in different clusters behave differently? Issues:

7 Bi-partite Graphs 1 2 3 4 1 2 3

8 Matrix Form

9 A Real Data Matrix (Leukemia)

10 Spectral Clustering: General Idea Discrete Optimisation Problem (NP - Hard) Discrete Optimisation Problem (NP - Hard) Real Optimisation Problem (Tractable) Real Optimisation Problem (Tractable) Approximation Exact - Impractical Heuristic - Practical

11 Discrete Optimisation  SVD Active Inactive Active Solution: Singular Value Decomposition of W scaled

12 Clustering Algorithm: Summary ACTIVE INACTIVE

13 Literature

14 Types of Graph Matrices

15 How we Cluster

16 Leukemia Data

17 Clustered Leukemia Data

18 Inaccuracies in the Data (Perturbation Theory)

19 Perturbation Theory (Deterministic Noise)

20 Deterministic Perturbation (Symmetric Matrix)

21 Linear Solve

22 Taylor Expansions

23 Rectangular Case  Symmetric

24 Random Perturbations (plan) The Model Issues with the Theory A Possible Solution via Simulations? Experiments

25 The Model 3 4 1 2 6 7 5

26 Difficulties with Random Matrix Theory (RMT)

27 Deterministic Perturbation  Stochastic Perturbation (simple eigenvector)

28 Deterministic Perturbation  Stochastic Perturbation (simple eigenvalues)

29 PP Plot -Test for Normality (Largest eigenvalue of a Symmetric Matrix)

30 Simulated Random Perturbation (Largest eigenvalue of a Symmetric Matrix)

31 Deterministic Perturbation  Stochastic Perturbation (simple eigenvectors)

32 Results for Laplacian Matrices

33 Functional of the Eigenvector

34 Results for h T v 2

35 PP Plot of h T v’(0) - Test for Normality (h = e j )

36 Histogram of h T v’(0) - Simulations (h = e j )

37 PP Plot of Simulated v [j] (  ) (Distribution close to Normal)

38 Histogram of Simulated v [j] (  ) (Distribution close to Normal)

39 Extension to the Rectangular Case

40 Probability of “Wrong Clustering”

41 Issues with Numerics

42 Efficient Simulations

43 Solution via Simulations?

44 Solution via Simulations? (Algorithm)

45 Comparing: Direct Calculation Vs. Repeated Linear Solve


Download ppt "Statistical perturbation theory for spectral clustering Harrachov, 2007 A. Spence and Z. Stoyanov."

Similar presentations


Ads by Google