Presentation is loading. Please wait.

Presentation is loading. Please wait.

Characterization of Linkage-Based Algorithms Margareta Ackerman Joint work with Shai Ben-David and David Loker University of Waterloo To appear in COLT.

Similar presentations


Presentation on theme: "Characterization of Linkage-Based Algorithms Margareta Ackerman Joint work with Shai Ben-David and David Loker University of Waterloo To appear in COLT."— Presentation transcript:

1 Characterization of Linkage-Based Algorithms Margareta Ackerman Joint work with Shai Ben-David and David Loker University of Waterloo To appear in COLT 2010

2 There are a wide variety of clustering algorithms, which often produce very different clusterings. How can we distinguish between clustering algorithms? How should a user decide which algorithm to use for a given application? Motivation

3 We propose a framework that lets a user utilize prior knowledge to select an algorithm Identify properties that distinguish between different clustering paradigms The properties should be: 1) Intuitive and “user-friendly” 2) Useful for classifying clustering algorithms Our approach for clustering algorithm selection

4 Kleinberg proposes abstract properties (“Axioms”) of clustering functions (NIPS, 2002) Bosagh Zadeh and Ben-David provide a set of properties that characterize single linkage clustering (UAI, 2009) Previous work

5 Propose a set of intuitive properties that uniquely indentify linkage-based clustering algorithms Construct a taxonomy of clustering algorithms based on the properties Our contributions

6 Define linkage-based clustering Our new clustering properties Main result Sketch of proof A taxonomy of common clustering algorithms using clustering properties Conclusions Outline

7 For a finite domain set X, a dissimilarity function d over the members of X. A Clustering Function F maps Input: (X,d) and k>0to Output: a k -partition (clustering) of X Formal setup

8 Start with the clustering of singletons Merge the closest pair of clusters Repeat until only k clusters remain. Ex. Single linkage, average linkage, complete linkage Informally, a linkage function is an extension of the between-point distance that applies to subsets of the domain. The choice of the linkage function distinguishes between different linkage-based algorithms. ? Linkage-based algorithm: An informal definition

9 Define linkage-based clustering Our new clustering properties Main result Sketch of proof A taxonomy of common clustering algorithms using our properties Conclusions Outline

10 A clustering C is a refinement of clustering C’ if every cluster in C’ is a union of some clusters in C. A clustering function is hierarchical if for and every F(X,d,k’) is a refinement of F(X,d,k). Hierarchical clustering

11 F is local if for any C Locality

12 Many clustering algorithms are local:  K-means  K-median  Single-linkage  Average-linkage  Complete-linkage Notably, some clustering algorithms fail locality:  Ratio cut  Normalized cut Which paradigms satisfy locality ?

13 If d’ equals d, except for increasing between-cluster distances, then F(X,d,k)=F(X,d’,k) for all d, X, and k. dd’ F(X,d,3)F(X,d’,3) Outer Consistency Based on Kleinberg, 2002.

14 K-means K-median Single-linkage Average-linkage Complete-linkage Not all clustering algorithms are outer-consistent  Ratio cut  Normalized cut Which paradigms satisfy outer-consistency?

15 Extended Richness

16

17 F satisfies extended richness if for any set of domains there is a d over that extends each of the so that Extended Richness

18 K-means K-median Single-linkage Average-linkage Complete-linkage Ratio cut Normalized cut Many clustering algorithms satisfy extended richness

19 Define linkage-based clustering Our new clustering properties Main result Sketch of proof A taxonomy of common clustering algorithms using our properties Conclusions Outline

20 Theorem: A clustering function is Linkage Based if and only if it is Hierarchical and it satisfies Outer Consistency, Locality and Extended Richness. Our main result

21 Every Linkage Based clustering function is Hierarchical, Local, Outer-Consistent, and satisfies Extended Richness. The proof is quite straight-forward. Easy direction of proof

22 If F is Hierarchical and it satisfies Outer Consistency, Locality and Extended-Richness then F is Linkage-Based. To prove this direction we first need to formalize linkage-based clustering, by formally defining what is a linkage function. Interesting direction of proof

23 A linkage function is a function l :{ : d is a distance function over } that satisfies the following: What do we expect from linkage function? 1) Representation independent: Doesn’t change if we re-label the data 2) Monotonic: if we increase edges that go between and, then l doesn’t decrease. 3) Any pair of clusters can be made arbitrarily distant: By increasing edges that go between and, we can make l reach any value in the range of l.

24 Recall direction: If F is a hierarchical function that satisfies outer- consistency, locality, and extended richness then F is linkage-based. Goal: Define a linkage function l so that the linkage based clustering based on l outputs F(X,d,k) (for every X, d and k ). Sketch of proof

25 Define an operator < F : (A,B,d 1 ) < F (C,D,d 2 ) if when we run F on, where d extends d 1 and d 2, A and B are merged before C and D. Sketch of proof (continued…)

26 Define an operator < F : (A,B,d 1 ) < F (C,D,d 2 ) if when we run F on, where d extends d 1 and d 2, A and B are merged before C and D.

27 Sketch of proof (continued…) Prove that < F can be extended to a partial ordering Use the ordering to define l Define an operator < F : (A,B,d 1 ) < F (C,D,d 2 ) if when we run F on, where d extends d 1 and d 2, A and B are merged before C and D.

28 Sketch of proof continue: Show that < F is a partial ordering We show that < F is cycle-free. Lemma: Given a function F that is hierarchical, local, outer-consistent and satisfies extended richness, there are no so that and

29 By the above Lemma, the transitive closure of < F is a partial ordering. R This implies that there exists an order preserving function l that maps pairs of data sets to R. It can be shown that l satisfies the properties of a linkage function. Sketch of proof (continued…)

30 Define linkage-based clustering Our new clustering properties Main result Sketch of proof A taxonomy of common clustering algorithms using our properties Conclusions Outline

31 LocalOuter Con. Inner Con. Heirar- chical Path Dist. Order Inv. Extent. Rich. Scale Inv. Iso. Inv. Single linkage Average linkage   Complete linkage   K-means  K-median  Min-Sum  Ratio-cut  Normalized- cut  Taxonomy of clustering algorithms

32 LocalOuter Con. Inner Con. Heirar- chical Path Dist. Order Inv. Extent. Rich. Scale Inv. Iso. Inv. Single linkage Average linkage   Complete linkage   K-means  K-median  Min-Sum  Ratio-cut  Normalized- cut  Characterization of Linkage-Based Algorithms

33 LocalOuter Con. Inner Con. Heirar- chical Path Dist. Order Inv. Extent. Rich. Scale Inv. Iso. Inv. Single linkage Average linkage   Complete linkage   K-means  K-median  Min-Sum  Ratio-cut  Normalized- cut  Characterization of Single-Linkage By Bosagh Zadeh and Ben-David (UAI, 09)

34 LocalOuter Con. Inner Con. Heirar- chical Path Dist. Order Inv. Extent. Rich. Scale Inv. Iso. Inv. Single linkage Average linkage   Complete linkage   K-means  K-median  Min-Sum  Ratio-cut  Normalized- cut  Distinguishing among Linkage-Based Algorithms

35 A function F is order invariant if for all d and d’ where for all points p,q,r,s d(p,q)< d(r,s) iff d’(p,q)< d’(r,s), we have that F(X,d) = F(X,d’). LocalOuter Con. Inner Con. Heirar- chical Path Dist. Order Inv. Extent. Rich. Scale Inv. Iso. Inv. Single linkage Average linkage   Complete linkage   Distinguishing among Linkage-Based Algorithms

36 LocalOuter Con. Inner Con. Heirar- chical Path Dist. Order Inv. Extent. Rich. Scale Inv. Iso. Inv. Single linkage Average linkage   Complete linkage   K-means  K-median  Min-Sum  Ratio-cut  Normalized- cut  When “Natural” properties are not satisfied

37 LocalOuter Con. Inner Con. Heirar- chical Path Dist. Order Inv. Extent. Rich. Scale Inv. Iso. Inv. Single linkage Average linkage   Complete linkage   K-means  K-median  Min-Sum  Ratio-cut  Normalized- cut  PropertiesAxioms

38 Using this framework, clustering users can utilize prior knowledge to determine which properties make sense for their application The goal is to construct a property-based taxonomy for many useful clustering algorithms Using this approach, a user will be able to find a suitable algorithm without the overhead of executing many clustering algorithms Advantages of the Framework

39 We introduced new properties of clustering algorithms. We use these properties to provide a characterization of linkage-based algorithms. We classified common clustering algorithms using these properties. Conclusions

40 Kleinberg (NIPS, 02) proposed 3 “axioms” of clustering functions, which he showed to be inconsistent. Ackerman and Ben-David (NIPS, 08) showed that the these properties are consistent in the setting of clustering quality measures. Goal: find a consistent set of axioms of clustering functions. Axioms of clustering

41 An axiom is a property that is satisfied by all members of a class A complete set of axioms of clustering functions would be satisfied by all clustering functions, and only by clustering functions Our goal is to find a complete set of axioms of clustering We use Kleinberg’s axioms as a starting point If we fix k, Kleinberg’s axioms are consistent. Axioms VS properties

42 Scale Invariance: The output of the function doesn’t change if the data is scaled uniformly.  Satisfied by common clustering algorithms. Richness: For all k -clustering C of X, there exists a distance function d over X so that F(X,d,k) = C.  Richness is implied by extended richness.  Satisfied by common clustering algorithms. Kleinberg’s axioms for fixed K Consistency: If d’ equals d, except for increasing between-cluster distances, then F(X,d,k)=F(X,d’,k).  Not satisfied by some common clustering algorithms.  Relaxations of this property, inner and outer consistency are also not satisfied by some common algorithms.

43 We propose using the following as axioms of clustering.  Scale invariance  Isomorphism invariance  Extended richness Are there natural clustering functions that fail any of these properties? Are these axioms sufficient? Towards axioms of clustering functions

44 LocalOuter Con. Inner Con. Heirar- chical Path Dist. Order Inv. Extent. Rich. Scale Inv. Iso. Inv. Single linkage Average linkage   Complete linkage   K-means  K-median  Min-Sum  Ratio-cut  Normalized- cut  PropertiesAxioms


Download ppt "Characterization of Linkage-Based Algorithms Margareta Ackerman Joint work with Shai Ben-David and David Loker University of Waterloo To appear in COLT."

Similar presentations


Ads by Google