“Clustering by Passing Messages Between Data Points”

“Clustering by Passing Messages Between Data Points”
Brendan J. Frey and Delbert Dueck

The Problem - Clustering
We want to assign our data points to categories, based on similarity Common approach - define clusters by “representative” points, assign each data point to best representative Most formulations are NP-hard, must be approximated heuristically

Affinity Propagation - Informal Description
Our data points are nodes in a graph Each iteration, every node gets messages from all other nodes How similar is the candidate? How much do other nodes support it? Messages estimate the suitability of nodes to serve as exemplars for others Messages from previous rounds are used to compute new set of messages

Affinity Propagation - Message Passing
Input: similarity matrix S We will iteratively build “Responsibility” matrix R, “Availability” matrix A R(i,k) represents how much i favors k as an exemplar candidate “I want to vote for you”, “I don’t want to vote for you” A(i,k) represents how much k favors being an exemplar for i “You should vote for me”, “No, don’t vote for me”

Affinity Propagation - Complete Procedure
Initialize A, R to all 0 Update responsibility: R(i,k) = S(i,k) - max{ A(i,k’) + S(i,k’) }, where k’ =/= k Update availability: A(i,k) = min{ 0, R(k,k) + SUM(max{ 0, R(i’,k) }) }, where i’ =/= i =/= k A(k,k) = SUM(max{ 0, R(i’,k) }), where i’ =/= k Iterate until convergence E = A + R The exemplar for node i is node k with largest E(i,k)

Nuance - Diagonal of Similarity Matrix
Diagonal values S(i,i) are called “preferences” Govern how much node i wants to vote for itself Parameter for the algorithm, will greatly affect clustering behavior Higher values - more, smaller clusters Lower values - fewer, larger clusters Typically set to median(S) Can be used to include/exclude candidates based on prior information

Advantages Doesn’t depend on geometric distances, symmetry, or centroids Doesn’t require any initial guesstimate of cluster exemplars Desired number of clusters doesn’t need to be fixed up front Similarity diagonal (a priori suitability of exemplars) can be flexibly tuned Easy to implement Matrix-based implementation yields good runtime

Disadvantages Quadratic space and runtime complexity
Entire similarity matrix must be computed up front Desired number of clusters doesn’t need to be fixed up front Not always clear how to best tune the similarity diagonal Evident preference for many small clusters, not always desirable

Images stolen from Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. science, 315(5814),

“Clustering by Passing Messages Between Data Points”

Similar presentations

Presentation on theme: "“Clustering by Passing Messages Between Data Points”"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

“Clustering by Passing Messages Between Data Points”

Similar presentations

Presentation on theme: "“Clustering by Passing Messages Between Data Points”"— Presentation transcript:

Similar presentations

About project

Feedback