Presentation is loading. Please wait.

Presentation is loading. Please wait.

Clustering.

Similar presentations


Presentation on theme: "Clustering."— Presentation transcript:

1 Clustering

2 Revesion of Yesterday's Algorithm

3 K-Means Algorithm Each cluster is represented by the mean value of the objects in the cluster Input : set of objects (n), no of clusters (k) Output : set of k clusters Algo Randomly select k samples & mark them a initial cluster Repeat Assign/ reassign in sample to any given cluster to which it is most similar depending upon the mean of the cluster Update the cluster’s mean until No Change.

4 K-Means (graph) Step1: Form k centroids, randomly
Step2: Calculate distance between centroids and each object Use Euclidean’s law do determine min distance: d(A,B) = (x2-x1)2 + (y2-y1)2 Step3: Assign objects based on min distance to k clusters Step4: Calculate centroid of each cluster using C = (x1+x2+…xn , y1+y2+…yn) n n Go to step 2. Repeat until no change in centroids.

5 K-Mediod (PAM) Also called Partitioning Around Mediods.
Step1: choose k mediods Step2: assign all points to closest mediod Step3: form distance matrix for each cluster and choose the next best mediod. i.e., the point closest to all other points in cluster go to step2. Repeat until no change in any mediods

6 What are Agglomerative Algorithms??
Bottom Up Approach Simple Outputs a hierarchy Structure is more informative Need not specify the number of clusters

7 Dendogram

8 Euclidean Distance

9 Distance Matrix

10 Agglomerative Algorithm
Step1: Make each object as a cluster Step2: Calculate the Euclidean distance from every point to every other point. i.e., construct a Distance Matrix Step3: Identify two clusters with shortest distance. Merge them Go to Step 2 Repeat until all objects are in one cluster

11 Agglomerative Algorithm Approaches
Single Link Complete Link Average Link

12 Simple Example Item E A C B D 1 2 3 5 6

13 Another Example Find single link technique to find clusters in the given database. X Y 1 0.4 0.53 2 0.22 0.38 3 0.35 0.32 4 0.26 0.19 5 0.08 0.41 6 0.45 0.3

14 Plot given data

15 Construct a distance matrix
1 2 3 4 5 6 0.24 0.22 0.15 0.37 0.2 0.34 0.14 0.28 0.29 0.23 0.25 0.11 0.39

16 Identify two nearest clusters

17 Repeat process until all objects in same cluster

18 Average link Average distance matrix

19 Use below data and draw single link, complete link and average link dendogram.
Object X Y A 2 B 3 C 1 D E 1.5 0.5


Download ppt "Clustering."

Similar presentations


Ads by Google