# Cluster Algorithms Adriano Joaquim de O Cruz ©2006 UFRJ

## Presentation on theme: "Cluster Algorithms Adriano Joaquim de O Cruz ©2006 UFRJ"— Presentation transcript:

Cluster Algorithms Adriano Joaquim de O Cruz ©2006 UFRJ adriano@nce.ufrj.br

K-means

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 3 K-means algorithm Based on the Euclidean distances among elements of the cluster Based on the Euclidean distances among elements of the cluster Centre of the cluster is the mean value of the objects in the cluster. Centre of the cluster is the mean value of the objects in the cluster. Classifies objects in a hard way. Each object belongs to a single cluster. Classifies objects in a hard way. Each object belongs to a single cluster.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 4 K-means algorithm Consider n (X={x 1, x 2,..., x n }) objects and k clusters. Consider n (X={x 1, x 2,..., x n }) objects and k clusters. Each object x i is defined by l characteristics x i =(x i1, x i2,..., x il ). Each object x i is defined by l characteristics x i =(x i1, x i2,..., x il ). Consider A a set of k clusters (A={A 1, A 2,..., A k }). Consider A a set of k clusters (A={A 1, A 2,..., A k }).

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 5 K-means properties The union of all clusters makes the Universe The union of all clusters makes the Universe No element belongs to more than one cluster No element belongs to more than one cluster There is no empty cluster There is no empty cluster

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 6 Membership function

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 7 Membership matrix U Matrix containing the values of inclusion of each element into each cluster (0 or 1). Matrix containing the values of inclusion of each element into each cluster (0 or 1). Matrix has c (clusters) lines and n (elements) columns. Matrix has c (clusters) lines and n (elements) columns. The sum of all elements in the column must be equal to one (element belongs only to one cluster The sum of all elements in the column must be equal to one (element belongs only to one cluster The sum of each line must be less than n e grater than 0. No empty cluster, or cluster containing all elements. The sum of each line must be less than n e grater than 0. No empty cluster, or cluster containing all elements.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 8 Matrix examples X1X2X3 X4X5X6 Two examples of clustering. What do the clusters represent?

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 9 Matrix examples cont. X1X2X3 X4X5X6 U1 and U2 are the same matrices.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 10 How many clusters? The cardinality of any hard k-partition of n elements is The cardinality of any hard k-partition of n elements is

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 11 How many clusters (example)? Consider the matrix U2 (k=3, n=6) Consider the matrix U2 (k=3, n=6)

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 12 K-means inputs and outputs Inputs: the number of clusters c and a database containing n objects with l characteristics each. Inputs: the number of clusters c and a database containing n objects with l characteristics each. Output: A set of k clusters that minimises the square-error criterion. Output: A set of k clusters that minimises the square-error criterion.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 13 Number of Clusters

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 14 K-means algorithm v1 Arbitrarily assigns each object to a cluster (matrix U). Repeat Update the cluster centres; Update the cluster centres; Reassign objects to the clusters to which the objects are most similar; Reassign objects to the clusters to which the objects are most similar; Until no change;

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 15 K-means algorithm v2 Arbitrarily choose c objects as the initial cluster centres. Repeat Reassign objects to the clusters to which the objects are most similar. Reassign objects to the clusters to which the objects are most similar. Update the cluster centres. Until no change

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 16 Algorithm details The algorithm tries to minimise the function The algorithm tries to minimise the function d ie is the distance between the element x e (m characteristics) and the centre of the cluster i (v i ) d ie is the distance between the element x e (m characteristics) and the centre of the cluster i (v i )

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 17 Cluster Centre The centre of the cluster i (v i ) is an l characteristics vector. The centre of the cluster i (v i ) is an l characteristics vector. The jth co-ordinate is calculated as The jth co-ordinate is calculated as

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 18 Detailed Algorithm Choose c (number of clusters). Choose c (number of clusters). Set error ( > 0) and step (r=0). Set error ( > 0) and step (r=0). Arbitrarily set matrix U (r). Do not forget, each element belongs to a single cluster, no empty cluster and no cluster has all elements. Arbitrarily set matrix U (r). Do not forget, each element belongs to a single cluster, no empty cluster and no cluster has all elements.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 19 Detailed Algorithm cont. Repeat Calculate the centre of the clusters v i (r) Calculate the distance d i (r) of each point to the centre of the clusters Generate U (r+1) recalculating all characteristic functions using the equations Until ||U (r+1) -U (r) || < Until ||U (r+1) -U (r) || <

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 20 Matrix norms Consider a matrix U of n lines and n columns: Consider a matrix U of n lines and n columns: Column norm Column norm Line norm Line norm

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 21 K-means problems? Suitable when clusters are compact clouds well separated. Suitable when clusters are compact clouds well separated. Scalable because computational complexity is O(nkr). Scalable because computational complexity is O(nkr). Necessity of choosing c is disadvantage. Necessity of choosing c is disadvantage. Not suitable for nonconvex shapes. Not suitable for nonconvex shapes. It is sensitive to noise and outliers because they influence the means. It is sensitive to noise and outliers because they influence the means. Depends on the initial allocation. Depends on the initial allocation.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 22 Examples of results

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 23 K-means: Actual Data

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 24 K-means: Results

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 25 K-medoids

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 26 K-medoids methods K-means is sensitive to outliers since an object with an extremely large value may distort the distribution of data. K-means is sensitive to outliers since an object with an extremely large value may distort the distribution of data. Instead of taking the mean value the most centrally object (medoid) is used as reference point. Instead of taking the mean value the most centrally object (medoid) is used as reference point. The algorithm minimizes the sum of dissimilarities between each object and the medoid (similar to k-means) The algorithm minimizes the sum of dissimilarities between each object and the medoid (similar to k-means)

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 27 K-medoids strategies Find k-medoids arbitrarily. Find k-medoids arbitrarily. Each remaining object is clustered with the medoid to which is the most similar. Each remaining object is clustered with the medoid to which is the most similar. Then iteratively replaces one of the medoids by a non-medoid as long as the quality of the clustering is improved. Then iteratively replaces one of the medoids by a non-medoid as long as the quality of the clustering is improved. The quality is measured using a cost function that measures the average dissimilarity between the objects and the medoid of its cluster. The quality is measured using a cost function that measures the average dissimilarity between the objects and the medoid of its cluster.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 28 Reassignment costs Each time a reassignment occurs a difference in square-error J is contributed. Each time a reassignment occurs a difference in square-error J is contributed. The cost function J calculates the total cost of replacing a current medoid by a non-medoid. The cost function J calculates the total cost of replacing a current medoid by a non-medoid. If the total cost is negative then m j is replaced by m random, otherwise the replacement is not accepted. If the total cost is negative then m j is replaced by m random, otherwise the replacement is not accepted.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 29 Replacing medoids case 1 Object p belongs to medoid m j. If m j is replaced by m random and p is closest to one of m i (i<>j), then reassigns p to m i Object p belongs to medoid m j. If m j is replaced by m random and p is closest to one of m i (i<>j), then reassigns p to m i mimi mjmj m random p

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 30 m random Replacing medoids case 2 Object p belongs to medoid m j. If m j is replaced by m random and p is closest to m random, then reassigns p to m random Object p belongs to medoid m j. If m j is replaced by m random and p is closest to m random, then reassigns p to m random mimi mjmj p

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 31 m random Replacing medoids case 3 Object p belongs to medoid m i (i<>j). If m j is replaced by m random and p is still close to m i, then does not change. Object p belongs to medoid m i (i<>j). If m j is replaced by m random and p is still close to m i, then does not change. mimi mjmj p

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 32 m random Replacing medoids case 4 Object p belongs to medoid m i (i<>j). If m j is replaced by m random and p is closest to m random,then reassigns p to m random. Object p belongs to medoid m i (i<>j). If m j is replaced by m random and p is closest to m random,then reassigns p to m random. mimi mjmj p

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 33 K-medoid algorithm Arbitrarily choose k objects as the initial medoids. Repeat Assign each remaining object to the cluster with the nearest medoid; Randomly select a nonmedoid object, m random ; Compute the total cost J of swapping m j with m random ; If J<0 then swap o j with o random ; Until no change

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 34Comparisons? K-medoids is more robust than k-means in presence of noise and outliers. K-medoids is more robust than k-means in presence of noise and outliers. K-means is less costly in terms of processing time. K-means is less costly in terms of processing time.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 35 Fuzzy C-means

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 36 Fuzzy C-means Fuzzy version of K-means Fuzzy version of K-means Elements may belong to more than one cluster Elements may belong to more than one cluster Values of characteristic function range from 0 to 1. Values of characteristic function range from 0 to 1. It is interpreted as the degree of membership of an element to a cluster relative to all other clusters. It is interpreted as the degree of membership of an element to a cluster relative to all other clusters.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 37 Fuzzy C-means setup Consider n (X={x 1, x 2,..., x n }) objects and c clusters. Consider n (X={x 1, x 2,..., x n }) objects and c clusters. Each object x i is defined by l characteristics x i =(x i1, x i2,..., x il ). Each object x i is defined by l characteristics x i =(x i1, x i2,..., x il ). Consider A a set of k clusters (A={A 1, A 2,..., A k }). Consider A a set of k clusters (A={A 1, A 2,..., A k }).

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 38 Fuzzy C-means properties The union of all clusters makes the Universe The union of all clusters makes the Universe There is no empty cluster There is no empty cluster

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 39 Membership function

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 40 Problems of probabilistic clusters Points representing circle lines (C1 e C2) Points representing circle lines (C1 e C2) Due to normalization strange results may emerge Due to normalization strange results may emerge C1 C2 {c1=0.5,c2= 0.5} C1 C2 {c1=0.5,c2= 0.5} {c1=0.7,c2= 0.3} {c1=0.3,c2= 0.7}

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 41 Membership matrix U Matrix containing the values of inclusion of each element into each cluster [0,1]. Matrix containing the values of inclusion of each element into each cluster [0,1]. Matrix has c (clusters) lines and n (elements) columns. Matrix has c (clusters) lines and n (elements) columns. The sum of all elements in the column must be equal to one. The sum of all elements in the column must be equal to one. The sum of each line must be less than n e grater than 0. No empty cluster, or cluster containing all elements. The sum of each line must be less than n e grater than 0. No empty cluster, or cluster containing all elements.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 42 Matrix examples X1X2X3 X4X5X6 Two examples of clustering. What do the clusters represent?

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 43 Fuzzy C-means algorithm v1 Arbitrarily assigns each object to a cluster (matrix U). Repeat Update the cluster centres; Update the cluster centres; Reassign objects to the clusters to which the objects are most similar; Reassign objects to the clusters to which the objects are most similar; Until no change;

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 44 Fuzzy C-means algorithm v2 Arbitrarily choose c objects as the initial cluster centres. Repeat Reassign objects to the clusters to which the objects are most similar. Reassign objects to the clusters to which the objects are most similar. Update the cluster centres. Until no change

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 45 Algorithm details The algorithm tries to minimise the function, m is the nebulisation factor. The algorithm tries to minimise the function, m is the nebulisation factor. d ie is the distance between the element x e (l characteristics) and the centre of the cluster i (v i ) d ie is the distance between the element x e (l characteristics) and the centre of the cluster i (v i )

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 46 Nebulisation factor m is the nebulisation factor. m is the nebulisation factor. This value has a range [1, ) This value has a range [1, ) If m=1 the the system is crisp. If m=1 the the system is crisp. If m the all the membership values tend to 1/c If m the all the membership values tend to 1/c The most common values are 1.25 and 2.0 The most common values are 1.25 and 2.0

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 47 Cluster Centre The centre of the cluster i (v i ) is a l characteristics vector. The centre of the cluster i (v i ) is a l characteristics vector. The jth co-ordinate is calculated as The jth co-ordinate is calculated as

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 48 Detailed Algorithm Choose c (number of clusters). Choose c (number of clusters). Set error ( > 0), nebulisation factor (m) and step (r=0). Set error ( > 0), nebulisation factor (m) and step (r=0). Arbitrarily set matrix U (r). Do not forget, each element belongs to a single cluster, no empty cluster and no cluster has all elements. Arbitrarily set matrix U (r). Do not forget, each element belongs to a single cluster, no empty cluster and no cluster has all elements.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 49 Detailed Algorithm cont. Repeat Calculate the centre of the clusters v i (r) Calculate the distance d i (r) of each point to the centre of the clusters Generate U (r+1) recalculating all characteristic functions(How?) Until ||U (r+1) -U (r) || < Until ||U (r+1) -U (r) || <

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 50 How to recalculate? If there is any distance greater than zero then membership grade is the weighted average of the distances to all centers. If there is any distance greater than zero then membership grade is the weighted average of the distances to all centers. else the element belongs to this cluster and no one else.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 51 Example of clustering result

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 52 Example of clustering result

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 53 Possibilistic Clustering

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 54 Membership function

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 55 Membership function The membership degree is the representativity of typicality of the datum x for the cluster i. The membership degree is the representativity of typicality of the datum x for the cluster i.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 56 Algorithm details The algorithm tries to minimise the function, m is the nebulisation factor. The algorithm tries to minimise the function, m is the nebulisation factor. The first sum is the usual and the second rewards high memberships. The first sum is the usual and the second rewards high memberships.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 57 How to calculate?

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 58 Detailed Algorithm Choose c (number of clusters) and m. Choose c (number of clusters) and m. Set error ( > 0) and step (r=0). Set error ( > 0) and step (r=0). Execute FCM Execute FCM

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 59 Detailed Algorithm cont. For 2 times Initialize U (0) and the centre of the clusters v i (0) with previous results Initialize k and r=0 Repeat Calculate the distance d i (r) of each point to the centre of the clusters Generate U (r+1) recalculating all characteristic functions using the equations Until ||U (r+1) -U (r) || < Until ||U (r+1) -U (r) || < End FOR

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 60 Gustafson-Kessel Algorithm

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 61 Gustafson-Kessel method This method (GK) is fuzzy clustering method similar to the Fuzzy C-means (FCM). This method (GK) is fuzzy clustering method similar to the Fuzzy C-means (FCM). The difference is the way the distance is calculated. The difference is the way the distance is calculated. FCM uses Euclidean distances FCM uses Euclidean distances GK uses Mahalanobis distances GK uses Mahalanobis distances

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 62 Gustafson-Kessel method Mahalobis distance is calculated as Mahalobis distance is calculated as The matrices A i are given by The matrices A i are given by

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 63 Gustafson-Kessel method The Fuzzy Covariance Matrix is The Fuzzy Covariance Matrix is

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 64 GK comments The clusters are hyperellipsoids on the l. The clusters are hyperellipsoids on the l. The hyperellipsoids have aproximately the same size. The hyperellipsoids have aproximately the same size. In order to be possible to calculate S -1 the number of samples n must be at least equal to the number of dimensions l plus 1. In order to be possible to calculate S -1 the number of samples n must be at least equal to the number of dimensions l plus 1.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 65 Results of GK

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 66 Gath-Geva method It is also known as Gaussian Mixture Decomposition. It is also known as Gaussian Mixture Decomposition. It is similar to the FCM method It is similar to the FCM method The Gauss distance is used instead of Euclidean distance. The Gauss distance is used instead of Euclidean distance. The clusters do not have a definite shape anymore and have various sizes. The clusters do not have a definite shape anymore and have various sizes.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 67 Gath-Geva Method Gauss distance is given by Gauss distance is given by A i =S i -1 A i =S i -1

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 68 Gath-Geva Method The term P i is the probability of a sample belong to a cluster. The term P i is the probability of a sample belong to a cluster.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 69 Gath-Geva Comments Pi is a parameter that influences the size of a cluster. Pi is a parameter that influences the size of a cluster. Bigger clusters attract more elements. Bigger clusters attract more elements. The exponential term makes more difficult to avoid local minima. The exponential term makes more difficult to avoid local minima. Usually another clustering method is used to initialise the partition matrix U. Usually another clustering method is used to initialise the partition matrix U.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 70 GG Results – Random Centers

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 71 GG Results – Centers FCM

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 72 Clustering based on Equivalence Relations A relation crisp R on a universe X can be thought as a relation from X to X A relation crisp R on a universe X can be thought as a relation from X to X R is an equivalence relation if it has the following three properties: R is an equivalence relation if it has the following three properties: Reflexivity (x i, x i ) R Reflexivity (x i, x i ) R Symmetry (x i, x j ) R (x j, x i ) R Symmetry (x i, x j ) R (x j, x i ) R Transitivity (x i, x j ) R and (x j, x k ) R (x i, x k ) R Transitivity (x i, x j ) R and (x j, x k ) R (x i, x k ) R

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 73 Crisp tolerance relation R is a tolerance relation if it has the following two properties: R is a tolerance relation if it has the following two properties: Reflexivity (x i, x i ) R Reflexivity (x i, x i ) R Symmetry (x i, x j ) R (x j, x i ) R Symmetry (x i, x j ) R (x j, x i ) R

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 74 Composition of Relations XYZ RS T=R°S

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 75 Composition of Crisp Relations The operation ° is similar to matrix multiplication.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 76 Transforming Relations A tolerance relation can be transformed into a equivalence relation by at most (n-1) compositions with itself. A tolerance relation can be transformed into a equivalence relation by at most (n-1) compositions with itself. n is the cardinality of the set R. n is the cardinality of the set R.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 77 Example of crisp classification Let X={1,2,3,4,5,6,7,8,9,10} Let X={1,2,3,4,5,6,7,8,9,10} Let R be defined as the relation for the identical remainder after dividing each element by 3. Let R be defined as the relation for the identical remainder after dividing each element by 3. This relation is an equivalence relation This relation is an equivalence relation

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 78 Relation Matrix

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 79 Crisp Classification Consider equivalent columns. Consider equivalent columns. It is possible to group the elements in the following classes It is possible to group the elements in the following classes R 0 = {3, 6, 9} R 0 = {3, 6, 9} R 1 = {1, 4, 7, 10} R 1 = {1, 4, 7, 10} R 2 = {2, 5, 8} R 2 = {2, 5, 8}

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 80 Clustering and Fuzzy Equivalence Relations A relation fuzzy R on a universe X can be thought as a relation from X to X A relation fuzzy R on a universe X can be thought as a relation from X to X R is an equivalence relation if it has the following three properties: R is an equivalence relation if it has the following three properties: Reflexivity: (x i, x i ) R or (x i, x i ) = 1 Reflexivity: (x i, x i ) R or (x i, x i ) = 1 Symmetry: (x i, x j ) R (x j, x i ) R or Symmetry: (x i, x j ) R (x j, x i ) R or (x i, x j ) = (x j, x i ) (x i, x j ) = (x j, x i ) Transitivity: (x i, x j ) and (x j, x k ) R (x i, x k ) R or if (x i, x j ) = 1 and (x j, x k ) = 2 then (x i, x k ) = and >=min( 1, 2 ) Transitivity: (x i, x j ) and (x j, x k ) R (x i, x k ) R or if (x i, x j ) = 1 and (x j, x k ) = 2 then (x i, x k ) = and >=min( 1, 2 )

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 81 Fuzzy tolerance relation R is a tolerance relation if it has the following two properties: R is a tolerance relation if it has the following two properties: Reflexivity (x i, x i ) R Reflexivity (x i, x i ) R Symmetry (x i, x j ) R (x j, x i ) R Symmetry (x i, x j ) R (x j, x i ) R

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 82 Composition of Fuzzy Relations The operation ° is similar to matrix multiplication.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 83 Distance Relation Let X be a set of data on l. Let X be a set of data on l. The distance function is a tolerance relation that can be transformed into a equivalence. The distance function is a tolerance relation that can be transformed into a equivalence. The relation R can be defined by the Minkowski distance formula. The relation R can be defined by the Minkowski distance formula. is a constant that ensures that R [0,1] and is equal to the inverse of the largest distance in X. is a constant that ensures that R [0,1] and is equal to the inverse of the largest distance in X.

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 84 Example of Fuzzy classification Let X={(0,0),(1,1),(2,3),(3,1),(4,0)} be a set of points in 2. Let X={(0,0),(1,1),(2,3),(3,1),(4,0)} be a set of points in 2. Set q=2, Euclidean distances. Set q=2, Euclidean distances. The largest distance is 4 ( x 1, x 5 ), so =0.25. The largest distance is 4 ( x 1, x 5 ), so =0.25. The relation R can be calculated by the equation The relation R can be calculated by the equation

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 85 Points to be classified

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 86 Tolerance matrix The matrix calculated by the equation is The matrix calculated by the equation is The is a tolerance relation that needs to be transformed into a equivalence relation The is a tolerance relation that needs to be transformed into a equivalence relation

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 87 Equivalence matrix The matrix transformed is The matrix transformed is

*@2006 Adriano Cruz *NCE e IM - UFRJ Cluster 88 Results of clustering Taking -cuts of fuzzy equivalent relation at various values of =0.44, 0.5, 0.65 and 1.0 we get the following classes: Taking -cuts of fuzzy equivalent relation at various values of =0.44, 0.5, 0.65 and 1.0 we get the following classes: R.44 =[{x 1,x 2,x 3,x 4,x 5 }] R.44 =[{x 1,x 2,x 3,x 4,x 5 }] R.55 =[{x 1,x 2,x 4,x 5 }{x 3 }] R.55 =[{x 1,x 2,x 4,x 5 }{x 3 }] R.65 =[{x 1,x 2 },{x 3 },{x 4,x 5 }] R.65 =[{x 1,x 2 },{x 3 },{x 4,x 5 }] R 1.0 =[{x 1 },{x 2 },{x 3 },{x 4 },{x 5 }] R 1.0 =[{x 1 },{x 2 },{x 3 },{x 4 },{x 5 }]