Presentation is loading. Please wait.

Presentation is loading. Please wait.

CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course.

Similar presentations


Presentation on theme: "CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course."— Presentation transcript:

1 CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course

2 Outline Density-based Clustering Methods 2  Density-Based Clustering Methods  Density-Based Clustering Background  Terminology  How does DBSCAN find clusters?  DBSCAN

3 Clustering Methods Density-based Clustering Methods 3  Partitioning methods  K-Means  Hierarchical methods  Agglomerative Hierarchical Clustering  Divisive hierarchical clustering  Density-based methods  DBSCAN: a Density-Based Spatial Clustering of Applications with Noise  Grid-based methods  STING: A Statistical Information Grid Approach to Spatial Data Mining  Model-based methods  Expectation-Maximization  Neural Network Approach  High Dimensional Data Clustering  CLIQUE: A Dimension-Growth Subspace Clustering Method

4 DBSCAN Density-based Clustering Methods 4

5 Density-Based Clustering Methods  Clustering based on density, such as density-connected points instead of distance metric.  Cluster = set of “density connected” points.  Major features:  Discover clusters of arbitrary shape  Handle noise  Need “density parameters” as termination condition- (when no new objects can be added to the cluster.)  Example:  DBSCAN (Ester, et al. 1996)  OPTICS (Ankerst, et al 1999)  DENCLUE (Hinneburg & D. Keim 1998) 5 Density-based Clustering Methods

6 Density-Based Clustering: Background  Eps neighborhood: The neighborhood within a radius Eps of a given object  MinPts: Minimum number of points in an Eps-neighborhood of that object.  Core object: If the Eps neighborhood contains at least a minimum number of points Minpts, then the object is a core object  Directly density-reachable: A point p is directly density- reachable from a point q wrt. Eps, MinPts if  1) p is within the Eps neighborhood of q  2) q is a core object p q MinPts = 5 Eps = 1 6 Density-based Clustering Methods

7 Density Reachability and Density Connectivity  M, P, O and R are core objects since each is in an Eps neighborhood containing at least 3 points Minpts = 3 Eps=radius of the circles 7 Density-based Clustering Methods

8 Directly density reachable  Q is directly density reachable from M.  M is directly density reachable from P and vice versa. 8 Density-based Clustering Methods

9 Indirectly density reachable  Q is indirectly density reachable from P since Q is directly density reachable from M and M is directly density reachable from P. But, P is not density reachable from Q since Q is not a core object. 9 Density-based Clustering Methods

10 Core, border, and noise points  DBSCAN is a Density-Based Spatial Clustering of Applications with Noise  Density = number of points within a specified radius (Eps)  A point is a core point if it has a specified number (or more) of points (MinPts) within Eps These are points that are at the interior of a cluster.  A border point has fewer than MinPts within Eps, but is in the neighborhood of a core point.  A noise point is any point that is not a core point nor a border point. 10 Density-based Clustering Methods

11 How does DBSCAN find clusters? Density-based Clustering Methods 11  DBSCAN searches for clusters by checking the Eps- neighborhood of each point in the database.  If the Eps-neighborhood of a point p contains more than MinPts, a new cluster with p as a core object is created.  DBSCAN then iteratively collects directly density- reachable objects from these core objects, which may involve the merge of a few density-reachable clusters.  The process terminates when no new point can be added to any cluster

12 DBSCAN Algorithm  Arbitrary select a point p  Retrieve all points density-reachable from p wrt Eps and MinPts.  If p is a core point, a cluster is formed.  If p is a border point, no points are density-reachable from p and DBSCAN visits the next point of the database.  Continue the process until all of the points have been processed. 12 Density-based Clustering Methods

13 DBSCAN Summary  DBSCAN is A Density-Based Clustering Method Based on Connected Regions with Sufficiently High Density  The algorithm grows regions with sufficiently high density into clusters and discovers clusters of arbitrary shape in spatial databases with noise.  It defines a cluster as a maximal set of density- connected points. So distance is not the metric unlike the case of hierarchical methods. 13 Density-based Clustering Methods

14 Summary Density-based Clustering Methods 14  Density-Based Clustering Methods  Density-Based Clustering Background  Terminology  How does DBSCAN find clusters?  DBSCAN


Download ppt "CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course."

Similar presentations


Ads by Google