Download presentation
Presentation is loading. Please wait.
1
Prepared by: Mahmoud Rafeek Al-Farra
College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology Data Mining Chapter 6: Clustering Methods Prepared by: Mahmoud Rafeek Al-Farra 2013
2
Course’s Out Lines Introduction Data Preparation and Preprocessing
Data Representation Classification Methods Evaluation Clustering Methods Mid Exam Association Rules Knowledge Representation Special Case study : Document clustering Discussion of Case studies by students
3
Out Lines Definition of Clustering Why clustering?
Where to use clustering? Next: Types of Data in Cluster Analysis Next: A Categorization of Major Clustering Methods
4
Definition of Clustering
Clustering can be considered the most important unsupervised learning technique; so, as every other problem of this kind, it deals with finding a structure in a collection of unlabeled data. Clustering is “the process of organizing objects into groups whose members are similar in some way”. A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters.
5
Definition of Clustering
Cluster: a collection of data objects Similar to one another within the same cluster Dissimilar to the objects in other clusters Cluster analysis Grouping a set of data objects into clusters Clustering is unsupervised classification: no predefined classes
6
Learning
7
Why clustering? Simplifications Pattern detection
Useful in data concept construction Unsupervised learning process
8
Where to use clustering?
Data mining Information retrieval text mining Web analysis marketing medical diagnostic
9
Which method should I use?
Type of attributes in data Scalability to larger dataset Ability to work with irregular data Time cost complexity Data order dependency Result presentation
10
Thanks
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.