Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University."— Presentation transcript:

1 Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University of Science and Technology Clustering Through Decision Tree Construction Pattern Recognition, 35 (2002), pp. 2783-2790.

2 Intelligent Database Systems Lab Outline Motivation Objective Introduction Building Cluster Trees Experimental results Conclusion Personal Opinion N.Y.U.S.T. I.M.

3 Intelligent Database Systems Lab Motivation Decision tree algorithm uses a purity function to partition the data space into different class regions but the technique is not directly applicable to clustering. N.Y.U.S.T. I.M.

4 Intelligent Database Systems Lab Objective Propose a novel clustering technique, which is based on a supervised learning method called decision tree construction.  Called CLTree (CLustering based on decision Trees) N.Y.U.S.T. I.M.

5 Intelligent Database Systems Lab Introduction (1/3) Decision tree  Uses a purity function to partition the data space into different class regions.  The technique is not directly applicable to clustering. N.Y.U.S.T. I.M.

6 Intelligent Database Systems Lab Introduction (2/3) If there are clusters in the data, the data points cannot be uniformly distributed in the entire space

7 Intelligent Database Systems Lab Introduction (3/3) The CLTree technique consists of two steps:  Cluster tree construction Use a modified decision tree algorithm with a new purity function  Cluster tree pruning To find meaningful/useful clusters.

8 Intelligent Database Systems Lab Building Cluster Trees

9 Intelligent Database Systems Lab Decision Tree N.Y.U.S.T. I.M.

10 Intelligent Database Systems Lab Introducing N points (1/2) We do not physically add these N points to the original data, but only assume their existence. We now determine how many N points to add. N.Y.U.S.T. I.M.

11 Intelligent Database Systems Lab Introducing N points (2/2) The reason that we increase the number of N points of a node if it has more inherited Y points than N points  To avoid the situation where there may be too few N points left after some cuts or splits.

12 Intelligent Database Systems Lab Two modifications to the Decision Tree Algorithm Compute the number of N points on the fly Evaluate on both sides of data points The space has 25 data (Y) points and N points N = 25 * 4/10 = 10 N = 15-10

13 Intelligent Database Systems Lab The New Criterion There are two main problems with the gain criterion  The cut with the best information gain tends to cut into clusters. This results in severe loss of data points in clusters.  The gain criterion does not look ahead in deciding the best cut.

14 Intelligent Database Systems Lab The New Criterion Find the initial cuts Look ahead to find better cuts Select the overall best cut

15 Intelligent Database Systems Lab

16 User-Oriented Pruning of Clustering Trees Browsing  The user simply explores the tree him/himself to find meaningful clusters. User-oriented pruning  The tree is pruned using two user-specify parameters min_y, min_rd

17 Intelligent Database Systems Lab

18 Merging adjacent Y regions and simplifying clusters

19 Intelligent Database Systems Lab Experiments

20 Intelligent Database Systems Lab

21 Conclusion & Personal Opinion The author proposed a novel clustering technique, called CLTree, which is based on decision trees in classification research.  Partitioning the data space into dense and sparse regions The results show that it is both effective and efficient. N.Y.U.S.T. I.M.

22 Intelligent Database Systems Lab review N.Y.U.S.T. I.M.


Download ppt "Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University."

Similar presentations


Ads by Google