Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Exploiting Data Topology in Visualization and Clustering.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Exploiting Data Topology in Visualization and Clustering."— Presentation transcript:

1 Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Exploiting Data Topology in Visualization and Clustering of Self-Organizing Maps Kadim Tas¸demir and Erzsébet Merényi TNN, Vol.20, No. 4, 2009, pp. 549-562. Presenter : Wei-Shen Tai 2009/5/21

2 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2 Outline Introduction Previous work on visualization of SOM knowledge Topology visualization through connectivity matrix of SOM prototypes Clustering through CONN VIS Discussions and conclusion Comments

3 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 3 Motivation Exploit underutilized component of the SOM’s knowledge: data topology  Inclusion of data topology in the SOM visualization provides more sophisticated clues to cluster structure than existing SOM visualization approaches.

4 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 4 Objective Integrate the data topology to the visualization of SOM  It can improve the cluster extraction of SOM map via “connectivity matrix” and its specific rendering over the SOM.

5 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 5 Visualization for SOM SOM is a topology preserving mapping  Ideally, prototypes(neurons) those are neighbors in SOM map are also neighbors (centroids of neighboring Voronoi polyhedra) in data space and vice versa. Growing SOM  It appears less robust than the Kohonen SOM because of the large number of parameters needing adjustment. ViSOM  it requires a relatively large number of prototypes even for small data sets.

6 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 6 Topology visualization through connectivity matrix of SOM prototypes Induced Delaunay triangulation  It can be determined from the relationships of the best matching units (BMUs) and the second BMUs. CONN  It is a weighted analog of A, where the weights indicate the density distribution of the input data among the prototypes adjacent in M. where, RF ij means w i is the BMU and w j is the second BMU.

7 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 7 CONNvis: visualization of the connectivity matrix Line width  The strength of the connection and reflects the density distribution among the connected units. Line colors  A ranking of the connectivity strengths of w i.  Reveals most-to-least dense regions local to w i in data space.

8 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 8 Assessment of topology preservation with CONNvis Topology violations  connected neural units that are not immediate neighbors in map (forward topology violations);  unconnected neural units that are immediate neighbors in map (backward topology violations).

9 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 9 Clustering through CONN VIS Remove weak connections that link any two coarse clusters X and Y at their boundary  Step 1) Remove all weak connections to cluster X if the number of weak connections to X is less than the number of weak connections to the other cluster Y.  Step 2) Remove the weakest connection if the connections of the prototype to the two clusters have different widths.  Step 3) Remove the lowest ranking connection if the number of weak connections to both clusters is the same and all connections at the boundary of these clusters are weak.  Step 4) Repeat Steps 1)–3) until this prototype has been disconnected from one of the clusters.  Step 5) Repeat Steps 1)–4) for all prototypes at this boundary.

10 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 10 A Real-Data Application A real remote sensing spectral image of Ocean City

11 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 11 Compare to U-matrix and ISOMAP

12 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 12 Discussion and conclusions CONN VIS  Integrates data distribution into the customary Delaunay triangulation.  Shows both forward and backward topology violations on the SOM grid.  Makes cluster extraction more efficiently.

13 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 13 Comments Advantage  This proposed method improves the visualization of SOM via combining induced Delaunay triangulation with connection strength.  It adopts the training processed of conventional SOM, but renders the resulting map via those connections between neurons after removing weak connection and boundary neurons. Drawback  In this paper, most of terminology are not as same as general used ones in SOM, such as data vectors.  If one connection, connects two neuron in the same cluster, cross over an unrelated neuron (because it is not a boundary neuron for this cluster, so it is not removed by this propose method), it will makes the user confuse in the relation of these three neurons. Application  Data clustering.


Download ppt "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Exploiting Data Topology in Visualization and Clustering."

Similar presentations


Ads by Google