Intelligent Database Systems Lab Presenter : Chuang, Kai-Ting Authors : Rodrigo T. Peres, Claus Aranha, Carlos E. Pedreira 2013, InfSci Optimized bi-dimensional data projection for clustering visualization
Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments
Intelligent Database Systems Lab Motivation The problem of data visualization consists of generating a bi-dimensional projection of a high- dimensional data set.
Intelligent Database Systems Lab Objectives We propose a new method to project n-dimensional data onto two dimensions, for visualization purposes. We apply Differential Evolution as a meta-heuristic to optimize a divergence measure of the projected data. This divergence measure is based on the Cauchy– Schwartz divergence, extended for multiple classes.
Intelligent Database Systems Lab Methodology-Framework Cauchy-Schwartz divergence measure Differential Evolution Data transformation
Intelligent Database Systems Lab Methodology
Intelligent Database Systems Lab Methodology-Cauchy-Schwartz divergence measure
Intelligent Database Systems Lab Methodology-Information Theoretic Learning (ITL)
Intelligent Database Systems Lab Methodology-Information Theoretic Learning (ITL)
Intelligent Database Systems Lab Methodology-Information Theoretic Learning (ITL)
Intelligent Database Systems Lab Methodology-Computational complexity of the Dcs
Intelligent Database Systems Lab Methodology-Differential Evolution
Intelligent Database Systems Lab Methodology-Data transformation
Intelligent Database Systems Lab Experiment setup Synthetic data sets – Initial conditions. – Robustness of the method to very noisy dimesions. Real world data sets – Pen Digits – Lung Cancer – Compares monocytes-related dendritic cells, plasmocytoid dendritic cells and B-lymphocytes. – Compares monocytes and neutrophils. – Compares plasmocytoid dendritic cells and neutrophils.
Intelligent Database Systems Lab Experiment-Kernel width
Intelligent Database Systems Lab Experiment-Synthetic data sets1
Intelligent Database Systems Lab Experiment-Synthetic data sets2
Intelligent Database Systems Lab Experiment-Real world data sets
Intelligent Database Systems Lab Conclusions Using this method, we promote the bi-dimensional visualization of high-dimensional data sets with optimized cluster separation.
Intelligent Database Systems Lab Comments Advantages – The method performed well. Disadvantages – It may be slower to train on data sets with a larger number of cases. Applications – Visualization.