Presentation is loading. Please wait.

Presentation is loading. Please wait.

Structure-Based Distance Metric for High-Dimensional Space Exploration with Multi-Dimensional Scaling Jenny Hyunjung Lee , Kevin T. McDonnell, Alla Zelenyuk.

Similar presentations


Presentation on theme: "Structure-Based Distance Metric for High-Dimensional Space Exploration with Multi-Dimensional Scaling Jenny Hyunjung Lee , Kevin T. McDonnell, Alla Zelenyuk."— Presentation transcript:

1 Structure-Based Distance Metric for High-Dimensional Space Exploration with Multi-Dimensional Scaling Jenny Hyunjung Lee , Kevin T. McDonnell, Alla Zelenyuk , Dan Imre, and Klaus Mueller   Visual Analytics and Imaging Lab Computer Science Department Stony Brook University and SUNY Korea Computer Science and Mathematics Department Dowling College  Chemical and Material Sciences Division Pacific Northwest National Lab

2 MDS and Parallel Coordinates Often used in conjunction  MDS gives the overview  parallel coordinates allows an inspection of the raw data MDS Parallel Coordinates

3 Cognition-Equivalent Mapping (CEM) MDSParallel Coordinates Not a cognition-equivalent mapping

4 Cognition-Equivalent Mapping (CEM) MDSParallel Coordinates Cognition-equivalent mappingNot a cognition-equivalent mapping

5 CEM In Practice – Same or Different? Different

6 CEM In Practice – Same or Different? Not Different

7 Our Method – Same or Different? Different

8 What’s the Magic? A New, Perceptual Distance Metric

9 Distance in High-D Space MDS optimization function:  Euclidian distance  measures point-pair error  sums all distance in ND distance in 2D

10 Why is The Euclidian Distance less ideal? Perceptual (dis)similarity is not gauged by a Euclidian metric  our cognitive faculties look for pattern similarity  poly lines with similar pattern signature are deemed closer  the equivalent points need to also be closer in the MDS plot  need a new perceptual distance metric that gauges this pattern similarity

11 Structural Similarity Index (SSIM) luminance contrast structure x y

12 Structural Similarity Index (SSIM) frequently pooled over 11×11 sliding window luminance contrast structure

13 Example from Image Processing Both images have the same MSE but different SSIM  SSIM = 0.91, 0.71, 0.77  Euclidian distance expression is similar to that of MSE  hence it can be expected that SSIM will do better when ported original blurred salt + pepper contrast stretched

14 Analogy of SSIM to ND Distance Just like images, polylines have  luminance  mean  contrast  structure  evaluates the structural similarity after the differences in mean and contrast have been accounted for

15 Cases

16 Effect of Windowing Procedure  order dimensions such that sum of pairwise correlations is maximized  use 11-point window Observations  purple cluster has higher local variance than blue  with windowing it has an equivalent spread in the MDS plot  without windowing this effect is averaged out and likewise in MDS not windowedwindowed clusters in G1local variance

17 Achieves Better Cluster Separability Euclidian SSIM # dimensions 8 clusters, 800 data points

18 Mass Spectra Data EuclidianSSIMParallel Coordinates Mass Spectra of Aerosol Particles, 450 D, 2,000 data points noisier (more skew)

19 Mass Spectra Data EuclidianSSIMParallel Coordinates Mass Spectra of Aerosol Particles, 450 D, 2,000 data points third peak)

20 In-Layout Manual Clustering SSIM-based layout clustering has better neighborhood definition Euclidian SSIM Parallel Coordinates

21 Cluster Compression SSIM slightly compresses local neighborhoods  but much less than LDA Solution:  use Euclidian at the local level EuclidianSSIM LDA

22 Bi-Scale Layout Bi-scale layout algorithm  lay out cluster centers via SSIM  then lay out cluster points via Euclidian  plot onto tiles  (partially) remove tile overlaps using a proximity stress algorithm original20% overlap5% overlap tiles resized

23 Effect on Curse of Dimensionality Relative contrast metric: Terms:  dist max : maximum distance in a given N-D data distribution  dist min : minimum distance in this N-D data distribution  m: number of dimensions Curse of dimensionality  as m increases, the distances between pairs of data points become increasingly indistinguishable  this adversely affects the MDS layout.

24 Effect on Curse of Dimensionality SSIM pushes the curse of dimensionality  relative contrast consistently and significantly higher  wider spread of distinct distances  better separates inter-cluster distances from intra-cluster distances  can distinguish clusters easier and more accurately Euclidian SSIM one Gaussian with l,000 points 5 Gaussians with 250 points each

25 Future Work Use SSIM in cluster analysis applications, such as k-means Incorporate into multi-scale and multi-resolution analysis to compare patterns at different levels of scale Confirm our currently more empirical successes with rigorous psycho-physical experiments

26 Questions? Funding provided by:  NSF grants , , and  US Department of Energy (DOE) Office of Basic Energy Sciences, Division of Chemical Sciences, Geosciences, and Biosciences  The IT Consilience Creative Project through the Ministry of Knowledge Economy, Republic of Korea


Download ppt "Structure-Based Distance Metric for High-Dimensional Space Exploration with Multi-Dimensional Scaling Jenny Hyunjung Lee , Kevin T. McDonnell, Alla Zelenyuk."

Similar presentations


Ads by Google