Presentation is loading. Please wait.

Presentation is loading. Please wait.

CATA 2010 March 2010 Jewels, Himalayas and Fireworks, Extending Methods for Visualizing N Dimensional Clustering W. Jockheck Dept. of Computer Science.

Similar presentations


Presentation on theme: "CATA 2010 March 2010 Jewels, Himalayas and Fireworks, Extending Methods for Visualizing N Dimensional Clustering W. Jockheck Dept. of Computer Science."— Presentation transcript:

1 CATA 2010 March 2010 Jewels, Himalayas and Fireworks, Extending Methods for Visualizing N Dimensional Clustering W. Jockheck Dept. of Computer Science North Dakota State University Fargo, North Dakota 58105 William.Jockheck@ndsu.nodak.edu Dr. William Perrizo Dept. of Computer Science North Dakota State University Fargo, North Dakota 58105 William.Perrizo@ndsu.nodak.edu

2 CATA 2010 March 2010 Overview Visualization of high dimensional data is often difficult and in some cases leads to incorrect conclusions. However the human mind is the most sophisticated and effective pattern recognizer available for two-dimensions (or three-dimensions?). This paper considers transformations of high dimensional data into two dimensional space for the purpose of using the human brain as a pattern recognition engine to get [partial] quick results. Then other methods can be used for further drill down.

3 CATA 2010 March 2010 Visualizing N Dimensional data Chernoff's faces Parallel Coordinates Jewel Diagram

4 CATA 2010 March 2010 Table Columns as Dimensions A table of n numeric columns can be considered as a set of points in n dimensional space (e.g., the 1 st 4 IRIS dataset columns).

5 CATA 2010 March 2010 Projecting Hypercube into 2D When projecting, an arbitrary vector can be used for each dimensional axis (typically, horizontal is used for the 1 st dimension and vertical for the 2 nd ). Five dimensional hypercube using the axis indicated.

6 CATA 2010 March 2010 Jewel Diagram and Projections Each dimension is laid out as the side of a regular n-gon. Data points are represented by connecting the points in each dimension with a line (just as is done in parallel coordinates). The mean of those dimension points is displayed as a 2-D projection of the n-D data point. However, the fact that the data is “wrapped around” can cause [unnecessary?] cancellation of information.

7 CATA 2010 March 2010 Other methods of projecting points –Parallel coordinates –Himalayan variation A1A1 A5A5

8 CATA 2010 March 2010 Himalayan versions

9 CATA 2010 March 2010 Adjustments and Fixes Variations on the arrangement and directionality of the axis were explored. –In some variations, pairs of attributes off set each other (e.g., there is cancellation due to wrap-around) This led to placing all axis in a single quadrant (so that they don’t cancel out each other’s effects).

10 CATA 2010 March 2010 Single Quadrant Variations (IRIS data) Himalayan Axis Spacing (doubling angle)Uniform angular Spacing

11 CATA 2010 March 2010 Fireworks

12 CATA 2010 March 2010 Injection of Noise The methods are, of course, sensitive to noise When additional random attributes were added to, e.g., the IRIS data, the clarity was diminished. That is to be expected.

13 CATA 2010 March 2010 Variations with Weighting and Noise Iris data AllWeighted With Noise

14 CATA 2010 March 2010 Why not 3D instead of 2D VRML, CAVE or other stereo-optics based viewing is possible, but do they pay dividends given the increase in complexity and cost? We tried a “3D” implementation with samples slightly offset in the Z axis. 2D seems to be the winner. –Humans only see in 2D. The retina, while curved only captures a 2D projection. Perspective and stereo-optics provide the 3 rd dimension. –Computer displays are 2D. –Printed outputs are 2D. –3D display devices and technologies have largely been a failure (in research – witness the fact that almost nothing has been written about the CAVE in 5 years… –Even though there is a slight uptick in interest in 3D movies, there have been upticks before and they have always fizzled. A thought: perspective provides all the 3D we need and any stereo- optic enhancement adds very little –It fills the forward space between the screen and the eyes –Perspective already fills the space from the screen backward to infinity and that seems to be enough. X

15 CATA 2010 March 2010 Problems As n increases –Jewel diagram polygon approaches a circle. –Fireworks axes become indistinguishable. –Projected points tend to overlap and obscure. –Possibility of coincident points increases. Sequence attributes (sides /axes) –Alter projected point distribution. –Convey different information.

16 CATA 2010 March 2010 Problem Mitigation The reality of points and lines. –A point has no dimensions, only location. –A line has no width. The reality of displays. –Points have to have dimension to be visible. –Lines have to have width to be visible. Scaling as a solution.

17 CATA 2010 March 2010 Contribution of the Visualization A single projected point represents a sample and shows its relationship in the data set. The methods provide display of each attribute distribution and may provide the big picture for the user.

18 CATA 2010 March 2010 Summary –These methods provide visualization of high dimensional –Provides a single point projection for each sample (tuple). –Computationally simple –Very modifiable and adaptable Colors, sequences, weighting, scaling

19 CATA 2010 March 2010 Jewels, Himalayas and Fireworks, Extending Methods for Visualizing N Dimensional Clustering W. Jockheck Dept. of Computer Science North Dakota State University Fargo, North Dakota 58105 William.Jockheck@ndsu.nodak.edu Dr. William Perrizo Dept. of Computer Science North Dakota State University Fargo, North Dakota 58105 William.Perrizo@ndsu.nodak.edu


Download ppt "CATA 2010 March 2010 Jewels, Himalayas and Fireworks, Extending Methods for Visualizing N Dimensional Clustering W. Jockheck Dept. of Computer Science."

Similar presentations


Ads by Google