HDLSS Discrimination Mean Difference (Centroid) Method Same Data, Movie over dim’s.

Slides:



Advertisements
Similar presentations
Lecture 9 Support Vector Machines
Advertisements

STOR 892 Object Oriented Data Analysis Radial Distance Weighted Discrimination Jie Xiong Advised by Prof. J.S. Marron Department of Statistics and Operations.
Dimension reduction (1)
Interesting Statistical Problem For HDLSS data: When clusters seem to appear E.g. found by clustering method How do we know they are really there? Question.
SigClust Gaussian null distribution - Simulation Now simulate from null distribution using: where (indep.) Again rotation invariance makes this work (and.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
L15:Microarray analysis (Classification) The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
L15:Microarray analysis (Classification). The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
The UNIVERSITY of Kansas EECS 800 Research Seminar Mining Biological Data Instructor: Luke Huan Fall, 2006.
Object Orie’d Data Analysis, Last Time Mildly Non-Euclidean Spaces Strongly Non-Euclidean Spaces –Tree spaces –No Tangent Plane Classification - Discrimination.
Dimensionality Reduction. Multimedia DBs Many multimedia applications require efficient indexing in high-dimensions (time-series, images and videos, etc)
Definition of Trigonometric Functions With trigonometric ratios of acute angles in triangles, we are limited to angles between 0 and 90 degrees. We now.
Section 9.5: Equations of Lines and Planes
Outline Separating Hyperplanes – Separable Case
Object Orie’d Data Analysis, Last Time OODA in Image Analysis –Landmarks, Boundary Rep ’ ns, Medial Rep ’ ns Mildly Non-Euclidean Spaces –M-rep data on.
Object Orie’d Data Analysis, Last Time HDLSS Discrimination –MD much better Maximal Data Piling –HDLSS space is a strange place Kernel Embedding –Embed.
7.1 Scalars and vectors Scalar: a quantity specified by its magnitude, for example: temperature, time, mass, and density Chapter 7 Vector algebra Vector:
Object Orie’d Data Analysis, Last Time
Object Orie’d Data Analysis, Last Time Distance Weighted Discrimination: Revisit microarray data Face Data Outcomes Data Simulation Comparison.
Mathematics. Session Three Dimensional Geometry–1(Straight Line)
Return to Big Picture Main statistical goals of OODA: Understanding population structure –Low dim ’ al Projections, PCA … Classification (i. e. Discrimination)
10.7 Moments of Inertia for an Area about Inclined Axes
Object Orie’d Data Analysis, Last Time Gene Cell Cycle Data Microarrays and HDLSS visualization DWD bias adjustment NCI 60 Data Today: Detailed (math ’
Object Orie’d Data Analysis, Last Time Classification / Discrimination Classical Statistical Viewpoint –FLD “good” –GLR “better” –Conclude always do GLR.
Graphing in 3-D Graphing in 3-D means that we need 3 coordinates to define a point (x,y,z) These are the coordinate planes, and they divide space into.
1 UNC, Stat & OR Hailuoto Workshop Object Oriented Data Analysis, II J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina.
Lecture 4 Linear machine
Maximal Data Piling Visual similarity of & ? Can show (Ahn & Marron 2009), for d < n: I.e. directions are the same! How can this be? Note lengths are different.
Return to Big Picture Main statistical goals of OODA: Understanding population structure –Low dim ’ al Projections, PCA … Classification (i. e. Discrimination)
Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.
Classification Course web page: vision.cis.udel.edu/~cv May 14, 2003  Lecture 34.
CLUSTERING HIGH-DIMENSIONAL DATA Elsayed Hemayed Data Mining Course.
1 UNC, Stat & OR U. C. Davis, F. R. G. Workshop Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North.
Linear Classifiers Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
Object Orie’d Data Analysis, Last Time
Vectors and the Geometry of Space Section 10.4 Lines and Planes in Space 2015.
3D Geometry for Computer Graphics Class 3. 2 Last week - eigendecomposition A We want to learn how the transformation A works:
1 UNC, Stat & OR Hailuoto Workshop Object Oriented Data Analysis, I J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina.
12.3 The Dot Product. The dot product of u and v in the plane is The dot product of u and v in space is Two vectors u and v are orthogonal  if they meet.
1 UNC, Stat & OR Hailuoto Workshop Object Oriented Data Analysis, III J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina.
Return to Big Picture Main statistical goals of OODA: Understanding population structure –Low dim ’ al Projections, PCA … Classification (i. e. Discrimination)
Kernel Embedding Polynomial Embedding, Toy Example 3: Donut FLD Good Performance (Slice of Paraboloid)
Return to Big Picture Main statistical goals of OODA:
Object Orie’d Data Analysis, Last Time
Object Orie’d Data Analysis, Last Time
LESSON 90 – DISTANCES IN 3 SPACE
The Polar Coordinate System
Copyright © Cengage Learning. All rights reserved.
Mathematics.
Participant Presentations
Coordinate Geometry – Outcomes
Maximal Data Piling MDP in Increasing Dimensions:
Day 9 – Pre-image and Image under transformation
Determining the horizontal and vertical trace of the line
Linear Programming in Two Dimensions
Vectors and the Geometry
Vectors and the Geometry
Orthogonal Drawings Orthogonal drawings are 2d drawings of 3d objects created by orthographic projection. Orthogonal: intersecting or lying at right angles.
Principal Component Analysis
Copyright © Cengage Learning. All rights reserved.
Feature space tansformation methods
12.1: Representations of Three-Dimension Figures
Day 71 – Verifying dilations
I.4 Polyhedral Theory (NW)
Maths for Signals and Systems Linear Algebra in Engineering Lecture 6, Friday 21st October 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL.
I.4 Polyhedral Theory.
The Polar Coordinate System
Where do these graphs intersect
10.1 Conics And 1.5 Circles Copyright © 2013 Pearson Education, Inc. All rights reserved.
Vectors and the Geometry
Presentation transcript:

HDLSS Discrimination Mean Difference (Centroid) Method Same Data, Movie over dim’s

HDLSS Discrimination Mean Difference (Centroid) Method Far more stable over dimensions Because is likelihood ratio solution (for known variance - Gaussians) Doesn’t feel HDLSS boundary Eventually becomes too good?!? Widening gap between clusters?!? Careful: angle to optimal grows So lose generalizability (since noise inc’s) HDLSS data present some odd effects…

Maximal Data Piling Strange FLD effect at HDLSS boundary: Data Piling: For each class, all data project to single value

Maximal Data Piling What is happening? Hard to imagine Since our intuition is 3-dim’al Came from our ancestors… Try to understand data piling with some simple examples

Maximal Data Piling Simple example (Ahn & Marron 2009): Let 𝐻 + be the hyperplane: Generated by Class +1 Which has dimension = 1 I.e. line containing the 2 points Similarly, let 𝐻 − be the hyperplane Generated by Class -1

Maximal Data Piling Simple example: 𝑛 + = 𝑛 − =2 in ℝ 3 Let 𝐻 + , 𝐻 − be Parallel shifts of 𝐻 + , 𝐻 − So that they pass through the origin Still have dimension 1 But now are subspaces

Maximal Data Piling Simple example: 𝑛 + = 𝑛 − =2 in ℝ 3 C:\Documents and Settings\J S Marron\My Documents\Research\ComplexPopn\MaxDataPiling\dppic2_2.pdf

Maximal Data Piling Simple example: 𝑛 + = 𝑛 − =2 in ℝ 3 Construction 1: Let 𝐻 + + 𝐻 − be Subspace generated by 𝐻 + & 𝐻 − Two dimensional Shown as cyan plane

Maximal Data Piling Simple example: 𝑛 + = 𝑛 − =2 in ℝ 3 Construction 1 (cont.): Let 𝑣 𝑀𝐷𝑃 be Direction orthogonal to 𝐻 + & 𝐻 − One dimensional Makes Class +1 Data project to one point And Class -1 Data project to one point Called Maximal Data Piling Direction

Maximal Data Piling Simple example: 𝑛 + = 𝑛 − =2 in ℝ 3 C:\Documents and Settings\J S Marron\My Documents\Research\ComplexPopn\MaxDataPiling\dppic2_2.pdf

Maximal Data Piling Simple example: 𝑛 + = 𝑛 − =2 in ℝ 3 Construction 2: Let 𝐻 + ⊥ & 𝐻 − ⊥ be Subspaces orthogonal to 𝐻 + & 𝐻 − (respectively) 𝐻 + ⊥ Projection collapses Class +1 𝐻 − ⊥ Projection collapses Class -1 Both are 2-d (planes)

Maximal Data Piling Simple example: 𝑛 + = 𝑛 − =2 in ℝ 3 Construction 2 (cont.): Let intersection of 𝐻 + ⊥ & 𝐻 − ⊥ be 𝑣 𝑀𝐷𝑃 Same Maximal Data Piling Direction Projection collapses both Class +1 and Class -1 Intersection of 2-d (planes) is 1-d dir’n

Maximal Data Piling General Case: 𝑛 + , 𝑛 − in ℝ 𝑑 with 𝑑≥ 𝑛 + + 𝑛 − Let 𝐻 + & 𝐻 − be Hyperplanes generated by Classes Of Dimensions 𝑛 + −1, 𝑛 − −1 (resp.) Let 𝐻 + & 𝐻 − be Parallel subspaces I.e. shifts to origin Of Dimensions 𝑛 + −1, 𝑛 − −1 (resp.)

Maximal Data Piling General Case: 𝑛 + , 𝑛 − in ℝ 𝑑 with 𝑑≥ 𝑛 + + 𝑛 − Let 𝐻 + ⊥ & 𝐻 − ⊥ be Orthogonal Subspaces Of Dim’ns 𝑑− 𝑛 + +1, 𝑑− 𝑛 − +1 (resp.) Where Proj’n in 𝐻 + ⊥ Dir’ns Collapse Class +1 Proj’n in 𝐻 − ⊥ Dir’ns Collapse Class -1 Expect 𝑑− 𝑛 + − 𝑛 − +2 intersection

(within subspace generated by data) Maximal Data Piling General Case: 𝑛 + , 𝑛 − in ℝ 𝑑 with 𝑑≥ 𝑛 + + 𝑛 − Can show (Ahn & Marron 2009): Most dir’ns in intersection collapse all to 0 But there is a great circle of directions (within subspace generated by data)

(Gap changes along great circle, Maximal Data Piling General Case: 𝑛 + , 𝑛 − in ℝ 𝑑 with 𝑑≥ 𝑛 + + 𝑛 − Can show (Ahn & Marron 2009): Most dir’ns in intersection collapse all to 0 But there is a great circle of directions, Where Classes collapse to different points (Gap changes along great circle, including sign flips)

Maximal Data Piling General Case: 𝑛 + , 𝑛 − in ℝ 𝑑 with 𝑑≥ 𝑛 + + 𝑛 − Can show (Ahn & Marron 2009): Most dir’ns in intersection collapse all to 0 But there is a great circle of directions, Where Classes collapse to different points Distance is max’ed in 2 direct’ns: ± 𝑣 𝑀𝐷𝑃 Called Maximal Data Piling direction(s) Unique (up to ±) in subspace of data

Maximal Data Piling Movie Through Increasing Dimensions