Matrix Notation for Representing Vectors

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

General Linear Model With correlated error terms  =  2 V ≠  2 I.
Pattern Recognition and Machine Learning
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Dimension reduction (1)
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
The General Linear Model. The Simple Linear Model Linear Regression.
Principal Component Analysis
1 Def: Let and be random variables of the discrete type with the joint p.m.f. on the space S. (1) is called the mean of (2) is called the variance of (3)
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
L15:Microarray analysis (Classification) The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
AGC DSP AGC DSP Professor A G Constantinides© Estimation Theory We seek to determine from a set of data, a set of parameters such that their values would.
3.3 Brownian Motion 報告者:陳政岳.
Probability theory 2011 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different definitions.
L15:Microarray analysis (Classification). The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Independent Component Analysis (ICA) and Factor Analysis (FA)
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Tch-prob1 Chapter 4. Multiple Random Variables Ex Select a student’s name from an urn. S In some random experiments, a number of different quantities.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Ordinary least squares regression (OLS)
Tutorial 10 Iterative Methods and Matrix Norms. 2 In an iterative process, the k+1 step is defined via: Iterative processes Eigenvector decomposition.
Continuous Random Variables and Probability Distributions
July 3, Department of Computer and Information Science (IDA) Linköpings universitet, Sweden Minimal sufficient statistic.
Probability theory 2008 Outline of lecture 5 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different.
Separate multivariate observations
Today Wrap up of probability Vectors, Matrices. Calculus
Maximum Likelihood Estimation
Summarized by Soo-Jin Kim
III. Multi-Dimensional Random Variables and Application in Vector Quantization.
Ordinary Least-Squares Emmanuel Iarussi Inria. Many graphics problems can be seen as finding the best set of parameters for a model, given some data Surface.
3.5 – Solving Systems of Equations in Three Variables.
Additive Data Perturbation: data reconstruction attacks.
The Mean of a Discrete RV The mean of a RV is the average value the RV takes over the long-run. –The mean of a RV is analogous to the mean of a large population.
Chapter 5.6 From DeGroot & Schervish. Uniform Distribution.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Vector Norms and the related Matrix Norms. Properties of a Vector Norm: Euclidean Vector Norm: Riemannian metric:
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
CHAPTER 5 SIGNAL SPACE ANALYSIS
CS 478 – Tools for Machine Learning and Data Mining SVM.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Chapter 28 Cononical Correction Regression Analysis used for Temperature Retrieval.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
III. Multi-Dimensional Random Variables and Application in Vector Quantization.
Visualizing and Exploring Data 1. Outline 1.Introduction 2.Summarizing Data: Some Simple Examples 3.Tools for Displaying Single Variable 4.Tools for Displaying.
Principle Component Analysis and its use in MA clustering Lecture 12.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: MLLR For Two Gaussians Mean and Variance Adaptation MATLB Example Resources:
Joint Moments and Joint Characteristic Functions.
Linear Classifiers Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
Presented by: Muhammad Wasif Laeeq (BSIT07-1) Muhammad Aatif Aneeq (BSIT07-15) Shah Rukh (BSIT07-22) Mudasir Abbas (BSIT07-34) Ahmad Mushtaq (BSIT07-45)
Principal Components Analysis ( PCA)
Conditional Expectation
Estimation Econometría. ADE.. Estimation We assume we have a sample of size T of: – The dependent variable (y) – The explanatory variables (x 1,x 2, x.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Data Modeling Patrice Koehl Department of Biological Sciences
Support Vector Machine
LECTURE 11: Advanced Discriminant Analysis
CH 5: Multivariate Methods
Additive Data Perturbation: data reconstruction attacks
C H A P T E R 3 Vectors in 2-Space and 3-Space
ECE 417 Lecture 4: Multivariate Gaussians
Singular Value Decomposition SVD
Principal Components What matters most?.
Principal Component Analysis
Chapter 4. Supplementary Questions
Chapter 2. Simplex method
Probabilistic Surrogate Models
Presentation transcript:

III. Multi-Dimensional Random Variables and Application in Vector Quantization

Matrix Notation for Representing Vectors Assume X is a two-dimensional vector, then in matrix notation it is represented as: The norm of any vector X || X || could be computed using the dot product as follows: For any vector X the unit vector UX is defined as x1 x2 X UX

Projection and Dot Product We would like to evaluate the projection of vector Y over vector X using the matrix notation (i.e, we would like to compute the value m), For proof see lecture notes

Principle Component Analysis Suppose we have a number of samples in a two dimensional space and we would like to identify a unit vector U that crosses the origin for which these samples are the closest. This could be achieved by finding the vector for which the sum of squares of distances to such vector is minimized (i.e., find U that minimizes d12+d22+d32+d42+…)

Principle Component Analysis From the Pythagoras theorem, we could argue that the closest vector to the observed samples is also equivalently a problem of finding the vector over which the sum of projection of the sample points square on it is maximized. l d is equivalent to m For constant l

Principle Component Analysis Define

Principle Component Analysis Therefore the unit vector U (i.e., UUT=1) that is closest to sample points X1, X2, X3, X4 satisfies Suppose the maximum value is λ The equation above is an eigen vector problem for the matrix XTX which means that the maximum we are seeking λ must solve the equation above and it is therefore one of the eigen values (maximum eigen value) for the SQUARE matrix XTX

Two-Dimensional Random Variables Assume X is a two-dimensional random variable, then in matrix notation X=[X1 X2] What does it mean that X is a two-dimensional random variable? It means that there is this experiment/phenomenon that could be expressed in terms of 2 random variables X1 and X2 Example: Weather (W) could be categorized as a two-dimensional random variable if we characterize it by Temperature (T) and Humidity (H). (i.e., W = [T, H])

Parameters of Two-Dimensional R.V. Mean of any of the Random Variables Variance of any of the Random Variables

Parameters for Relation Between R.V.s Correlation Covariance

Covariance Matrix

2-D R.V. & Principle Component Analysis Assume X is a two-dimensional random variable, then in matrix notation X=[X1 X2] Assume X’ is a two-dimensional random variable, then in matrix notation X’=[X’1 X’2] where X1’=X1 - E[X’1], X2’=X2 - E[X’2] The vector U that is closest to sample points of the two-dimensional random variable X is the eigen vector that corresponds to the maximum eigen value for the Covariance Matrix Rx’. i.e., U solves (Proof in lecture notes)