Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
The Maximum Likelihood Method
General Linear Model With correlated error terms  =  2 V ≠  2 I.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Dimension reduction (1)
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Ch.6 Simple Linear Regression: Continued
Simple Linear Regression and Correlation
The General Linear Model. The Simple Linear Model Linear Regression.
Nguyen Ngoc Anh Nguyen Ha Trang
What is Statistical Modeling
Visual Recognition Tutorial
Maximum likelihood (ML) and likelihood ratio (LR) test
9. SIMPLE LINEAR REGESSION AND CORRELATION
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Slide 1 Statistics for HEP Roger Barlow Manchester University Lecture 3: Estimation.
Statistics for Business and Economics
Dimensional reduction, PCA
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
SIMPLE LINEAR REGRESSION
Chapter 11 Multiple Regression.
Computer vision: models, learning and inference
Factor Analysis Ulf H. Olsson Professor of Statistics.
Maximum likelihood (ML)
Sampling Distributions & Point Estimation. Questions What is a sampling distribution? What is the standard error? What is the principle of maximum likelihood?
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Regression and Correlation Methods Judy Zhong Ph.D.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.
Statistics for Business and Economics Chapter 10 Simple Linear Regression.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
1 Standard error Estimated standard error,s,. 2 Example 1 While measuring the thermal conductivity of Armco iron, using a temperature of 100F and a power.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
KNN Ch. 3 Diagnostics and Remedial Measures Applied Regression Analysis BUSI 6220.
Review of statistical modeling and probability theory Alan Moses ML4bio.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Dimension reduction (1) Overview PCA Factor Analysis Projection persuit ICA.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
IRT Equating Kolen & Brennan, 2004 & 2014 EPSY
CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Comparing Models.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
The Maximum Likelihood Method
HGEN Thanks to Fruhling Rijsdijk
Probability Theory and Parameter Estimation I
The Maximum Likelihood Method
The Maximum Likelihood Method
Statistical Assumptions for SLR
Integration of sensory modalities
Simple Linear Regression
Parametric Methods Berlin Chen, 2005 References:
Mathematical Foundations of BME Reza Shadmehr
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Multidimensional Scaling

Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Proximities AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 p Amherst, Hadley

Configuration (in 2-D) xixi

Configuration (in 1-D)

Formal MDS Definition f: p ij  d ij (X) MDS is a mapping from proximities to corresponding distances in MDS space. After a transformation f, the proximities are equal to distances in X. AmherstBelcherto wn HadleyLeverettPelhamShutesbu ry Sunderla nd Amherst Belcherto wn Hadley Leverett Pelham Shutesbu ry Sunderla nd 0

Distances, d ij d Amherst, Hadley (X)

Distances, d ij

d Amherst, Hadley (X)=4.32

Proximities and Distances AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 Proximities AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 Distances

The Role of f f relates the proximities to the distances. f(p ij )=d ij (X)

The Role of f f can be linear, exponential, etc. In psychological data, f is usually assumed any monotonic function. –That is, if p ij <p kl then d ij (X)  d kl (X). –Most psychological data is on an ordinal scale, e.g., rating scales.

Looking at Ordinal Relations AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 Proximities AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 Distances

Stress It is not always possible to perfectly satisfy this mapping. Stress is a measure of how closely the model came. Stress is essentially the scaled sum of squared error between f(p ij ) and d ij (X)

Stress Dimensions Stress “Correct” Dimensionality

Distance Invariant Transformations Scaling (All X doubled in size (or flipped)) Rotatation (X rotated 20 degrees left) Translation (X moved 2 to the right)

Configuration (in 2-D)

Rotated Configuration (in 2-D)

Uses of MDS Visually look for structure in data. Discover the dimensions that underlie data. Psychological model that explains similarity judgments in terms of distance in MDS space.

Simple Goodness of Fit Measures Sum-of-squared error (SSE) Chi-Square Proportion of variance accounted for (PVAF) R 2 Maximum likelihood (ML)

Sum of Squared Error DataPrediction(Data-Prediction) SS E 7.97

Chi-Square DataPrediction (Data- Prediction) 2 (Data - Prediction) 2 /Predicti on Chi-Square1.70

Proportion of Variance Accounted for DataMean PredictionModel Prediction MeanErrorError 2 PredictionErrorError SS T 34SS E 7.96 (SST-SSE)/SST = ( )/34 =.77

R2R2 R 2 is PVAF, but… DataMean PredictionModel Prediction MeanErrorError 2 PredictionErrorError SS T 34SS E (SST-SSE)/SST = ( )/34 =

Maximum Likelihood Assume we are sampling from a population with probability f(Y;  ). The Y is an observation and the  are the model parameters. Y  =[0] N(-1.7; [  =0])=0.094

Maximum Likelihood With independent observations, Y 1 …Y n, the joint probability of the sample observations is: Y1Y1  =[0] x x.3605 =.0090 Y2Y2 Y3Y3

Maximum Likelihood Expressed as a function of the parameters, we have the likelihood function: The goal is to maximize L with respect to the parameters, .

Maximum Likelihood Y1Y1  =[0] x x.3605 =.0090 Y2Y2 Y3Y3 Y1Y1  =[ ] x x.3398 =.0425 Y2Y2 Y3Y3 (Assuming  =1)

Maximum Likelihood Preferred to other methods –Has very nice mathematical properties. –Easier to interpret. –We’ll see specifics in a few weeks. Often harder (or impossible?) to calculate than other methods. Often presented as log likelihood, ln(ML). –Easier to compute (sums, not products). –Better numerical resolution. Sometimes equivalent to other methods. –E.g., same as SSE when calculating mean of a distribution.