Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986.

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.

The Maximum Likelihood Method

General Linear Model With correlated error terms  =  2 V ≠  2 I.

Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests

FTP Biostatistics II Model parameter estimations: Confronting models with measurements.

Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.

Dimension reduction (1)

Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.

Ch.6 Simple Linear Regression: Continued

Simple Linear Regression and Correlation

The General Linear Model. The Simple Linear Model Linear Regression.

Nguyen Ngoc Anh Nguyen Ha Trang

What is Statistical Modeling

Visual Recognition Tutorial

Maximum likelihood (ML) and likelihood ratio (LR) test

9. SIMPLE LINEAR REGESSION AND CORRELATION

Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.

Slide 1 Statistics for HEP Roger Barlow Manchester University Lecture 3: Estimation.

Statistics for Business and Economics

Dimensional reduction, PCA

Basic Statistical Concepts Psych 231: Research Methods in Psychology.

SIMPLE LINEAR REGRESSION

Chapter 11 Multiple Regression.

Computer vision: models, learning and inference

Factor Analysis Ulf H. Olsson Professor of Statistics.

Maximum likelihood (ML)

Sampling Distributions & Point Estimation. Questions What is a sampling distribution? What is the standard error? What is the principle of maximum likelihood?

Statistical hypothesis testing – Inferential statistics II. Testing for associations.

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

Regression and Correlation Methods Judy Zhong Ph.D.

CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.

Simple Linear Regression Models

Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.

Statistics for Business and Economics Chapter 10 Simple Linear Regression.

Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.

Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.

MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.

Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,

Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.

1 Standard error Estimated standard error,s,. 2 Example 1 While measuring the thermal conductivity of Armco iron, using a temperature of 100F and a power.

Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.

Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.

KNN Ch. 3 Diagnostics and Remedial Measures Applied Regression Analysis BUSI 6220.

Review of statistical modeling and probability theory Alan Moses ML4bio.

ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.

Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.

Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.

R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.

Dimension reduction (1) Overview PCA Factor Analysis Projection persuit ICA.

11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.

Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

IRT Equating Kolen & Brennan, 2004 & 2014 EPSY

CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Comparing Models.

STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.

The Maximum Likelihood Method

HGEN Thanks to Fruhling Rijsdijk

Probability Theory and Parameter Estimation I

The Maximum Likelihood Method

The Maximum Likelihood Method

Statistical Assumptions for SLR

Integration of sensory modalities

Simple Linear Regression

Parametric Methods Berlin Chen, 2005 References:

Mathematical Foundations of BME Reza Shadmehr

Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

MGS 3100 Business Analysis Regression Feb 18, 2016

Presentation transcript:

Multidimensional Scaling

Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Proximities AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 p Amherst, Hadley

Configuration (in 2-D) xixi

Configuration (in 1-D)

Formal MDS Definition f: p ij  d ij (X) MDS is a mapping from proximities to corresponding distances in MDS space. After a transformation f, the proximities are equal to distances in X. AmherstBelcherto wn HadleyLeverettPelhamShutesbu ry Sunderla nd Amherst Belcherto wn Hadley Leverett Pelham Shutesbu ry Sunderla nd 0

Distances, d ij d Amherst, Hadley (X)

Distances, d ij

d Amherst, Hadley (X)=4.32

Proximities and Distances AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 Proximities AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 Distances

The Role of f f relates the proximities to the distances. f(p ij )=d ij (X)

The Role of f f can be linear, exponential, etc. In psychological data, f is usually assumed any monotonic function. –That is, if p ij <p kl then d ij (X)  d kl (X). –Most psychological data is on an ordinal scale, e.g., rating scales.

Looking at Ordinal Relations AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 Proximities AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 Distances

Stress It is not always possible to perfectly satisfy this mapping. Stress is a measure of how closely the model came. Stress is essentially the scaled sum of squared error between f(p ij ) and d ij (X)

Stress Dimensions Stress “Correct” Dimensionality

Distance Invariant Transformations Scaling (All X doubled in size (or flipped)) Rotatation (X rotated 20 degrees left) Translation (X moved 2 to the right)

Configuration (in 2-D)

Rotated Configuration (in 2-D)

Uses of MDS Visually look for structure in data. Discover the dimensions that underlie data. Psychological model that explains similarity judgments in terms of distance in MDS space.

Simple Goodness of Fit Measures Sum-of-squared error (SSE) Chi-Square Proportion of variance accounted for (PVAF) R 2 Maximum likelihood (ML)

Sum of Squared Error DataPrediction(Data-Prediction) SS E 7.97

Chi-Square DataPrediction (Data- Prediction) 2 (Data - Prediction) 2 /Predicti on Chi-Square1.70

Proportion of Variance Accounted for DataMean PredictionModel Prediction MeanErrorError 2 PredictionErrorError SS T 34SS E 7.96 (SST-SSE)/SST = ( )/34 =.77

R2R2 R 2 is PVAF, but… DataMean PredictionModel Prediction MeanErrorError 2 PredictionErrorError SS T 34SS E (SST-SSE)/SST = ( )/34 =

Maximum Likelihood Assume we are sampling from a population with probability f(Y;  ). The Y is an observation and the  are the model parameters. Y  =[0] N(-1.7; [  =0])=0.094

Maximum Likelihood With independent observations, Y 1 …Y n, the joint probability of the sample observations is: Y1Y1  =[0] x x.3605 =.0090 Y2Y2 Y3Y3

Maximum Likelihood Expressed as a function of the parameters, we have the likelihood function: The goal is to maximize L with respect to the parameters, .

Maximum Likelihood Y1Y1  =[0] x x.3605 =.0090 Y2Y2 Y3Y3 Y1Y1  =[ ] x x.3398 =.0425 Y2Y2 Y3Y3 (Assuming  =1)

Maximum Likelihood Preferred to other methods –Has very nice mathematical properties. –Easier to interpret. –We’ll see specifics in a few weeks. Often harder (or impossible?) to calculate than other methods. Often presented as log likelihood, ln(ML). –Easier to compute (sums, not products). –Better numerical resolution. Sometimes equivalent to other methods. –E.g., same as SSE when calculating mean of a distribution.