Multidimensional Scaling
Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986
Proximities AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 p Amherst, Hadley
Configuration (in 2-D) xixi
Configuration (in 1-D)
Formal MDS Definition f: p ij d ij (X) MDS is a mapping from proximities to corresponding distances in MDS space. After a transformation f, the proximities are equal to distances in X. AmherstBelcherto wn HadleyLeverettPelhamShutesbu ry Sunderla nd Amherst Belcherto wn Hadley Leverett Pelham Shutesbu ry Sunderla nd 0
Distances, d ij d Amherst, Hadley (X)
Distances, d ij
d Amherst, Hadley (X)=4.32
Proximities and Distances AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 Proximities AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 Distances
The Role of f f relates the proximities to the distances. f(p ij )=d ij (X)
The Role of f f can be linear, exponential, etc. In psychological data, f is usually assumed any monotonic function. –That is, if p ij <p kl then d ij (X) d kl (X). –Most psychological data is on an ordinal scale, e.g., rating scales.
Looking at Ordinal Relations AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 Proximities AmherstBelchertownHadleyLeverettPelhamShutesburySunderland Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland0 Distances
Stress It is not always possible to perfectly satisfy this mapping. Stress is a measure of how closely the model came. Stress is essentially the scaled sum of squared error between f(p ij ) and d ij (X)
Stress Dimensions Stress “Correct” Dimensionality
Distance Invariant Transformations Scaling (All X doubled in size (or flipped)) Rotatation (X rotated 20 degrees left) Translation (X moved 2 to the right)
Configuration (in 2-D)
Rotated Configuration (in 2-D)
Uses of MDS Visually look for structure in data. Discover the dimensions that underlie data. Psychological model that explains similarity judgments in terms of distance in MDS space.
Simple Goodness of Fit Measures Sum-of-squared error (SSE) Chi-Square Proportion of variance accounted for (PVAF) R 2 Maximum likelihood (ML)
Sum of Squared Error DataPrediction(Data-Prediction) SS E 7.97
Chi-Square DataPrediction (Data- Prediction) 2 (Data - Prediction) 2 /Predicti on Chi-Square1.70
Proportion of Variance Accounted for DataMean PredictionModel Prediction MeanErrorError 2 PredictionErrorError SS T 34SS E 7.96 (SST-SSE)/SST = ( )/34 =.77
R2R2 R 2 is PVAF, but… DataMean PredictionModel Prediction MeanErrorError 2 PredictionErrorError SS T 34SS E (SST-SSE)/SST = ( )/34 =
Maximum Likelihood Assume we are sampling from a population with probability f(Y; ). The Y is an observation and the are the model parameters. Y =[0] N(-1.7; [ =0])=0.094
Maximum Likelihood With independent observations, Y 1 …Y n, the joint probability of the sample observations is: Y1Y1 =[0] x x.3605 =.0090 Y2Y2 Y3Y3
Maximum Likelihood Expressed as a function of the parameters, we have the likelihood function: The goal is to maximize L with respect to the parameters, .
Maximum Likelihood Y1Y1 =[0] x x.3605 =.0090 Y2Y2 Y3Y3 Y1Y1 =[ ] x x.3398 =.0425 Y2Y2 Y3Y3 (Assuming =1)
Maximum Likelihood Preferred to other methods –Has very nice mathematical properties. –Easier to interpret. –We’ll see specifics in a few weeks. Often harder (or impossible?) to calculate than other methods. Often presented as log likelihood, ln(ML). –Easier to compute (sums, not products). –Better numerical resolution. Sometimes equivalent to other methods. –E.g., same as SSE when calculating mean of a distribution.