Presentation is loading. Please wait.

Presentation is loading. Please wait.

Class 2: Tues., Sept. 14th Correlation (2.2) Introduction to Measurement Theory: –Reliability of measurements and correlation –Example that demonstrates.

Similar presentations


Presentation on theme: "Class 2: Tues., Sept. 14th Correlation (2.2) Introduction to Measurement Theory: –Reliability of measurements and correlation –Example that demonstrates."— Presentation transcript:

1 Class 2: Tues., Sept. 14th Correlation (2.2) Introduction to Measurement Theory: –Reliability of measurements and correlation –Example that demonstrates reliability is not the same as accuracy –Validity

2 Strength of association Strength of the association: Measure of how strong is the positive or negative association. Statistical associations are overall tendencies, not ironclad rules. If there is a strong association between two variables, then knowing one helps a lot in predicting the other. But when there is a weak association, information about one variable does not help much in guessing the other.

3 For which data set, (A) or (B), is X more strongly associated with Y?

4 Correlation Motivation: We would like a numerical measure of association. Correlation (r) : numerical measure of how close X and Y are to a straight line (the straight line that best summarizes the relationship of X and Y in scatterplot). Formula (do not need to use, we use JMP to calculate) Correlation is always between –1 and 1. Correlations near –1 indicate strong negative association (X and Y close to a downward sloping line), correlations near 0 indicate little association and correlations near 1 indicate strong positive association (X and Y close to an upward sloping line).

5

6 Computing Correlation in JMP Click Analyze, Multivariate Methods, Multivariate. Put variables into Y, Columns for which you want to compute the correlation. This produces scatterplot matrix and matrix of correlations above it. In scatterplot matrix, the ellipses contains approximately 95% of the points.

7

8 Properties of Correlation 1.Correlation makes no use of distinction between response and explanatory variables. It makes no difference what we call X and what we call Y. 2.Correlation requires both variables to be quantitative. We can’t compute correlation between religion and alcohol use. 3.The correlation is dimensionless. The correlation does not change if we change the units of measurement (e.g., change inches to feet) of X or Y. 4.The correlation also does not change if we add a constant to X or Y.

9 To summarize the relationship between X and Y numerically, we need to know the means and standard deviations of X and Y in addition to the correlation. Example 2.9: Competitive divers are scored on their form by a panel of judges who uses a scale from 1 to 10. We have the scores awarded by the two judges. Ivan and George, on a large number of dives. How well do they agree? We do some calculation and find that the correlation between their scores is r=0.9. But the mean of Ivan’s scores is 3 points lower than George’s mean.

10 Missing Data Missing data is a fact of life in most real data sets. In surveys, people often refuse or don’t bother to answer certain questions.

11 How JMP handles missing data for correlations In Analyze, Multivariate, the Correlations Multivariate option that comes up by default uses only units for which all variables listed in Y, Columns are recorded. To obtain pairwise correlations that use all the units for which both of the two variables being considered are recorded, click on the red triangle next to Multivariate and click Pairwise Correlations after obtaining the scatterplot matrix.

12

13 More Properties of Correlation When looking for relationships between variables, what size of correlation should we get excited about? Unfortunately, there is no absolute answer. Time series of two variables are usually highly correlated, often above 0.95. In social science research on the relationship between people’s attitudes and their characteristics (e.g., income level), researchers are often excited by a correlation of 0.25. Resistant statistic: Statistic that is not strongly affected by a few outliers. Correlation is not resistant.

14 Nonlinear relationships Correlation measures how close the observations in a scatterplot are to a straight line. Correlation is only a good measure of the association between two variables if the mean of X given Y roughly follows a straight line as Y increases.

15 Correlation and Nonlinear Relationships Correlation = -0.005 But Strong Association Moral: Don’t use correlation to summarize association when relationship is nonlinear.

16 Correlation and Reliability of Measurements Measurement theory: Branch of applied statistics that attempts to describe, categorize, evaluate and improve the quality of measurements. Measurement theory for psychological attributes such as intellectual ability or personality is called psychometrics. Reliability of a measurement: The degree of consistency with which a trait or attribute is measured. A perfectly reliable measurement will produce the same value each time assuming the trait or attribute remains constant. Validity of a measurement: The degree to which a measurement measures what it purports to measure. The reliability of a measurement is often determined by the correlation between repeated measurements of the same trait/attribute.

17 Reliability of Pulse Measurement Correlation = 0.9021

18 When is a Measurement Reliable Enough? It is often said that a reliability (correlation) of greater than 0.90 is high. For example, a reliability of 0.90 for educational tests has been considered adequate to assure the quality of standardized tests and large scale assessment programs.

19 Shoe Shopping and the Reliability Coefficient Citation: This example was developed by David Rogosa of Stanford University. Dedicated to Al Bundy A man who cares as much about good measurement as he does about his own children.

20 What reliability would you assign this man?

21 Reliability and Accuracy Try this on 1.A population of male and female shoe-shoppers who have true shoe sizes between size 5 and size 15 (e.g., the small sizes are female feet translated to the male shoe-size scale). 2.Mr. Bundy measures each shopper’s shoe size as either too large or too small with equal probability. On a good day Mr. Bundy misses the correct shoe size by one-half size too big or one-half size too small. On other days Mr. Bundy misses the correct shoe size by a full size too big or a full size too small. In each case the shoe size measurement error has mean 0 (overall and at each level of shoe size) and is uncorrelated with actual shoe size.

22 The accuracy of shoe fitting on the good day is poor (as most wearers would notice a half-size misfitting), and on the other days the accuracy is totally unacceptable (as a full-size misfitting would presumably be unwearable). The reliability coefficient for Al Bundy on the good day is.973 (better than any standardized test, even though accuracy is poor). The reliability coefficient for Al Bundy making errors of a full shoe size is.902 (comparable to many standardized tests, even though accuracy is unacceptable). Moral: Reliability is not the same as accuracy. We’ll discuss more about this and how to measure accuracy on Thursday.

23 Validity Validity of a measurement: The degree to which a measurement measures what it purports to measure. To what extent is pulse a valid measure of one’s general state of health. What time did you go to bed last night? –Is it reliable? –Is it valid as a measure of what time you actually went to bed? –Is it valid as a measure of time spent studying?

24 Summary Correlation: –Use: Measure of a certain type of association, how close points are to a straight line. –Caveat: Check in scatterplot that mean of Y given X is roughly a straight line as X increases. Otherwise, correlation is not a good measure of association. –Not good for measuring association that is nonlinear. Measurement Theory: –Reliability: The correlation between repeated measurements of same trait, quantifies how consistent is the measurement. Not the same as accuracy. –Validity: Does the measurement measure what it purports to measure.


Download ppt "Class 2: Tues., Sept. 14th Correlation (2.2) Introduction to Measurement Theory: –Reliability of measurements and correlation –Example that demonstrates."

Similar presentations


Ads by Google