Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 6 Norm-Referenced Measurement. Topics for Discussion Reliability Consistency Repeatability Validity Truthfulness Objectivity Inter-rater reliability.

Similar presentations


Presentation on theme: "Chapter 6 Norm-Referenced Measurement. Topics for Discussion Reliability Consistency Repeatability Validity Truthfulness Objectivity Inter-rater reliability."— Presentation transcript:

1 Chapter 6 Norm-Referenced Measurement

2 Topics for Discussion Reliability Consistency Repeatability Validity Truthfulness Objectivity Inter-rater reliability

3 Observed, Error, and True Scores Observed Score = True Score + Error Score

4 Reliability Reliability is that proportion of observed score variance that is true score variance

5 Table 6-1 Systolic Blood Pressure Recordings for 10 Subjects Subject Observed BP = True BP + Error BP 1103105-2 2117115+2 3116120-4 4123125-2 5127125+2 61251250 7135125+10 8126130-4 9133135-2 101451450 Sum (  )125012500 Mean (M)125.0125.00 Variance (S 2 )133.6116.716.9 S11.610.84.1

6 Interclass Reliability Pearson Product Moment Test Retest Equivalence Split Halves

7 Table 6-2 Sit-up Performance for 10 Subjects Subject Trial 1 Trial 2 14549 23836 35450 43838 54749 63938 73943 84243 92930 104242 Sum (  )413418 Mean41.341.8 S6.66.5 Variance (S 2 )43.641.7 r xx’ =.927

8 Spearman Brown Prophecy Formula k = the number of items I WANT to estimate the reliability for divided by the number of items I HAVE reliability for

9 Table 6-3 Odd and Even Scores for 10 Subjects Subject Odd Even 11213 2911 3108 496 5118 6710 799 81210 954 1087 Sum (  )9286 Mean9.28.6 S2.22.6 Variance (S 2 )4.86.7 r xx’ =.639

10 Table 6-4 Values of r kk From Spearman-Brown Prophecy Formula r 11.25.501.52.03.04.05.0.10.03.05.14.18.25.31.36.22.07.12.30.36.46.53.59.40.14.25.50.57.67.73.77.50.20.33.60.67.75.80.83.60.27.43.69.75.82.86.88.68.35.52.76.81.86.89.91.80.50.67.86.89.92.94.95.92.74.85.95.96.97.98.98.96.86.92.97.98.99.99.99 K (change in test length)

11 Table 6-5 Effect of a Constant Change in Measures Subject Trial 1 Trial 2 11525 21727 31020 42030 52333 62636 72737 83040 93242 103343 Sum (  )233333 Mean23.333.3 S7.77.7 Variance (S 2 )59.159.1 r xx’ = 1.00

12 Intraclass Reliability ANOVA Model Cronbach's alpha coefficient Alpha Coefficient

13 Intraclass (ANOVA) Reliabilities Common terms you will encounter Alpha Reliability Kuder Richardson Formula 20 (KR 20 ) Kuder-Richardson Formula 21 (KR 21 ) ANOVA reliabilities

14 Table 6-6 Calculating the Alpha Coefficient Subject Trial 1 Trial 2 Trial 3 Total 135311 22226 365314 453513 534411  X 19191755  X 2 837963643 S 2 2.701.701.309.50

15 Calculating the Alpha Coefficient

16 Index of Reliability The theoretical correlation between observed scores and true scores

17 Table 6-7 Student Scores on a 10-Item Multiple-Choice Quiz Subject1 2 3 4 5 6 7 8 9 10 Total 11 1 1 1 1 1 0 1 0 1 8 20 1 0 1 1 0 1 0 1 1 6 30 1 1 0 1 1 0 1 0 0 5 41 0 0 0 1 0 1 1 0 0 4 50 0 0 1 0 1 0 1 0 0 3 Items

18 Standard Error of Measurement Reflects the degree to which a person's observed score fluctuates as a result of errors of measurement

19 Factors Affecting Test Reliability 1)Fatigue 2)Practice 3)Subject variability 4)Time between testing 5)Circumstances surrounding the testing periods 6)Appropriate difficulty for testing subjects 7)Precision of measurement 8)Environmental conditions

20 Decline in Reliability for the Harvard Alumni Activity Survey as the Time Between Testing Periods Increases Months Between Test-Retest

21 Validity Types Content-Related Validity Criterion-Related Validity Statistical or correlational concurrent predictive Construct-Related Validity

22 Standard Error of Estimate Standard Error Standard Error of Prediction

23 Standard Errors SE of Measurement SE of Estimate

24 Methods of Obtaining a Criterion Measure Actual participation e.g., golf, archery Perform the criterion known valid criterion (e.g., treadmill performance) Expert judges panel judges Tournament participation Round robin Known valid test

25 Table 6-8 Correlation Matrix for Development of a Golf Skill Test (From Green et al., 1987) Playing golf Long puttChip shotPitch shotMiddle distance shot Drive Playing golf 1.00 Long putt.591.00 Chip shot.58.471.00 Pitch shot.54.37.351.00 Middle distance shot.66.55.61.401.00 Drive -.65-.62-.48-.52-.791.00 What are these? Concurrent Validity coefficients

26 Table 6-9 Concurrent Validity Coefficients for Golf Test 2-item battery Middle distance shot Pitch shot.72 3-item battery Middle distance shot Pitch shot Long putt.76 4-item battery Middle distance shot Pitch shot Long putt Chip shot.77

27 Correlations Between IQs of Related or Unrelated Children as a Function of Genetic Similarity and Similarity of Environment Identical twins - reared together.88 Identical twins - reared apart.75 Fraternal twins - same sex.53 Fraternal twins - opposite sex.53 Siblings - reared together.49 Siblings - reared apart.46 Parent with child.52 Foster parent with child.19 Unrelated - reared together.16 From Glass & Stanley, 1970, p. 119

28 Figure 6.1 Diagram of Validity and Reliability Terms

29 Interpreting the “r” you obtain

30 Concurrent Validity This square represents variance in performance in a skill (e.g., golf)

31 Concurrent Validity The different colors and patterns represent different parts of a skills test battery to measure the criterion (e.g., golf)

32 Concurrent Validity The orange color represents ERROR or unexplained variance in the criterion (e.g., golf) Error

33 Concurrent Validity ACDB Consider the Concurrent validity of the above 4 possible skills test batteries

34 Concurrent Validity ACDB Which test battery would you be LEAST likely to use? Why? D – it has the MOST error and requires 4 tests to be administered

35 Concurrent Validity ACDB Which test battery would you be MOST likely to use? Why? C – it has the LEAST error but it requires 3 tests to be administered

36 Concurrent Validity ACDB Which test battery would you use if you are limited in time? A or B – requires 1 or 2 tests to be administered but you lose some validity

37 Interpret these correlations Actual golf score Putting Trial 1 Putting Trial 2 Driving Trial 1 Driving Trial 2 Observer 1 Observer 2 Actual golf score 1.00 Putting T1.781.00 Putting T2.74.831.00 Driving T1.58.21.251.00 Driving T2.68.25.30.701.00 Observer 1.48.34.40.43.381.00 Observer 2.39.30.41.47.35.501.00 What are these? Concurrent Validity coefficients Criterion

38 Interpret these correlations Actual golf score Putting Trial 1 Putting Trial 2 Driving Trial 1 Driving Trial 2 Observer 1 Observer 2 Actual golf score 1.00 Putting T1.781.00 Putting T2.74.831.00 Driving T1.58.21.251.00 Driving T2.68.25.30.701.00 Observer 1.48.34.40.43.381.00 Observer 2.39.30.41.47.35.501.00 What are these? Reliability coefficients

39 Interpret these correlations Actual golf score Putting Trial 1 Putting Trial 2 Driving Trial 1 Driving Trial 2 Observer 1 Observer 2 Actual golf score 1.00 Putting T1.781.00 Putting T2.74.831.00 Driving T1.58.21.251.00 Driving T2.68.25.30.701.00 Observer 1.48.34.40.43.381.00 Observer 2.39.30.41.47.35.501.00 What is this? Objectivity coefficient

40 Example of Reliability Study (Rikli et al., RQES, 1992) K1234 DistanceGender 1/2 mileM.77.74.75.74.79 F.73.77.76.67.47 3/4 mileM.48.54.83.89.85 F.58.64.68.83.80 1 mileM.53.56.70.84.87 F.39.54.71.90.85 Grade

41 SPSS Examples


Download ppt "Chapter 6 Norm-Referenced Measurement. Topics for Discussion Reliability Consistency Repeatability Validity Truthfulness Objectivity Inter-rater reliability."

Similar presentations


Ads by Google