Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reliability REVIEW Inferential Infer sample findings to entire population Chi Square (2 nominal variables) t-test (1 nominal variable for 2 groups, 1 continuous)

Similar presentations


Presentation on theme: "Reliability REVIEW Inferential Infer sample findings to entire population Chi Square (2 nominal variables) t-test (1 nominal variable for 2 groups, 1 continuous)"— Presentation transcript:

1 Reliability REVIEW Inferential Infer sample findings to entire population Chi Square (2 nominal variables) t-test (1 nominal variable for 2 groups, 1 continuous) ANOVA (1 nominal variable for 3 + groups, 1 continuous)

2 Variance Standard deviation Correlation Are two variables related? What happens to Y when X changes? Linear relationship between two variables Quantifies the RELIABILITY & VALIDITY of a test or measurement

3 Reliability (0-1;.80 + goal) All scores: observed = true + error r xx =S 2 t /S 2 o proportion of observed score variance that is true score variance Interclass reliability coefficients (correlates 2 trials) Test/retest time, fatigue, practice effect Equivalent reduces test length by 50% Split-halves Index of Reliability Tells you what? Related to C of D how?

4 Standard Error of Measurement RELIABILITY MEASURE Reflects the degree to which a person's observed score fluctuates as a result of measurement errors S=standard deviation of the test r xx’ =reliability of the test

5 EXAMPLE: Test standard deviation=100r=.84 SEM = =100( .16) =100(.4) =40

6 SEM is the standard deviation of the measurement errors around an observed score EXAMPLE: Test score=500SEM=40 68% of all scores should fall between 460-540 (500+40) 95% of all scores range between: ?420-580

7 Factors Affecting Test Reliability 1)Fatigue ↓ 2)Practice ↑ 3)Subject variability homogeneous ↓, heterogeneous ↑ 4)Time between testing more time= ↓ 5)Circumstances surrounding the testing periods change= ↓ 6)Test difficulty too hard/easy= ↓ 7) Precision of measurement precise= ↑ 8)Environmental conditions change= ↓ SO WHAT? A test must first be reliable to be valid

8 Validity Types THIS SLIDE IS HUGE!!!! Content-Related Validity (a.k.a., face validity) Should represent knowledge to be learned Criterion for content validity rests w/ interpreter Use “experts” to establish Criterion-Related Validity Test has a statistical relationship w/ trait measured Alternative measures validated w/ criterion measure Concurrent: criterion/alternate measured same time Predictive: criterion measured in future Construct-Related Validity Validates theoretical measures that are unobservable

9 Standard Error of Estimate (reflects accuracy of estimating a score on the criterion measure) VALIDITY MEASURE Standard Error Standard Error of Prediction

10 Standard Errors SE of Measurement SE of Estimate

11 Methods of Obtaining a Criterion Measure Actual participation Play the game over multiple trials Perform the criterion known valid criterion (e.g., treadmill performance) Expert judges Tournament participation Round robin (to identify best player/team) Known valid test (may be too long/time consuming)

12 Interpreting the “r” you obtain THIS IS HUGE!!!!

13 Table 6-8 Correlation Matrix for Development of a Golf Skill Test (From Green et al., 1987) Playing golf Long puttChip shotPitch shotMiddle distance shot Drive Playing golf 1.00 Long putt.591.00 Chip shot.58.471.00 Pitch shot.54.37.351.00 Middle distance shot.66.55.61.401.00 Drive -.65-.62-.48-.52-.791.00 What are these? Concurrent Validity coefficients

14 Interpret these correlations Actual golf score Putting Trial 1 Putting Trial 2 Driving Trial 1 Driving Trial 2 Observer 1 Observer 2 Actual golf score 1.00 Putting T1.781.00 Putting T2.74.831.00 Driving T1.58.21.251.00 Driving T2.68.25.30.701.00 Observer 1.48.34.40.43.381.00 Observer 2.39.30.41.47.35.501.00 What are these? Concurrent Validity coefficients Criterion

15 Interpret these correlations Actual golf score Putting Trial 1 Putting Trial 2 Driving Trial 1 Driving Trial 2 Observer 1 Observer 2 Actual golf score 1.00 Putting T1.781.00 Putting T2.74.831.00 Driving T1.58.21.251.00 Driving T2.68.25.30.701.00 Observer 1.48.34.40.43.381.00 Observer 2.39.30.41.47.35.501.00 What are these? Reliability coefficients

16 Interpret these correlations Actual golf score Putting Trial 1 Putting Trial 2 Driving Trial 1 Driving Trial 2 Observer 1 Observer 2 Actual golf score 1.00 Putting T1.781.00 Putting T2.74.831.00 Driving T1.58.21.251.00 Driving T2.68.25.30.701.00 Observer 1.48.34.40.43.381.00 Observer 2.39.30.41.47.35.501.00 What is this? Objectivity coefficient

17 Concurrent Validity This square represents variance in performance in a skill (e.g., golf)

18 Concurrent Validity The different colors and patterns represent different parts of a skills test battery to measure the criterion (e.g., golf)

19 Concurrent Validity The orange color represents ERROR or unexplained variance in the criterion (e.g., golf) Error

20 Concurrent Validity ACDB Consider the Concurrent validity of the above 4 possible skills test batteries

21 Concurrent Validity ACDB Which test battery would you be LEAST likely to use? Why? D – it has the MOST error and requires 4 tests to be administered

22 Concurrent Validity ACDB Which test battery would you be MOST likely to use? Why? C – it has the LEAST error but it requires 3 tests to be administered

23 Concurrent Validity ACDB Which test battery would you use if you are limited in time? A or B – requires 1 or 2 tests to be administered but you lose some validity


Download ppt "Reliability REVIEW Inferential Infer sample findings to entire population Chi Square (2 nominal variables) t-test (1 nominal variable for 2 groups, 1 continuous)"

Similar presentations


Ads by Google