Presentation is loading. Please wait.

Presentation is loading. Please wait.

REVIEW I Reliability Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure Degree.

Similar presentations


Presentation on theme: "REVIEW I Reliability Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure Degree."— Presentation transcript:

1 REVIEW I Reliability Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure Degree to which an observed score fluctuates due to measurement errors Factors affecting reliability A test must be RELIABLE to be VALID

2 REVIEW II Types of validity Content-related (face) Represents important/necessary knowledge Use “experts” to establish Criterion-related Evidence of a statistical relationship w/ trait being measured Alternative measures must be validated w/ criterion measure Construct-related Validates unobservable theoretical measures

3 REVIEW III Standard Error of Estimate Validity measure Degree of error in estimating a score based on the criterion Methods of obtaining a criterion measure Actual participation Perform criterion Predictive measures Interpreting “r”

4 Criterion-Referenced Measurement PoorSufficientBetter It’s all about me: did I get ‘there’ or not?

5 Criterion-Referenced Testing aka, Mastery Learning Standard Development Judgmental: use experts typical in human performance Normative: theoretically accepted criteria Empirical: cutoff based on available data Combination: expert & norms typically combined

6 Advantages of Criterion-Referenced Measurement Represent specific, desired performance levels linked to a criterion Independent of the % of the population that meets the standard If not met, specific diagnostic evaluations can be made Degree of performance is not important-reaching the standard is Performance linked to specific outcomes Individuals know exactly what is expected of them

7 Limitations of Criterion-Referenced Measurement Cutoff scores always involve subjective judgment Misclassifications can be severe Motivation can be impacted; frustrated/bored

8 Setting a Cholesterol “Cut-Off” Cholesterol mg/dl N of deaths

9 Setting a Cholesterol “Cut-Off” Cholesterol mg/dl N of deaths

10 Statistical Analysis of CRTs Nominal data (categorical; major, gender, pass/fail, etc.) Contingency table development (2x2 Chi 2 ) Chi-Square analysis (used w/ categorical variables) Proportion of agreement (see next slide) Phi coefficient (correl for dichotomous (y/n) variables)

11 Proportion of Agreement (P) Sum the correctly classified cells/total (n 1 + n 4 )/n 1 +n 2 +n 3 + n 4 Examples on board

12 Considerations with CRT The same as norm-referenced testing Reliability (consistency) Equivalence: is the PACER equivalent to 1-mi run/walk? Stability: does same test result in consistent findings? Validity (Truthfulness of measurement) Criterion-related: concurrent or predictive Construct-related: establish cut scores (see Fig. 7.3)

13 Meeting Criterion-Referenced Standards Possible Decisions Truly Below Criterion Truly Above Criterion Did not achieve standard Correct Decision False Positive Did achieve standard False Negative Correct Decision

14 CRT Reliability Test/Retest of a single measure Fail Day 2 Pass Fail Pass Day 1 n1n1 n2n2 n3n3 n4n4 (n 1 + n 4 )/(n 1 +n 2 +n 3 + n 4)

15 CRT Validity Use of a field test and criterion measure Fail Field Test Pass Fail Pass Criterion n1n1 n2n2 n3n3 n4n4

16 Example 1 FITNESSGRAM Standards (1987) 24 (4%) 21 (4%) 64 (11%) 472 (81%) Did not achieve the standard on the run/walk test Did achieve the standard on the run/walk test Below the criterion VO 2 max Above the criterion VO 2 max P=(24 + 472)/(24+21+64+472) 496/581=85%

17 Example 2 AAHPERD Standards (1988) 130 (22%) 23 (4%) 201 (35%) 227 (39%) Did not achieve the standard on the run/walk test Did achieve the standard on the run/walk test Below the criterion VO 2 max Above the criterion VO 2 max Compare Examples 1-2: F’gram (81%) better predictor of VO 2max than AAHPERD standards (39%) P=(130 + 227)/(130+23+201+227) 357/581=61%

18 Criterion-referenced Measurement Find a friend: Explain one thing that you learned today and share WHY IT MATTERS to you as a future professional


Download ppt "REVIEW I Reliability Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure Degree."

Similar presentations


Ads by Google