Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.

Similar presentations


Presentation on theme: "Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test."— Presentation transcript:

1 Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test

2 Overview of Reliability Techniques 1. Test-retest 2. Parallel/Alternate Forms 3. Split-half TT 12 A 1 2 T AB Score 1 100 50 pairs 4. Internal Consistency K-R-20 Coefficient Alpha

3 Test-retest method [error is due to changes occurring due to the passage of time] Some Issues: Length of time between test administrations if crucial (generally, the longer the interval, the lower the reliability) Memory Stability of the construct being assessed Speed tests, sensory discrimination, psychomotor tests (possible fatigue factor)

4 Parallel/Alternate Forms [error due to test content and perhaps passage of time] Two types: 1) Immediate (back-to-back administrations) 2) Delayed (a time interval between administrations) Some Issues: Need same number & type of items on each test Item difficulty must be the same on each test Variability of scores must be the same on each test

5 Split-half reliability [error due to differences in item content between the halves of the test] Typically, responses on odd versus even items are employed Correlate total scores on odd items with the scores obtained on even items Need to use the Spearman-Brown correction formula r ttc = nr 12 1 + (n – 1) r 12 # of times the test is lengthened correlation between both parts of the test corrected r for the total test Person Odd Even 1 36 43 2 44 40 3 42 37 4 33 40

6 KR-20 and Coefficient Alpha [error due to item similarity] KR-20 is used with scales that have right & wrong responses (e.g., achievement tests) Alpha is used for scales that have a range of response options where there are no right or wrong responses (e.g., 7-point Likert-type scales) R tt = k  p i (1 – p i ) k – 1  y 2 # of items variance of test scores % of those getting the item correct KR-20  = k 1 –  i 2 k – 1  y 2 # of items variance of test scores variance of scores on each item Alpha

7 Correlation of items with total test scores Corr. with criteria.00.05.10.15.20.25.30.35.40.45.50. 50.45.40.35.30.25.20.15.10.05.00 Possible problem with choosing test items based on their correlations with a criterion **** * * ** * * * * Selection zone

8 Factors Affecting Reliability 1) Variability of scores (generally, the more variability, the higher the reliability) 2) Number of items (the more questions, the higher the reliability) 3)Item difficulty (moderately difficult items lead to higher reliability, e.g., p-value of.40 to.60) 4) Homogeneity/similarity of item content (e.g., item x total score correlation; the more homogeneity, the higher the reliability) 5) Scale format/number of response options (the more options, the higher the reliability)

9 Standard Error of Measurement [Error that exists in an individual’s test score] 1 - r SEM = Standard Deviation Reliability Examples:  = 10; r =.90 SEM = 3.16  = 10; r =.60 SEM = 6.32

10 -4  -3  -2  -1  Mean +1  +2  +3  +4  Normal Curve 68 % 95 % 99 % Actual z-score = 1.96 Actual z-score = 2.58 3.16 x 1.96 = 6.19 (95% confidence) 3.16 x 2.58 = 8.15 (99% confidence)

11 Other Standard Errors Standard error of the mean: S X = S √ N s = standard deviation N = # observations or sample size Standard error of proportion: SEP = p (1 - p)/N p = proportion N = sample size Standard error of difference in proportions: Standard error of estimate (validity coefficient):  y’ =  y 1 - r 2 xy  y = standard deviation of y (criterion) r 2 xy = correlation between x and y squared


Download ppt "Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test."

Similar presentations


Ads by Google