Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Design of Statistical Specifications for a Test Mark D. Reckase Michigan State University.

Similar presentations


Presentation on theme: "The Design of Statistical Specifications for a Test Mark D. Reckase Michigan State University."— Presentation transcript:

1 The Design of Statistical Specifications for a Test Mark D. Reckase Michigan State University

2 Procedures for Test Design Test design has been considered to be a subjective, artistic endeavor. But, with the development of item response theory, test design has become more scientific. Lord suggested that tests be constructed to match a target information function. Very sophisticated methods have been developed to select items to match target information functions. Little work has been done on the design of test information functions.

3 Purposes for this Paper Present methodology for designing target information functions or item difficulty distributions for a test. Demonstrate that methodology for several common testing situations. Measure all examinees from a normal distribution of the trait to a desired level of precision. Measure a range of a trait to a desired level of precision.

4 Basic Concepts If examinee  is known, optimal test should contain a set of items that provide the required information at that . Information from an item covers a range so items that are optimal for one person supply some information for other persons. General approach is to randomly select persons from target population then select optimal items for that person. For each additional person, select only the additional items that are needed to reach information target.

5 Example Suppose target examinee population is N(0,1) Randomly select examinee. Information equivalent to reliability.90 is 10. Select items until information 10 is reached assuming Rasch model (b =  ). Randomly select additional examinees. Select items for those examinees until a test length of 50 is reached.

6 Results -- Comments Results are from one sample of 6 examinees randomly selected. 14 items needed for first examinee. Other examinees need fewer additional items because of overlap of information functions. Need to consider the effects of sampling variation.

7 Information from One Item

8 Results – Selected Items

9 Results – Information Function

10 The Complete Process Create ideal set of items for a sample. Replicate the process many times (500 seems to work well) Average information functions from the samples. Average number of items in.2-unit bins to determine difficult spread. Check specifications against target.

11 Conditions for Rasch-based Design N(0,1) trait distribution 50 item test Rasch model 500 replications Minimum information 10

12 Average Test Information

13 Item Difficulty Distribution

14 Match of Test to Target

15 Comments Minimum information requirement met from - 2.3 to 2.3. Information accumulates to higher values in the middle of the distribution. Difficulty distribution is essentially rectangular. Test information exceeds the target because item numbers are rounded upward in many cases.

16 Process Can Help Select Test Length Run process for different test lengths. Also can consider forcing selection of first examinee at 0.0. What test length allows criteria to be met?

17 Effect of Test Length

18 Results – Test Length With increase test length, information function widens and increases in height. Test length of 15 is too short to meet requirements unless it is focused at 0.0. Forcing first examinee at 0.0 makes information function narrower and more peaked. 75 items is maximum number of items that makes sense for the criteria specified here.

19 Test Designed to Measure with Precision over a Range Brian Junker suggested the following procedure. Select range Pick items at extremes of range Fill in with items between extremes to yield flat information function Continue until information criterion is reached over entire range

20 Increment of Information with Each Added Item

21 Target Information Function for Range from -2 to 2

22 Items that Match Target

23 Specifications Counter to Traditional Specifications Most tests have normal distributions of difficulties. These results seem very odd compared to traditional results. Need to investigate further. What is distribution of scores? What is distribution of p-values?

24 Number-Correct Score Distribution

25 P-value Distribution

26 Odd Results Distribution of scores is near normal. Distribution of p-values mirrors b- parameter distribution. Extreme item difficulties are.08 and.92. Surprising that these items yield normal distribution of scores. Look at test characteristic curve.

27 Test Characteristic Curve

28 Test Characteristic Cure Test characteristic curve is virtually linear from -2 to 2. When curve is linear, the form of the distribution of  is mapped to the estimated true score scale. In this case, since the  distribution was normal, so is the number-correct score distribution.

29 Test Information Function for Test with c =.16

30 Items that Match Target

31 Conclusions A process has been developed for designing target information functions and item difficulty distributions for tests. The process suggests that either a rectangular or a U-shaped distribution is appropriate if it is desired to measure with equal precision over a range. The number of items needed is related to the range of the scale that needs to be measured. The U-shaped item difficulty distribution works best if it is desired to recover the underlying  distribution. The results are quite different than traditional test development procedures.


Download ppt "The Design of Statistical Specifications for a Test Mark D. Reckase Michigan State University."

Similar presentations


Ads by Google