Presentation on theme: "Topic 4B Test Construction. The test construction process Defining the Test Selecting Scaling Method Constructing the items Testing the items Revising."— Presentation transcript:
Topic 4B Test Construction
The test construction process Defining the Test Selecting Scaling Method Constructing the items Testing the items Revising the test Publishing the test
Defining the Test Explain the test purpose explicitly and propose a fresh focus for what the test intend to measure, for example, intelligence. K-ABC
Selecting Scaling Method The immediate purpose of psychological testing is to assign numbers to responses on a test so that the examinee can be judged to have more or less of the characteristic measured. Levels of Measurement: Stevens (1946) Nominal scale, ordinal scale, interval scale, ratio scale
Selecting Scaling Method Nominal scale: allows for categorizing Ordinal scale: allows for ranking Interval scale: uses equal intervals Ratio scale: possesses real zero point
Representative scaling methods Expert rankings Method of equal-appearing intervals Likert scales Method of empirical keying Rational scale construction (internal consistency)
Constructing the Items Should item content be homogeneous or varied? What range of difficulty should the items cover? How many initial items should be constructed? Which cognitive processes and item domains should be tapped? What kind of test item should be used?
Testing the Items In conducting a thorough item analysis, the test developer might make use of item- difficulty index, item-reliability index, item- validity index, item-characteristic curve, and an index of item discrimination.
Item-difficulty index Generally, item difficulties that hover around 0.5, ranging between 0.3 and 0.7, maximize the information the test provides about differences between examinees. However, this rule of thumb is subject to one important qualification and one very significant exception.
Item-reliability index The product of point-biserial correlation and dispersion (standard deviation) of a item is the item-reliability index.
item-validity index The point-biserial correlation between the item score and the score on the criterion variable is computed first. Thus, the item-validity index consists of the product of the standard deviation and the point-biserial correlation.
Item-characteristic curve Figure 4.8
Item-discrimination index An ideal test item is one that most of the high scores pass and most of the low scores fail.
Revising the Test Cross Validation Validity Shrinkage Feedback from Examinees
Publishing the Test Production of Testing Materials Technical Manual and User’s Manual Testing is Big Business