# Topic 4B Test Construction.

## Presentation on theme: "Topic 4B Test Construction."— Presentation transcript:

Topic 4B Test Construction

The test construction process
Defining the Test Selecting Scaling Method Constructing the items Testing the items Revising the test Publishing the test

Defining the Test Explain the test purpose explicitly and propose a fresh focus for what the test intend to measure, for example, intelligence. K-ABC

Selecting Scaling Method
The immediate purpose of psychological testing is to assign numbers to responses on a test so that the examinee can be judged to have more or less of the characteristic measured. Levels of Measurement: Stevens (1946) Nominal scale, ordinal scale, interval scale, ratio scale

Selecting Scaling Method
Nominal scale: allows for categorizing Ordinal scale: allows for ranking Interval scale: uses equal intervals Ratio scale: possesses real zero point

Representative scaling methods
Expert rankings Method of equal-appearing intervals Likert scales Method of empirical keying Rational scale construction (internal consistency)

Constructing the Items
Should item content be homogeneous or varied? What range of difficulty should the items cover? How many initial items should be constructed? Which cognitive processes and item domains should be tapped? What kind of test item should be used?

Testing the Items In conducting a thorough item analysis, the test developer might make use of item-difficulty index, item-reliability index, item-validity index, item-characteristic curve, and an index of item discrimination.

Item-difficulty index
Generally, item difficulties that hover around 0.5, ranging between 0.3 and 0.7, maximize the information the test provides about differences between examinees. However, this rule of thumb is subject to one important qualification and one very significant exception.

Item-reliability index
The product of point-biserial correlation and dispersion (standard deviation) of a item is the item-reliability index.

item-validity index The point-biserial correlation between the item score and the score on the criterion variable is computed first. Thus, the item-validity index consists of the product of the standard deviation and the point-biserial correlation.

Item-characteristic curve
Figure 4.8

Item-discrimination index
An ideal test item is one that most of the high scores pass and most of the low scores fail.

Revising the Test Cross Validation Validity Shrinkage
Feedback from Examinees

Publishing the Test Production of Testing Materials
Technical Manual and User’s Manual Testing is Big Business