Presentation is loading. Please wait.

Presentation is loading. Please wait.

Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician.

Similar presentations

Presentation on theme: "Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician."— Presentation transcript:

1 Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

2 Objectives Identify the negative impact of examination bias Discuss the impact of enemy items on test validity


4 Because of the high-stakes nature of the NCLEX ®, numerous processes are in place to ensure that the exam is psychometrically sound, valid and legally defensible One such process includes regular review of the NCLEX for potential biases

5 Detecting Bias Bias exists when the test construct measured in one group differs from the construct measured in another group taking the same exam For example, bias would exist in the NCLEX if it measured nursing knowledge in one group of candidates and another construct, such as reading comprehension, in another

6 Consequence of Bias Goal of the NCLEX is to classify candidates into two groups Those who have adequate knowledge, skills and ability to practice entry-level nursing safely Those who do not If bias occurs, the construct of entry-level nursing knowledge may not be measured accurately for some groups of candidates

7 Methods to Detect and Minimize Bias Item Development Writing Review Editorial SME Sensitivity Analyses Differential Item Functioning (DIF) Readability

8 What is DIF? Investigates bias at the individual item level Exists when two groups of candidates with similar ability perform differently on an item In short, one may consider whether the candidates response to the item is dependent upon a group in which he/she resides

9 DIF Analyses Statistical analyses are conducted on a focal vs. reference group Focal: group of interest (generally the minority) Reference: group with whom the focal group is compared (generally the majority) Method Rasch Separate Calibration t-test Compares the difference in difficulty of an item for the focal and reference groups

10 NCLEX DIF Procedure Routine DIF analyses are conducted semi- annually Data include all U.S.-educated candidates

11 Focal and Reference Groups Gender Reference: Female Focal: Male Ethnicity Reference: Caucasian Focal: African American, Hispanic, Asian Other, Asian Indian, Native American and Pacific Islander

12 2010 U.S.-Educated NCLEX Candidates [1] [1] 22,008 candidates did not provide information regarding ethnicities; 5,827 candidates did not provide information on gender. 78,222 PN candidates reported gender 164,175 RN candidates reported gender 74,147 PN candidates reported ethnicity 152,069 RN candidates reported ethnicity

13 NCLEX DIF Procedure Continued Analyses are conducted on all pretest and operational items Minimum sample size requirements 50 focal group candidates 400 reference group candidates Item difficulty is estimated for the two separate groups of candidates

14 Content Review Items with large differences in difficulty are flagged for content review Items displaying statistical DIF may still be content appropriate and valid Item content may be within the scope of entry- level nurse practice Obstetrics and gynecology Operating medical equipment

15 Content Review Panel Panel of subject matter experts (SMEs) convened to review items displaying statistical DIF Panel composition must contain at least Five members Three ethnic focal groups One male One member with a background in linguistics One licensed RN

16 Content Review Panel Continued Panel reviews all items flagged for statistical DIF in the past six months Potential bias Content relevance for entry-level nursing Items identified for bias are forwarded to NCLEX Examination Committee Content irrelevant items removed from operational use

17 Sample Item #1 The nursing care plan for a 74-year-old resident of a long-term care facility includes actions to promote the quality and duration of the clients nighttime sleep. Which of the following behaviors, if exhibited by the client, would indicate an appropriate action? 1.The client does mild calisthenics 1 hour before bedtime. 2.The client takes walks in the halls primarily in the afternoon. 3.The client takes naps from mid- to late afternoon. 4.The client drinks warm tea before bedtime.

18 Sample Item #2 The nurse is caring for a 9-year-old client with bronchial asthma who was admitted with pneumonia. The client is on bed rest. Which of the following would be most appropriate to offer the client? 1.Coloring book and crayons 2.A toy stethoscope and syringe with needle 3.Beads and thread for making jewelry 4.A radio and telephone

19 Conclusion Goal of NCLEX is to ensure public safety by classifying candidates based on whether they can practice entry-level nursing safely and effectively Analyses such as DIF are conducted to ensure that all candidates receive an examination that accurately measures their entry-level nursing knowledge


21 Effective item sampling from a specified test plan is essential to ensure that the exam is psychometrically sound, valid and legally defensible One such process which assists in this endeavor is assessing and eliminating item duplication or enemy item pairs

22 How are Enemy Pairs Developed? Random Occurs coincidentally in the normal process of item development Direct Intent Items similar in nature are purposefully developed

23 What is an Enemy Item Pair? Two or more items with very similar content are not placed on the same exam due to an impairment in: Content validity Face validity Measurement precision

24 Content Validity The consistency with which the content is represented on the exam may be impacted The content domain may be consideredoversampled Large impact on standardized exam as a specific number of items are allocated to the said content domain

25 Face Validity Item duplication may cause the candidate to question exam validity Candidate response may be altered due to the perception that the item is redundant Candidate may become distracted believing that it is a trick

26 Measurement Precision Item duplication may result in what is called Conditional Dependence The two or more items are most likely correlated Two dependent areas are being sampled and may lead to errors in ability estimates

27 Types of Enemy Item Pairs Duplicate Items Stems Options Stimuli Overlapping Content

28 Duplicate Items All item components are virtually identical True duplicates, same item except punctuation or other small differences

29 Duplicate Stems Identical item stem and varying options May occur as a result of developing items used as variants Less likely to occur when developing authentic items from scratch

30 Duplicate Options Similar stem and near identical item options Is considered a cost effective strategy used by test developers to increase item development productivity With response options so similar, candidates may become confused

31 Duplicate Stimuli Identical exam stimulus such as Graphics Exhibits Case scenarios Using same stimuli across exam items may create candidate confusion Candidate exposure to the same stimuli multiple times may introduce fatigue

32 Overlapping Content Similar content exists in the items (stem or options), however, the verbiage is different Same concept, phrased differently Difficult to detect, precise effort should be employed to seek out Can occur in differing item format, e.g. multiple-choice and multiple response

33 Management of Enemy Item Pairs Item Development Process Test Publishing Efforts Post Exam Administration

34 Item Development Enemy Management Efforts placed at the beginning of item development to identify and label enemy pairs Automated software now available which can isolate potential enemy pairs Subject Matter Experts (SMEs) then review potential enemy item pairs, making identification more precise

35 Test Publishing Enemy Management Once one or more enemy items are labeled, test developers can activate test driver specifications to prohibit the inclusion of an enemy item once one item in the enemy set has been selected

36 Post Administration Enemy Management Test developers may analyze item intercorrelations High intercorrelations may indicate potential enemy pairs This method may capture the most obscure enemy pairsthose not immediately identifiable, least likely to impact test validity and measurement


38 Future Research DIF Investigate DIF using different reference/focal groups Enemy Item Management Impact of various enemy pairs on test validity does one type of enemy pair have a stronger/lesser impact on test validity and measurement?

39 References Exam Publications Ensuring Validity of NCLEX ® With Differential Item Functioning Analysis Understanding the Impact of Enemy Items on Test Validity and Measurement Precision

Download ppt "Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician."

Similar presentations

Ads by Google