Dan Thompson Oklahoma State University Center for Health Science Evaluating Assessments: Utilizing ExamSoft’s item-analysis to better understand student.

Dan Thompson Oklahoma State University Center for Health Science Evaluating Assessments: Utilizing ExamSoft’s item-analysis to better understand student performance and our exams

Session Objectives At the completion of this session, participants will be able to: Identify the purpose of each item-analysis measure Evaluate individual exam items and overall exams using the appropriate statistics Improve exam items based on prior statistical performance Utilize exam data to construct well-balanced exams

Item-Analysis Indicates: Quality of the exam as a whole Question difficulty The results of high and low performers How each answer choice performs The correlation of exam takers’ performance on each item with their overall exam results

Exam Performance KR20 Scale of 0.00 – 1.00 Evaluates the performance and quality of the overall exam As scores increase, the exam is considered more consistent and reliable Licensure exams are expected to maintain KR20 scores >.8 For our class sizes, the goal should be KR20 scores >.7

Question Performance Diff(p) Item difficulty (p-value); Scale of 0.0 – 1.0 For a general bell curve, you want these items to be in the.4 -.6 range (should have higher discriminators if in this range).3 and below is a very difficult question.8 and above is considered a very easy question Upper/Lower (27%) P-value of the upper and lower 27% of the class on that exam

Question Performance Discrimination Index Comparative analysis of the upper and lower 27% Disc. Index = upper 27% - lower 27% Scale of -1.00 to 1.00 (the higher the better).3 and above are good discriminators.1 -.29 are fair, but may need review 0 = no discrimination; all exam takers selected the correct answer Any negative value is considered a flawed item and should be removed or revised Indicates the lower performers scored better than the high performers on that item Low disc. index is appropriate for mastery questions Above.4 realistically indicates the difference between top and bottom performers – illustrates what the higher performers know that the lower performers do not know

Question Performance Point Biserial Measures correlation between exam takers’ responses on an individual question with how they performed on the overall exam Scale of -1.00 – 1.00 A higher biserial indicates that exam takers that performed well on the exam also performed well on that specific item and exam takers that performed poorly on the exam also performed poorly on that item

Question Performance Point Biserial continued: A negative biserial indicates negative correlation – this question should definitely be reviewed When the biserial is low (near 0), there is little correlation between the performance of this item and the overall exam Mastery items lead to low biserials as most student answer correctly Average Answer Time The time students spent answering this specific question 72 seconds per question

Response Distribution Distractor Analysis We can identify which distractors are doing what we intend Helpful guidelines: Quality over quantity Distractors Use as many plausible distractors that can be written for that question Item-analysis is not affected much by the number of answer choices as long as there are at least three total plausible options

Response Distribution Helpful guidelines continued: Check the item distribution of the top and bottom 27% Start with items that have a high discrimination index Will indicate which distractors were most confusing to each group of students

Question Intention It is important to factor in the intention of each question when reviewing the item-analysis Mastery questions should have: Low biserials Low discrimination indexes High p-values Discrimination questions should have: High biserials High discrimination indexes Lower p-values

Examples Mastery Item Review Needed

Examples Removal/Revision Needed Discrimination Question

Conclusion KR20 >.8 Discrimination Index >.3 Biserial – the higher the better; hopefully >.2 P-value depends on intention of the question Check distractor stats to make sure they are doing what they are intending to do Utilize previous exam item-analysis when constructing new exams Helps create well balanced exams

References Dewald, Aaron. An Introduction to Item Analysis. Video retrieved from http://learn.examsoft.com/video/numbers- everywhere-introduction-item-analysishttp://learn.examsoft.com/video/numbers- everywhere-introduction-item-analysis Haladyna, T. M., & Downing, S. M. (1989). Validity of a taxonomy of multiple-choice item-writing rules. Applied Measurement in Education, 2(1), 51-78. Starks, Jason. Exam Quality Through Use of a Psychometric Analysis – A Primer. Retrieved from file:///C:/Users/dpthomp/Downloads/Exam_Quality_Th rough_Use_of_Psychometric_Analysis_A_Primer.pdf file:///C:/Users/dpthomp/Downloads/Exam_Quality_Th rough_Use_of_Psychometric_Analysis_A_Primer.pdf

Thank you!!! Questions/Comments?? Dan.thompson@okstate.edu

Dan Thompson Oklahoma State University Center for Health Science Evaluating Assessments: Utilizing ExamSoft’s item-analysis to better understand student.

Similar presentations

Presentation on theme: "Dan Thompson Oklahoma State University Center for Health Science Evaluating Assessments: Utilizing ExamSoft’s item-analysis to better understand student."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dan Thompson Oklahoma State University Center for Health Science Evaluating Assessments: Utilizing ExamSoft’s item-analysis to better understand student.

Similar presentations

Presentation on theme: "Dan Thompson Oklahoma State University Center for Health Science Evaluating Assessments: Utilizing ExamSoft’s item-analysis to better understand student."— Presentation transcript:

Similar presentations

About project

Feedback