Presentation on theme: "Confidence-based assessment Tony Gardner-Medwin - Physiology, UCL l context l confidence assessment as a study tool l confidence assessment in exams More."— Presentation transcript:
Confidence-based assessment Tony Gardner-Medwin - Physiology, UCL l context l confidence assessment as a study tool l confidence assessment in exams More info:- web site :
INTROSPECTION AND ACTIVE LEARNING IN BIOMEDICAL STUDY Tony Gardner-Medwin In the CRUCIFORM Confidence-based marking to develop introspection - LAPT The problems: Fewer staff, more students, less small group & practical teaching Rote learning: students focus on information, not understanding Poor introspection, concept manipulation, numeracy Some ways computers can help: Interactive simulations to develop visual intuition - LABVIEW Thinking in parallel - TALK (cf. DISCOURSE - see separate demo) Life & Times of guess-who - an illustrative QUIZ
TALK & PAGER PAGER - Pops up messages onto students screens on the network TALK - Show simultaneous student responses to the tutor/s (cf. DISCOURSE - commercial package) PAGE - Any new version of a text file pops up on top of students work. WATCH- up to 80 text messages visible simultaneously within a few secs. NETWORK - Everyone sees all messages within a few secs.
Knowledge depends on degree of belief, or confidence: knowledge uncertainty ignorance misconception delusion What is Knowledge? Knowledge depends on degree of belief, or confidence: 4 knowledge 4 uncertainty ignorance 6 misconception 6 delusion increasing nescience =0-log 2 (confidence*) for truth of a =1true proposition >>1 Measurement of knowledge requires the eliciting of confidence (or *subjective probability) for the truth of correct statements. This requires a proper scheme of incentives
LAPT confidence-based scoring scheme Confidence Level Score if Correct Score if incorrect P(correct) 67% >80% Odds 2:1 >4:1 0% 20% 40% 60% 80% 100% Subjective Probability Subjective Expectation of Score C=2 C=1 C=3
- evaluation next - basic principle
UCL LAPT usage (on campus only)
"How useful was confidence assessment?" 0% 10% 20% 30% 40% 50% Very Useful Not useful at all No Reply Evaluation study (with K. Issroff) 136 replies (/210) after 1st yr medical course
How useful were the explanations? 0% 10% 20% 30% 40% 50% 60% Very Useful Not useful at all No Reply
"I think about confidence assessment 0% 10% 20% 30% 40% 50% Every TimeMost of the time RarelyNeverNo reply "I sometimes change my answer while thinking about confidence assessment" Disagree 1234Agree 5 %
confident errors are far worse than acknowledged ignorance and are a wake-up call (-6!) to pay attention to explanations expressing uncertainty when you are uncertain is a good thing thinking about the basis and reliability of answers can help tie bits of knowledge together (to form understanding) checking an answer and rereading the question are worthwhile sound confidence judgement is a valued intellectual skill in every context, and one they can improve Principles that students seem readily to understand :- both under- and over- confidence are impediments to learning
- analysis of exam data - student evaluation
A problem with conventional scoring: many answers are based on partial and uncertain knowledge these contribute relatively little to the credit - but a lot to the variance This is statistically inefficient Since we can identify the uncertain answers, we can assess the magnitude of this problem under exam conditions students, 500 True/False Questions
0% 20% 40% 60% 80% 100% 0%20%40%60%80%100% simple score confidence-based score A. (50% correct) d a c b y = x 1.67 equality (only expected for a pure mix of certain knowledge and total guesses) scores if uncertainty is homogeneous and correctly reported theoretical scores for homogeneous uncertainty, based on an information theoretic measure
Simple scores (scaled conventional scores) were scaled so chance gives 0%, total knowledge 100% (equivalent to +1 for correct, -1 for incorrect, 0 for omission). Breakdown of credit and variance due to uncertainty - 65% of the variance came from answers at C=1, but only 18% of the credit. Confidence scores: these give less weight to uncertain answers; uncertainty variance is then more in proportion to credit, and was reduced by 46% (relative to the variation of student marks)
Exam marks are determined by: 1. the students knowledge and skills in the subject area 2. the level of difficulty of the questions 3. chance factors in the way questions relate to details of the students knowledge 4. chance factors in the way uncertainties are resolved (luck) The most convincing test of this is to compare marks on one set of questions with marks for the same student on a different set. A good correlation means we are measuring something about the student, not just noise (1) = signal (its measurement is the object of the exam) (3,4) = noise (random factors obscuring the signal) Confidence-based marks improve the signal-to-noise ratio
The correlation, across students, between scores on one set of questions and another is higher for confidence than for simple scores. But perhaps they are just measuring ability to handle confidence ? No. Confidence scores are better than simple scores at predicting even the conventional scores on a different set of questions. This can only be because they are a statistically more efficient measure of knowledge.
How should one handle students with poor calibration? Significantly overconfident: 2 students (1%) e.g. 50% Significantly underconfident: 41 students (14%) e.g. 83% Maybe one shouldnt penalise such students Adjusted confidence-based score: Mark the set of answers at each C level as if they were entered at the C level that gives the highest score. mean benefit = 1.5% ± 2.1% (median 0.6%)
0% 20% 40% 60% 80% 100% 0%20%40%60%80%100% simple scaled score confidence-based score A. (50% correct) d a c b (100% correct) y = x 1.67 equality (only expected for a pure mix of certain knowledge and total guesses) scores if uncertainty is homogeneous and correctly reported theoretical scores for homogeneous uncertainty, based on an information theoretic measure
simple conf conf (adj) Signal / noise variance ratio: Savings in no. of Qs required: - 48% 35%
SUMMARY CONCLUSIONS Adjusted confidence scores seem the best scores to use (they dont discriminate on the basis of the calibration of a persons confidence judgements, and are also the best predictors of performance on a separate set of questions). Reliable discrimination of student knowledge can be achieved with one third fewer questions, compared with conventional scoring. Confidence scoring is not only fundamentally more fair (rewarding students who can correctly identify which answers are uncertain) but it is more efficient at measuring performance.