1 Evaluating Outcomes Marshall (Mark) Smith, MD, PhD Director, Simulation and InnovationBanner Health
2 Acknowledgement and Appreciation Lecture today is adapted from presentation given at last Laerdal SUN meeting by Geoffrey T. Miller, and some of the slides today are used/modified from that presentation.Geoffrey T. MillerAssociate Director, Research and Curriculum DevelopmentDivision of Pehospital and Emergency HealthcareGordon Center for Research in Medical EducationUniversity of Miami Miller School of Medicine
3 The wisdom of a group is greater than that of a few experts…
4 Session AimsDiscuss the importance and use of outcomes evaluation and challenges to traditional assessmentsDiscuss some learning models that facilitate developing assessmentsDiscuss the importance of validity, reliability and feasibility as it relates to assessmentDiscuss types of assessments and their application in healthcare education
5 EvaluationSystematic determination of merit, worth, and significance of something or someone using criteria against a set of standards.Wikipedia, 2009
6 AssessmentEducational Assessment is the process of documenting, usually in measurable terms, knowledge, skills, attitudes and beliefs. Assessment can focus on the individual learner,……Wikipedia, 2009
7 Assessment vs Evaluation Assessment is about the progress and achievements of the individual learnersEvaluation is about the learning program as a wholeTovey, 1997
9 Measurement What is measured, improves You Can't Manage What You Don't MeasureYou tend to improve what you measureFor years I have seen instructors that want to “teach” what they think is important, but without any concept of what is really “learned”!
10 Measurement Promotes learning Allows evaluation of individuals and learning programsBasis of outcomes- or competency-based educationDocumentation of competencies
11 Measurements in Future CredentialingPrivilegingLicensureBoard certificationHigh stake assessments for practitionersAll involve assessment of competence
12 What are the challenges today of traditional methods of measurement/assessment for healthcare providers?
13 Challenges in traditional assessments Using actual (sick) patients for evaluation of skillsCannot predict nor schedule clinical training eventsCompromise of quality of patient care, safetyPrivacy concernsPatient modestyCultural issuesProlongation of care (longer procedures, etc)
14 Challenges in traditional assessments Challenges with other modelsCadaveric tissue modelsAnimal models / labs
15 Challenges in traditional assessments Feasibility issues for large-scale examinationsStandardized, perceived fairness issues in high-stakes settingsStandardized patients (SPs) improve reliability, but validity issues exist: cannot mimic many physical findings
16 Challenges in traditional assessments Wide range of clinical problems, including rare and critical eventsAvailabilityFinancial costAdequate resourcesReliability, validity, feasibility
17 Kirkpatrick's Four Levels of Evaluation ReactionLearningPerformanceResults1994reaction
18 Kirkpatrick's Four Levels of Evaluation 1. ReactionMeasures only one thing – learners perceptionNot indicative of any skills, performanceSuccess is critical to success of programRelevance to learner important1994reaction
19 Kirkpatrick's Four Levels of Evaluation 2. LearningThis is where learner changesRequires pre and post testingEvaluation at this step is through learner assessmentFirst level to measure change in learner!1994reaction
20 Kirkpatrick's Four Levels of Evaluation 3. Performance (Behavior)Action that is performedConsequence of behavior is performanceTraditionally involves measurement in the workplaceTransfer of learning from classroom to work environment1994reaction
21 Kirkpatrick's Four Levels of Evaluation 4. ResultsClinical and quality outcomesDifficult to measure in healthcarePerhaps better in team trainingOften ROI that management wants1994reaction
22 Kirkpatrick's Four Levels of Evaluation ReactionLearningPerformanceResultsIncreasing complexityIncreasing difficulty to measure, time consumingIncreasing value!First three measures softSometimes management wants soft data…for retentionSecond level really measures KSA
23 Kirkpatrick's Four Levels of Evaluation ReactionLearningPerformanceResults1994reaction
24 How do we start to develop outcome measurements
27 Development of Curricula AnalysisClearly define and clarify desired outcomes*DesignDevelopmentImplementationEvaluationADDIE
28 Defining AssessmentsOutcomes are general, objectives are specific and support outcomesIf objectives are clearly defined and written, questions and assessments nearly write themselves
29 Defining OutcomesLearners are more likely to achieve competency and mastery of skills if the outcomes are well defined and appropriate for the level of skill trainingDefine clear benchmarks for learners to achieveClear goals with tangible, measurable objectivesStart with the end-goal in mind and the assessment metrics, then the content will begin to develop itself
30 Role of Assessment in Curricula Design CourseTeaching and learningAssessment and evaluationRefineLearner and Course OutcomesModify curricula/assessmentsAssess learners
31 Use of assessments in healthcare simulation InformationDemonstrationPracticeFeedbackRemediationMeasurementDiagnosisRosen, MA et al. Measuring Team Performance in Simulation-Based Training: Adopting Best Practices for Healthcare.Simulation in Healthcare 3:2008;33–41.
32 Preparing assessments What should be assessed?Any part of curriculum considered essential and/or has significant designated teaching timeShould be consistent with learning outcomes that are established as the competencies for learnersConsider weighted assessments
33 Clinical competence and performance Competent performance requires acquisition of basic knowledge, skills & attitudesCompetence =Application of specific KSAsPerformance =Translation of competence into action
34 Three Types of Learning (Learning Domains) Bloom's TaxonomyCognitive: mental skills (Knowledge)Psychomotor: manual or physical skills (Skills)Affective: growth in feelings or emotional areas (Attitude)Committee of colleges in 1956, led by Benjamin bloom
35 Three Types of Learning Bloom's TaxonomyCognitive = Knowledge KPsychomotor = Skills SAffective = Attitude A
48 Possible Outcome Competencies (GME Based) Patient careMedical knowledgePractice-based learning and improvementInterpersonal and communication skillsProfessionalismSystems-Based PracticeKnowledgeSkillsAttitudes
49 So we know what we want to measure, but how do we do that?
52 Miller’s PyramidTop two cells of the pyramid, in the domains of action, or performance, reflect clinical realityThe professionalism and motivation required to continuously apply these in the real setting must be observed during actual patient care.
53 Miller’s Pyramid Top two levels most difficult to measure Quality of assessment in the clinical setting lags far behindIn training Evaluation Reports and Likert ScaleLittle value as formative, or feedback, instrument that might contribute to the learner’s education
54 Miller’s Pyramid of competence for learning and assessment
55 Miller’s Pyramid of Competence DoesShowsKnows HowKnowsBehaviorCognitionMiller GE. The Assessment of Clinical Skills / Competence / Performance, Academic Medicine, 65:9, S63-S67.
56 Teaching and Learning “Knows” OpportunityReading / Independent StudyLectureComputer-basedColleagues / PeersDoesShowsKnows HowKnowsMiller GE. The Assessment of Clinical Skills / Competence / Performance, Academic Medicine, 65:9, S63-S67.
57 Assessment of “Knows” Factual Tests Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance
59 Teaching and Learning - “Knows How” OpportunityProblem-based Ex.Tabletop ExercisesDirect ObservationMentorsDoesShowsKnows HowKnowsMiller GE. The Assessment of Clinical Skills / Competence / Performance
60 Assessment of “Knows how” DoesShowsKnows HowKnowsClinical ContextBased TestsMiller GE. The Assessment of Clinical Skills / Competence / Performance
61 The Tools of “Knows How” Multiple-choice questionEssayShort answerOral interview
62 Teaching and Learning - “Shows” OpportunitySkill-based ExercisesRepetitive practiceSmall GroupRole PlayingDoesShowsKnows HowKnowsMiller GE. The Assessment of Clinical Skills / Competence / Performance
63 Assessment of “Shows” Performance Assessment Does Shows Knows How Miller GE. The Assessment of Clinical Skills / Competence / Performance
64 The Tools of “Shows” Objective Structured Clinical Examination (OSCE) Standardized Patient-based
65 Variables in Clinical Assessment ExaminerStudentClinicalAssessmentPatientCONTROL VARIABLES as much as possible…
66 Teaching and Learning - “Does” ShowsKnows HowKnowsLearningOpportunityExperienceMiller GE. The Assessment of Clinical Skills / Competence / Performance
67 Assessment of “Does” Performance Assessment Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance
68 The Tools of “Does”Undercover / Stealth / Incognito Standardized Patient-basedVideos of performancePortfolio of learnerPatient Satisfaction Surveys
69 Influences on clinical performance DoesSystem relatedIndividual relatedCompetenceCambridge Model for delineating performance and competenceRethans JJ, et al. The relationship between competence and performance: implications for assessing practice performance, Medical Education, 36:
72 Assessments - Reliability Does the test consistently measure what it is supposed to be measuring?Types of reliability:Inter-rater (consistency over raters)Test-retest (consistency over time)Internal consistency (over different items/forms)
73 Inter-rater Reliability Multiple judges code independently using the same criteriaReliability = raters code same observations into same classificationExamplesMedical record reviewsClinical skillsOral examinations
74 Factors Influencing Reliability Test lengthLonger tests give more reliable scoresGroup homogeneityThe more heterogeneous the group, the higher the reliabilityObjectivity of scoringThe more objective the scoring, the higher the reliability
75 Assessments - Validity Are we measuring what we are supposed to be measuringUse the appropriate instrument for the knowledge, skill, or attitude you are testingThe major types of validity should be consideredFaceContentConstruct
76 Validity is accuracy Both archers are equally reliable Archer 1 hits bulls eye every timeArcher 2 hits outer ring in same spot every timeThe problem in educational measurement is that we can seldom “see” the target, e.g., attitudes, knowledge, professionalism as targets are “invisible targets” and validity claims are based on inferences and extrapolations from indirect evidence that we have hit the target.Both archers are equally reliableValidity = quality of archer’s hits
77 Reliability and Validity Reliable and ValidReliable, not validNot reliable, not valid
78 Improving reliability and validity Base assessment on outcome/objectives- event triggers- observable behavior- behavioral rating-assessDefine:Low-medium-high performanceUse of rubric or rating metricUse (video) training examples of performanceEmploy quality assurance/improvement system
79 Assessments types Choose the appropriate assessment method: Formative SummativeSelfPeer
80 Assessment Formative Assessment Summative Assessment Lower stakes One of several, over time of course or programMay be evaluative, diagnostic, or prescriptiveOften results in remediation or progression to next levelSummative AssessmentHigher stakesGenerally final of course or programPrimary purpose is performance measurementOften results in a “Go, No-Go” outcome
81 Assessments - selfEncourages responsibility for the learning process, fosters skills in making judgments as to whether work is of an acceptable standard – it improves performance.Most forms of assessment can be adapted to a self-assessment format (MCQs, OSCEs, and short answers)Students must be aware of standards required for competent performance.
82 Assessments - peerEnables learners to hone their skills in their ability to work with others and professional insightEnables faculty to obtain a view of students they do not seeAn important part of peer assessment is for students to justify the marks they award to othersJustification can also be used as a component when faculty evaluates attitudes and professionalism.
83 Assessments – Setting Standards Should be set to determine competenceEnables certification to be documented, accountable and defensibleAppropriately set standards for an assessment will pass those students who are truly competentStandards should not be two low (false positives) to pass those who are incompetent, nor too high (false negative) to fail those who are competent.
84 Assessments – Setting Standards Those responsible in setting standards must also have a direct role in teaching students at the level being examined and assist in providing examination material
85 Assessments – Setting Standards Standards should be set around a core curriculum that includes the knowledge, skills and attitudes required of all studentsWhen setting a standard the following should be considered:What is assessed must reflect the core curriculumStudents should be expected to reach a high standard in the core components of the curriculum (For instance an 80-90% pass mark of for the important core and 60-80% for the less important aspects.)Students should be required to demonstrate mastery of the core in one phase of the curriculum before moving on to the next part of the curriculum
86 Assessments - Feasibility Is the administration and taking of the assessment instrument feasible in terms of time and resourcesThe following questions should be considered:How long will it take to construct the instrument?How much time will be involved with the scoring process?Will it be relatively easy to interpret the scores and produce the results?Is it practical in terms of organization?Can quality feedback result from the instrument?Will the instrument indicate to the students the important elements within the course?Will the assessment have a beneficial effect in terms of student motivation, good study habits and positive career aspirations?
87 Practicality Number of students to be assessed Time available for the assessmentNumber of staff availableResources/equipment availableSpecial accommodations
89 Assessments - Instruments Be aware of the types of assessment instruments available as well as the advantages and disadvantages of eachUse more than one assessment instrument and more than one assessor if possible when looking at skills and attitudes
90 Choosing appropriate assessment methods When choosing the assessment instrument, the following should be answered:Is it validIs it reliableIs it feasible
95 Assessment Metrics Procedural or Check List assessment Global Rating assessment
96 Assessment Metrics Procedural or Check List assessment BCLS Y N Open AirwayCheck BreathingBCLSYNOpen Airway (< 5 sec of LOC)Check Breathing (< 5 sec of Airway)BCLSYNOpen AirwayCheck BreathingARating Score+1-1*Assist
97 Assessment Metrics Global Rating assessment Code Blue P F CPR and ACLS CPR <1(low) - 5(Hi)> pointsACLS <1(low)- 5(Hi)> pointsPts.Code BlueHMCPRACLSLRating Score+1-1
98 Review Assessment drives learning Clearly define the desired outcome, ensure that it can be measuredConsider the effectiveness of the measurementFeedback to individual candidatesFeedback to training programs
101 CompetencyCompetency is noted when a learner is observed performing a task or function that has been established as a standard by the profession. The achievement of professional competency requires the articulation of learning objectives as observable, measurable outcomes for a specific level of learner performance. Such specific detailing of performance expectations defines educational competencies. They are verified on the basis of evidence documenting learner achievement, and must be clearly communicated to learners, faculty, and institutional leaders prior to assessment.____________Identified by members of work group on competency-based women’s health education at APGO Interdisciplinary Women’s Health Education Conference in September, 1996, Chantilly, VA.