Presentation on theme: "Evaluating Outcomes Marshall (Mark) Smith, MD, PhD Director, Simulation and Innovation Banner Health."— Presentation transcript:
Evaluating Outcomes Marshall (Mark) Smith, MD, PhD Director, Simulation and Innovation Banner Health
Lecture today is adapted from presentation given at last Laerdal SUN meeting by Geoffrey T. Miller, and some of the slides today are used/modified from that presentation. Geoffrey T. Miller Associate Director, Research and Curriculum Development Division of Pehospital and Emergency Healthcare Gordon Center for Research in Medical Education University of Miami Miller School of Medicine Acknowledgement and Appreciation
The wisdom of a group is greater than that of a few experts…
Session Aims Discuss the importance and use of outcomes evaluation and challenges to traditional assessments Discuss some learning models that facilitate developing assessments Discuss the importance of validity, reliability and feasibility as it relates to assessment Discuss types of assessments and their application in healthcare education
Evaluation Systematic determination of merit, worth, and significance of something or someone using criteria against a set of standards. – Wikipedia, 2009
Assessment Educational Assessment is the process of documenting, usually in measurable terms, knowledge, skills, attitudes and beliefs. Assessment can focus on the individual learner,…… – Wikipedia, 2009
Assessment vs Evaluation Assessment is about the progress and achievements of the individual learners Evaluation is about the learning program as a whole Tovey, 1997
So why measure anything? Why not just teach?
Measurement What is measured, improves You Can't Manage What You Don't Measure You tend to improve what you measure
Measurement Promotes learning Allows evaluation of individuals and learning programs Basis of outcomes- or competency-based education Documentation of competencies
Measurements in Future Credentialing Privileging Licensure Board certification High stake assessments for practitioners All involve assessment of competence
What are the challenges today of traditional methods of measurement/assessment for healthcare providers?
Challenges in traditional assessments Using actual (sick) patients for evaluation of skills – Cannot predict nor schedule clinical training events – Compromise of quality of patient care, safety – Privacy concerns – Patient modesty – Cultural issues – Prolongation of care (longer procedures, etc)
Challenges in traditional assessments Challenges with other models – Cadaveric tissue models – Animal models / labs
Challenges in traditional assessments Feasibility issues for large-scale examinations Standardized, perceived fairness issues in high-stakes settings Standardized patients (SPs) improve reliability, but validity issues exist: cannot mimic many physical findings
Challenges in traditional assessments Wide range of clinical problems, including rare and critical events Availability Financial cost Adequate resources Reliability, validity, feasibility
Kirkpatrick's Four Levels of Evaluation Reaction Learning Performance Results
Kirkpatrick's Four Levels of Evaluation 1. Reaction Measures only one thing – learners perception Not indicative of any skills, performance Success is critical to success of program Relevance to learner important
Kirkpatrick's Four Levels of Evaluation 2. Learning This is where learner changes Requires pre and post testing Evaluation at this step is through learner assessment First level to measure change in learner!
Kirkpatrick's Four Levels of Evaluation 3. Performance (Behavior) Action that is performed Consequence of behavior is performance Traditionally involves measurement in the workplace Transfer of learning from classroom to work environment
Kirkpatrick's Four Levels of Evaluation 4. Results Clinical and quality outcomes Difficult to measure in healthcare Perhaps better in team training Often ROI that management wants
Kirkpatrick's Four Levels of Evaluation Reaction Learning Performance Results 1.Increasing complexity 2.Increasing difficulty to measure, time consuming 3.Increasing value!
Kirkpatrick's Four Levels of Evaluation Reaction Learning Performance Results
How do we start to develop outcome measurements
Development of Curricula Analysis – Clearly define and clarify desired outcomes * Design Development Implementation Evaluation ADDIE
Defining Assessments Outcomes are general, objectives are specific and support outcomes If objectives are clearly defined and written, questions and assessments nearly write themselves
Defining Outcomes Learners are more likely to achieve competency and mastery of skills if the outcomes are well defined and appropriate for the level of skill training Define clear benchmarks for learners to achieve Clear goals with tangible, measurable objectives Start with the end-goal in mind and the assessment metrics, then the content will begin to develop itself
Role of Assessment in Curricula Design
Use of assessments in healthcare simulation Information Demonstration Practice Rosen, MA et al. Measuring Team Performance in Simulation-Based Training: Adopting Best Practices for Healthcare. Simulation in Healthcare 3:2008;33–41. Feedback Remediation Measurement Diagnosis
Preparing assessments What should be assessed? – Any part of curriculum considered essential and/or has significant designated teaching time – Should be consistent with learning outcomes that are established as the competencies for learners – Consider weighted assessments
Clinical competence and performance Competent performance requires acquisition of basic knowledge, skills & attitudes Competence = – Application of specific KSAs Performance = – Translation of competence into action
Three Types of Learning (Learning Domains) Bloom's Taxonomy Cognitive: mental skills (Knowledge) Psychomotor: manual or physical skills (Skills) Affective: growth in feelings or emotional areas (Attitude)
Three Types of Learning Bloom's Taxonomy Cognitive = KnowledgeK Psychomotor = SkillsS Affective = AttitudeA
Bloom’s Taxonomy – Knowledge
The Anti – Blooms…
Bloom’s Taxonomy – Skills Bloom’s committee did not propose a compilation of the psychomotor domain model, but others have since.
Bloom’s Taxonomy – Attitude Five Major Categories Receiving phenomena Responding to Phenomena Valuing Organization Internalizing values
Possible Outcome Competencies (GME Based) Patient care Medical knowledge Practice-based learning and improvement Interpersonal and communication skills Professionalism Systems-Based Practice Knowledge SkillsAttitudes
So we know what we want to measure, but how do we do that?
Miller’s Pyramid Top two cells of the pyramid, in the domains of action, or performance, reflect clinical reality The professionalism and motivation required to continuously apply these in the real setting must be observed during actual patient care.
Miller’s Pyramid Top two levels most difficult to measure Quality of assessment in the clinical setting lags far behind In training Evaluation Reports and Likert Scale Little value as formative, or feedback, instrument that might contribute to the learner’s education
Miller’s Pyramid of competence for learning and assessment
Miller’s Pyramid of Competence Miller GE. The Assessment of Clinical Skills / Competence / Performance, Academic Medicine, 65:9, S63-S67. Does Shows Knows How Knows Cognition Behavior
Learning Opportunity Reading / Independent Study Lecture Computer-based Colleagues / Peers Teaching and Learning “Knows” Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance, Academic Medicine, 65:9, S63-S67.
Assessment of “Knows” Factual Tests Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance
Assessment Tools for “Knows” Multiple Choice Questions (MCQs) Short Answer True / False Matching (extended) Constructed Response Questions
Learning Opportunity Problem-based Ex. Tabletop Exercises Direct Observation Mentors Teaching and Learning - “Knows How” Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance
Clinical Context Based Tests Assessment of “Knows how” Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance
Multiple-choice question Essay Short answer Oral interview The Tools of “Knows How”
Learning Opportunity Skill-based Exercises Repetitive practice Small Group Role Playing Teaching and Learning - “Shows” Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance
Assessment of “Shows” PerformanceAssessment Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance
Objective Structured Clinical Examination (OSCE) Standardized Patient-based The Tools of “Shows”
Variables in Clinical Assessment Clinical Assessment Examiner Patient Student CONTROL VARIABLES as much as possible…
Learning Opportunity Experience Teaching and Learning - “Does” Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance
Assessment of “Does” Performance Assessment Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance
Undercover / Stealth / Incognito Standardized Patient-based Videos of performance Portfolio of learner Patient Satisfaction Surveys The Tools of “Does”
Influences on clinical performance Does Performance Competence System related Individual related Cambridge Model for delineating performance and competence Rethans JJ, et al. The relationship between competence and performance: implications for assessing practice performance, Medical Education, 36:901-909.
Reliability and Validation
Assessments - Reliability Does the test consistently measure what it is supposed to be measuring? – Types of reliability: Inter-rater (consistency over raters) Test-retest (consistency over time) Internal consistency (over different items/forms)
Inter-rater Reliability Multiple judges code independently using the same criteria Reliability = raters code same observations into same classification Examples Medical record reviews Clinical skills Oral examinations
Factors Influencing Reliability Test length Longer tests give more reliable scores Group homogeneity The more heterogeneous the group, the higher the reliability Objectivity of scoring The more objective the scoring, the higher the reliability
Assessments - Validity Are we measuring what we are supposed to be measuring Use the appropriate instrument for the knowledge, skill, or attitude you are testing The major types of validity should be considered – Face – Content – Construct
Both archers are equally reliable Validity = quality of archer’s hits Archer 1 hits bulls eye every time Archer 2 hits outer ring in same spot every time Validity is accuracy
Reliable and Valid Reliable, not valid Not reliable, not valid Reliability and Validity
Improving reliability and validity Base assessment on outcome/objectives- event triggers- observable behavior- behavioral rating-assess Define: – Low-medium-high performance – Use of rubric or rating metric – Use (video) training examples of performance – Employ quality assurance/improvement system
Assessment Formative Assessment – Lower stakes – One of several, over time of course or program – May be evaluative, diagnostic, or prescriptive – Often results in remediation or progression to next level Summative Assessment – Higher stakes – Generally final of course or program – Primary purpose is performance measurement – Often results in a “Go, No-Go” outcome
Assessments - self Encourages responsibility for the learning process, fosters skills in making judgments as to whether work is of an acceptable standard – it improves performance. Most forms of assessment can be adapted to a self- assessment format (MCQs, OSCEs, and short answers) Students must be aware of standards required for competent performance.
Assessments - peer Enables learners to hone their skills in their ability to work with others and professional insight Enables faculty to obtain a view of students they do not see An important part of peer assessment is for students to justify the marks they award to others Justification can also be used as a component when faculty evaluates attitudes and professionalism.
Assessments – Setting Standards Should be set to determine competence Enables certification to be documented, accountable and defensible Appropriately set standards for an assessment will pass those students who are truly competent Standards should not be two low (false positives) to pass those who are incompetent, nor too high (false negative) to fail those who are competent.
Assessments – Setting Standards Those responsible in setting standards must also have a direct role in teaching students at the level being examined and assist in providing examination material
Assessments – Setting Standards Standards should be set around a core curriculum that includes the knowledge, skills and attitudes required of all students When setting a standard the following should be considered: – What is assessed must reflect the core curriculum – Students should be expected to reach a high standard in the core components of the curriculum (For instance an 80-90% pass mark of for the important core and 60-80% for the less important aspects.) – Students should be required to demonstrate mastery of the core in one phase of the curriculum before moving on to the next part of the curriculum
Assessments - Feasibility Is the administration and taking of the assessment instrument feasible in terms of time and resources The following questions should be considered: – How long will it take to construct the instrument? – How much time will be involved with the scoring process? – Will it be relatively easy to interpret the scores and produce the results? – Is it practical in terms of organization? – Can quality feedback result from the instrument? – Will the instrument indicate to the students the important elements within the course? – Will the assessment have a beneficial effect in terms of student motivation, good study habits and positive career aspirations?
Practicality Number of students to be assessed Time available for the assessment Number of staff available Resources/equipment available Special accommodations
Assessments - Instruments Be aware of the types of assessment instruments available as well as the advantages and disadvantages of each Use more than one assessment instrument and more than one assessor if possible when looking at skills and attitudes
Choosing appropriate assessment methods When choosing the assessment instrument, the following should be answered: – Is it valid – Is it reliable – Is it feasible
Assessment Metrics Procedural or Check List assessment Global Rating assessment
Assessment Metrics Procedural or Check List assessment BCLSYN Open Airway Check Breathing BCLSYN Open Airway (< 5 sec of LOC) Check Breathing (< 5 sec of Airway) BCLSYN Open Airway Check Breathing A Rating Score +1 0 *Assist
Assessment Metrics Global Rating assessment Code BluePF CPR and ACLS Code Blue CPR points ACLS points Code BlueHM CPR ACLS L Rating Score +10 Pts.
Review Assessment drives learning Clearly define the desired outcome, ensure that it can be measured Consider the effectiveness of the measurement Feedback to individual candidates Feedback to training programs
Questions and discussion
101 Competency Competency is noted when a learner is observed performing a task or function that has been established as a standard by the profession. The achievement of professional competency requires the articulation of learning objectives as observable, measurable outcomes for a specific level of learner performance. Such specific detailing of performance expectations defines educational competencies. They are verified on the basis of evidence documenting learner achievement, and must be clearly communicated to learners, faculty, and institutional leaders prior to assessment. ____________ Identified by members of work group on competency- based women’s health education at APGO Interdisciplinary Women’s Health Education Conference in September, 1996, Chantilly, VA.