Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluating Outcomes Marshall (Mark) Smith, MD, PhD

Similar presentations

Presentation on theme: "Evaluating Outcomes Marshall (Mark) Smith, MD, PhD"— Presentation transcript:

1 Evaluating Outcomes Marshall (Mark) Smith, MD, PhD
Director, Simulation and Innovation Banner Health

2 Acknowledgement and Appreciation
Lecture today is adapted from presentation given at last Laerdal SUN meeting by Geoffrey T. Miller, and some of the slides today are used/modified from that presentation. Geoffrey T. Miller Associate Director, Research and Curriculum Development Division of Pehospital and Emergency Healthcare Gordon Center for Research in Medical Education University of Miami Miller School of Medicine

3 The wisdom of a group is greater than that of a few experts…

4 Session Aims Discuss the importance and use of outcomes evaluation and challenges to traditional assessments Discuss some learning models that facilitate developing assessments Discuss the importance of validity, reliability and feasibility as it relates to assessment Discuss types of assessments and their application in healthcare education

5 Evaluation Systematic determination of merit, worth, and significance of something or someone using criteria against a set of standards. Wikipedia, 2009

6 Assessment Educational Assessment is the process of documenting, usually in measurable terms, knowledge, skills, attitudes and beliefs. Assessment can focus on the individual learner,…… Wikipedia, 2009

7 Assessment vs Evaluation
Assessment is about the progress and achievements of the individual learners Evaluation is about the learning program as a whole Tovey, 1997

8 So why measure anything?
Why not just teach?

9 Measurement What is measured, improves
You Can't Manage What You Don't Measure You tend to improve what you measure For years I have seen instructors that want to “teach” what they think is important, but without any concept of what is really “learned”!

10 Measurement Promotes learning
Allows evaluation of individuals and learning programs Basis of outcomes- or competency-based education Documentation of competencies

11 Measurements in Future
Credentialing Privileging Licensure Board certification High stake assessments for practitioners All involve assessment of competence

12 What are the challenges today of traditional methods of measurement/assessment for healthcare providers?

13 Challenges in traditional assessments
Using actual (sick) patients for evaluation of skills Cannot predict nor schedule clinical training events Compromise of quality of patient care, safety Privacy concerns Patient modesty Cultural issues Prolongation of care (longer procedures, etc)

14 Challenges in traditional assessments
Challenges with other models Cadaveric tissue models Animal models / labs

15 Challenges in traditional assessments
Feasibility issues for large-scale examinations Standardized, perceived fairness issues in high-stakes settings Standardized patients (SPs) improve reliability, but validity issues exist: cannot mimic many physical findings

16 Challenges in traditional assessments
Wide range of clinical problems, including rare and critical events Availability Financial cost Adequate resources Reliability, validity, feasibility

17 Kirkpatrick's Four Levels of Evaluation
Reaction Learning Performance Results 1994 reaction

18 Kirkpatrick's Four Levels of Evaluation
1. Reaction Measures only one thing – learners perception Not indicative of any skills, performance Success is critical to success of program Relevance to learner important 1994 reaction

19 Kirkpatrick's Four Levels of Evaluation
2. Learning This is where learner changes Requires pre and post testing Evaluation at this step is through learner assessment First level to measure change in learner! 1994 reaction

20 Kirkpatrick's Four Levels of Evaluation
3. Performance (Behavior) Action that is performed Consequence of behavior is performance Traditionally involves measurement in the workplace Transfer of learning from classroom to work environment 1994 reaction

21 Kirkpatrick's Four Levels of Evaluation
4. Results Clinical and quality outcomes Difficult to measure in healthcare Perhaps better in team training Often ROI that management wants 1994 reaction

22 Kirkpatrick's Four Levels of Evaluation
Reaction Learning Performance Results Increasing complexity Increasing difficulty to measure, time consuming Increasing value! First three measures soft Sometimes management wants soft data…for retention Second level really measures KSA

23 Kirkpatrick's Four Levels of Evaluation
Reaction Learning Performance Results 1994 reaction

24 How do we start to develop outcome measurements


26 Learning Outcomes Goals Course Objectives Projected Outcomes
Expected Outcomes

27 Development of Curricula
Analysis Clearly define and clarify desired outcomes* Design Development Implementation Evaluation ADDIE

28 Defining Assessments Outcomes are general, objectives are specific and support outcomes If objectives are clearly defined and written, questions and assessments nearly write themselves

29 Defining Outcomes Learners are more likely to achieve competency and mastery of skills if the outcomes are well defined and appropriate for the level of skill training Define clear benchmarks for learners to achieve Clear goals with tangible, measurable objectives Start with the end-goal in mind and the assessment metrics, then the content will begin to develop itself

30 Role of Assessment in Curricula Design
Course Teaching and learning Assessment and evaluation Refine Learner and Course Outcomes Modify curricula/assessments Assess learners

31 Use of assessments in healthcare simulation
Information Demonstration Practice Feedback Remediation Measurement Diagnosis Rosen, MA et al. Measuring Team Performance in Simulation-Based Training: Adopting Best Practices for Healthcare. Simulation in Healthcare 3:2008;33–41.

32 Preparing assessments
What should be assessed? Any part of curriculum considered essential and/or has significant designated teaching time Should be consistent with learning outcomes that are established as the competencies for learners Consider weighted assessments

33 Clinical competence and performance
Competent performance requires acquisition of basic knowledge, skills & attitudes Competence = Application of specific KSAs Performance = Translation of competence into action

34 Three Types of Learning (Learning Domains)
Bloom's Taxonomy Cognitive: mental skills (Knowledge) Psychomotor: manual or physical skills (Skills) Affective: growth in feelings or emotional areas (Attitude) Committee of colleges in 1956, led by Benjamin bloom

35 Three Types of Learning
Bloom's Taxonomy Cognitive = Knowledge K Psychomotor = Skills S Affective = Attitude A

36 Bloom’s Taxonomy – Knowledge

37 Bloom’s Taxonomy – Knowledge

38 Bloom’s Taxonomy – Knowledge

39 Bloom’s Taxonomy – Knowledge

40 Bloom’s Taxonomy – Knowledge

41 The Anti – Blooms…

42 Bloom’s Taxonomy – Skills
Bloom’s committee did not propose a compilation of the psychomotor domain model, but others have since.

43 Bloom’s Taxonomy – Attitude
Five Major Categories Receiving phenomena Responding to Phenomena Valuing Organization Internalizing values

44 Knowledge Competencies
Cognitive knowledge (factual) Recall Comprehension Application Analysis Synthesis Evaluation

45 Skill competencies Skills Knowledge Communication Physical Exam
Procedures Informatics Self Learning Time Management Problem Solving Skills

46 XAttitude competencies
Knowledge Attitudes Behavior Teamwork Professionalism Key Personal Qualities Motivation Attitudes Skills

47 Continuous process Knowledge Attitudes Skills

48 Possible Outcome Competencies (GME Based)
Patient care Medical knowledge Practice-based learning and improvement Interpersonal and communication skills Professionalism Systems-Based Practice Knowledge Skills Attitudes

49 So we know what we want to measure, but how do we do that?


51 Miller’s Pyramid of Competence
Moving Up the Pyramid: Assessing Performance in the Clinic More that 18 years ago, George Miller introduced a framework for the assessment of medical students and residents, “Miller’s Pyramid”1 (Figure 1). In the accompanying address to the Association of American Medical Colleges, he advocated the evaluation of learners for their skills and abilities in the 2 top cells of the pyramid, in the domains of action, or performance, reflecting clinical reality. Miller argued that the demonstration of competence in these higher domains strongly implies that a student has already acquired the prerequisite knowledge, or Knows, and the ability to apply that knowledge, or Knows How, that make up the base of the pyramid. Basic clinical skills (Shows How) are those that can be measured in an examination situation such as an objective structured clinical examination (OSCE). However, the professionalism and motivation required to continuously apply these in the real setting (Does) must be observed during actual patient care. View larger version: In this page In a new window Figure 1. Miller’s pyramid1. From Academic Medicine 1990;65:S63–7; with permission from Wolters Kluwer Health. The component that Miller argued is the most vital aspect of measurement, what the learner does in clinical practice, has been the most difficult to capture. Almost 2 decades later, we are still struggling with the need to develop reliable and valid methods of assessing learners in the clinical setting. In the meantime, there have been many advances in the lower echelons of the pyramid. In the domains of Know and Know How, the Medical Council of Canada2 and the National Board of Medical Examiners3 have made great strides in the art of the multiple choice examination, the Key Feature examination, and computer-based, adaptive examination. In a pair of landmark publications, Tamblyn, et al have provided good evidence of the predictive validity of these assessments to outcomes in clinical practice4,5. The OSCE examination has become so ubiquitous that it has been claimed to define the expectations of practice itself6. In the name of reliability, the standardized patient has overtaken the real patient for the purposes of certifying examinations7. However, as outlined in the article by Susan Humphrey-Murto, et al in this issue of The Journal, the quality of assessment in the clinical setting lags far behind8. They state that the most frequently used instrument, the Intraining Evaluation Report (ITER), is completed by the resident or clerkship director, who may have had little personal experience of the learner, and at a time removed from many of those observations. This leads to a migration towards the center of the ubiquitous Likert scale, as the director is reluctant to label the student as being either exceptional or substandard in any specific item. Given that these forms are usually completed at the end of the rotation, and lack information clearly anchored to the performance of the learner, they have little value as a formative, or feedback, instrument that might contribute to the student’s education. Norcini, J. J BMJ 2003;326: Copyright ©2003 BMJ Publishing Group Ltd.

52 Miller’s Pyramid Top two cells of the pyramid, in the domains of action, or performance, reflect clinical reality The professionalism and motivation required to continuously apply these in the real setting must be observed during actual patient care.

53 Miller’s Pyramid Top two levels most difficult to measure
Quality of assessment in the clinical setting lags far behind In training Evaluation Reports and Likert Scale Little value as formative, or feedback, instrument that might contribute to the learner’s education

54 Miller’s Pyramid of competence for learning and assessment

55 Miller’s Pyramid of Competence
Does Shows Knows How Knows Behavior Cognition Miller GE. The Assessment of Clinical Skills / Competence / Performance, Academic Medicine, 65:9, S63-S67.

56 Teaching and Learning “Knows”
Opportunity Reading / Independent Study Lecture Computer-based Colleagues / Peers Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance, Academic Medicine, 65:9, S63-S67.

57 Assessment of “Knows” Factual Tests Does Shows Knows How Knows
Miller GE. The Assessment of Clinical Skills / Competence / Performance

58 Assessment Tools for “Knows”
Multiple Choice Questions (MCQs) Short Answer True / False Matching (extended) Constructed Response Questions

59 Teaching and Learning - “Knows How”
Opportunity Problem-based Ex. Tabletop Exercises Direct Observation Mentors Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance

60 Assessment of “Knows how”
Does Shows Knows How Knows Clinical Context Based Tests Miller GE. The Assessment of Clinical Skills / Competence / Performance

61 The Tools of “Knows How”
Multiple-choice question Essay Short answer Oral interview

62 Teaching and Learning - “Shows”
Opportunity Skill-based Exercises Repetitive practice Small Group Role Playing Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance

63 Assessment of “Shows” Performance Assessment Does Shows Knows How
Miller GE. The Assessment of Clinical Skills / Competence / Performance

64 The Tools of “Shows” Objective Structured Clinical Examination (OSCE)
Standardized Patient-based

65 Variables in Clinical Assessment
Examiner Student Clinical Assessment Patient CONTROL VARIABLES as much as possible…

66 Teaching and Learning - “Does”
Shows Knows How Knows Learning Opportunity Experience Miller GE. The Assessment of Clinical Skills / Competence / Performance

67 Assessment of “Does” Performance Assessment Does Shows Knows How Knows
Miller GE. The Assessment of Clinical Skills / Competence / Performance

68 The Tools of “Does” Undercover / Stealth / Incognito Standardized Patient-based Videos of performance Portfolio of learner Patient Satisfaction Surveys

69 Influences on clinical performance
Does System related Individual related Competence Cambridge Model for delineating performance and competence Rethans JJ, et al. The relationship between competence and performance: implications for assessing practice performance, Medical Education, 36:

70 Reliability and Validation


72 Assessments - Reliability
Does the test consistently measure what it is supposed to be measuring? Types of reliability: Inter-rater (consistency over raters) Test-retest (consistency over time) Internal consistency (over different items/forms)

73 Inter-rater Reliability
Multiple judges code independently using the same criteria Reliability = raters code same observations into same classification Examples Medical record reviews Clinical skills Oral examinations

74 Factors Influencing Reliability
Test length Longer tests give more reliable scores Group homogeneity The more heterogeneous the group, the higher the reliability Objectivity of scoring The more objective the scoring, the higher the reliability

75 Assessments - Validity
Are we measuring what we are supposed to be measuring Use the appropriate instrument for the knowledge, skill, or attitude you are testing The major types of validity should be considered Face Content Construct

76 Validity is accuracy Both archers are equally reliable
Archer 1 hits bulls eye every time Archer 2 hits outer ring in same spot every time The problem in educational measurement is that we can seldom “see” the target, e.g., attitudes, knowledge, professionalism as targets are “invisible targets” and validity claims are based on inferences and extrapolations from indirect evidence that we have hit the target. Both archers are equally reliable Validity = quality of archer’s hits

77 Reliability and Validity
Reliable and Valid Reliable, not valid Not reliable, not valid

78 Improving reliability and validity
Base assessment on outcome/objectives- event triggers- observable behavior- behavioral rating-assess Define: Low-medium-high performance Use of rubric or rating metric Use (video) training examples of performance Employ quality assurance/improvement system

79 Assessments types Choose the appropriate assessment method: Formative
Summative Self Peer

80 Assessment Formative Assessment Summative Assessment Lower stakes
One of several, over time of course or program May be evaluative, diagnostic, or prescriptive Often results in remediation or progression to next level Summative Assessment Higher stakes Generally final of course or program Primary purpose is performance measurement Often results in a “Go, No-Go” outcome

81 Assessments - self Encourages responsibility for the learning process, fosters skills in making judgments as to whether work is of an acceptable standard – it improves performance. Most forms of assessment can be adapted to a self-assessment format (MCQs, OSCEs, and short answers) Students must be aware of standards required for competent performance.

82 Assessments - peer Enables learners to hone their skills in their ability to work with others and professional insight Enables faculty to obtain a view of students they do not see An important part of peer assessment is for students to justify the marks they award to others Justification can also be used as a component when faculty evaluates attitudes and professionalism.

83 Assessments – Setting Standards
Should be set to determine competence Enables certification to be documented, accountable and defensible Appropriately set standards for an assessment will pass those students who are truly competent Standards should not be two low (false positives) to pass those who are incompetent, nor too high (false negative) to fail those who are competent.

84 Assessments – Setting Standards
Those responsible in setting standards must also have a direct role in teaching students at the level being examined and assist in providing examination material

85 Assessments – Setting Standards
Standards should be set around a core curriculum that includes the knowledge, skills and attitudes required of all students When setting a standard the following should be considered: What is assessed must reflect the core curriculum Students should be expected to reach a high standard in the core components of the curriculum (For instance an 80-90% pass mark of for the important core and 60-80% for the less important aspects.) Students should be required to demonstrate mastery of the core in one phase of the curriculum before moving on to the next part of the curriculum

86 Assessments - Feasibility
Is the administration and taking of the assessment instrument feasible in terms of time and resources The following questions should be considered: How long will it take to construct the instrument? How much time will be involved with the scoring process? Will it be relatively easy to interpret the scores and produce the results? Is it practical in terms of organization? Can quality feedback result from the instrument? Will the instrument indicate to the students the important elements within the course? Will the assessment have a beneficial effect in terms of student motivation, good study habits and positive career aspirations?

87 Practicality Number of students to be assessed
Time available for the assessment Number of staff available Resources/equipment available Special accommodations

88 Assessment Instruments

89 Assessments - Instruments
Be aware of the types of assessment instruments available as well as the advantages and disadvantages of each Use more than one assessment instrument and more than one assessor if possible when looking at skills and attitudes

90 Choosing appropriate assessment methods
When choosing the assessment instrument, the following should be answered: Is it valid Is it reliable Is it feasible

91 Assessments – Knowledge Instruments
Objective tests (short answer, true/false, matching, multiple choice) Objective Structured Clinical Evaluations (OSCEs) Constructed response questions Rating scales (used on clerkships)

92 Assessments – Skill Instruments
Objective tests (Simulation based) OSCEs Constructed response questions Critical reading papers (interpreting literature) Checklists Rating Scales Portfolios (self-evaluation, time management)

93 Weighted Checklists List of items to measure Set of weights of each
Summary score

94 Assessments – Attitude Instruments
Portfolios Essays / Modified essay questions OSCEs Checklists Rating scales Patient management problems Short/long case assessments

95 Assessment Metrics Procedural or Check List assessment
Global Rating assessment

96 Assessment Metrics Procedural or Check List assessment BCLS Y N
Open Airway Check Breathing BCLS Y N Open Airway (< 5 sec of LOC) Check Breathing (< 5 sec of Airway) BCLS Y N Open Airway Check Breathing A Rating Score +1 -1 *Assist

97 Assessment Metrics Global Rating assessment Code Blue P F CPR and ACLS
CPR <1(low) - 5(Hi)> points ACLS <1(low)- 5(Hi)> points Pts. Code Blue H M CPR ACLS L Rating Score +1 -1

98 Review Assessment drives learning
Clearly define the desired outcome, ensure that it can be measured Consider the effectiveness of the measurement Feedback to individual candidates Feedback to training programs

99 Questions and discussion

100 Good Luck 

101 Competency Competency is noted when a learner is observed performing a task or function that has been established as a standard by the profession. The achievement of professional competency requires the articulation of learning objectives as observable, measurable outcomes for a specific level of learner performance. Such specific detailing of performance expectations defines educational competencies. They are verified on the basis of evidence documenting learner achievement, and must be clearly communicated to learners, faculty, and institutional leaders prior to assessment. ____________ Identified by members of work group on competency-based women’s health education at APGO Interdisciplinary Women’s Health Education Conference in September, 1996, Chantilly, VA.

Download ppt "Evaluating Outcomes Marshall (Mark) Smith, MD, PhD"

Similar presentations

Ads by Google