Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Reflective Practice Model for Examining the Validity of Performance assessments Cynthia Conn, PhD Assistant Vice Provost, Professional Education Programs.

Similar presentations


Presentation on theme: "A Reflective Practice Model for Examining the Validity of Performance assessments Cynthia Conn, PhD Assistant Vice Provost, Professional Education Programs."— Presentation transcript:

1 A Reflective Practice Model for Examining the Validity of Performance assessments
Cynthia Conn, PhD Assistant Vice Provost, Professional Education Programs Kathy Bohan, EdD Associate Dean, College of Education Sue Pieper, PhD Assessment Coordinator, Office of Curriculum, Learning Design, & Academic Assessment

2 Workshop Objectives Objectives
Review the purpose of the Validity Inquiry Process Model and the strategies and instruments developed to examine validity of performance assessments Discuss the implementation of the Validity Inquiry Process Model at NAU Participate in a model Validity Inquiry Process meeting Identify lessons learned from the implementation of the Validity Inquiry Process Model at NAU

3 The Purpose of the Validity Inquiry Process (VIP) Model
The purpose of the Validity Inquiry Process (VIP) Model instruments is to assist in examining and gathering evidence to build a validity argument for the interpretation and use of data from locally or faculty developed performance assessment instruments.

4

5 The Approach Theory to practice Qualitative and reflective Efficient
Utilized the existing validity and performance assessment literature to develop practical guidelines and instruments for examining performance assessment in relation to validity criteria (Kane, 2013; Linn, Baker, & Dunbar, 1991; Messick, 1994) Qualitative and reflective Process and instruments guide the review and facilitate discussion regarding performance assessments and rubrics Efficient Some steps require documenting foundational information, one provides a survey completed by students, and the other steps involve faculty discussion and review

6 Validity Inquiry Process (VIP) Model Criteria
Domain coverage Content quality Cognitive complexity Meaningfulness Generalizability Consequences Fairness Cost and Efficiency (Linn, Baker, & Dunbar, 1991; Messick, 1994)

7 Implementing Validity Inquiry Process Model
Timeline for Implementation Identified target programs and faculty developed performance assessments (1 semester in advance) Identified lead faculty member(s) for each performance assessment (1 month in advance) Provided brief announcement and description at department faculty meeting (1 month in advance) Associate Dean scheduled meeting with lead faculty including (at least 2 to 3 weeks in advance): Introduction letter describing purpose (CAEP Standard 5.2) and what to expect (sample letter available through website) Attached copy of Validity Inquiry Form and Metarubric Verified most recent copy of performance assessment to be reviewed Requested preliminary review of Validity Inquiry Form and Metarubric prior to the meeting

8 Performance Assessment Review Meeting
Logistics for Meeting Individual review meetings were scheduled for 2 hours Skype was utilized for connecting with faculty at statewide campuses Participants included 2 to 3 lead faculty members, facilitators (Associate Dean & Assessment Coordinator), and a part-time employee or Graduate Assistant to take notes

9 Model Meeting Performance Assessment Review Meeting Agenda
Introduction to Validity Inquiry Process Model and relation CAEP Standard 5 Purpose of performance assessment (Activity #1) Validity Inquiry Form (Activity #2) Metarubric (Activity #3) Feedback on the meeting and overview of next steps

10 Introduction to Validity Inquiry Process Model and Relation to CAEP
Standard 5: Provider Quality Assurance and Continuous Improvement Quality and Strategic Evaluation 5.2 The provider’s quality assurance system relies on relevant, verifiable, representative, cumulative and actionable measures, and produces empirical evidence that interpretations of data are valid and consistent.

11

12 Activity #1: Using the Validity Inquiry Form
Discuss in small groups the stated purpose of this performance assessment and if it is an effective purpose statement. Course Prefix, Number & Name of Performance Assessment: CI 356: Curriculum and Assessment Presentation Purpose of Performance Assessment The purpose of this assignment is to demonstrate successful attainment of the course learning outcomes aligned with the SPA standards.

13 Activity #1: Small Group Discussion
What are the results of your small group discussion?

14 Activity 1: Question Prompts to Promote Deep Discussion
Why are you asking candidates to prepare and deliver a curriculum and assessment presentation? Why is it important? How does this assignment apply to candidates’ future professional practice? How does this assignment fit with the rest of your course? How does this assignment fit with the rest of your program curriculum?

15 Activity #2: Using the Validity Inquiry Form
Discuss in small group questions 2 and 3 on the Validity Inquiry Form and how you would rate the assignment. Questions 2 & 3: Content Quality: Q2: Does the performance assessment evaluate process or application skills as well as content knowledge? Cognitive Complexity: Q3: Analyze performance assessment using the Rigor/Relevance Framework (see to provide evidence of cognitive complexity: Identify the quadrant that the assessment falls into and provide a justification for this determination.

16 Activity #2: Using the Validity Inquiry Form
Cognitive Complexity Daggett, W.R. (2014). Rigor/relevance framework®: A guide to focusing resources to increase student performance. International Center for Leadership in Education. Retrieved from

17 Activity #2: Discussion
What are the results of your small group discussion?

18 Activity #2: Question Prompts to Promote Deep Discussion
How well does the assignment or performance assessment evaluate content knowledge? How well does the assignment or performance assessment evaluate process or applications skills? Using the Rigor/Relevance Framework® What quadrant did the assessment fall into and why? How did your group establish consensus on the quadrant determination?

19

20 Activity #3: Using the Metarubric
As a large group, we will go through the following process for Question 2 (Q2) on the Metarubric: Read the example assignment rubric provided as well as the Metarubric questions. Criteria: Q2: Does each rubric criterion align directly with the assignment instructions? (Pieper, 2012)

21 Activity #3: Using the Metarubric
With the person(s) sitting next to you, complete the process again for the following questions: Descriptions: Q8: “Are the descriptions clear and different from each other?” (Stevens & Levi, 2005, p. 94) Overall Qualities: Q11: Do the assignment instructions “encourage students to use the rubric for self- and peer assessment?” (Pieper, 2012) And any other questions you wish to discuss…

22 Activity #3: Discussion
What are the results of your discussion?

23

24 Feedback on Meeting and Overview of Next Steps
Notes from meetings will be consolidated Assessment Coordinator will develop a one page document outlining: Who participated Strengths Areas for improvement Next steps Initial follow-up documentation will be utilized to develop Validity Argument (CAEP Evidence Guide, 2015) “To what extent does the evaluation measure what it claims to measure? (construct validity)” “Are the right attributes being measured in the right balance? (content validity)” “Is a measure of subjectively viewed as being important and relevant? (face validity)” Documentation for CAEP Standard 5

25 Validity Argument Candidate Work Sample (CWS) (Student Teaching; EPP-wide assessment) After a Validity Inquiry Process review of the original CWS rubric, the leadership team determined it was insufficient in terms of collecting meaningful data related to impact on student learning as required by the new CAEP standards. To determine appropriate, distinct criteria and draft detailed level descriptions for the rubric, feedback from University Supervisors (content experts who implement the performance assessment) was collected during their Summer 2014 University Supervisor Annual Meeting. The original CWS instrument was discussed with respect to the various evidence University Supervisors look for to base their evaluation. These descriptions of evidences were documented. Based on this feedback and continued consultation with a small committee (content and assessment experts) including a University Supervisor, an instructional specialist, the director of the Office of Field Experiences/Student Teaching, and the assistant vice provost for the NAU Professional Education Programs, the instrument was significantly revised. The purpose of the instrument was articulated, the criteria were expanded to include discrete objectives, and the descriptions for the rubric were based on the expert review of University Supervisors, professionals with extensive K-12 experience. A validity inquiry was completed based on the revised instrument.

26 Validity Argument Candidate Work Sample (CWS) (Student Teaching; EPP-wide assessment) Justification for validity of the instrument are presented in terms of the following performance assessment criteria: Domain Coverage (content validity): The instrument criteria are comprehensive and based on best practices. The criteria are explicitly aligned to current InTASC standards. The revised scoring guide was appropriately expanded to allow for distinct evaluations related to various aspects of planning, implementation, and assessment. Content Quality (content validity): This summative assessment is focused on application of content and pedagogical knowledge. Various aspect of the work product require candidates to demonstrate both content knowledge (i.e., description of instruction and materials; formative and summative assessments) and application skills (i.e., reflection regarding implementation and necessary revisions or re-teaching of material). Cognitive Complexity (content validity): This performance assessment falls ranks in Quadrant C or D on the Rigor and Relevance Framework. Candidates need to accomplish work related to the higher levels of Bloom’s Taxonomy (i.e., Application, Analysis, Synthesis, & Evaluation), sometime in one discipline but often across disciplines, and in unpredictable real world situations.

27 Validity Argument Meaningfulness (face validity): The performance assessment is highly authentic requiring the actual implementation of multiple lessons that match the class or grade level curriculum map and completed as part of student teaching. The purpose of the assessment is to demonstrate impact on student learning, and there are multiple requirements asking for candidates to document formative and summative assessment data and how the data was used to modify instruction. Consequences: This is a summative assessment for student teaching. Candidates are required to pass this assessment in order to pass student teaching. This is a significant consequence but appropriate for student teaching, the culminating course in the program of study. Fairness: It is a substantial assignment, and university supervisors work closely with candidates to monitor and scaffold progress related to this assessment. Some programs implement a similar assignment prior to student teaching, and other programs are considering this approach as well to ensure all candidates have adequate preparation to successfully complete this assessment. Efficiency: Expectations are communicated in detail through the instructions. The type is assessment is appropriate for student teaching, and efforts to allow candidates to practice completing a similar assignment prior to student teaching will improve efficiency.

28 Documentation for CAEP Standard 5
Creation of bundled pdf file with validity inquiry and metarubric forms Cover sheet to pdf should contain validity argument and information regarding experts involved in review process Store files in web-based, collaborative program for easy access by leadership, faculty, site visit team members

29 Inter-Rater Reliability
“The heart of the discussion focuses on articulating criteria for each grade level [or scale level] that will help graders maintain consistency with one another and from paper to paper.” Center for Innovative Teaching and Learning (CITL) Writing Program, Indiana University Bloomington

30 Inter-Rater Reliability
Candidate Work Sample (CWS) (Student Teaching; EPP-wide assessment) The revised instrument was piloted in Fall 2014 and fully implemented in Spring 2015. In Summer 2015, inter-rater reliability and norming session was conducted as part of the University Supervisor meeting.

31 Inter-Rater Reliability
Norming Steps University Supervisors reviewed a sample assignment in advance of annual meeting and submitted scores Percentages of agreement were calculated Significant areas of disagreement and trends were identified During meeting, large group discussion conducted regarding: What evidence is there that supports your ratings? Musumeci, M. & Bohan, K. (2015). Candidate Work Sample (CWS) Norming Session. Presentation at NAU University Supervisor Annual Meeting, Phoenix, AZ.

32 Inter-Rater Reliability
Norming Steps Small group discussion: Re-examine the sample paper and rubric in your group. What evidence supports the pre-normed score rating? Is the pre-normed score reasonable and fair based on evidence? Do you agree with it based on the evidence from the sample paper? Do the members of your group disagree with the pre-normed scores? If you disagree, can you reach consensus with those scores? Are your rubric row scores within +/- 1 point of the pre-normed scores? Whole group discussion: Representatives from various groups relay the group’s discussion

33 Inter-Rater Reliability
Summary of Inter-Rater Reliability Data Summary of Supervisor Agreements Number of score pair agreements 36 Number of raters 47 % Score pair agreement 76.60% Average % Perfect Agreement 38.52% Average % Adjacent Agreement 46.47% Overall Average Agreement 84.99% (Adjacent + Perfect)

34 Faculty Feedback Regarding Process
“I wanted to thank you all for a providing a really productive venue to discuss the progress and continuing issues with our assessment work.  I left the meeting feeling very optimistic about where we have come and where we are going.  Thank you.” –Associate Professor, Elementary Education “Thanks for your facilitation and leadership in this process. It is so valuable from many different perspectives, especially related to continuous improvement! Thanks for giving us permission to use the validity tools as we continue to discuss our courses with our peers. I continue to learn and grow...” –Assistant Clinical Professor, Special Education

35 Improving the Process Timing the process and meetings so work concludes by Spring Break Encouraging chairs to be involved in process to understand and allocate appropriate department faculty meeting time (video tape meeting for review or include chair with instructions to be an observer rather than participant) Building capacity and sustaining process Value of small group meetings (faculty felt listened to and process appeared to improve faculty moral; faculty felt safe to discuss ideas) As a university with a large number of programs and over 200 faculty developed instruments, is there any way to retain value of small meetings through a more efficient process?

36 Resources & Contact Information Website: Contact Information: Cynthia Conn, PhD Assistant Vice Provost, Professional Education Programs Kathy Bohan, EdD Associate Dean, College of Education Sue Pieper, PhD Assessment Coordinator, Office of Curriculum, Learning Design, & Academic Assessment

37 Definitions Performance Assessment Validity Reliability
An assessment tool that requires test takers to perform—develop a product or demonstrate a process—so that the observer can assign a score or value to that performance. A science project, an essay, a persuasive speech, a mathematics problem solution, and a woodworking project are examples. (See also authentic assessment.) Validity The degree to which the evidence obtained through validation supports the score interpretations and uses to be made of the scores from a certain test administered to a certain person or group on a specific occasion. Sometimes the evidence shows why competing interpretations or uses are inappropriate, or less appropriate, than the proposed ones. Reliability Scores that are highly reliable are accurate, reproducible, and consistent from one testing occasion to another. That is, if the testing process were repeated with a group of test takers, essentially the same results would be obtained. (National Council on Measurement in Education. (2014). Glossary of important assessment and measurement terms. Retrieved from:

38 References Center for Innovative Teaching & Learning. (2005). Norming sessions ensure consistent paper grading in large course. Retrieved from Daggett, W.R. (2014). Rigor/relevance framework®: A guide to focusing resources to increase student performance. International Center for Leadership in Education. Retrieved from Gall, M. D., Borg, W. R., & Gall, J. P. (1996). Educational research: An introduction (6th Edition). White Plains, NY: Longman Publishers. Kane, M. (2013). The argument-based approach to validation. School Psychology Review, 42(4), Linn, R. L., Baker, E. L., & Dunbar, S. B. (1991). Complex, performance-based assessment: Expectations and validation criteria. Educational Researcher, 20(8), Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), Pieper, S. (2012, May 21). Evaluating descriptive rubrics checklist. Retrieved from Stevens, D. D., & Levi, A. J. (2005). Introduction to rubrics: An assessment tool to save grading time, convey effective feedback and promote student learning. Sterling, VA: Stylus Publishing, LLC.


Download ppt "A Reflective Practice Model for Examining the Validity of Performance assessments Cynthia Conn, PhD Assistant Vice Provost, Professional Education Programs."

Similar presentations


Ads by Google