1 Assessment Use Argument Nancy Powers Chief of English Testing Section SHAPE, Mons, Belgium Sept 2013.

Slides:



Advertisements
Similar presentations
Common Core Standards (What this means in computer class)
Advertisements

Performance Assessment
Quality Control in Evaluation and Assessment
KRISTINE SOGHIKYAN YEREVAN STATE LINGUISTIC UNIVERSITY EPOSTL AS AN ADMINISTRATOR'S GUIDE TO INTERNAL QUALITY ASSURANCE IN UNIVERSITY LANGUAGE INSTRUCTION.
Evidence & Preference: Bias in Scoring TEDS-M Scoring Training Seminar Miami Beach, Florida.
Performance Tasks for English Language Arts
Second Language Acquisition Education 286. Today: Introduction What is the nature of the interdisciplinary field of research (linguistics, psychology,
Assessment Assessment should be an integral part of a unit of work and should support student learning. Assessment is the process of identifying, gathering.
Raili Hildén, University of Helsinki, Finland TBLT 2009 Lancaster ‘Tasks: context, purpose and use’ 3rd Biennial International.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Objective Develop an understanding of Appendix B: CA ELD Standards Part II: Learning About How English Works.
Updated 11/16/06©1996 & forthcoming, Bachman & Palmer & OUPPage 1 The Place of Intended Impact in Assessment Use Arguments * Lyle F. Bachman Department.
Developing Language Assessments and Justifying their Use
Developing consistency of teacher judgment Module 2.
Lecturette 1: Culturally Responsive Progress Monitoring: Universally Designed Classroom Assessment.
1 The New Adaptive Version of the Basic English Skills Test Oral Interview Dorry M. Kenyon Funded by OVAE Contract: ED-00-CO-0130 The BEST Plus.
How well did the assessment task do what we wanted it to do? Janina Drazek Manager — Assessment & Comparability, QCAR Queensland Studies Authority.
VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green 1.
VALIDITY.
Language Testing Introduction. Aims of the Course The primary purpose of this course is to enable students to become competent in the design, development,
Basic Issues in Language Assessment 袁韻璧輔仁大學英文系. Contents Introduction: relationship between teaching & testing Introduction: relationship between teaching.
Classroom Assessment A Practical Guide for Educators by Craig A
Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.
Shawna Williams BC TEAL Annual Conference May 24, 2014.
Foreign language and English as a Second Language: Getting to the Common Core of Communication. Are we there yet? Marisol Marcin
MATHEMATICS KLA Years 1 to 10 Understanding the syllabus MATHEMATICS.
6 th semester Course Instructor: Kia Karavas.  What is educational evaluation? Why, what and how can we evaluate? How do we evaluate student learning?
Ten key parts of the manuscript
© 2013 Cengage Learning. Outline  Types of Cross-Cultural Research  Method validation studies  Indigenous cultural studies  Cross-cultural comparisons.
Thinking Actively in a Social Context T A S C.
BSBIMN501A QUEENSLAND INTERNATIONAL BUSINESS ACADEMY.
Argumentation in Middle & High School Science Victor Sampson Assistant Professor of Science Education School of Teacher Education and FSU-Teach Florida.
Quality in language assessment – guidelines and standards Waldek Martyniuk ECML Graz, Austria.
January 29, 2010ART Beach Retreat ART Beach Retreat 2010 Assessment Rubric for Critical Thinking First Scoring Session Summary ART Beach Retreat.
Monitoring through Walk-Throughs Participants are expected to purpose the book: The Three-Minute Classroom Walk-Through: Changing School Supervisory.
ELD Transition Sessions
Validity & Practicality
Principles in language testing What is a good test?
ELA Common Core Shifts. Shift 1 Balancing Informational & Literary Text.
Evaluating a Research Report
Chap. 2 Principles of Language Assessment
An Investigation of test- taking strategies among Uitm students in an online test. SITI NASUHA ABU HASSAN P61632.
Military Language Testing at the National Defence University and the Common European Framework BILC CONFERENCE BUDAPEST.
Eloise Forster, Ed.D. Foundation for Educational Administration (FEA)
Performance and Portfolio Assessment. Performance Assessment An assessment in which the teacher observes and makes a judgement about a student’s demonstration.
NATO BAT Testing: The First 200 BILC Professional Seminar 6 October, 2009 Copenhagen, Denmark Dr. Elvira Swender, ACTFL.
CAROLE GALLAGHER, PHD. CCSSO NATIONAL CONFERENCE ON STUDENT ASSESSMENT JUNE 26, 2015 Reporting Assessment Results in Times of Change:
COUNCIL OF CHIEF STATE SCHOOL OFFICERS (CCSSO) & NATIONAL GOVERNORS ASSOCIATION CENTER FOR BEST PRACTICES (NGA CENTER) JUNE 2010.
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
1 TESL Evaluating CALL Packages:Curriculum/Pedagogical/Lingui stics Dr. Henry Tao GUO Office: B 418.
Alternative Assessment Chapter 8 David Goh. Factors Increasing Awareness and Development of Alternative Assessment Educational reform movement Goals 2000,
Qualifications Update: Higher Media Qualifications Update: Higher Media.
Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.
Argument Writing Evette Striblen ELA curriculum specialist Central Region
Publishing Research Papers in Applied Linguistics and TESOL Jinyan Huang, Ph.D., Professor Niagara University, United States Wuhan University of Technology.
Midterm Report Presenter: Eunice Lai Instructor: Patricia Su Date: 19 th April, 2012.
Case Study of the TOEFL iBT Preparation Course: Teacher’s perspective Jie Chen UWO.
EVALUATING EPP-CREATED ASSESSMENTS
BILC Seminar, Budapest, October 2016
Writing a sound proposal
The evidence is in the specifications
Validity and Reliability
ASSESSMENT OF STUDENT LEARNING
پرسشنامه کارگاه.
Chief of English Testing, Language Programs
Roadmap Towards a Validity Argument
Using Verbal Reports for Data Collection and Analysis
Assessment Use Argument
Successful trialling: from trial and error to best practices
Presentation transcript:

1 Assessment Use Argument Nancy Powers Chief of English Testing Section SHAPE, Mons, Belgium Sept 2013

2 Introduction Validity is an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores” (Messick 1989: 13)

3 Assessment Use Argument Based on Toulmin’s (2003) approach to practical reasoning Justification Accountability

4 According to Bachman & Palmer Assessment development consists of two parallel processes that serve two purposes. 1.The assessment production process 2.The assessment justification process (p. 430, 2010)

5 Therefore… An AUA is a theoretical framework that provides a rationale and set of procedures for justifying the intended uses of the assessment.

6 The nitty-gritty of an AUA It is comprised of 4 parts 1. Claims 1.The beneficial consequences of an assessment 2.The decisions that are made 3.The interpretations that are made 4.The assessment records 2. Warrant – statements that elaborate the claims

7 The nitty-gritty of an AUA (cont’d) Not everyone will agree with us 3. Rebuttal – counterclaim 4. Backing – evidence supporting the warrants includes feedback from stakeholders through questionnaires, verbal protocols, observations, interviews, previous research, statistical analyses

8 An AUA at work Lots of theory… Concrete example: Justifying the inclusion of videos in a listening test

9 Claim 1: The consequences are beneficial I make the claim that The consequences of using a video listening test are beneficial to the test developers and to the students. So, what does this mean? I need to elaborate.

10 Warrant The consequences of using the VLT that are specific to the test developers and to the students will be beneficial. The test developers will develop tests that are more authentic and better reflect the TLU domain Students can use the visual cues to help with comprehension The context will be clear thereby reducing student anxiety

11 Rebuttals I disagree with you! The consequences of using the VLT that are specific to the test developers and to the students will NOT be beneficial. Videos will be distracting Attending to multiple sources of stimulation is more tiring & demanding

12 Backing: Collection of evidence that justify your claims 1.The students who trialled the test reported that… “The video aspect helped to ground the task, making it more authentic than just an audio test” “It gave focus to me, therefore allowing me to listen. Often, when listening to audio-only, my mind wanders, i.e. I think of something else, therefore missing the listening text.” “They [the videos] were relaxing; therefore there was no mental block to listening because of nervousness.” 2.The use of videos can be theoretically justified in that it introduces construct-relevant variance (Wagner, 2002, 2007)

13 Backing cont’d 3.Wagner (2010) found that student performance on a listening test that included videos increased 6.5% 4.If test task characteristics are similar to the TLU characteristics, then the test can be seen as having construct validity (Bachman & Palmer, 1996)

14 Claim 2: Decisions made The decisions to award a proficiency level reflect existing educational and societal values and the content/task/accuracy statements as stated in the NATO STANAG 6001 Language Proficiency Levels; and are equitable for those students who are placed at different proficiency levels. These decisions are made by the test developers and refer to which proficiency level the students belong. The individuals affected by these decisions are the students and the teachers of the MTCP program.

15 Warrant: Values sensitivity Relevant educational values of CDA are carefully considered in the proficiency level decisions that are made. Rebuttal: Relevant educational values of CDA are NOT carefully considered in the proficiency level decisions that are made.

16 Backing: CDA governed by two documents: Qualification Standard and the Foreign National Training Plan VLT respects the C/T/A statements for each proficiency level in STANAG 6001

17 Warrant : Equitability Test takers and teachers are fully informed about how the decision will be made. Rebuttal: Test takers and teachers are NOT fully informed about how the decision will be made.

18 Backing: The testing section conducts information sessions with teachers and testers when introducing new testing methods. Candidate’s Guide

19 Claim 3: Interpretations The interpretations about the students’ ability to utilize verbal and non- verbal behaviour to comprehend the main idea, explicitly stated information and implicit information are meaningful in terms of the construct definition of listening comprehension, impartial to all groups of test takers, generalizable to tasks that resemble the TLU, and relevant to and sufficient for the proficiency level decisions that are to be made.

20 Warrant : Meaningful The claim is meaningful in terms of listening to and comprehending general English with respect to the construct definition. Rebuttal: The claim is NOT meaningful in terms of listening to and comprehending general English with respect to the construct definition.

21 Backing: Meaningful The construct definition is based on research on listening comprehension. The items were developed according to the NATO STANAG 6001 Proficiency levels. Item specs

22 Warrant: Impartiality Test takers are treated impartially during all aspects of the administration of the assessment Rebuttal: Test takers are NOT treated impartially during all aspects of the administration of the assessment

23 Backing: Impartiality Candidate’s guide, all sessions administered in the same way every time. All students are given the same test with the same instructions, despite their country of origin, their rank, their gender, etc. Generalizable Relevant Sufficient

24 Claim 4 Assessment Records The scores from the video listening test are consistent across different forms and administrations of the test, across students from different military trades, and across groups with different nationalities and first languages.

25 Warrants: Inter/Intra rater reliability Scored the same way across administrations Rebuttal: no rebuttal Backing: this is multiple-choice, computer-delivered test: no inter/intra rater reliability needed internal algorithm in computer program for scoring

26 Conclusion: in a nutshell Basically you are saying something about the test that you have designed You make these claims clear by elaborating on what you mean. Then, you address any perspective that goes against what you have claimed and gather evidence that supports your point of view.

27 Questions? Thank you

28 References Bachman, L. F., & Palmer, A.S. (1996). Language testing in practice. Oxford, Oxford University Press. Bachman, L. F., & Palmer, A.S. (2010). Language assessment in practice. Oxford, Oxford University Press. Hostetter A. B. (2011). When do gestures communicate? A meta-analysis. Psychological Bulletin, 137(2), Kellerman, S. (1990). Lip service: The contribution of the visual modality to speech perception and its relevance to the teaching and testing of foreign language listening comprehension. Applied Linguistics, 11(3), Kellerman, S. (1992). “I see what you mean”: The role of kinesic behaviour in listening and implications for foreign and second language learning. Applied Linguistics, 13,

29 Okey, G. (2007). Construct implications of including still image or video in computer-based listening tests. Language Testing, 24, Toulmin, S. E. (2003). The uses of argument (updated edn). Cambridge: Cambridge University press. Wagner, E. (2002) Video listening tests: A pilot study. Working Papers in TESOL & Applied Linguistics, Teacher’s College, Columbia University, 2 (1). Retrieved from the Internet on August 20, Wagner, E. (2007). Are they watching? Test-taker viewing behaviour during an L2 video listening test. Language Learning & Technology, 11, Wagner, E. (2010b). The effect of the use of video texts on ESL listening test-taker performance. Language Testing, 27,