Nick Saville Bridging the gap between theory and practice EALTA Krakow May 2006 Investigating the impact of language assessment systems within a state.

Slides:



Advertisements
Similar presentations
Quality Control in Evaluation and Assessment
Advertisements

School Based Assessment and Reporting Unit Curriculum Directorate
Intelligence Step 5 - Capacity Analysis Capacity Analysis Without capacity, the most innovative and brilliant interventions will not be implemented, wont.
Wynne Harlen. What do you mean by assessment? Is there assessment when: 1. A teacher asks pupils questions to find out what ideas they have about a topic.
Action Research Not traditional educational research often research tests theory not practical Teacher research in classrooms and/or schools/districts.
Assessment Systems for the Future: the place of assessment by teachers A project of the Assessment Reform Group, funded by the Nuffield Foundation.
A2 Unit 4A Geography fieldwork investigation Candidates taking Unit 4A have, in section A, the opportunity to extend an area of the subject content into.
- a necessary condition to ensure equality of opportunity for all pupils Workshop 5: How to leave no one behind? Essential teaching competencies for inclusive.
Evaluating public RTD interventions: A performance audit perspective from the EU European Court of Auditors American Evaluation Society, Portland, 3 November.
Constructing the Foundations of Capacity Building An Activity Theory Analysis of the English in Action Baseline Studies Jan Rae and Adrian Kirkwood.
1 Functions of Assessment Why do we assess students? Discuss in your group and suggest the three most important reasons.
Assessment as a washback tool: is it beneficial or harmful? Nick Saville Director, Research and Validation University of Cambridge ESOL Examinations October.
Self-evaluation as a process and an instrument Laura Muresan PROSPER-ASE Bucharest QUEST Romania.
Standards for Qualitative Research in Education
New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Consequential Validity Inclusive Assessment Seminar Elizabeth.
Instrument Development for a Study Comparing Two Versions of Inquiry Science Professional Development Paul R. Brandon Alice K. H. Taum University of Hawai‘i.
The aim of this part of the curriculum design process is to find the situational factors that will strongly affect the course.
TOPIC 3 BASIC PRINCIPLES OF ASSSESSMENT
Microsoft 2013 All Rights Reserved. Partners in Learning School Research Background.
Lesson planning? It can’t be that difficult! Svetla Tashevska, NBU.
Preparation of the Body. In this key area you will investigate the specific fitness demands of activities. You will learn about: 1.Types of fitness, this.
Impact, Washback and Consequences of Large-scale Testing
LANGUAGE PROFICIENCY TESTING A Critical Survey Presented by Ruth Hungerland, Memorial University of Newfoundland, TESL Newfoundland and Labrador.
Welcome to the Athens, Greece June17, Teaching and Testing: Promoting Positive Washback Kathleen M. Bailey Monterey Institute of International Studies.
TRAINING SOLUTIONS RISK ASSESSMENT For more information contact Victoria: (Tel) (Fax) ( )
Teacher Education for Inclusive Schooling: The Case of the Inclusive Practice Project in Scotland University of Gothenburg October 2013 Professor Lani.
MATHEMATICS KLA Years 1 to 10 Understanding the syllabus MATHEMATICS.
The Effectiveness of Student Authentication and Student Authenticity in Online Learning at Community Colleges Mitra Hoshiar Los Angeles Pierce College.
6 th semester Course Instructor: Kia Karavas.  What is educational evaluation? Why, what and how can we evaluate? How do we evaluate student learning?
Dimensions of Test Washback
Curriculum for Excellence Numeracy and Mathematics 18th September 2010
Implementation & Evaluation Regional Seminar ‘04 School Development Planning Initiative “An initiative for schools by schools”
Department of Physical Sciences School of Science and Technology B.S. in Chemistry Education CIP CODE: PROGRAM CODE: Program Quality Improvement.
Challenges in Developing and Delivering a Valid Test Michael King and Mabel Li NAFLE, July 2013.
Applying the Principles of Prior Learning Assessment Debra A. Dagavarian Diane Holtzman Dennis Fotia.
The New Scottish Teacher Education Professional Standards and the Development of the Professional Update System Tom Hamilton Director of Education and.
Quality in language assessment – guidelines and standards Waldek Martyniuk ECML Graz, Austria.
Quality Management.  Quality management is becoming increasingly important to the leadership and management of all organisations. I  t is necessary.
Educating Engineers in Sustainability Dr. Carol Boyle International Centre for Sustainability Engineering and Research University of Auckland.
Unit 1 – Preparation for Assessment LO 1.1&1.2&1.3.
Principles in language testing What is a good test?
1 Use of qualitative methods in relating exams to the Common European Framework: What can we learn? Spiros Papageorgiou Lancaster University The Third.
Assuring quality for the teaching of intercultural communication in Europe: perspectives and challenges Sharon Millar and Célio Conceição.
1 Historical Perspective... Historical Perspective... Science Education Reform Efforts Leading to Standards-based Science Education.
USEFULNESS IN ASSESSMENT Prepared by Vera Novikova and Tatyana Shkuratova.
Semester 2 Situation analysis TESL 3240 Lecture 3.
Scottish Qualifications Authority National Qualifications Group Awards: 2009 Conference Dr John Allan Curriculum for Excellence and NQGAs.
Promoting Positive Washback
How Much Do We know about Our Textbook? Zhang Lu.
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
FASA Middle School Principal ’ s Leadership Academy Don Griesheimer Laura Hassler Lang July 22, 2007.
1 European Association for Language Testing and Assessment
Assessing the enquiry process Andy Owen. Fieldwork and enquiry Where we are – in the worst cases: Pressure to deliver progress measures has reduced risk.
VALIDITY, RELIABILITY & PRACTICALITY Prof. Rosynella Cardozo Prof. Jonathan Magdalena.
Midterm Presentation- HOSPITALITY LANGUAGE IN DIFFERENT PLACE Min-Han Tsai (Tony) AFL 1A.
Evaluation and Assessment Evaluation is a broad term which involves the systematic way of gathering reliable and relevant information for the purpose.
Barry O’Sullivan | British Council Re-conceptualising Validity in High Stakes Testing Invited Seminar February 11 th 2015 University.
Raising standards improving lives The revised Learning and Skills Common Inspection Framework: AELP 2011.
2 What are Functional Skills? How do they fit in and how will they be assessed?
Towards an evidence-based language curriculum at KMUTT.
Development of Assessments Laura Mason Consultant.
Evaluating a Task-based English Course: A Proposed Model
Quality and Qualifications Ireland and its Functions
© Copyright Showeet.com ORAL PRESENTATION Nº1 Subject: Curriculum Evaluation Date: May 11 th, 2018 Cycle: VI Topic: Unit 1: Evaluation and Innovation and.
RESEARCH BASICS What is research?.
Evaluation and Testing
BASIC PRINCIPLES OF ASSESSMENT
Providing feedback to learners
Why do we assess?.
Presentation transcript:

Nick Saville Bridging the gap between theory and practice EALTA Krakow May 2006 Investigating the impact of language assessment systems within a state educational context

Investigating the impact of language assessment systems within a state educational context Nick Saville Bridging the gap between theory and practice EALTA Krakow May 2006

Background - a personal perspective 1980’s Bachman – early 1990s The literature on washback/impact early work and recent progress gaps? where next? Analysis of three case studies – what can be learnt? Towards a comprehensive model of impact Applying the model in a state educational context the Asset Languages Project Outline

 The 1980s – a personal perspective assessment in Italian universities entrance exams in Japan the influence of TOEIC/TOEFL e.g. in Japan/Korea developing Cambridge exams Tests affect individuals and society! How can this be managed better? What is needed to “do a better job”? Background

V R Practicality? Test Background – : Japan Considerations in developing fair tests The art of the possible

Practicality V R P Test “Practicality in Language Testing: an educational management model” Main argument: test development is a form of educational innovation - and needs to be managed as such “... achieving a balance between the purpose of the test, its validity for the purpose, the required reliability for the purpose and the constraints imposed by the context is essentially the task facing the test designer ….” Saville (1990), University of Reading - based on test development project Japan,

Practicality V RP Test Aspects of Practicality within a context and educational setting: Acceptability Applicability Availability Difficulty Economy Interpretability Relevance Replicability “… a principled approach to Practicality should provide the test designer with the means of approaching test development so that a suitable balance can be achieved without overlooking factors which cause possible solutions to fall down in practice”.

Putting the test into context V RP Test The aim … is not only to encourage good testing practice, but to prevent bad tests being produced a bad test is not only one with low reliability and dubious validity but also one which has a damaging backwash on the curriculum. Saville 1990:11-13 A logical consequence …. is that ethicality will be achieved as a result.. ……. this is because any test which is produced should be appropriate to the educational context in which it is to be used and the effect on learners and institutions will be a major consideration.

Putting the test into context V RP Test

Impact Ripples V RP Test

Impact Ripples V RP Test I Local Impact “micro” level

Impact Ripples V R P Test I I Wider Impact (“macro” level) I I

U = V + R + I + P Bachman - Cambridge 1990/91 Usefulness as overall validity

U = V + R + I + P Bachman and Palmer, 1996 : U = Cv + A + I + R + I + P Developing “useful tests”, fit for purpose Balancing the test qualities Usefulness as overall validity

Starting to develop a model  1993 – 1995  Using VRIP to develop and revise exams e.g. IELTS 1995 The IELTS impact project

The literature on washback/impact  Readings in the language testing literature: Hamp-Lyons (1989) Wall and Alderson (1993) Does washback exist? Etc.. Bailey (1996) Hamp-Lyons (1997) Watanabe (1997) Cheng and Watanabe (eds) (2004) Recent PhD studies and subsequent books based on research conducted in the 1990s: Cheng (SILT ) Wall (SILT ) Green (2004 – SILT forthcoming 2007) Hawkey – SILT 24 (forthcoming ) Current work in Lancaster, ETS, UCLA, Cambridge etc.

The literature on washback/impact So Impact is relatively new in the field of language assessment - an extension on the notion of washback and related to ethicality It is now considered to be of growing importance It is part of a validity argument and evidence needs to be provided Broadly speaking there is consensus impact deals with wider influences and includes the “macro contexts” - tests and examinations in society washback is an aspect of impact related to the “micro contexts” of the classroom and the school BUT  The dynamics between the micro and macro contexts mean that this is a complex rather than a simple relationship - a “complex dynamic system”

The literature on washback/impact And currently: there is no comprehensive model of test or examination impact within educational contexts impact has not yet been fully integrated into an approach to test development and validation in a systematic way

Three case studies – 1995 to 2004  Case 1 - the world-wide survey of the impact of IELTS a starting point for the work and the original model for what has followed a conceptualisation of impact and design/validation of suitable instruments to investigate it  Case 2 - the Italian PL2000 project an application of the model within a macro educational context an initial attempt at the applying the approach on a limited basis within a state educational context  Case 3 - the Florence Learning Gains Project an extension and re-application of the model within in a single school context at the micro level focusing on individual stakeholders within a single language teaching institution

Learning from the case studies  What can be learned using these specific impact projects as meta-data?

Learning from the case studies  Three key factors of contemporary educational systems need to be accounted for: 1.the nature of complex dynamic systems 2.the roles that stakeholders play within such systems 3.the need to see assessment projects as educational innovations within the systems and to manage change effectively

1.the nature of complex dynamic systems

2.the roles that stakeholders play

 See Wall (2005) a case study using insights from testing and innovation theory E.g. Henrichsen (1989) 3.the need to see assessment projects as educational innovations and to manage change effectively Hybrid Model of the Diffusion / Implementation Process AntecedentsProcessConsequences

Learning from the case studies  When applied to language assessment – two key factors also need to be accounted for : 1.the nature of language itself as a socio-cognitive phenomenon (the latest views on validity) 2.the nature of the test development and validation process from conception to routine data collection and analysis  Impact research, therefore is no different from any other kind of validation activity

1. A SOCIO-COGNITIVE FRAMEWORK Messick Bachman Kane Mislevy Weir etc.

A SOCIO-COGNITIVE FRAMEWORK The testing system Construct

The contexts Learning contexts Testing contexts Use of results contexts

Impact

2. Model of the Test Development Process

Identifying stakeholders and their needs Linking these needs to the requirements of test usefulness - including predicted impact - theoretical - practical Long term, Iterative Processes - a key feature of validation Model of the Test Development Process

Involvement of the stakeholder constituency E.g. during test design and development  presentation and consultation to do with specifications and detailed syllabus designs  professional support programmes for institutions and individual teachers/students etc. who plan to use the examinations  training and employment of suitable personnel within the field to work on all aspects of the examination cycle – to be question/item writers, to act as examiners, etc.

After an examination becomes operational  Procedures also need to be in place to routinely collect data which allows impact to be estimated:  e.g. who is taking the examination (i.e. a profile of the candidates) who is using the examination results and for what purpose who is teaching towards the examination and under what circumstances what kinds of courses and materials are being designed and used to prepare candidates what effect the examination has on public perceptions generally (e.g. regarding educational standards) how the examination is viewed by those directly involved in educational processes (e.g. by students, examination takers, teachers, parents, etc.) how the examination is viewed by members of society outside education (e.g. by politicians, business people, etc.)

Towards a comprehensive model  How can these considerations be combined to produce a comprehensive, integrated model?

Next phase: applying the model  Asset Languages within the UK educational context

Contacts: