Quality Control in Evaluation and Assessment

Slides:

Advertisements

Similar presentations

SKILLS LEARNED IN SCHOOL

Advertisements

Assessment types and activities

Principles and Practice in Language Testing: Compliance or Conflict?

Assessment Adapted from text Effective Teaching Methods Research-Based Practices by Gary D. Borich and How to Differentiate Instruction in Mixed Ability.

Correction, feedback and assessment: Their role in learning

The Test of English for International Communication (TOEIC): necessity, proficiency levels, test score utilization and accuracy. Author: Paul Moritoshi.

Assessment Systems for the Future: the place of assessment by teachers A project of the Assessment Reform Group, funded by the Nuffield Foundation.

Testing What You Teach: Eliminating the “Will this be on the final

Alternative Assesment There is no single definition of ‘alternative assessment’ in the relevant literature. For some educators, alternative assessment.

Assessment as a washback tool: is it beneficial or harmful? Nick Saville Director, Research and Validation University of Cambridge ESOL Examinations October.

The International Legal English Certificate Issues in Developing a Test of English for Specific Purposes David Thighe, Cambridge ESOL EALTA Conference.

Susan Malone Mercer University.  “The unit has taken effective steps to eliminate bias in assessments and is working to establish the fairness, accuracy,

Evaluating tests and examinations What questions to ask to make sure your assessment is the best that can be produced within your context. Dianne Wall.

Creating Effective Classroom Tests by Christine Coombe and Nancy Hubley 1.

Linguistics and Language Teaching Lecture 9. Approaches to Language Teaching In order to improve the efficiency of language teaching, many approaches.

Assessing and Evaluating Learning

Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.

Standards and Guidelines for Quality Assurance in the European

INTRODUCTION.- PROGRAM EVALUATION

Relating language examinations to the Common European Framework of Reference for Languages (CEFR) Waldemar Martyniuk Waldemar Martyniuk Language Policy.

 Supporting excellence in assessment  Encouraging professional development for individuals involved in assessment  Disseminating good practice  Proving.

Introduction: Teaching and Testing/Assessment

Principles and Practice in Language Testing Compliance or Conflict?

The Development of Intercultural Dimension in Language Teaching

Quality in language assessment – guidelines and standards Waldek Martyniuk ECML Graz, Austria.

1 An Introduction to Language Testing Fundamentals of Language Testing Fundamentals of Language Testing Dr Abbas Mousavi American Public University.

Ways for Improvement of Validity of Qualifications PHARE TVET RO2006/ Training and Advice for Further Development of the TVET.

Validity & Practicality

Principles in language testing What is a good test?

Nick Saville Bridging the gap between theory and practice EALTA Krakow May 2006 Investigating the impact of language assessment systems within a state.

Student assessment AH Mehrparvar,MD Occupational Medicine department Yazd University of Medical Sciences.

Chap. 2 Principles of Language Assessment

Week 5 Lecture 4. Lecture’s objectives  Understand the principles of language assessment.  Use language assessment principles to evaluate existing tests.

USEFULNESS IN ASSESSMENT Prepared by Vera Novikova and Tatyana Shkuratova.

Semester 2 Situation analysis TESL 3240 Lecture 3.

WHO Global Standards. 5 Key Areas for Global Standards Program graduates Program graduates Program development and revision Program development and revision.

Module 6 Testing & Assessment Part 1

ACADEMIC PERFORMANCE AUDIT ON AREA 1, 2 AND 3 Prepared By: Nor Aizar Abu Bakar Quality Academic Assurance Department.

Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.

How Much Do We know about Our Textbook? Zhang Lu.

Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.

Alternative Assessment Chapter 8 David Goh. Factors Increasing Awareness and Development of Alternative Assessment Educational reform movement Goals 2000,

Nurhayati, M.Pd Indraprasta University Jakarta.  Validity : Does it measure what it is supposed to measure?  Reliability: How the representative is.

IB: Language and Literature

VALIDITY, RELIABILITY & PRACTICALITY Prof. Rosynella Cardozo Prof. Jonathan Magdalena.

Foundations of American Education: Perspectives on Education in a Changing World, 15e © 2011 Pearson Education, Inc. All rights reserved. Chapter 11 Standards,

Language Assessment. Evaluation: The broadest term; looking at all factors that influence the learning process (syllabus, materials, learner achievements,

PRINCIPLES OF LANGUAGE ASSESSMENT Riko Arfiyantama Ratnawati Olivia.

English-Language Arts Content Standards for California By Ashleigh Boni & Christy Pryde By Ashleigh Boni & Christy Pryde.

Dr. Salwa El-Magoli Chairperson of the National Quality Assurance and Accreditation Committee. Former Dean of the Faculty of Agricultural, Cairo university.

Copyright Keith Morrison, 2004 ASSESSMENT IN EDUCATION The field of assessment is moving away from simple testing towards a much more sophisticated use.

Monitoring and Assessment Presented by: Wedad Al –Blwi Supervised by: Prof. Antar Abdellah.

FSM NSTT Teaching Competency Test Evaluation. The NSTT Teaching Competency differs from the three other NSTT tests. It is accompanied by a Preparation.

Evaluation and Assessment Evaluation is a broad term which involves the systematic way of gathering reliable and relevant information for the purpose.

PGCE Evaluation of Assessment Methods. Why do we assess? Diagnosis: establish entry behaviour, diagnose learning needs/difficulties. Diagnosis: establish.

Language Assessment.

Good teaching for diverse learners

EVALUATING EPP-CREATED ASSESSMENTS

BILC Seminar, Budapest, October 2016

ECML Colloquium2016 The experience of the ECML RELANG team

Concept of Test Validity

پرسشنامه کارگاه.

Learning About Language Assessment. Albany: Heinle & Heinle

Assessment A.E.T. Week 10 Cate Clegg.

Gazİ unIVERSITY M.A. PROGRAM IN ELT TESTING AND ASSESSMENT IN ELT «ValIdIty» PREPARED BY FEVZI BALIDEDE 2013, ANKARA.

BASIC PRINCIPLES OF ASSESSMENT

Why do we assess?.

Assessment and the question of quality: Towards sustainable assessment for lifelong learning AND SYSTEMIC LEARNING Peter rule Centre for higher and adult.

Presentation transcript:

Quality Control in Evaluation and Assessment J Charles Alderson, Department of Linguistics and Modern English Language, Lancaster University

“Assessment is central to language learning, in order to establish where learners are at present, what level they have achieved, to give learners feedback on their learning, to diagnose their needs for further development, and to enable the planning of curricula, materials and activities.”

Outline Current practice Assessment for certification Tradition one: teacher-centred, school-based Tradition two: central, quality controlled Basic parameters What is needed to ensure parameters are met

Current practice Quality of important examinations not monitored No obligation to show that exams are relevant, fair, unbiased, reliable, and measure relevant skills University degree in a foreign language qualifies one to examine language competence, despite lack of training in language testing In many circumstances merely being a native speaker qualifies one to assess language competence. Teachers assess students’ ability without having been trained.

First tradition · Teacher-centred · School/university-based assessment · Teacher develops the questions · Teacher's opinion the only one that counts · Teacher-examiners have no explicit marking criteria · Assumption that by virtue of being a teacher, and having taught the student being examined, teacher- examiner makes reliable and valid judgements · Authority, professionalism, reliability and validity of teacher rarely questioned · Rare for students to fail

Second tradition · Tests externally developed and administered · National or regional agencies responsible for development, following accepted standards · Tests centrally constructed, piloted and revised · Difficulty levels empirically determined · Externally trained assessors · Empirical equating to known standards or levels of proficiency

Basic parameters Validity Reliability Practicality Authenticity Washback Impact Currency

“Validity in general refers to the appropriateness of a given test or any of its component parts as a measure of what it is purported to measure. A test is said to be valid to the extent that it measures what it is supposed to measure. It follows that the term valid when used to describe a test should usually be accompanied by the preposition for. Any test may then be valid for some purposes, but not for others.”(Henning, 1987)

Validity Rational, empirical, construct Internal and external validity Face, content, construct Concurrent, predictive Construct

How can validity be established? My parents think the test looks good. The test measures what I have been taught. My teachers tell me that the test is communicative and authentic. If I take the Rigo utca test instead of the FCE, I will get the same result. I got a good English test result, and I had no difficulty studying in English at university.

How can validity be established? Does the test look valid to the general public? Does the test match the curriculum, or its specifications? Is the test based adequately on a relevant and acceptable theory?

How can validity be established? Does the test yield results similar to those from a test known to be valid for the same audience and purpose? Does the test predict a learner’s future achievements? Note: a test that is not reliable cannot, by definition, be valid

How can validity be established? A test’s items should work well: they should be of suitable difficulty, and good students should get them right, whilst weak students are expected to get them wrong. All tests should be piloted, and the results analysed to see if the test performed as predicted

Factors affecting validity Unclear or non-existent theory Lack of specifications Lack of training of item/ test writers Lack of / unclear criteria for marking Lack of piloting/ pre-testing Lack of detailed analysis of items/ tasks Lack of standard setting to CEF Lack of feedback to candidates and teachers

Reliability If I take the test again tomorrow, will I get the same result? If I take a different version of the test, will I get the same result? If the test had had different items, would I have got the same result? Do all markers agree on the mark I got? If a marker marks my test again tomorrow, will I get the same result?

Reliability Over time: test – re-test Over different forms: parallel Over different samples: homogeneity Over different markers: inter-rater Within one rater over time: intra-rater

Factors affecting reliability Poor administration conditions – noise, lighting, cheating Lack of information beforehand Lack of specifications Lack of marker training Lack of standardisation Lack of monitoring

Practicality Number of tests to be produced Length of test in time Cost of test Cost of training Cost of monitoring Difficulty in piloting/ pre-testing Time to report results

Factors affecting practicality Awareness of complexity and cost Time to do the job: ‘quick and dirty’ remains dirty Funding to support development, monitoring and further development Recognition of need for training – of testers and of teachers

Authenticity Genuineness of text Naturalness of task Naturalness of learners’ response Suitability of test for purpose Match of test to learners’ needs (if known) Face validity Expectations of stakeholders and culture

Factors affecting ‘authenticity’ A test is a test is a test Availability of resources Training of test developers/ item writers Relative importance of reliability over validity Purpose of test: proficiency versus progress or diagnosis

Washback Test can have positive or negative effects Test can affect content of teaching Test can affect method of teaching Test can affect attitudes and motivation Test can affect all teachers and students in same way, or individuals differently Importance of test will affect washback

Factors affecting washback Extent to which teachers know nature of test Extent to which teachers understand rationale of test Extent to which teachers consider how best to prepare learners for test Nature of teachers’ beliefs about teaching Effort teachers are willing to make Difficulty of test

Impact Effect of test on society Effect of test on stakeholders: employers, higher education, parents, politicians Intended and unintended Beneficial or detrimental

Factors affecting impact Extent to which purpose of test is understood and accepted Currency of test Face validity of test Stakes of test Availability of information Education of stakeholders re complexity of testing

Currency of test Extent to which test is valued by stakeholders Different stakeholders may have different perspectives: university vs employer; parents vs teachers; teachers vs principals? politicians vs professionals?

Factors affecting currency Consequences of passing or failing – stakes Extent to which stakeholders take results seriously into consideration Beliefs about value of tests in general Extent to which test matches expectations about tests in general or language tests in particular Difficulty of test Institution offering the test

General Issues · Teacher-based assessment vs central quality control · Internal vs external assessment · Quality control of exams (and the associated cost) · Piloting and pre-testing · Test analysis and the role of the expert · The existence of test specifications · Guidance and training for test developers and markers

General Issues (continued) Feedback to candidates Pass / fail rates The currency of the old and the new traditions The relationship with other languages and countries The standards of the local exams in terms of "Europe"

Constraints on testing · Time – much less than for teaching · Sample – inevitably limited · Resources always limited – money, infrastructure, trained personnel · Assessment culture / tradition · Lack of awareness of problems and solutions

BUT WASHBACK · Testing is too important to be left to the teacher · Testing is too important to be left to the tester · Both are needed, to reflect and influence teaching, validly and reliably.

“Assessment is central to language learning, in order to establish where learners are at present, what level they have achieved, to give learners feedback on their learning, to diagnose their needs for further development, and to enable the planning of curricula, materials and activities.”