How could training of language examiners be related to the Common European Framework? A case study based on the experience of the Hungarian Examinations.

How could training of language examiners be related to the Common European Framework? A case study based on the experience of the Hungarian Examinations Reform Teacher Support Project of the British Council Ildikó Csépes University of Debrecen, Hungary Inaugural Conference of EALTA Kranjska Gora, Slovenia May 14th-16th 2004

Assessing speaking skills  subjective assessment standard procedures adopting standard procedures governing how the assessments should be carried out (Guideline 1) basing judgements in direct tests on specific defined criteria (Guideline 2) using pooled judgements to rate performances (Guideline 3) undertaking appropriate training in relation to assessment guidelines (Guideline 4) The Common European Framework of Reference for Languages (2001, p.188): Subjectivity in assessment can be reduced, and validity and reliability thus increased by taking steps such as

 The training of language examiners has become an important issue in English language education in Europe. There is an increased interest in QUALITY CONTROL

the use of Interlocutor Frame to conduct the speaking exam the role of benchmarking in assessor training In this presentation, some important aspects of quality control will be highlighted in relation to training oral examiners:

A set of suggested training procedures for oral examiner training will also be presented. The model speaking examination and the interlocutor/assessor training model have been developed and piloted by the Hungarian Examinations Reform Teacher Support Project of the British Council. The original aim of the Project was to develop a new English school-leaving examination. Now there is only a model exam and related training courses available. The training model can be easily adapted to other contexts.

According to the CEF (Guideline 1), standard procedures should be adopted to carry out the assessments. a way of standardising the elicitation of oral performances  by using an Interlocutor Frame (it helps to conduct the exam in a standard manner, following a standard procedure) (it helps to conduct the exam in a standard manner, following a standard procedure)

describes in detail how the exam should be conducted gives standardised wording for beginning the examination giving instructions providing transition from one part of the examination to the other intervening rounding off the examination The Interlocutor Frame developed by the Project

Overview of the Model Speaking Examination

Part 1: focuses on candidates’ general interactional skills and ability to use English for social purposes Part 2: candidates demonstrate their ability to produce transactional long turns by comparing and contrasting visual prompts and to answer scripted supplementary questions asked by the interlocutor In Part 1 and 2, the interlocutor’s contributions (questions and instructions) are carefully guided and described in as much detail as possible in the Interlocutor Frame.

Part 3: candidates produce both transactional and interactional short turns The interlocutor and the candidate interact with each other in order to reach a decision about a problem that is posed by the interlocutor. The candidate has a small number of prompts to work with while the interlocutor has specific guidelines for contributing to the exchange. In Part 3, the interlocutor’s contributions are also carefully guided but the interlocutor has more freedom to express him or herself when participating in the simulated discussion task.

Performances are rated by the assessor according to set criteria, which consist of communicative impactcommunicative impact grammar and coherencegrammar and coherence vocabularyvocabulary sound, stress and intonationsound, stress and intonation According to the CEF (Guideline 2 ), assessors’ judgements should be based on specific defined criteria.

It consists of 8 bands: 5 of these bands (0, 1, 3, 5, 7) are defined by band descriptors5 of these bands (0, 1, 3, 5, 7) are defined by band descriptors 3 of them (2, 4, 6) are empty bands, which are provided for evaluating performances which are better than the level below, but worse than the level above3 of them (2, 4, 6) are empty bands, which are provided for evaluating performances which are better than the level below, but worse than the level above The Analytic Rating Scale

According to the CEF (Guideline 3), pooled judgements should be used to rate performances. Pooled judgementsbenchmarks Pooled judgements are represented as benchmarks for sample performances. Benchmarked performances can enhance and ensure the reliability of subjective marking. illustrate band descriptors Benchmarked performances can illustrate band descriptors (different levels of achievement). Without benchmarked performances, assessors may interpret candidates’ performances in their own terms.

In the Hungarian context they consisted of four main phases: 1.selecting sample performances and judges 2.home marking by judges 3.live benchmarking 4.editing and standardising justifications The benchmarking procedures were designed by Charles Alderson (the advisor of the Project).

a wide range of oral performance samples The Assessor-Interlocutor Training Team members selected a wide range of oral performance samples (12) that had been videoed during pilot examinations. These were subsequently used for the benchmarking exercise. 15 experts 15 experts were invited, who were thought to have particular expertise in and experience of the assessment of oral performances in English, both at secondary and at tertiary level, and who were expected to have some familiarity with the CEF. Phase 1: Selecting Sample Performances and Judges

Judges were asked to study the documents study the documents of the Benchmarking Pack carefully. view the videoed performancesmark them view the videoed performances on tape and mark them according to the appropriate rating scale and use the mark sheets provided. view the videos again make any necessary adjustments to the marks view the videos again once all performances have been marked, and make any necessary adjustments to the marks. Phase 2: Home Marking

Judges were asked to note downfeatures of each performance note down any features of each performance that justified the mark for each criterion, always referring to the band descriptors in the scale. make a list of examples of candidate language make a list of examples of candidate language, which would contribute to the final list to be compiled after the benchmarking exercise and to be used for training assessors in the future. Phase 2: Home Marking

Mark sheets and notes were sent to Györgyi Együd, the coordinator of the benchmarking exercise in an electronic format Mark sheets and notes were sent to Györgyi Együd, the coordinator of the benchmarking exercise in an electronic format  she collated all the marks and notes for each performance sample and assigned an ID number to each judge. For each candidate  a table of results by criterion and judge and a table of justifications

STEP 1: Judges viewed and marked each video again without the notes they had made previously. However, they were encouraged to take notes, underline relevant aspects of the scales that led them to their decisions. STEP 2: Judges were asked to reveal their marks after each video sample. Phase 3: The Live Benchmarking

STEP 3: Judges looked at the table of marks given in the preparation phase together with the collated justifications – in the meantime the marks were being recorded for purposes of calculating first and second marks (intra-rater reliability). STEP 4: The candidate’s performance was then discussed with reference to the justifications and the current rating session. Phase 3: The live benchmarking

STEP 5: Judges voted for the final benchmarks. STEP 6: The individual mark sheets were handed in for central recording after the performance sample had been benchmarked. STEP 7: Judges discussed major and minor errors in relation to the benchmarked performance. Phase 3: The live benchmarking

The main purpose of the benchmarking workshop: to reach agreement on grades using the Project’s scales. Relating the performances to the Common European Framework could only be a supplementary exercise.  For this purpose the 9-point scale (Overall Spoken Interaction) on page 74 of the Framework was used. After each video sample, judges had to indicate which of the 9 levels best described the candidate.

Reasons for editing and standardising the justifications: the justifications had to be worded in harmony with the wording of the Speaking Assessment Scales as much as possible in order to make the assessor training more effective  the justifications had to be worded in harmony with the wording of the Speaking Assessment Scales as much as possible in order to make the assessor training more effective  participants seemed to be more ready to accept the benchmarks when they saw that the justifications used the same terms (printed in bold) as the band descriptors in the scales. Phase 4: Editing and Standardising Justifications

the examples for minor and major mistakes, included in the justifications for support and illustration, had to be selected from the list of examples for candidate language that had been agreed on by all the judges. the examples for minor and major mistakes, included in the justifications for support and illustration, had to be selected from the list of examples for candidate language that had been agreed on by all the judges. the justifications or notes produced by the individual expert judges in the home marking phase were rather varied with respect to both content and format and so they had to be collated and standardised in terms of layout in order to produce the final justifications for each candidate. the justifications or notes produced by the individual expert judges in the home marking phase were rather varied with respect to both content and format and so they had to be collated and standardised in terms of layout in order to produce the final justifications for each candidate. Phase 4: Editing and Standardising Justifications

The Use of Benchmarked Performances in the Training of Assessors The benchmarks and justifications produced by the judges in the benchmarking sessions are used for supporting the pre-course tasks and the face-to-face assessor training course.  Benchmarked performance samples illustrate candidate performance at different levels of the scales.

The Use of Benchmarked Performances in the Training of Assessors When the wording of the assessment scales contains expressions such as ‘major and minor mistakes’, or ‘wide and limited range of vocabulary’, only benchmarked performance samples on video together with standardised, written justifications can help future assessors to come to an agreement about what level of performance the band descriptors actually refer to.

In the face-to-face training phase, the benchmarks and justifications are revealed to course participants in different ways at different stages of the training.

Stage 1 Individual assessor’s decision Justifications Benchmarks Step 1 Step 2 Step 3

Stage 2 Individual assessor’s decision Step 1 Step 2 Step 3 Step 4 Group Group decision Justifications Benchmarks

Stage 3 Individual assessor’s decision Step 1 Step 2 Step 3 Step 4 Group Group decision (revealed) Justifications Benchmarks

Stage 4 Individual assessor’s decision (revealed) + taking notes Step 1 Step 1 Step 2 Step 2 Step 3 Step 3 Step 4 Step 4 Justifications Benchmarks Groups write justifications

A Colour-coded Overview of the Techniques

The training procedures developed by the Project have the following aims: to provide participants with sufficient information about the model speaking examination they are going to be trained for (outline, task types, mode) to familiarise participants with standard interlocutor behaviour to familiarise participants with the main principles and procedures of assessing speaking performances According to the CEF (Guideline 4), future oral examiners should undertake appropriate training.

Further aims: to introduce the idea and practice of using analytic rating scales for assessing oral performances to enable participants to develop the necessary interlocuting and assessing skills to ensure valid and reliable assessment of live performances through standardisation  to equip trainees with transferable skills (there is a special need for this in Hungary)

The Outline of the Training Model Stage 1: Stage 1: pre-course distance learning self-study of an Introductory Training Pack with a pre-course video  accomplishing the pre-course tasks (analysing and marking sample video performances)

The Introductory Training Pack contains An overview of the speaking examination Guidelines for interlocutor behaviour Guidelines for assessor behaviour Pre-course tasks Self-assessment questions Appendices (e.g. Benchmarks & Justifications for the Sample Speaking Tests, Examples of Candidate Language, CEF Scales, Glossary)

The Outline of the Training Model Stage 2A: live interlocutor training course (a series of workshop sessions – Day 1) discussing the experiences of the distance phasediscussing the experiences of the distance phase analysing video samples of both standard and non-standard interlocutor behaviouranalysing video samples of both standard and non-standard interlocutor behaviour standardisation of the administration procedure through simulated examination situations (role plays)standardisation of the administration procedure through simulated examination situations (role plays)

Stage 2B: live assessor training course (a series of workshop sessions – Day 2) discussing the experiences of the distance phasediscussing the experiences of the distance phase introduction to assessing oral performances: modes and techniques of assessmentintroduction to assessing oral performances: modes and techniques of assessment familiarisation with the analytic rating scalefamiliarisation with the analytic rating scale standardisation of the assessment procedurestandardisation of the assessment procedure comparing performances at different levelscomparing performances at different levels The Outline of the Training Model

Stage 3: a distance phase application of the acquired skills Practical application of the acquired skills in mock speaking tests  in co-operationParticipants do the mock exams in co-operation with another course participant, thus they take the role of both the interlocutor and the assessor. observeThey can observe each other and share their experiences. reportThey have to report on their experience in detail. The Outline of the Training Model

Sample Materials from the Interlocutor Training Model Sample 1: non-standard interlocutor Analysing non-standard interlocutor behaviour discussing standard interlocutor After seeing and discussing standard interlocutor behaviourcompare it with behaviour, participants are asked to compare it with non-standard performances. deviates from the Interlocutor Framesuggested guidelinesThey have to identify instances where the interlocutor’s behaviour deviates from the Interlocutor Frame and the suggested guidelines.

Sample 2:ating Sample 2: Simulating difficult examination situations role play Participants role play difficult examination in groups of three situations in groups of three: an observer the candidate the interlocutor For each part of the model speaking exam, there are three role-play tasks → all participants will experience all the three roles by the end of the training. Sample Materials from the Interlocutor Training Model

Role-play Cards for Part 1 (The Interview) Candidate are a shy, not very talkative candidate who tends to wait for guiding questions. You often reply with one or two short sentences only. You are a shy, not very talkative candidate who tends to wait for guiding questions. You often reply with one or two short sentences only. Interlocutor You are the interlocutor who asks the questions of the first part of the speaking test. You have to elicit as much speech from the candidate as possible. Please remember to ask the questions listed in the Interlocutor Frame.

Conclusions formal training It is impossible to become a trained interlocutor and assessor without formal training. distance and face-to-face elements Training should involve distance and face-to-face elements as well to ensure that future interlocutors and assessors go through each and every phase of the difficult and complex standardisation process. One training course is not enough One training course is not enough. Only further practicemonitoring interlocutor and assessor behaviour Only further practice and monitoring interlocutor and assessor behaviour can ensure that candidates’ speaking ability is assessed in a standard manner and the assessments are valid and reliable.

INTO EUROPE Series Editor: J. Charles Alderson The Speaking Handbook Ildikó Csépes & Györgyi Együd The Handbook is accompanied by a 5-hour DVD Published by Teleki László Foundation & The British Council Distributor: Libro Trade Info:books@librotrade.hu Info: books@librotrade.hu email: icsepes@delfin.unideb.hu

How could training of language examiners be related to the Common European Framework? A case study based on the experience of the Hungarian Examinations.

Similar presentations

Presentation on theme: "How could training of language examiners be related to the Common European Framework? A case study based on the experience of the Hungarian Examinations."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

How could training of language examiners be related to the Common European Framework? A case study based on the experience of the Hungarian Examinations.

Similar presentations

Presentation on theme: "How could training of language examiners be related to the Common European Framework? A case study based on the experience of the Hungarian Examinations."— Presentation transcript:

Similar presentations

About project

Feedback