Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using the IRT and Many-Facet Rasch Analysis for Test Improvement “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” Desislava Dimitrova, Dimitar.

Similar presentations


Presentation on theme: "Using the IRT and Many-Facet Rasch Analysis for Test Improvement “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” Desislava Dimitrova, Dimitar."— Presentation transcript:

1 Using the IRT and Many-Facet Rasch Analysis for Test Improvement “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” Desislava Dimitrova, Dimitar Atanasov New Bulgarian University BILC Seminar, 10-15 October 2010-Varna

2 Outline  Examination procedure  Main concepts and observations  Socio-cognitive test validation framework, Cyril Weir (2005) and criteria  Scoring validity for listening and reading parts of the test  Scoring validity for essay

3 Test structure 1. Listening paper: two tasks  15 MCQ 2. Reading paper: five tasks  6 items matching response format  10 items bank-cloze response format  10 items open-cloze response format  16 items short-answer response format  2 open-ended questions  5 MCQ 3. Essay: 180-220 words

4 Too much? The concept of communicative language ability (CEFR) The concept of test usefulness (Bachman) The concept of justifing the use of language assessment in real world (Bachman) The concept of validity The Code of practice (ALTE *, for example) * Association of Language Testers in Europe

5 Statements NBU exam is high-stake. NBU exam is criterion-oriented. NBU exam is ‘independent’. Evidences for test validation were not established, BUT there was a routine practice for test development process and test administration.

6 The Socio-cognitive Framework for test validation, Cyril Weir (2005) Test takers characteristics and: Context validity Theory-based validity Scoring validity Consequential validity Criterion-related validity

7 “ Before-the –test- event” Context validity Theory-based validity “After- the- test –event” Scoring validity Consequential validity Criterion-related validity

8 Scoring validity for listening and reading parts of the test are established by: Item analysis Internal consistency Error of measurement Marker reliability Not just looking at them! Investigate, discuss, learn and take decisions!

9 Analisis3-parameter IRT model Advantages Item parameter estimates are independent of the group of examinees used Test taker ability estimates are independent of the particular set of items used Degree of Difficulty to specify the discrimination to specify the content

10

11

12 Summer session, 2010

13 Item number Version 1 Values of difficulty Version 2 Values of difficulty Version 3 Values of difficulty Version 4 Values of difficulty 1-1,7-1,21,6-0,7 2-1,5-1,21.9-2,2 3-1,7-2,92,6-0,4 4-0,5-2,4-0,9-0,2 5-3-0,12,6-1,4 6-0.7-0,1-0,3-0,2

14 Possible decisions Remedial procedures Classroom assessment Only certification decision

15 Scoring validity for writing is established by: Criteria/rating scale Rating procedures: Rater training Standardization Rating conditions Rating Moderation Statistical analysis Raters Grading

16

17 Conclusion for the essay: Good Two raters Analytic writing scale Rubrics and input Negative The score depends on the raters No task specific scale No standardization

18 Now is fact that: We will continue our work for item writer’s training content and statistical specification of the items test review and test revision

19 Shearing: Investigation (small steps to “strong” validity). Comparison (language ability of the same population at the same level) Cooperation ( in research project)

20 Thank you New Bulgarian University www.nbu.bg


Download ppt "Using the IRT and Many-Facet Rasch Analysis for Test Improvement “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” Desislava Dimitrova, Dimitar."

Similar presentations


Ads by Google