Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 A Century of Testing: Ideas on Solving Enduring Accountability and Assessment Problems UCLA, Los Angeles 8-9 September 2005 Barry McGaw Director for.

Similar presentations


Presentation on theme: "1 A Century of Testing: Ideas on Solving Enduring Accountability and Assessment Problems UCLA, Los Angeles 8-9 September 2005 Barry McGaw Director for."— Presentation transcript:

1 1 A Century of Testing: Ideas on Solving Enduring Accountability and Assessment Problems UCLA, Los Angeles 8-9 September 2005 Barry McGaw Director for Education Organisation for Economic Co-operation and Development Celebrating 20 years of Research on Educational Measurement The 2005 CRESST Conference:

2 2 Where to focus…

3 3 …so much has happened… r Advancing the link to teaching and learning r Refining system monitoring l effectiveness (quality) l efficiency (value for money) l equity r Taking an international perspective l IEA surveys such as TIMSS, PIRLS l OECD Programme for International Student Assessment (PISA) –different national achievement on social background slopes –Google on PISA r Somewhere else? (given what else was on the programme)

4 4 One key problem to be resolved

5 5 Point of reference for judging individuals r Abandoning hope of an external measure l Psychophysics –comparing judgements (such as brightness of light) with measure –requiring judgements of differences, not absolute values l Psychological phenomena –developed in the context of differential psychology –individual performance judged in relation to other’ performance –in particular, in relation to average performance of others –norm-referenced (Want to look better? Choose other company.) r In search of an external criterion l Separating scale construction and measurement –Thurstone –criterion-referenced measurement l Simultaneous scale construction and measurement –item-response models (person-response-to-item models)

6 6 Application in a high-stakes arena

7 7 Public examinations r High-stakes assessments based on curriculum l secondary certification and university entrance l selection of highly competitive courses (top 1½ per cent) l need a common curriculum across schools r The comparability-over-time problem… l Grade distributions used to monitor standards –failure rate used as a measure of ‘standards –claim that if participation rates grow, grades should decline to ensure that an ‘A’ still and ‘A’, etc –do enough students fail? l Criterion (standards) and norm (cohort)-referencing –‘standards’ were never absent (in curriculum, examination) –‘standards’ were ignored in the norm-based award of results –cannot use link items over time, whole test must become public –marrying criterion and norm-referencing with judgments

8 8 Marrying criterion and norm-referencing r England l use of criteria defined for some grade boundaries l review of previous years’ scripts at grade boundaries l reference to prior grade distributions l reference to evidence of change in student cohort to justify shifts in grade distributions between years r Australia (New South Wales) l development of band descriptors l ‘consistent’ definition of bands over years. l reporting with norm and criterion-referencing

9 9 The Suite of Documents

10 10 All HSC courses listed with Assessment Mark, Examination Mark, HSC Mark and Performance Band All Preliminary courses listed

11 11 Descriptions in bands: summary of what students know and can do Minimum standard expected (50) Graph of distribution of results to show how all students performed Student’s HSC MarkMark Range 0–100Examination MarkSchool Assessment Mark Number of candidates

12 12 How they got there… r Review and recommendations for change l New NSW Higher School Certificate –McGaw, (1997). Shaping their future: Recommendations for reform of the Higher School Certificate. Sydney: Department of Training and Education Co-ordination l Scaling process –standards-referencing to curriculum and over-time –Bennett, J. (2001), Standards-setting and the NSW Higher School Certificate www.boardofstudies.nsw.edu.au/manuals/pdf_doc/bennett.pdf r Developing grade descriptors l Used past examinations –experienced examiners for each subject –reviewed examination papers and students’ marked papers l Developing band descriptors –described performance for Band 6 to 2, low Band 1 not described

13 13 Using grade descriptors r Stage 1 l examiners independently form ‘image of band’ l set cut mark for each band boundary on each question r Stage 2 l examiners work together to reach agreement on boundary locations for bands on each question l boundary locations for total scores also established r Stage 3 l Student work at boundaries on total scores reviewed l Cut points reviewed and determined l Boundaries located on mark scale –5/6 boundary set to 90 –4/5 boundary set to 80 –… –1/2 boundary set to 50

14 14 but, it does not always change debate…

15 15 Debate isn’t always changed r Federal Minister l found an English paper awarded a pass despite some inadequate expression within it l concluded too few students were being failed r Nature of debate l became again a debate about desirable failure rates l important for such debates to be reconstructed as a debate about nature of performance judged inadequate

16 16 OECD education website www.oecd.org/edu Contact Barry.McGaw@oecd.org Thank-you


Download ppt "1 A Century of Testing: Ideas on Solving Enduring Accountability and Assessment Problems UCLA, Los Angeles 8-9 September 2005 Barry McGaw Director for."

Similar presentations


Ads by Google