Presentation is loading. Please wait.

Presentation is loading. Please wait.

Choosing appropriate summative tests. Presented by Philip Holmes-Smith School Research Evaluation and Measurement Services.

Similar presentations


Presentation on theme: "Choosing appropriate summative tests. Presented by Philip Holmes-Smith School Research Evaluation and Measurement Services."— Presentation transcript:

1 Choosing appropriate summative tests. Presented by Philip Holmes-Smith School Research Evaluation and Measurement Services

2 Overview of this module Choosing Appropriate Summative Tests The reliability of summative (standardised) tests. Choosing appropriate summative tests. When should you administer summative tests?

3 The Reliability of Summative Tests

4 Three Questions 1. 1.Do you believe that your students’ NAPLAN and/or On-Demand and/or PAT results accurately reflect their level of performance?

5 Three Questions 1. 1.Do you believe that your students’ NAPLAN and/or On-Demand and/or PAT results accurately reflect their level of performance? 2. 2.If we acknowledge that the odd student will have a lucky guessing day or a horror day, what about the majority? – – Do your weakest students usually receive low scores? – – Do your average students usually received scores at about expected level? – – Do your best students usually receive high scores?

6 Three Questions 1. 1.Do you believe that your students’ NAPLAN and/or On-Demand and/or PAT results accurately reflect their level of performance? 2. 2.If we acknowledge that the odd student will have a lucky guessing day or a horror day, what about the majority? – – Do your weakest students usually receive low scores? – – Do your average students usually received scores at about expected level? – – Do your best students usually receive high scores? 3. 3.However, think about your students who received high and low scores: – – Are your low scores too low? - (i.e. indicatively correct but too low) – – Are your high scores too high? - (i.e. indicatively correct but too high)

7 Is this reading score reliable? This high is probably too high. Is this reading score reliable? This low is probably too low. Examples of High highs and Low lows

8 Item difficulties for a typical test (A test pitched at average year level standard does not have enough easy or hard questions to reliably or accurately reflect low or high scores)

9 Summary Statements about Scores Low scores (i.e. more than a year below expected) indicate poor performance but the actual values should be considered as indicative only (i.e. such scores are associated with high levels of measurement error). High scores (i.e. more than a year above expected) indicate good performance but the actual values should be considered as indicative only. (i.e. such scores are associated with high levels of measurement error). Average scores indicate roughly expected levels of performance and the actual values are more reliable (i.e. such scores are associated with lower levels of measurement error).

10 Summative (Standardised) Testing Summative testing is essential to monitor the effectiveness of your teaching, but: – – NAPLAN is not reliable for all students. Furthermore, if used incorrectly, the other summative tests you administer (e.g. On-Demand, PAT, etc.) may also be unreliable. – – More importantly, if NAPLAN is the only summative data used in your school you are not gathering enough information to monitor the effectiveness of your teaching at all year levels. What about Prep, Yr1, Yr2, Yr4, Yr6, Yr8 and Yr10? For example: Year 3 NAPLAN reflects the effectiveness of your Prep-Yr2 teaching but what about the Prep teaching vs. Yr1 teaching vs. the Yr2 teaching? Year 9 NAPLAN reflects the effectiveness of your Yr7-Yr8 teaching but what about the Yr 7 teaching vs. Yr 8 teaching?

11 Summative (Standardised) Testing We need to maximise the reliability of the tests we use to monitor the effectiveness of our teaching (by better matching the difficulty of the items to the ability of the studnets). We need to maximise the reliability of the tests we use to monitor the effectiveness of our teaching (by better matching the difficulty of the items to the ability of the studnets). We need to choose appropriate summative tests to monitor the effectiveness of our teaching at all year levels from Prep – Yr10! We need to choose appropriate summative tests to monitor the effectiveness of our teaching at all year levels from Prep – Yr10!

12 Choosing appropriate summative tests

13 Item Difficulties for Booklet 6 on the PAT-R (Comprehension) PAT-R (Comprehension) scale score scale For whom is this test most appropriate? Prep?, Yr4?, Yr10? Average Item Difficulty Test is about right for the average Yr4 student Test is too easy for the average Yr10 student Test is too hard for the average Prep student

14 Converting Raw test Scores to PAT-R (Comprehension) scale score A Yr10 student of ability 144 who answers every question correctly (35/35) would be falsely placed at ability (i.e. an unreliable high high) A Yr4 student of ability 120 who answers approximately half the questions correctly (18/35) would be accurately placed at ability A Prep student of ability 79 who answers every question incorrectly (0/35) would be falsely placed at ability 67.4 (i.e. an unreliable low low)

15 Test difficulties of the PAR-R (Comprehension) Tests on the PAT-R score scale together with Year Level mean scores

16 Item difficulties of the PAR-R (Comprehension) Tests on the PAT-R score scale together with Year Level mean scores Test Booklet 2 would be a good test to give to a typical Yr 1 student because the typical item difficulties are around about the ability level of typical Yr 1 students

17 Different norm tables for different tests

18 Source: Source: ACER, 2006 ACER, 2006 Test difficulties of the PAT-Maths Tests on the PATM scale score scale together with Year Level mean scores Which is the best test for an average Year 4 student? Year 1 Year 2 Year 3 Year 4 Year 5 Year 6&7 Year 8&9 Year 10

19 Source: Source: ACER, 2006 ACER, 2006 Year 1 Year 2 Year 3 Year 4 Year 5 Year 6&7 Year 8&9 Year 10 Test difficulties of the PAT-Maths Tests on the PATM scale score scale together with Year Level mean scores The best test for an average Year 4 student is probably Test 5 (or perhaps Test 4)

20 Things to look for in a summative test Needs to have a single developmental scale that shows increasing levels of achievement over all the year levels at your school. Needs to have “norms” or expected levels for each year level (e.g. The National “norm” for Yr 3 students on TORCH is an average of 34.7). Needs to be able to demonstrate growth from one year to the next (e.g. during Yr 4, the average student grows from a score of 34.7 in Yr 3 to an expected score of 41.4 in Yr 4 – that is 6.7 score points). As a bonus, the test could also provides diagnostic information.

21 N.B. Don’t expect growth to be linear (Growth in the early and later years is more rapid than in the middle years) TORCH NORMS 50 th Percentile 10 th Percentile 90 th Percentile

22 My Recommended Summative Tests (Pen & Paper) Reading Comprehension – – Progressive Achievement Test - Reading (Comprehension) (PAT-R, 4 th Edition) – – TORCH (2 nd Ed.) and TORCH plus Mathematics – – Progressive Achievement Test - Mathematics (PAT-Maths, 3 rd Edition) combined with the I Can Do Maths Spelling – – South Australian Spelling (Use Test A and Test B alternatively) – – Single Word Spelling Test (SWST)

23 Selecting the correct PAT-R (Comprehension) Tests

24 Selecting the correct PAT-Math/ICDM Test

25 Selecting the correct TORCH Test

26 My Recommended Summative Tests (On-Line) On-Demand - Reading Comprehension – – The 30-item “On-Demand” Adaptive Reading test (Yr3 – Yr10) On-Demand - Spelling – – The 30-item “On-Demand” Adaptive Spelling test (Yr3 – Yr10) On-Demand - Writing Conventions – – The 30-item “On-Demand” Adaptive Writing Conventions test (Yr3 – Yr10) On-Demand – General English (Comprehension, Spelling & Writing Conventions) (Yr3 – Yr10) – – The 60-item “On-Demand” Adaptive General English test English Online (Victorian Gov. Schools) – – Prep-Yr2 Individual interview On-Demand - Number – – The 30-item “On-Demand” Adaptive Number test (Yr3 – Yr10) On-Demand – Measurement, Chance & Data – – The 30-item “On-Demand” Adaptive Measurement, Chance & Data test (Yr3 – Yr10) On-Demand - Space – – The 30-item “On-Demand” Adaptive Space test (Yr3 – Yr10) On-Demand - Structure – – The 30-item “On-Demand” Adaptive Structure test (Yr3 – Yr10) On-Demand - Mathematics (Number, Measurement, Chance & Data and Space) (Yr3 – Yr10) – – The 60-item “On-Demand” Adaptive General Mathematics test PAT-Maths Plus – – 10 tests from Yr1 to Yr10

27 Available “Adaptive” ENGLISH Tests (Choosing the right starting point is still important)

28 Available “Adaptive” MATHEMATICS Tests (Choosing the right starting point is still important)

29 Choosing the right starting point for “Adaptive” Tests

30 Summative Testing and Triangulation Even if you give the right test to the right student, sometimes, the test score does not reflect the true ability of the student – every measurement is associated with some error. To overcome this we should aim to get at least three independent measures – what researchers call TRIANGULATION. This may include: – – Teacher judgment – – NAPLAN results – – Other pen & paper summative tests (e.g. TORCH, PAT-R, PAT- Maths, I Can Do Maths) – – On-line summative tests (e.g. On-Demand ‘Adaptive’ testing, PAT-Maths Plus, English Online)

31 Summative Testing and Triangulation BUT remember, more summative testing does not lead to improved learning outcomes so keep the summative testing to a minimum

32 When should you administer summative tests?

33 Timing for Summative Testing Should be done at a time when teachers are trying to triangulate on each student’s level of performance. (i.e. mid-year and end-of-year reporting time.) Should be done at a time that enables teachers to monitor growth – say, every six months. (i.e. From the beginning of the year to the middle of the year and from the middle of the year to the end of the year.)

34 Suggested timing For Year 1 – Year 6 and Year 8 – Year 10 – – Late May/Early June (for mid-year reporting and six-monthly growth*) – – Late October/Early November (for end-of-year reporting and six- monthly growth) For Prep and Year 7 and new students at other levels – – Beginning of the year (for base-line data) – but record as November the year before – – Late May/Early June (for mid-year reporting and six-monthly growth) – – Late October/Early November (for end-of-year reporting and six- monthly growth) * November results from the year before form the base-line data for the current year. (i.e. February testing is not required for Year 1 – Year 6 or for Year 8 – Year 10)


Download ppt "Choosing appropriate summative tests. Presented by Philip Holmes-Smith School Research Evaluation and Measurement Services."

Similar presentations


Ads by Google