Using Summative Data to Monitor Student Performance: Choosing appropriate summative tests. Presented by Philip Holmes-Smith School Research Evaluation.

Using Summative Data to Monitor Student Performance: Choosing appropriate summative tests. Presented by Philip Holmes-Smith School Research Evaluation and Measurement Services

Overview of the session 1.Diagnostic vs. Summative Testing 2.Choosing Appropriate Summative Tests The reliability of summative (standardised) tests. Choosing appropriate summative tests. When should you administer summative tests?

1.Overview of Diagnostic vs. Summative Testing

Examples of Diagnostic Testing Assessment tools such as: Assessment tools such as: –Marie Clay Inventory, –English Online Assessment (Government schools), –Maths Online Assessment (Government schools), –SINE (CEO schools), –On-Demand Linear Tests, –Probe. Teacher assessments such as: Teacher assessments such as: –teacher questioning in class, –teacher observations, –student work (including portfolios).

Diagnostic Testing Research shows that our most effective teachers constantly use feedback (including the use of diagnostic information to inform their teaching). Research shows that our most effective teachers (in terms of improving the learning outcomes of students) constantly use feedback (including the use of diagnostic information to inform their teaching). Hattie (2003, 2009)* shows that using feedback (including using diagnostic information about what each student can and can’t do to inform teaching) has one of the biggest impacts on improving student learning outcomes. Hattie (2003, 2009)* shows that using feedback (including using diagnostic information about what each student can and can’t do to inform teaching) has one of the biggest impacts on improving student learning outcomes. 2003: http://www.acer.edu.au/documents/RC2003_Hattie_TeachersMakeADifference.pdfhttp://www.acer.edu.au/documents/RC2003_Hattie_TeachersMakeADifference.pdf 2009: Hattie, John. (2009). Visible Learning: A synthesis of over 800 meta-analyses relating to achievement. NY: Routledge. *

Examples of Summative (Standardised) Testing Government sponsored assessment tools such as: Government sponsored assessment tools such as: –NAPLAN, –English Online Assessment (Government schools), –On-Demand Adaptive Tests. Other commercial tests such as: Other commercial tests such as: – TORCH, – PAT-R, – PAT-Math I Can Do Maths). – PAT-Math (together with I Can Do Maths).

Summative (Standardised) Testing Summative testing is essential to monitor the effectiveness of your teaching. Summative testing is essential to monitor the effectiveness of your teaching. But, research shows that summative tests do not lead to improved learning outcomes. As the saying goes: But, research shows that summative tests do not lead to improved learning outcomes. As the saying goes: “You don’t fatten a pig by weighing it” So, although it is essential, keep summative testing to a minimum. So, although it is essential, keep summative testing to a minimum.

2. Summative Tests

Summative (Standardised) Testing Summative testing is essential to monitor the effectiveness of our teaching, but: Summative testing is essential to monitor the effectiveness of our teaching, but: – Is NAPLAN reliable for all students? – Are the other summative tests you administer reliable for all students? We need to maximise the reliability of the tests we use to monitor the effectiveness of our teaching. We need to maximise the reliability of the tests we use to monitor the effectiveness of our teaching.

Summative (Standardised) Testing Summative testing is essential to monitor the effectiveness of our teaching, but: Summative testing is essential to monitor the effectiveness of our teaching, but: – Do we currently gather enough information to monitor the effectiveness of our teaching of ALL students? e.g. Year 3 NAPLAN reflects the effectiveness of your Prep-Yr2 teaching but what about the Prep teaching vs. Yr1 teaching vs. the Yr2 teaching? Year 3 NAPLAN reflects the effectiveness of your Prep-Yr2 teaching but what about the Prep teaching vs. Yr1 teaching vs. the Yr2 teaching? Year 9 NAPLAN reflects the effectiveness of your Yr7-Yr8 teaching but what about the Yr 7 teaching vs. Yr 8 teaching? Year 9 NAPLAN reflects the effectiveness of your Yr7-Yr8 teaching but what about the Yr 7 teaching vs. Yr 8 teaching? We need choose appropriate summative tests to monitor the effectiveness of our teaching at all year levels from Prep – Yr10! We need choose appropriate summative tests to monitor the effectiveness of our teaching at all year levels from Prep – Yr10!

The Reliability of Summative Tests

Three Questions 1.Do you believe that your students’ NAPLAN and/or On-Demand results accurately reflect their level of performance?

Three Questions 1.Do you believe that your students’ NAPLAN and/or On-Demand results accurately reflect their level of performance? 2.If we acknowledge that the odd student will have a lucky guessing day or a horror day, what about the majority? – Have your weakest students received a low score? – Have your average students received a score at about expected level? – Have your best students received a high score?

Three Questions 1.Do you believe that your students’ NAPLAN and/or On-Demand results accurately reflect their level of performance? 2.If we acknowledge that the odd student will have a lucky guessing day or a horror day, what about the majority? – Have your weakest students received a low score? – Have your average students received a score at about expected level? – Have your best students received a high score? 3.Think about your students who received high and low scores: – Are your low scores too low? – Are your high scores too high?

Is this reading score reliable? High highs and Low lows

Item difficulties for a typical test

Summary Statements about Scores Low scores (i.e. more than 0.5 VELS levels below expected) indicate poor performance but the actual values should be considered as indicative only Low scores (i.e. more than 0.5 VELS levels below expected) indicate poor performance but the actual values should be considered as indicative only (i.e. such scores are associated with high levels of measurement error). (i.e. such scores are associated with high levels of measurement error). High scores (i.e. more than 0.5 VELS levels above expected) indicate good performance but the actual values should be considered as indicative only. High scores (i.e. more than 0.5 VELS levels above expected) indicate good performance but the actual values should be considered as indicative only. (i.e. such scores are associated with high levels of measurement error). (i.e. such scores are associated with high levels of measurement error). Average scores indicate roughly expected levels of performance and the actual values are more reliable Average scores indicate roughly expected levels of performance and the actual values are more reliable (i.e. such scores are associated with lower levels of measurement error). (i.e. such scores are associated with lower levels of measurement error).

Choosing appropriate summative tests

Item Difficulties for Booklet 6 on the PAT-R (Comprehension) scale score scale Average Item Difficulty

Converting Raw test Scores Booklet 6 ToPAT-R(Comprehension) scale score

PAR-R (Comprehension) Tests Test difficulties of the PAR-R (Comprehension) Tests on the TORCH score scale together with Year Level mean scores

Different norm tables for different tests

Source: Source: ACER, 2006 ACER, 2006 Test difficulties of the PAT-Maths Tests on the PATM scale score scale together with Year Level mean scores Which is the best test for an average Year 4 student? Year 1 Year 2 Year 3 Year 4 Year 5 Year 6&7 Year 8&9 Year 10

Source: Source: ACER, 2006 ACER, 2006 Year 1 Year 2 Year 3 Year 4 Year 5 Year 6&7 Year 8&9 Year 10 Test difficulties of the PAT-Maths Tests on the PATM scale score scale together with Year Level mean scores The best test for an average Year 4 student is probably Test 4 or 5

Things to look for in a summative test Needs to have a single developmental scale that shows increasing levels of achievement over all the year levels at your school. Needs to have a single developmental scale that shows increasing levels of achievement over all the year levels at your school. Needs to have “norms” or expected levels for each year level (e.g. The National “norm” for Yr 3 students on TORCH is an average of 34.7). Needs to have “norms” or expected levels for each year level (e.g. The National “norm” for Yr 3 students on TORCH is an average of 34.7). Needs to be able to demonstrate growth from one year to the next (e.g. during Yr 4, the average student grows from a score of 34.7 in Yr 3 to an expected score of 41.4 in Yr 4 – that is 6.7 score points). Needs to be able to demonstrate growth from one year to the next (e.g. during Yr 4, the average student grows from a score of 34.7 in Yr 3 to an expected score of 41.4 in Yr 4 – that is 6.7 score points). As a bonus, the test could also provides diagnostic information. As a bonus, the test could also provides diagnostic information.

Norms for Year 3 to Year 10 on the TORCH scale TORCH scale TORCH NORMS 50 th Percentile 10 th Percentile 90 th Percentile

My Recommended Summative Tests (Pen & Paper) Reading Comprehension Reading Comprehension – Progressive Achievement Test - Reading (Comprehension) (PAT-R, 4 th Edition) – TORCH and TORCH plus Mathematics Mathematics – Progressive Achievement Test - Mathematics (PAT-Maths, 3 rd Edition) combined with the I Can Do Maths

Selecting the correct PAT-C Test

Selecting the correct TORCH Test

Selecting the correct PAT-Math/ICDM Test

My Recommended Summative Tests (On-Line) On-Demand - Reading Comprehension On-Demand - Reading Comprehension – The 30-item “On-Demand” Adaptive Reading test On-Demand - Spelling On-Demand - Spelling – The 30-item “On-Demand” Adaptive Spelling test On-Demand - Writing Conventions On-Demand - Writing Conventions – The 30-item “On-Demand” Adaptive Writing test On-Demand – General English (Comprehension, Spelling & Writing Conventions) On-Demand – General English (Comprehension, Spelling & Writing Conventions) – The 60-item “On-Demand” Adaptive General English test On-Demand - Mathematics (Number, Measurement, Chance & Data and Space) On-Demand - Mathematics (Number, Measurement, Chance & Data and Space) – The 60-item “On-Demand” Adaptive General Mathematics test On-Demand - Number On-Demand - Number – The 30-item “On-Demand” Adaptive Number test On-Demand – Measurement, Chance & Data On-Demand – Measurement, Chance & Data – The 30-item “On-Demand” Adaptive Measurement, Chance & Data test On-Demand - Space On-Demand - Space – The 30-item “On-Demand” Adaptive Space test

Choosing the right starting point is still important (even for “Adaptive” Tests)

Summative Testing and Triangulation Even if you give the right test to the right student, sometimes, the test score does not reflect the true ability of the student – every measurement is associated with some error. Even if you give the right test to the right student, sometimes, the test score does not reflect the true ability of the student – every measurement is associated with some error. To overcome this we should aim to get at least three independent measures – what researchers call TRIANGULATION. To overcome this we should aim to get at least three independent measures – what researchers call TRIANGULATION. This may include: This may include: – Teacher judgment – NAPLAN results – Other pen & paper summative tests (e.g. TORCH, PAT-R, PAT- Maths, I Can Do Maths) – On-line summative tests (e.g. On-Demand ‘Adaptive’ testing, English Online)

Summative Testing and Triangulation BUT remember, more summative testing does not lead to improved learning outcomes so keep the summative testing to a minimum BUT remember, more summative testing does not lead to improved learning outcomes so keep the summative testing to a minimum

When should you administer summative tests?

Timing for Summative Testing Should be done at a time when teachers are trying to triangulate on each student’s level of performance. (i.e. mid-year and end-of-year reporting time.) Should be done at a time when teachers are trying to triangulate on each student’s level of performance. (i.e. mid-year and end-of-year reporting time.) Should be done at a time that enables teachers to monitor growth – say, every six months. (i.e. From the beginning of the year to the middle of the year and from the middle of the year to the end of the year.) Should be done at a time that enables teachers to monitor growth – say, every six months. (i.e. From the beginning of the year to the middle of the year and from the middle of the year to the end of the year.)

Suggested timing For Year 1 – Year 6 and Year 8 – Year 10 For Year 1 – Year 6 and Year 8 – Year 10 – Early June (for mid-year reporting and six-monthly growth*) – Early November (for end-of-year reporting and six-monthly growth) For Prep and Year 7 and new students at other levels For Prep and Year 7 and new students at other levels – Beginning of the year (for base-line data) – Early June (for mid-year reporting and six-monthly growth) – Early November (for end-of-year reporting and six-monthly growth) * November results from the year before form the base-line data for the current year. (i.e. February testing is not required for Year 1 – Year 6 or for Year 8 – Year 10)

Using Summative Data to Monitor Student Performance: Choosing appropriate summative tests. Presented by Philip Holmes-Smith School Research Evaluation.

Similar presentations

Presentation on theme: "Using Summative Data to Monitor Student Performance: Choosing appropriate summative tests. Presented by Philip Holmes-Smith School Research Evaluation."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Using Summative Data to Monitor Student Performance: Choosing appropriate summative tests. Presented by Philip Holmes-Smith School Research Evaluation.

Similar presentations

Presentation on theme: "Using Summative Data to Monitor Student Performance: Choosing appropriate summative tests. Presented by Philip Holmes-Smith School Research Evaluation."— Presentation transcript:

Similar presentations

About project

Feedback