Presentation on theme: "1 Hands On Session With (Real) Data. 2 CELPT Testing Language proficiency written test Singapore based Ngee Ann Polytechnic Students with varied cultural."— Presentation transcript:
2 CELPT Testing Language proficiency written test Singapore based Ngee Ann Polytechnic Students with varied cultural backgrounds Profile reporting by language category Calibrated bank of some 1500 questions
3 Two is four Each CELPT test becomes two tests Limited attempt to tailor questions Match difficulty of questions And candidate ability
4 How does it work? Each test has 220 questions Starter test of 20 questions Branching on 20-question test score Easier test More difficult test Only a single branch point
5 In a little more detail Questions 1 to 20Starter Test Questions 21 to 60 Part 1A Test Questions 61 to 100Part 1B Test Questions 101 to 160Part 2A Test Questions 161 to 220Part 2B Test
6 Hands On – The basic task Have two tests 21001 and 21002 Break down into four tests 21001 and 71001 21002 and 71002 Wish to build a single item bank
7 Hands On – The basic task Need to calibrate all questions Into a single bank How many questions? 2 x 220 = 440? 4 x 120 = 480? Some common questions! Starter and others…
8 How should the task be tackled? Four different tests – 4 x 120 questions Four different sets of students Key is the common questions Use questions to link during analysis But how? Many ways to do this!
9 Hands on soon! Task for the afternoon Does it matter how to do this? Could bank one test at a time Adding tests one by one Could all tests be banked together?
10 IBS – Item Banking System Takes in data from many tests Finds common questions Includes existing bank if appropriate Analyses to find optimum statistics Banks the end results
11 Hands On – for ‘Test’ 21002 IBS analysis of test 21001 IBS analysis of test 71001 and bank One group Plot difficulty estimates For common questions in tests One set from test one from bank
12 Hands On – for ‘Test’ 21002 Also a further IBS analysis Tests 21001 and 7001 Second group Plot difficulty estimates For common questions in banks For each type of analysis
13 Task again Test 21001Bank 1 Test 71001 + Bank 1Bank 2 Test 21001 + Test 71001Bank 3 Group 1: 71001 and 21001 (Bank 1) Group 2: Bank 2; Bank 3
14 Let’s look at FIT in 21002/71002 CE0888 – Link fit = 4.22 In 21002 - δ/σ = 1.57/0.21; fit -3.78 In 71002 - δ/σ = -1.21/0.31; fit -0.52 In 21002, item is 57/60 in Part 1A in 71002, item is 39/60 in Part 1B Omits high in 1A Question is probably OK
15 Question CE0888 – 57/1A/I/A Teenagers today are driven by _____________ to do a lot of things they would otherwise not do. A. peer group pressure B. a peer group pressure C. the peer group pressure D. some peer group pressure
16 Let’s look at FIT in 21002/71002 CE7414 – Link fit = -5.27 In 21002 - δ/σ = 1.87/0.21; fit 5.22 In 71002 - δ/σ = 2.01/0.14; fit 5.07 In 21002, item is 15/60 in Starter in 71002, item is 15/60 in Starter Bad question; unstable difficulty
17 Question CE7514 – 15/ST/H/C There are many reasons for our losses. They include the following :- __________________________________________; The raw materials we bought were not equal to those specified. A. Violation of set procedures by staff in the assembly plant B. The violation of set procedures in the assembly plant by staff C. Set procedures were violated by staff in the assembly plant D. Violating set procedures being common in the assembly plant
18 Still need to bank four tests Test 21001Bank 1 Test 71001 + Bank 1Bank 2 Test 21002 + Bank 2Bank 4 Test 71002 + Bank 3Bank 5 21001, 71001, 21002, 71002 Bank 6
19 Hands On – for all two/four tests Now have a bank built in steps - Bank 5 Bank has questions from all four tests Also have a bank built in one pass Bank 6 also has all questions?
20 Hands On – for all tests together Plot difficulty estimates Bank 5 vs bank 6 Four groups for plotting CE0000 - CE2999, CE3000 - CE49000 CE5000 - CE6999, CE7000 - CE7999 Plot any questions found common
21 Hands On – for all tests together Statistical estimation – how robust? What of fit? Fit within is important Fit between is also very important Different groups of students Fit will be a reflection of many factors
22 More general points Joint approach to calibration is preferable Balances (smoothes) lumpy data Gives a better overall idea of ‘reality’ Helps to identify real problems Measurement model makes this possible
23 The analysis of many tests Analysis has to be possible – connectivity Design needs consideration Subtests – questions are only in one subtest Largest possible group of questions that occurs in unique grouping of tests Q1, Q4 and Q6 are in tests X and Y
24 Linking With four tests maximum is: 4 C 1 + 4 C 2 + 4 C 3 + 4 C 4 Or 4 + 6 + 4 + 1 = 15 All 15 are found in this analysis Provides test by test linking Some subtests very small
25 Fit again Misfitting people Misfitting questions All due to question/person interactions Not independent Examples
26 Where have we been today? Seeking to learn more about what we do Wishing to measure not just report Looking at inconsistent behaviour Trying to understand our data Aiming to build better tests
27 An Impossibility! How can the Rasch Model apply fully? The formulation is so strict! But it is a system to provide measurement If it fails, then at least we can know about it But hang on, the model failing? More like the data not fitting….
28 So here is the rub… The test constructor needs to be aware Of what is required and what has happened Where the data have come from Which students were used, what questions The model will help to interpret of data In the end, it all depends on the user
29 In Conclusion - 1 It has been a romp And there is much more to say and do Plenty of books Plenty of analysis programs Rasch community
30 In Conclusion - 2 Bond, T.G and Fox, C.M Applying the Rasch Model: Fundamental Measurement in the Human Sciences Lawrence Erlbaum Associates ISBN: 0-8058-4252-7