Presentation is loading. Please wait.

Presentation is loading. Please wait.

Determining the Significance of Item Order In Randomized Problem Sets Zachary A. Pardos, Neil T. Heffernan Worcester Polytechnic Institute Department of.

Similar presentations


Presentation on theme: "Determining the Significance of Item Order In Randomized Problem Sets Zachary A. Pardos, Neil T. Heffernan Worcester Polytechnic Institute Department of."— Presentation transcript:

1 Determining the Significance of Item Order In Randomized Problem Sets Zachary A. Pardos, Neil T. Heffernan Worcester Polytechnic Institute Department of Computer Science 1

2 The Problem In which order should we present tutor content to students? Many problem sets in ITS where items given in a random order Randomizing item order is mostly done when there is not an obvious ordering of items that would benefit learning. Can we data mine user responses to infer orderings that are reliably more beneficial to learning than others? 2 Pardos, Z. A., Heffernan, N. T. In Press (2009) Detecting the Learning Value of Items In a Randomized Problem Set. In Proceedings of the 14th International Conference on Artificial Intelligence in Education. Brighton, UK. IOS Press.

3 Solution Approach Possible approach: evaluate each sequence for learning value In this paper we evaluate the learning rates of ordered item pairs, such as should Q1 go before Q2 or should Q2 go before Q1 seq1 seq2 Probability of learning: 0.13 Probability of learning: 0.19 If multiple reliable orderings are found, a full sequence could be determined to be best for learning. Item pair (1,2) Probability of learning: 0.09 Item pair (2,1) Probability of learning: 0.14 3

4 Solution Application Example > )(),, ( Learning rate: 0.09Learning rate: 0.14 < )(),, ( Learning rate: 0.17Learning rate: 0.15 (), Learning rate: 0.17 ), ( ) (, > 4

5 Model Modeling or measuring learning requires modeling knowledge Knowledge Tracing used to model learning incorrectcorrect Observables (question answers) S Latent (skill knowledge) (dichotomous) SS P(Skill: 0 → 1) Parameters (probability of learning) (guess/slip) P(correct| Skill = 0) P(incorrect| Skill = 1) Parameters can be learned with the EM algorithm!.. ? 5

6 Model The six sequence permutations modeled with shared Bayesian parameters Also known as Equivalence classes of CPTs (conditional probability tables ) Novel contribution of paper: Harnessing the power of randomization to help estimate accurate parameters using all response data 6

7 Reliability measure Data for a problem set randomly split into 10 equal size bins by student Each bin was evaluated separately by the model Binomial test used to estimate the probability of the null hypothesis, that each ordering is equally likely to have the highest learning rate ie: binopdf(best_choice_mode,20,0.25) (3,2)(2,1)(3,1)(1,2)(2,3)(1,3) Split 10.07320.02670.08370.07010.03790.642... Split 100.08490.05120.05500.07100.07680.0824 Ordered pair learning rates 7

8 Dataset Main problem hint Student main problem responses (correct/incorrect) to 5 problem sets of 3 questions each Questions within a problem set relate to the same skill 295-674 students completed each problem set in 2006-2007 school year data Questions in the problem sets were presented in a randomized order (required for this analysis) 8

9 Confound Main problem hint Since only main question responses are being analyzed, the learning from the main question is confounded with the learning from the scaffolding and hints of the problem. In an item pair, learning could be attributed to The immediate feedback to the main problem of question 1 The scaffolding of question 1 Applying concepts from question 1 on question 2’s main problem 9

10 Results Of the 5 problem sets evaluated, two returned statistically reliable orderings Other item relationships could be tested In Problem Set 36: (2,1) > (3,1) in 10 out of 10 of the bins Learning probabilities of Item Pairs Problem Set Users(3,2)(2,1)(3,1)(1,2)(2,3)(1,3)Reliable Rules 244030.16200.09480.07930.08500.07540.0896(3,2) > (2,3) 364190.15070.16790.06850.11790.12740.1371(1,3) > (3,1) 10

11 Results 11 Problem Set 24Problem Set 36 Question #GuessSlipGuessSlip 10.170.180.330.13 20.310.080.310.10 30.230.170.200.08 Guess and Slip values per question Values are within reasonable range (<.50) Same problem sets run with AIED and sequence model Same guess and slip values were returned Indicates high stability in parameter estimation among methods

12 Simulation Validation Since ground truth of learning rates in the real world are impossible to know, a simulation study was run The simulation set a variety of values for the parameters of prior, guess/slip and learning rates and then simulated user responses These responses could then be analyzed by the method using the same technique as was used on real data 160 simulations run using different combinations of parameters Parameters for the simulation drawn from a distribution fit to a previous year’s analysis of ASSISTment data. Parameter typeMeanStd Beta dist α Beta dist β Learning rate0.0860.0630.06520.6738 Guess0.1440.3830.01700.5909 Slip0.0900.0310.01700.6499 12

13 Simulation Results More data leads to more reliable rules found The rate of false positives remains low, independent of number of users Average false positive is 6.3%, very close to the 5% p-value cutoff of our reliability estimator Simulation suggests that the results are trustworthy 13

14 Limitations Only problem sets of five questions or less can be reasonably evaluated – Larger problem sets become intractable to compute due to the exponential increase in nodes and permutations as question count increases for a four question set (4+4)*24 = 192 nodes for a five question set (5+5) *120 = 1,200 nodes – Possible optimization is to only model the sequences for which there is data Randomization of question order must be present to control for factors including problem difficulty and allow for detecting learning rates of all item pairs in the problem set 14

15 Conclusions & Future Work We think that this method, and ones built off of it, will facilitate better tutoring systems Randomization gives many of the properties of a RCE. This method can perform a similar function but in the form of data mining. Best orderings might have a variety of reasons for existing. Applying this method to investigate those reasons could inform content authors and scientists on best practices in much the same way as randomized controlled experiments do but by utilizing the far more economical means of investigation which is data mining. 15


Download ppt "Determining the Significance of Item Order In Randomized Problem Sets Zachary A. Pardos, Neil T. Heffernan Worcester Polytechnic Institute Department of."

Similar presentations


Ads by Google