Presentation is loading. Please wait.

Presentation is loading. Please wait.

Choosing Sample Size for Knowledge Tracing Models DERRICK COETZEE.

Similar presentations


Presentation on theme: "Choosing Sample Size for Knowledge Tracing Models DERRICK COETZEE."— Presentation transcript:

1 Choosing Sample Size for Knowledge Tracing Models DERRICK COETZEE

2 Motivation ◦BKT parameters are inferred from data ◦But best solution for a given data set may not quite match the parameters that actually generated it (sampling error) 0,0,0,0,0 0,1,1,0,1 0,1,0,0,0 0,0,1,1,0 5 students, 5 problems each, 25 bits of data prior = 0.205 learning = 0.010 guess = 0.142 slip = 0.031 4 parameters, 3 decimal digits each, 39.9 bits of data Not even possible for all parameter sets to be represented!

3 Questions ◦So how much data is needed for accurate estimates? ◦And do the parameter values affect how much you need? ◦Can we give confidence intervals for parameters?

4

5 Normal distribution over samples ◦Mean is almost always near true generating value ◦Standard deviation can be used to describe variation of estimates ◦Can use 68–95–99.7 rule for confidence intervals

6

7 Variation does depend on parameter values ◦Each parameter behaves differently ◦Best estimates for parameters near zero/one, worst in 05-0.8 range

8

9 There are interactions between parameter values ◦Can’t just precompute a table of stddevs for each parameter  ◦Complex relationship, analytical approach probably infeasible ◦But at least there is continuity with small rates of change

10

11 Sample size recommendations ◦Stddev proportional to 1/sqrt(n) ◦Must increase sample size by factor of 4 to improve error by factor of 2 ◦Small data sets (<1000 students) will not give even one sigfig in all parameters ◦Question systems based on small classes!

12

13 No interaction between sample size and parameters ◦Change sample size without changing parameters → predictable variation in error ◦Gives an approach to estimate error on real-world data sets: ◦Take samples with replacement, infer parameters for each, compute stddev ◦Scale using 1/sqrt(n) to estimate stddevs at other sample sizes

14 Knowledge Tracing for Interacting Student Pairs DERRICK COETZEE

15 Motivation ◦Standard Bayesian knowledge tracing uses fixed learning rate parameter to capture all learning

16 Motivation ◦One way to improve: use information on course materials viewed

17 Motivation ◦What about peer interaction (e.g. forums/chat)? ◦Not fixed/static like instructional materials ◦The level of knowledge of the other student is important ◦Use our BKT model of the other student’s knowledge!

18 Pair interaction scenario ◦Simple case of student interaction ◦Two students are paired and always interact between each item (no interactions with others) Do exercise Learn independently Interact with partner Do exercise Learn independently

19 Pair interaction scenario ◦Model independent learning and interaction stages

20 Pair interaction scenario ◦Model independent learning and interaction stages ◦New parameters: teach, mislead KnowsOther student knows Probability knows after interaction No 0 Yes 1 NoYesteach YesNo1−mislead

21 Results: Preliminary simulations ◦5-parameter system (prior, learn, guess, slip, teach) ◦forget, mislead parameters fixed at zero ◦Generate synthetic data, run EM from generating values ◦Same behavior as classic system when teach = 0 ◦Unstable when teach > 0 ◦Converges to trivial solution prior=learn=teach=1, slip=proportion incorrect responses ◦Occurs for both small and large teach parameters

22 Results: Preliminary simulations ◦4-parameter system (learn, guess, slip, teach) ◦forget, mislead, prior fixed at zero ◦For small teach values (e.g. 0.05), teach converges to zero ◦Yields nontrivial solutions for large teach values, but other parameters absorb some of the teach: ◦learn=0.0900, guess=0.1400, slip=0.0900, teach=0.9000, 100 students → learn=0.1586, guess=0.1648, slip=0.0856, teach=0.6481 ◦learn=0.0900, guess=0.1400, slip=0.0900, teach=0.9000, 1000 students → learn=0.1643, guess=0.1940, slip=0.1102, teach=0.7225

23 Results: Preliminary simulations ◦4-parameter system (learn, guess, slip, teach) with 10000 students and high teach ◦prior=0.0000, learn=0.0900, guess=0.1400, slip=0.0900, teach=0.9000 → prior=0.2184, learn=0.0841, guess=0.1239, slip=0.2658, teach=0.8793 ◦prior and slip have high error, but learning/guess/teach are good ◦teach accuracy increases dramatically with sample size

24 Possible solutions ◦Answer items between independent learning and interaction (more observed data) ◦Mentor/mentee model: knowledge flows in only one direction ◦Eliminate different parameters, or combine parameters to create lower-dimensional space

25 Future work ◦Determine whether interaction model produces better predictions on synthetic data ◦Gather real-world pair interaction data using MOOCchat tool ◦Determine whether pair interaction produces better predictions ◦Typical values, appropriate interpretations for teach and mislead parameters? ◦Generalize to more complex interactions


Download ppt "Choosing Sample Size for Knowledge Tracing Models DERRICK COETZEE."

Similar presentations


Ads by Google