Choosing Sample Size for Knowledge Tracing Models DERRICK COETZEE.

Slides:

Advertisements

Similar presentations

Chapter 10: Estimating with Confidence

Advertisements

Chapter 8: Estimating with Confidence

Sampling: Final and Initial Sample Size Determination

Chapter 10: Estimating with Confidence

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.

Chapter Seventeen HYPOTHESIS TESTING

Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.

BPS - 3rd Ed. Chapter 131 Confidence intervals: the basics.

CHAPTER 8 Estimating with Confidence

Chapter 10: Estimating with Confidence

Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.

10.3 Estimating a Population Proportion

+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.

ESTIMATING with confidence. Confidence INterval A confidence interval gives an estimated range of values which is likely to include an unknown population.

CHAPTER 8 Estimating with Confidence

Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.

Chapter 8 Introduction to Inference Target Goal: I can calculate the confidence interval for a population Estimating with Confidence 8.1a h.w: pg 481:

+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.

Statistics 101 Chapter 10. Section 10-1 We want to infer from the sample data some conclusion about a wider population that the sample represents. Inferential.

Lecture 14 Dustin Lueker. 2  Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample.

Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.

+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.

+ Warm-Up4/8/13. + Warm-Up Solutions + Quiz You have 15 minutes to finish your quiz. When you finish, turn it in, pick up a guided notes sheet, and wait.

Chapter 7: Data for Decisions Lesson Plan Sampling Bad Sampling Methods Simple Random Samples Cautions About Sample Surveys Experiments Thinking About.

+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Estimating with Confidence Section 10.1 Confidence Intervals: The Basics.

Section 10.1 Confidence Intervals

BPS - 3rd Ed. Chapter 131 Confidence Intervals: The Basics.

Lecture 7 Dustin Lueker. 2  Point Estimate ◦ A single number that is the best guess for the parameter  Sample mean is usually at good guess for the.

6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.

Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…

+ DO NOW. + Chapter 8 Estimating with Confidence 8.1Confidence Intervals: The Basics 8.2Estimating a Population Proportion 8.3Estimating a Population.

Ch 8 Estimating with Confidence 8.1: Confidence Intervals.

INFERENCE Farrokh Alemi Ph.D.. Point Estimates Point Estimates Vary.

10.1 – Estimating with Confidence. Recall: The Law of Large Numbers says the sample mean from a large SRS will be close to the unknown population mean.

+ Unit 5: Estimating with Confidence Section 8.3 Estimating a Population Mean.

+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.

Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.

+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.

Chapter 10 Confidence Intervals for Proportions © 2010 Pearson Education 1.

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

CHAPTER 8 Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Section 10.1: Confidence Intervals

Confidence Intervals Chapter 10 Section 1.

Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.

Confidence Intervals: The Basics

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

CHAPTER 12 More About Regression

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Presentation transcript:

Choosing Sample Size for Knowledge Tracing Models DERRICK COETZEE

Motivation ◦BKT parameters are inferred from data ◦But best solution for a given data set may not quite match the parameters that actually generated it (sampling error) 0,0,0,0,0 0,1,1,0,1 0,1,0,0,0 0,0,1,1,0 5 students, 5 problems each, 25 bits of data prior = learning = guess = slip = parameters, 3 decimal digits each, 39.9 bits of data Not even possible for all parameter sets to be represented!

Questions ◦So how much data is needed for accurate estimates? ◦And do the parameter values affect how much you need? ◦Can we give confidence intervals for parameters?

Normal distribution over samples ◦Mean is almost always near true generating value ◦Standard deviation can be used to describe variation of estimates ◦Can use 68–95–99.7 rule for confidence intervals

Variation does depend on parameter values ◦Each parameter behaves differently ◦Best estimates for parameters near zero/one, worst in range

There are interactions between parameter values ◦Can’t just precompute a table of stddevs for each parameter  ◦Complex relationship, analytical approach probably infeasible ◦But at least there is continuity with small rates of change

Sample size recommendations ◦Stddev proportional to 1/sqrt(n) ◦Must increase sample size by factor of 4 to improve error by factor of 2 ◦Small data sets (<1000 students) will not give even one sigfig in all parameters ◦Question systems based on small classes!

No interaction between sample size and parameters ◦Change sample size without changing parameters → predictable variation in error ◦Gives an approach to estimate error on real-world data sets: ◦Take samples with replacement, infer parameters for each, compute stddev ◦Scale using 1/sqrt(n) to estimate stddevs at other sample sizes

Knowledge Tracing for Interacting Student Pairs DERRICK COETZEE

Motivation ◦Standard Bayesian knowledge tracing uses fixed learning rate parameter to capture all learning

Motivation ◦One way to improve: use information on course materials viewed

Motivation ◦What about peer interaction (e.g. forums/chat)? ◦Not fixed/static like instructional materials ◦The level of knowledge of the other student is important ◦Use our BKT model of the other student’s knowledge!

Pair interaction scenario ◦Simple case of student interaction ◦Two students are paired and always interact between each item (no interactions with others) Do exercise Learn independently Interact with partner Do exercise Learn independently

Pair interaction scenario ◦Model independent learning and interaction stages

Pair interaction scenario ◦Model independent learning and interaction stages ◦New parameters: teach, mislead KnowsOther student knows Probability knows after interaction No 0 Yes 1 NoYesteach YesNo1−mislead

Results: Preliminary simulations ◦5-parameter system (prior, learn, guess, slip, teach) ◦forget, mislead parameters fixed at zero ◦Generate synthetic data, run EM from generating values ◦Same behavior as classic system when teach = 0 ◦Unstable when teach > 0 ◦Converges to trivial solution prior=learn=teach=1, slip=proportion incorrect responses ◦Occurs for both small and large teach parameters

Results: Preliminary simulations ◦4-parameter system (learn, guess, slip, teach) ◦forget, mislead, prior fixed at zero ◦For small teach values (e.g. 0.05), teach converges to zero ◦Yields nontrivial solutions for large teach values, but other parameters absorb some of the teach: ◦learn=0.0900, guess=0.1400, slip=0.0900, teach=0.9000, 100 students → learn=0.1586, guess=0.1648, slip=0.0856, teach= ◦learn=0.0900, guess=0.1400, slip=0.0900, teach=0.9000, 1000 students → learn=0.1643, guess=0.1940, slip=0.1102, teach=0.7225

Results: Preliminary simulations ◦4-parameter system (learn, guess, slip, teach) with students and high teach ◦prior=0.0000, learn=0.0900, guess=0.1400, slip=0.0900, teach= → prior=0.2184, learn=0.0841, guess=0.1239, slip=0.2658, teach= ◦prior and slip have high error, but learning/guess/teach are good ◦teach accuracy increases dramatically with sample size

Possible solutions ◦Answer items between independent learning and interaction (more observed data) ◦Mentor/mentee model: knowledge flows in only one direction ◦Eliminate different parameters, or combine parameters to create lower-dimensional space

Future work ◦Determine whether interaction model produces better predictions on synthetic data ◦Gather real-world pair interaction data using MOOCchat tool ◦Determine whether pair interaction produces better predictions ◦Typical values, appropriate interpretations for teach and mislead parameters? ◦Generalize to more complex interactions