Presentation is loading. Please wait.

Presentation is loading. Please wait.

Classical and Bayesian Computerized Adaptive Testing Algorithms Richard J. Swartz Department of Biostatistics

Similar presentations


Presentation on theme: "Classical and Bayesian Computerized Adaptive Testing Algorithms Richard J. Swartz Department of Biostatistics"— Presentation transcript:

1 Classical and Bayesian Computerized Adaptive Testing Algorithms Richard J. Swartz Department of Biostatistics (rswartz@mdanderson.org)

2 Outline Principle of computerized adaptive testingPrinciple of computerized adaptive testing Basic statistical concepts and notationBasic statistical concepts and notation Trait estimation methodsTrait estimation methods Item selection methodsItem selection methods Comparisons between methodsComparisons between methods Current CAT Research TopicsCurrent CAT Research Topics 2

3 Computerized Adaptive Tests (CAT) First developed for assessment testingFirst developed for assessment testing Test tailored to an individualTest tailored to an individual –Only questions relevant to individual trait level –Shorter tests Sequential adaptive selection problemSequential adaptive selection problem Requires item bankRequires item bank –Fit with IRT models –Extensive initial development before CAT implementation 3

4 Item Bank Development I Qualitative item developmentQualitative item development –Content experts –Response categories Test model fitTest model fit –Likelihood ratio based methods –Model fit indices 4

5 Item Bank Development II Test Assumption: UnidimensionalityTest Assumption: Unidimensionality –Factor analysis –Confirmatory factor analysis –Multidimensional IRT models Test assumption: Local DependenceTest assumption: Local Dependence –Residual correlation after 1 st factor removed –Multidimensional IRT models 5

6 Item Bank Development III Test assumption: InvarianceTest assumption: Invariance –DIF = differential item functioning Over time and across groups (i.e. men vs. women)Over time and across groups (i.e. men vs. women) Across groupsAcross groups Many different methods (Logistic Regression method, Area between response curves, and others)Many different methods (Logistic Regression method, Area between response curves, and others) 6

7 CAT Implementation 1 3 4 6 5 7 8 14 15 9 13 12 10 11 Lo Depression Hi Depression ab c2c ab c5 ab c15b b 2 7

8 CAT Item Selection Estimate latent trait Greedily find “best” item Administer item and wait for response 8

9 Basic Concepts/ Notation 9

10 Basic Concepts/ Notation II 10

11 TRAIT ESTIMATION 11

12 Estimating Traits Assumes Item parameters are knownAssumes Item parameters are known Represent the individual’s abilityRepresent the individual’s ability Done sequentially in CATDone sequentially in CAT Estimate is updated after each additional responseEstimate is updated after each additional response –Maximum Likelihood Estimator –Bayesian Estimators 12

13 Likelihood Model describing a person’s response pattern:Model describing a person’s response pattern: 13

14 Maximum Likelihood Estimate Frequentist: “likely” value to generate the responsesFrequentist: “likely” value to generate the responses Consistency, efficiency depend on selection methods and item bank used.Consistency, efficiency depend on selection methods and item bank used. Does not always existDoes not always exist 14

15 Bayesian Framework  is a random variable  is a random variable A distribution on  describes knowledge prior to data collection (Prior distribution)A distribution on  describes knowledge prior to data collection (Prior distribution) Update information about  (Trait) as data is collected (Posterior distribution)Update information about  (Trait) as data is collected (Posterior distribution) Describes distribution of  values instead of a point estimateDescribes distribution of  values instead of a point estimate 15

16 Bayes Rule Combines information about  (prior) with information from the data (Likelihood)Combines information about  (prior) with information from the data (Likelihood) 16 Posterior  Likelihood × PriorPosterior  Likelihood × Prior

17 Maximum A Posteriori (MAP) Estimate Properties:Properties: –Uniform Prior = equivalent to MLE over support of the prior, –For some prior/likelihood combinations, Posterior can be multimodal 17

18 Expected A Posteriori (EAP) Estimate Properties:Properties: –Always exists for a proper prior –Easy to calculate with numerical integration techniques –Prior influences estimate 18

19 Posterior Variance Describes variability of Describes variability of  Can be used as conditional Standard Error of Measurement (SEM) for a given response pattern.Can be used as conditional Standard Error of Measurement (SEM) for a given response pattern. 19

20 ITEM SELECTION 20

21 Item Selection Algorithms Choose the item that is “best” for the individual being testedChoose the item that is “best” for the individual being tested Define “best”Define “best” –Most information about trait estimate –Greatest reduction in expected variability of trait estimate 21

22 Fisher’s Information Information of a given item at a trait valueInformation of a given item at a trait value 22

23 Maximum Fisher’s Information Myopic algorithmMyopic algorithm Pick the item i k at stage k, (i k  R k ) that maximizes Fisher’s information at current trait estimate, (Classically MLE):Pick the item i k at stage k, (i k  R k ) that maximizes Fisher’s information at current trait estimate, (Classically MLE): 23

24 MFI - Selection 24

25 Minimum Expected Posterior Variance (MEPV) Selects items that yields the minimum predicted Posterior variance given previous responsesSelects items that yields the minimum predicted Posterior variance given previous responses Uses predictive distributionUses predictive distribution Is a myopic Bayesian decision theoretic approach (minimizes Bayes risk)Is a myopic Bayesian decision theoretic approach (minimizes Bayes risk) First described by Owen (1969, 1975)First described by Owen (1969, 1975) 25

26 Predictive Distribution Predict the probability of a response to an item given previous responsesPredict the probability of a response to an item given previous responses 26

27 Bayesian Decision Theory Dictates optimal (sequential adaptive) decisionsDictates optimal (sequential adaptive) decisions In addition to prior and Likelihood, specify a loss function (squared error loss):In addition to prior and Likelihood, specify a loss function (squared error loss): 27

28 Bayesian Decision Theory: Item Selection Optimal estimator for Squared-error loss is posterior mean (EAP)Optimal estimator for Squared-error loss is posterior mean (EAP) Select item that minimizes Bayes risk:Select item that minimizes Bayes risk: 28

29 Minimum Expected Posterior Variance (MEPV) Pick the item i k remaining in the bank at stage k, (i k  R k ) that minimizes the expected posterior variance (with respect to the predictive distribution):Pick the item i k remaining in the bank at stage k, (i k  R k ) that minimizes the expected posterior variance (with respect to the predictive distribution): 29

30 Other Information Measures Weighted MeasuresWeighted Measures –Maximum Likelihood weighted Fisher’s Information(MLWI) –Maximum Posterior Weighted Fisher’s Information (MPWI): Kulback-Leibler Information: Global Information MeasureKulback-Leibler Information: Global Information Measure 30

31 Hybrid Algorithms Maximum Expected Information (MEI)Maximum Expected Information (MEI) –Use observed information –Predict information for next item Maximum Expected Posterior Weighted Information (MEPWI)Maximum Expected Posterior Weighted Information (MEPWI) –Use observed information –Predict information for next item – Weight with Posterior –MEPWI  MPWI 31

32 Mix – N– Match MAP with uniform prior to approximate MLEMAP with uniform prior to approximate MLE MFI using EAP instead of MLE (any point information function)MFI using EAP instead of MLE (any point information function) Use EAP for item selection, but MFI for final trait estimateUse EAP for item selection, but MFI for final trait estimate 32

33 COMPARISONS 33

34 Study Design Real Item BankReal Item Bank –Depressive symptom items (62) –4 categories (fit with Graded Response IRT Model) Peaked Bank: Items have “narrow” coveragePeaked Bank: Items have “narrow” coverage Flat Bank: Items have “wider” coverageFlat Bank: Items have “wider” coverage fixed length: 5, 10, 20-item CATsfixed length: 5, 10, 20-item CATs 34

35 Datasets Used Post hoc simulation using real data:Post hoc simulation using real data: –730 patients and caregivers at MDA –Real bank only Simulated data: Simulated data: –  grid: -3 to 3 by.5 –500 “simulees” per  –Simulated and Real banks 35

36 Real Item Bank Characteristics 36

37 Real Bank, Real Data, 5 Items 37

38 Real Bank, Real Data, 5 items 38 Selection Criterion Mean SE 2 RMSDCORR MFI0.14630.37630.9069 MLWI0.14320.37360.9094 MPWI0.13960.37380.9080 MEPV0.13880.35980.9149 MEI (Fisher’s)0.13880.36320.9134 MEI (Observed)0.13880.36160.9139 Random0.23690.45670.8565

39 Peaked Bank, Sim. Data, 5 Item 39

40 Peaked Bank, Sim. Data, 5 Item 40 Selection Criterion BIASRMSECORR MFI0.02830.39230.9822 MLWI0.06780.47980.9724 MPWI0.02610.38980.9822 MEPV0.02320.38710.9822 MEI (Fisher’s)0.02990.39030.9824 MEI (Observed)0.02830.39110.9823 Random0.00950.83780.9233

41 Summary Polytomous itemsPolytomous items – Choi and Swartz, In press –Classic MFI with MLE, and MLWI not as good as others. –MFI with EAP, and all others essentially perform similarly. Dichotomous itemsDichotomous items –(van der Linden, 1998) –MFI with MLE not as good as all others* –Difference more pronounced for shorter tests 41

42 Adaptations/ Active Research Areas Constrained adaptive tests/ content balancingConstrained adaptive tests/ content balancing Exposure ControlExposure Control A-stratified adaptive testingA-stratified adaptive testing Item selection including burdenItem selection including burden Cheating detectionCheating detection Response timesResponse times 42

43 43 Thank You!

44 References and Further Reading Choi SW Swartz RJ. (in press) ”Comparison of CAT Item Selection Criteria for Polytomous Items” Applied psychological Measurement. Owen RJ (1969) A Bayesian approach to tailored testing (Research report 69-92) Princeton, NJ: Educational Testing Service Owen RJ (1975). A Bayesian Sequential Procedure for quantal response in the context of adaptive mental testing. Journal of the American Statistical Association, 70, 351-356. van der Linden WJ. (1998). “Bayesian item selection criteria for adaptive testing” Psychometrika, 2, 201-216. van der Linden WJ. & Glas, C. A. W. (Eds). (2000). Computerized Adaptive Testing: Theory and Practice. Dordrecht; Boston: Kluwer Academic. 44

45 45

46 MLE Properties Usually has desirable asymptotic propertiesUsually has desirable asymptotic properties Consistency and efficiency depend on selection criteria and item bankConsistency and efficiency depend on selection criteria and item bank Finite estimate does not exist for repeated responses in categories 1 or mFinite estimate does not exist for repeated responses in categories 1 or m 46


Download ppt "Classical and Bayesian Computerized Adaptive Testing Algorithms Richard J. Swartz Department of Biostatistics"

Similar presentations


Ads by Google