Presentation on theme: "Accounting for Individual Differences in Bradley-Terry Models by Means of Recursive Partitioning Carolin Strobl Florian Wickelmaier Achim Zeileis."— Presentation transcript:
Accounting for Individual Differences in Bradley-Terry Models by Means of Recursive Partitioning Carolin Strobl Florian Wickelmaier Achim Zeileis
Three Models Thurstone’s law of comparative judgment: the attractiveness of a stimulus is normally distributed in population (Use normal ogive) Bradley-Terry-Luce (BTL) model: The probability of an alternative depends on the ratio of the attractiveness of that alternative to the sum of the attractiveness values in all alternative. (Use logistic function) Rasch model: one attractiveness is ‘1’, another attractiveness is exp(theta-beta)
Research Question The preference scaling maybe heterogeneous among a group of subjects, but related to stimuli characteristic and/or person characteristic. Fit separate BT models to different groups of subjects (McGuire & Davison, 1991; Kissler & Bauml, 2001); Add covariate explicitly to the model (Bockenholt, 2001) Propose a newly model-based recursive partitioning method to incorporate subject covariate in BT model
Procedure of Recursive Partitioning 1. Fit a BT model to the paired comparisons of all subjects in the current subsample, staring with the full sample 2. Assess the stability of the BT model parameters with respect to each available covariate 3. If there is significant instability, split the sample along the covariate with the strongest instability and use the cutpoint with the highest improvement of the model fit 4. Repeat step 1-3 recursively in the resulting subsamples until there are no more significant instabilities (or the subsample is too small)
Fitting BT models π j are stimulus-specific parameters (called worth parameters; merits). ν is a discrimination constant. (Bockenholt added covariate to model these parameters). Prefer j over j’ Prefer j’ over j Undecided (a tie)
Assessing parameter instability in BT models Fluctuation test: to compute subject-wise model deviations that should fluctuate randomly around zero under the null hypothesis of parameter stability
Subject-wise estimation function: the derivative of the likelihood contribution with respect to the parameter vector The derivative are cumulatively aggregated along each of the covariate
Test statistics of systematic deviations: The variable with the smallest p values is used to determine the subsample in the recursive partition. (the sequence of split) Numeric Categorical
The age has the smallest p value and thus used for the first splitting variable in Figure 2.
Cutpoint selection in BT model After the l-th covariate was chosen for splitting, the cutpoint is selected by maximizing the partitioned likelihood. 52
For numeric and ordered covariates, For unordered covariates, the Q categories of an unordered categorical covaraite can be split into any two groups. The partition which has the maximal likelihood is chosen as the cutpoint.
Application Example 1. Germany’s next topmodel 2007 data
The worth parameter estimates showed the preference for the candidate (or in another words, the attractiveness of the candidate to the subjects) For subjects of >=52, the probability of choosing Barbara over Anni is <=52, q2=y <=52, q2=n, M <=52, q2=n, F >=52
The first splitting is Italian skill, the second splitting is spanish/french skill, then under the French, the third splitting is study field.
Discussion Recursive partitioning approach, which is a nonparametric (data-driven) way, is more flexible in detecting nonlinear and interaction effects of covariates. Treatment of numeric and categorical covariate is more natural: the splitting of a numeric covariate is automatically selected in a data-driven way. By contrast, the fully parametric approach require not only an active selection of the covariate but also a distinct choice of the functional form in which the covariates are included.
Latent class approach vs recursive partitioning approach Application of recursive partitioning approach to the Rasch model in the future study
Comments An fundamental problem in this paper is: equal- mean-difficulty was used as constrain in this study. Consequently, the magnitude of the deviation from the mean will sum to zero across the persons in Figure 1. If not use the EMD constrain, it is possible to use one person’s estimate as anchor. Moreover, the purification procedure is desirable to incorporate the recursive partitioning approach.