Evaluating Non-EU Models Michael H. Birnbaum Fullerton, California, USA.

Presentation on theme: "Evaluating Non-EU Models Michael H. Birnbaum Fullerton, California, USA."— Presentation transcript:

Evaluating Non-EU Models Michael H. Birnbaum Fullerton, California, USA

Outline This talk will review tests between Cumulative Prospect Theory (CPT) and Transfer of Attention eXchange (TAX) models. Emphasis will be on experimental design; i.e., how we select the choices we present to the participants. How not to design a study to contrast with how one should devise diagnostic tests.

Cumulative Prospect Theory/ Rank-Dependent Utility (RDU)

Nested Models

Testing Nested Models Because EV is a special case of EU, there is no way to refute EU in favor of EV. Because EU is a special case of CPT, there is no way to refute CPT in favor of EU. We can do significance tests and cross- validation. Are deviations significant? Do we improve prediction by estimating additional parameters (Cross-validation)? (It can easily occur that CPT fits significantly better but does worse than EU on cross-validation.)

Indices of Fit have little Value in Comparing Models Indices of fit such as percentage of correct predictions or correlations between theory and data are often insensitive and can be misleading when comparing non-nested models. In particular, problems of measurement, parameters, functional forms, and “error” can make a worse model achieve higher values of the index.

Individual Differences If some individuals are best fit by EV, some by EU, and some by CPT, we would say CPT is the “best” model because all participants can be fit by the same model. But with non-nested models with “errors” it is likely that some individuals will appear “best” fit by a “wrong” model.

“Prior” TAX Model Assumptions:

TAX Parameters For 0 < x < \$150 u(x) = x Gives a decent approximation. Risk aversion produced by  

Non-nested Models Special TAX and CPT are both special cases of a more general rank- affected configural weight model, and both have EU as a special case, but neither of these models is nested in the other. Both can account for Allais paradoxes but do so in different ways.

How not to test among the models Choices of form: (x, p; y, q; z) versus (x, p; y’, q; z) EV, EU, CPT, and TAX as well as other models all agree for such choices. Furthermore, picking x, y, z, y’, p, and q randomly will not help.

Non-nested Models

CPT and TAX nearly identical inside the prob. simplex

How not to test non-EU models Tests of Allais types 1, 2, 3 do not distinguish TAX and CPT. No point in fitting these models to such non-diagnostic data. Choosing random levels of the gamble features does not add anything.

Testing CPT Coalescing Stochastic Dominance Lower Cum. Independence Upper Cumulative Independence Upper Tail Independence Gain-Loss Separability TAX:Violations of:

Testing TAX Model 4-Distribution Independence 3-Lower Distribution Independence 3-2 Lower Distribution Independence 3-Upper Distribution Independence CPT: Violations of:

Allais Paradox 80% prefer R = (\$100,0.1;\$7) over S = (\$50, 0.15; \$7) 20% prefer R’ = (\$100, 0.9; \$7) over S’ = (\$100, 0.8; \$50) This reversal violates “Sure Thing” Axiom. Due to violation of coalescing, restricted branch independence, or transitivity?

Decision Theories and Allais Paradox Branch Independence CoalescingSatisfiedViolated SatisfiedEU, CPT* OPT* RDU, CPT* ViolatedSWU, OPT*RAM, TAX, GDU

Stochastic Dominance This choice does test between CPT and TAX (x, p; y, q; z) vs. (x, p – q; y’, q; z) Note that this recipe uses 4 distinct values of consequences. It falls outside the probability simplex defined on three consequences.

Basic Assumptions Each choice in an experiment has a true choice probability, p, and an error rate, e. The error rate is estimated from (and is the “reason” given for) inconsistency of response to the same choice by same person over repetitions

One Choice, Two Repetitions AB A B

Solution for e The proportion of preference reversals between repetitions allows an estimate of e. Both off-diagonal entries should be equal, and are equal to:

Estimating e

Estimating p

Testing if p = 0

Ex: Stochastic Dominance 122 Undergrads: 59% repeated viols (BB) 28% Preference Reversals (AB or BA) Estimates: e = 0.19; p = 0.85 170 Experts: 35% repeated violations 31% Reversals Estimates: e = 0.196; p = 0.50 Chi-Squared test reject H0: p < 0.4

Results: CPT makes wrong predictions for all 12 tests Can CPT be saved by using different participants? Not yet. Can CPT be saved by using different formats for presentation? More than a dozen formats have been tested. Violations of coalescing, stochastic dominance, lower and upper cumulative independence replicated with 14 different formats and thousands of participants.

Implications Results are quite clear: neither PT nor CPT are descriptive of risky decision making TAX correctly predicts the violations of CPT; several predictions made in advance of experiments. However, it might be a series of lucky coincidences that TAX has been successful. Perhaps some other theory would be more accurate than TAX. Luce and Marley working with GDU, a family of models that violate coalescing.