Decision Analysis Lecture 2

Decision Analysis Lecture 2
Tony Cox My Course web site:

Toward higher-value analytics
Reorientation: From solving well-posed problems to discovering how to act more effectively Descriptive analytics: What’s happening? Predictive analytics: What’s (probably) coming next? Causal analytics: What can we do about it? Prescriptive analytics: What should we do? Evaluation analytics: How well is it working? Learning analytics: How to do better? Collaboration: How to do better together?

Agenda Problem set 1 solutions Assignment 2
Normal form (decision table) analysis Assignment 2 Decision tables and decision trees Why use utility functions and EU? Risk profiles Stochastic dominance

Class 1, Problem #1 Which is the better choice, A or B
A gives probabilities (0.1, 0.2, 0.6, 0.1) of values (20, 10, 0, -10) B gives probabilities (0,7, 0.3) of values (5, -1) Which has the greater expected value? (Assume for now that the goal is to maximize expected value)

Solution to problem # 1 For choice A,
EMV(A) = sum(c(0.1, 0.2, 0.6, 0.1)*c(20, 10, 0, -10)) = 3 For choice B, EMV(B) = sum(c(0.7, 0.3)*c(5, -1)) = 3.2 So, B is preferable (higher EMV)

Class 1, Problem #2: Risk attitude and insurance decisions
House is worth $1,200,000 Probability of loss in any year = 0.05 Can buy full insurance against loss for a cost of $100,000 per year Should owner buy the insurance? Solve for risk-neutral owner, u(x) = x Draw decision tree, identify decision with maximum expected monetary value (EMV) Solve if u(x) = ln(x), x = final wealth

Solution via decision tree: Part 1

Solution to problem 1 for risk-neutral owner
For risk neutral owner, utility of final wealth = final wealth: u(x) = x, if x = final wealth If he insures, expected utility is EU(Insure) = 1,300,000 – 100,000 = 1,200,000 If he does not insure, expected utility is EU(Do not insure) = 0.05*( ) *( ) = 1,240,000 Since 1,240,000 > 1,200,000, he should not insure.

Normal form analysis (decision table)
State 1 State 2 EU Act 1 c11 c12 ? Act 2 c21 c22 Prob: p1 p2 Normal form analysis: Assess utility of consequence for each act-state pair. Choose act with maximum expected utility

Solution method: Normal form analysis (decision table)
State 1 State 2 EU Act 1 c11 c12 p1*u(c11) + p2*u(c12) Act 2 c21 c22 p1*u(c21) + p2*u(c22) Prob: p1 p2 Normal form analysis: Assess utility of consequence for each act-state pair. Choose act with maximum expected utility

Solution for risk-neutral owner using normal form analysis
Loss (fire) No loss EU Insure 1,200,000 Do not insure 100,000 1,300,000 0.05* * = 1,240,000 Prob: 0.05 0.95 Normal form analysis: Assess utility of consequence for each act-state pair. Choose act with maximum expected utility (here, Do not insure, since 1,240,000 < 1,200,000)

Decision tree for Part 2 (risk-averse owner)

Solution to problem #2, Part 2 for risk-averse owner
For risk-averse owner, utility of final wealth is u(x) = log(x), if x = final wealth (log = ln) If he insures, expected utility is EU(Insure) = log(1,200,000) = 14.0 If he does not insure, expected utility is EU(Do not insure) = 0.05*log(100000) *log( ) = 13.95 Since 14.0 > 13.95, he should insure.

Normal form analysis for risk-averse owner
Loss (e.g., fire) No loss EU Insure 1,200,000 log( ) = 14.0 Do not insure 100,000 1,300,000 0.05*log(100000) *log( ) = 13.95 Prob: 0.05 0.95 Normal form analysis: Assess utility of consequence for each act-state pair. Choose act with maximum expected utility (here, Insure, since 14 >13.95)

Lessons from Problem # 2 What the owner should do (to maximize EU) depends on his risk aversion There is no correct answer in EU theory to “How risk-averse should I be?” Utilities for outcomes are inputs, not outputs But preferences in EU theory are not arbitrary: they must be coherent Raising the cost of insurance would not increase its EU to owner

Class 1, Problem # 3: Decision tree analysis
Which decision (install scrubbers, order new cleaner coal, or install new transmission line to hydroplant) maximizes EMV (expected monetary value)? For what range of scrubber prices (shown here as $3M) is “Install scrubbers” the optimal (EMV-maximizing) decision?

EMV of different choices
EMV(Install scrubbers) = -3M EMV(do not intervene) = -(0.7* *5) = -3.6M > -4.5M, so Do not intervene if the chance to do so arises EMV(cleaner coal) = -(0.5* *1.5) = -2.55 EMV(new line) = 0.1*(-3.5M) + 0.9*(-2.25M) = -(0.1* *2.25) M = M So choose new line. “Install scrubbers” is optimal for cost < 2.375M.

Decision table representation
Strategy Line works, no strike Line works, quick strike Line works, slow strike Line fails, quick strike slow strike Install scrubbers -3 Order new coal, intervene if strike -1.5 -4.5 Order new coal, do not intervene if strike -5 Install new line -2.25 -3.5 Probabilities 0.9*0.5 0.9*0.5*0.7 0.9*0.5*0.3 0.1*0.5 0.1*0.5*0.7 0.1*0.5*0.3

Homework #2 (Due by 4:00 PM, January 31)
Problems Investment decision: Options vs. stock Readings Schoemaker 1982, on EU theory Optional” Kranton notes on EU Optional: Kirkwood Chapter 1, Decision Trees (Optional: Kranton notes on continuous distributions and on first-order stochastic dominance (FSD), pp

Problem #1: Options vs. stock
Investor has $500 to invest. Stock price is now $ It will be either $33.50 with probability 25% (if Apricot wins) $25.75 with probability 75% (if Apricot loses) Choice A: Buy $500 of the stock at $28.50 Choice B: Pay $500 for option to buy 1000 shares of stock for $30,000 (= $30/share) Choice C: Buy neither, make 8% on $500

Draw cumulative distribution functions (“risk profiles”) for A, B, and C, assuming 25% probability that stock price will increase to $33.50 (else will fall to $25.75) Which choice should the investor make? Please submit 3 numbers: EMV(A), EMV(B), EMV(C) How large must the probability be that stock price will increase to $33.50 to make buying the option optimal (EMV-maximizing)? Please submit one number

Stock price is now $ It will be either $33.50 with probability 25% (if Apricot wins) $25.75 with probability 75% (if Apricot loses) Choice A: Buy $500 of the stock at $28.50 EMV(A) = (500/28.50)*33.50* (500/28.50)*25.75*0.75

Skill 4: Drawing and interpreting risk profiles Choosing by stochastic dominance

Challenge: Decision trees can be infinitely large
Example 1: When to quit “triple or nothing”? Flip fair coin, get $3n if n heads and stop; get $0 and game ends if a tail comes up St. Petersburg Paradox

Challenge: Decision trees can be infinitely large
Example 2: Choose between: A: $100 with certainty B: A return that is equally likely to be anywhere between $90 and $120 Uniform probability distribution What should an EMV decision-maker do? What to do if u(x) = ln(x)?

R to the rescue! Drawing a risk profile (using “p” before the distribution name) for a uniform distribution (“unif”) going from 90 to 120: > x = c(0:150) > y = punif(x,90,120) > plot(x,y)

Simulation: EMV = about 105
Calculating the mean of a random sample (“r” prefix) for a uniform distribution (“unif”) going from 90 to 120: “runif(100)” samples 100 times from U[0, 1] Uniform distribution (or density) from 0 to 1 *runif(100) gives 100 samples uniformly distributed between 90 and 120 > mean( *runif(100)) [1]

Simulation: E[ln(x)] = about 4.65 (Exceeds ln(100) = 4.61)
> log(100) [1] > mean(log( *runif(100))) [1] > mean(log( *runif(1000))) [1] [1] [1] [1] So, this d.m. should take the risk (Choice B) instead of the $100 with certainty.

Notation: Risk profile or CDF (cumulative distribution function)
Pr(X < x), X is a random variable (uncertain quantity), x is a number x

What is Pr(X < 1000)? What is Pr(X = 1000)?

Pr(X < 1000) = 0.6 (point A) Pr(X = 1000) = 0.6 - 0.25 = 0.35

Deterministic dominance: Which choice is better, A or B?

Stochastic dominance: Which choice is better, C or D?

Answers Whenever more of x is preferred to less, the right-most cumulative distribution function (CDF) is preferred (A and C) Because it gives a higher probability of achieving at least any given x value Is also implied by EU theory: All EU decision-makers with u(x) increasing in x prefer the right-most CDF, if there is one Fancy name for this concept: First order stochastic dominance (FSD)

Why? Why use EU? Why use N-M utility function?

Why Maximize EU? Q: Why not consider variance of utility, probability of gain, probability of obtaining a target level, minimizing maximum loss, etc.? Math answer: EU is the unique decision principle satisfying axioms of: Reduction: Only consequence probabilities matter Weak Order (transitive, reflexive, complete) for the “is at least as preferred as” relation, R Independence: aRb iff (a p c)R(b p c), 0 < p < 1 Where (a p c) means get a with probability p, else get c Continuity: (a P b) & (b P c) implies (a p c) is indifferent to b for some p

And, furthermore (math answer)…
EU is the unique decision principle satisfying dynamic consistency axioms… … and it is the only procedure satisfying various other (coherent updating, substitution-of-equivalents, etc.) axioms. It is also one of the few procedures that is guaranteed to avoid dominated decisions, sensitivity to logically irrelevant information, and various types of incoherence. Need EU (or something very similar) to evaluate alternative risk profiles.

Intuition behind the theory (beyond the math answer)
Expected value of a probability is a probability Example: Let P = success probability. Then 0.6 probability of P = 0.2 and a 0.4 probability of P = 0.8 gives a total probability of success of E(P) = 0.6* *0.8 = = 0.44 Key idea: Define N-M utility as a probability A clever way to measure preferences as, explained next Then maximizing preferences implies maximizing utility (or maximizing expected utility, for acts).

Hypothetical lottery conceptual definition of N-M utility
Let x be a consequence whose utility we want to evaluate Define u(most-preferred consequence) = 1 Define u(least-preferred consequence) = 0 For x, by continuity, there is a p between 0 and 1 such that x is indifferent to a probability p of most-preferred outcome, else least-preferred outcome. Define this value of p as the N-M utility of x. In symbols, u(x) = p*1 + (1 – p)*0 = p = EU of “canonical lottery” that is indifferent to x.

Wrap-up on why to maximize EU
u(x) = indifference probability that makes prospect giving best outcome with some probability, else worst outcome, exactly indifferent to x with certainty. A prospect that gives multiple possible outcomes can be reduced (via repeated substitutions) to an equivalent (indifferent) prospect giving only best or worst outcome. , pages 6-8 Because all numbers can be interpreted as probabilities, the correct reduction formula is expected value. Defining N-M utilities as indifference probabilities guarantees that expected values represent preferences and that the d.m. should maximize expected utility.

What kind of measurement scale is utility measured on?
u(x) represents preferences for consequences Preferred consequences have higher utility numbers. (So, the measurement scale is at least ordinal.) EU(X) represents preferences for random consequences. (Interpret acts as r.v.s) u(x) also reflects risk attitude (willingness to accept a risk) – more on this soon

What kind of measurement scale is utility measured on?
If u(x) is a utility function representing a d.m.’s preferences and risk attitude, and if w(x) = a*u(x) + b, for any constant a > 0 and any b, then w(x) also represents the d.m.’s preferences and risk attitude Can choose the origin and scale, just as with temperature Thus, utility is measured on a difference scale Choosing 0 and 1 as the two endpoints for the utility scale makes u(x) unique.

Decision psychology: Descriptive vs. prescriptive decision science

Normative vs. descriptive decision theories
EU is a normative decision theory. Prescribes how we should make decisions It is not a descriptive theory. Many violations of EU maximization in practice, as documented by Allais, Ellsberg, Tversky and Kahneman, Slovic, Thaler, many others Prospect Theory is a descriptive theory Not perfect, but pretty good for many purposes

Concurrent choice experiment
Pick one of A or B, and pick one of C or D A: Gain $240 for sure B: 25% chance to gain $1000, else gain 0 C: Sure loss of $750 D: 75% chance to lose $1000, else lose 0 So, your four possible choices are A & C A & D B & C B & D Which pair do you choose?

Usual Concurrent Choices
Pick one of A or B, and pick one of C or D A: Gain $240 for sure [84%](Tversky & Kahneman, ‘81) B: 25% chance to gain $1000, else gain 0 [16%] C: Sure loss of $750 [13%] D: 75% chance to lose $1000, else lose 0 [87%] But usual choice (A & D) is objectively worse than unusual choice (B & C). Why? (A and D) is equivalent to 75% chance to lose $760 (= $240 - $1000), else gain $240. (B and C) is equivalent to 75% chance to lose $750, else gain $250.

Stochastic dominance (A and D) gives 75% chance to lose $760, 25% chance to gain $240. (B and C) gives 75% chance to lose $750, 25% chance to gain $250. (B & C) stochastically dominates (A & D) First-order stochastic dominance (FSD): Probabilities of preferred outcomes are greater for (B & C) than for (A & D) DA identifies undominated choices

Conclusions on concurrent choice
Humans tend to consider each choice in isolation rather than looking at portfolios of outcomes and their probabilities for different combinations of choices This leads us to make predictably sub-optimal choices. (So do many other psychological barriers to rational decision-making.) Decision analysis (DA) can overcome such sub-optimal decision-making.

Framing: Asian Flu, year 1
Policy maker must choose between vaccines against Asian Flu. No vaccine: 600 people die Choose between the following: Vaccine A: Save 200 people with certainty Vaccine B: 1/3 probability save 600 people; 2/3 probability save no one

Framing: Asian Flu, year 2
Policy maker must choose between vaccines against Asian Flu. No vaccine: 600 people die Year 2: Choose between the following: Vaccine C: 400 people die with certainty Vaccine D: 1/3 probability nobody dies; 2/3 probability 600 people die

Asian Flu example: Results
No vaccine: 600 people die Choose between the following: Vaccine A: Save 200 people with certainty (72%) Vaccine B: 1/3 probability save 600 people; 2/3 probability save no one (28%) Vaccine C: 400 people die with certainty (22%) Vaccine D: 1/3 probability nobody dies; 2/3 probability 600 people die (78%)

Framing/presentation affects preferences

Conclusions on framing
How prospects are framed (gain vs. loss, positive vs. negative) and how reference points are set can greatly affect choices. Reference point may respond to cues and information that are logically irrelevant Yet, framing has no effect on a rational decision-maker (“Econ” in Richard Thaler’s term, or homo economicus) Real people are not entirely rational

Decision psychology: Much room to improve naturalistic decision-making
Q: Why bother with normative / prescriptive decision analysis? A: Because decision psychology leads to predictable (and correctable) sub-optimal decisions, lower-than-necessary rewards Concurrent decisions – Overconfidence Framing – Conformation bias Sunk cost fallacy – Planning fallacy Endowment effect – Etc.

Heuristics and Biases (Tversky and Kahneman Slovic Thaler’s Misbehaving)

Application: Decision psychology in marketing

Decision psychology is used to shape preferences and behaviors

Use of reference points and framing in marketing

Manipulating reference point

Multi-criteria reference points: The decoy effect in marketing
Which is better, A or B? Answer is not obvious Decoy option C shifts preferences toward A Decoy option D shifts preferences toward B MP3 player A B price $400 $300 storage 30 GB 20 GB A B C price $400 $300 $450 storage 30 GB 20 GB 25 GB A B D price $400 $300 $350 storage 30 GB 20 GB 15 GB

Neuromarketing: Investigate decision-making as stimulus-response in brain
Coke vs. Pepsi example: Brand affects taste!

Explaining heuristics and biases: Prospect theory
Perceived value of a risky prospect is not a sum of (probabiity*consequence value), but a sum of w(p)v(x) values w(p) = probability weighting function overweights small probabilities underweights large probabilities certainty effect v(x) = value of outcome x relative to some reference point

Prospect Theory: Four-fold pattern of preferences
Reference point -- No stable values, sunk costs -- Diminishing sensitivity Loss aversion: A loss is about twice as painful as a gain of the same size Risk-averse for medium to large probability of gains; risk-seeking for small probability of gains Risk-seeking for medium to large probability of losses; risk-averse for small probability losses (insurance)

Prospect theory: Probability weighting (increases with stress)

Loss aversion, endowment effect and status quo bias
We tend to prefer whatever we have Selling prices (“willingness-to-sell”) may exceed buying prices (“willingness-to-pay”) by large amounts

Example: Asian Flu risk management (Tversky & Kahneman, 1981)
"Imagine that the U.S. is preparing for an outbreak of an unusual Asian disease, which is expected to kill 600 people. Two alternative programs to combat the disease are proposed. Assume that the exact scientific estimates of the consequences of the programs are as follows: If program A is adopted, 200 people will be saved. If program B is adopted, there is a 1/3 probability that 600 people will be saved and 2/3 probability that no people will be saved." 72% choose option A instead of B

Example: Asian Flu risk management (Tversky & Kahneman, 1981)
"Imagine that the U.S. is preparing for an outbreak of an unusual Asian disease, which is expected to kill 600 people. Two alternative programs to combat the disease are proposed. Assume that the exact scientific estimates of the consequences of the programs are as follows: If program C is adopted, 400 people will die. If program D is adopted, there is a 1/3 probability that no one will die and 2/3 probability that 600 people will die. 78% choose option D (logically equivalent to B)

Skill 4: Decision Tables (“Normal form” analysis)

Example: Buying a used car

One-shot decision: Buying used car
Ingredients Acts, states, consequences Preferences, values, utilities for consequences Probabilities of states Optimal act maximizes expected utility Good 0.8 Bad 0.2 Buy 1 Don’t “Normal form” decision table (Raiffa, 1968)

Ingredients Acts, states, consequences Preferences, values, utilities for consequences Probabilities of states Optimal act maximizes expected utility Q: What are the acts, states, utilities? Good 0.8 Bad 0.2 Buy 1 Don’t “Normal form” decision table (Raiffa, 1968)

Ingredients Acts, states, consequences Preferences, values, utilities for consequences Probabilities of states Optimal act maximizes expected utility Q: Find expected utility of each act. 0.8* *0 = 0.8 0.8* *0.2 = 0.2 Good 0.8 Bad 0.2 EU Buy 1 Don’t “Normal form” decision table (Raiffa, 1968)

Ingredients Acts, states, consequences Preferences, values, utilities for consequences Probabilities of states Optimal act maximizes expected utility Select act with maximum EU (in this case, Buy) Good 0.8 Bad 0.2 EU Buy 1 Don’t “Normal form” decision table (Raiffa, 1968)

Decision tables Key idea: Each possible decision (“act”) yields a set of probabilities for consequences, (x, p). Display acts as rows of a table Display uncertain states as columns State = resolution of all uncertainties that, together with the act, determine the outcome Enter in each cell the N-M utility of the outcome for that row (act) and state (column) Assess probabilities for the states (columns) Choose row to maximize EU(x ,p)

Example: Ticket-buying decision
A ticket has a 45% chance of winning $10, else it wins nothing. Should a risk-neutral d.m. be willing to pay $4.00 for it? Decision table analysis: Change in EMV > 0, so yes, the ticket is worth more than $4.00 Risk-neutral d.m. need only consider change in EMV. State 1 = ticket wins State 2 = ticket loses Change in EMV Act 1 = Do not buy Act 2 = Buy 10 – 4 = 6 0 – 4 = -4 0.45* *4 = 0.5 State probability 0.45 0.55

Generalization: Payoff matrix
Let p = (p1, p2, ..., pn) = probabilities of states 1, 2, …, n (the n columns in table). Let ui = (ui1, ui2, ..., uin) be the N-M utilities if d.m. chooses act (row) i and if each state (column) j = 1, 2, …, n occurs. In R notation, ui = u[i, ] U = m x n “payoff matrix” with rows ui Uij = utility of outcome from act i, state j

Creating payoff matrices in R
Command syntax for creating a matrix: matrix(data, nrow, ncol, byrow = T) Example: Create the following matrix in R Solution script: State 1 = ticket wins State 2 = ticket loses Act 1 = Do not buy Act 2 = Buy 6 -4 R: U <- matrix(c(0,0,6,-4),2,2, byrow = T) R: colnames(U) <- c('State 1', 'State 2') R: rownames(U) <- c('Act 1', 'Act 2') R: U State 1 State 2 Act Act

Solving a payoff matrix for the best decision
Given a payoff matrix U and state probabilities p, the expected utility of row/act/decision i is just sum(p*U[i,]) In algebraic notation, EU[i] = jp[j]U[i, j] = expected utility of act i R provides special operators (dot product %*% of a matrix and a vector) that let us easily calculate EU for all rows and identify the row with the largest EU value. This is the recommended decision. R: p <- c(0.45, 0.55) R: EU <- U %*% p R: colnames(EU) <- 'EU' R: EU EU Act 1 0.0 Act 2 0.5 R: best_act <- which(EU == max(EU)) R: best_act [1] 2

Summary of decision table analysis R script
Create a payoff matrix R: U <- matrix(c(0, 0, 6,-4), 2, 2, byrow = T) Name the columns R: colnames(U) <- c('State 1', 'State 2') Name the rows R: rownames(U) <- c('Act 1', 'Act 2') Display the payoff matrix R: U Specify column (state) probs R: p <- c(0.45, 0.55) Compute EU for each row (act) R: EU <- U %*% p Name the vector of EU values “EU” R: colnames(EU) <- 'EU' Display the EU values R: EU Find row (act) with greatest EU R: best_act <- which(EU == max(EU)) Print number of EU-maximizing row R: best_act Create and print decision table R: table <- rbind(cbind(U, EU), p) R: table[nrow(table), ncol(table)] <- NA R: table

Interim decisions may include collecting more information
Making an observation before acting can change probabilities of outcomes, and hence the EUs of different acts. Information has value if it might change the EU-maximizing act. Later, we shall consider how to quantify value of information (VOI) and how to use information to update probabilities (via Bayes’ Rule and conditional probability) Shrobe and Davis, 2006,

A larger decision tree example
Which decision (install scrubbers, order new cleaner coal, or install new transmission line to hydroplant) maximizes EMV (expected monetary value)? For what range of scrubber prices (shown here as $3M) is “Install scrubbers” the optimal (EMV-maximizing) decision?

Decision table representation
Strategy Line works, no strike Line works, quick strike Line works, slow strike Line fails, quick strike slow strike Install scrubbers -3 Order new coal, intervene if strike -1.5 -4.5 Order new coal, do not intervene -5 Install new line -2.25 -3.5 Probabilities 0.9*0.5 0.9*0.5*0.7 0.9*0.5*0.3 0.1*0.5 0.1*0.5*0.7 0.1*0.5*0.3

Automated solution DAT_EU(daTable,daProb)
Expected utilities for each strategy in decreasing order: ExpectedUtility CertaintyEquivalents Install_new_line Order_new_coal___do_not_intervene Install_scrubbers Order_new_coal___intervene_if_strike

Setting up decision tables (Normal form analysis)
Represent decision problem by payoff matrix Rows = acts (or strategies, decision rules) Columns = states “State” = all uncontrolled factors that, with act a, determine the consequence Each state has a probability Consequence for act a, state s: c(a, s) Utility for a, s: u(a, s) = u[c(a, s)]

Normal form analysis Represent decision problem by payoff matrix u(a, s) and state probabilities, p(s) State probabilities: p(s) = Pr(s) represent beliefs State = all determinants of consequence, except act Utilities represent preferences, risk attitude Eliminate dominated rows (acts) Those with lower payoffs in every state Identify row with highest EU value: Where EU(a) = ∑su(a, s)p(s)

One-shot decisions: Summary
Solution method: Choose act with greatest EU. Calculation: EU(a) = sum over s of Pr(s)u[c(a, s)]. c(a, s) is consequence of taking act a if state is s. u[c(a, s)] = utility of consequence c(a, s) (between 0 and 1) Pr(s) = probability of state s. EU(a) = expected utility of act a = sPr(s)u[c(a, s)] Why it works: Axioms relate preferences (utilities) for consequences to preferences for acts. Inputs: Preferences for consequences, u(c) Beliefs: Pr(s); consequence model, c(a, s) Outputs: Recommended a; Pr(c | a) (risk profile)

Example: Airline overbooking
Strategic (normal) form model of profits: Find expected value-maximizing decision State Decision s = number of passenger no-shows Overbook 0 Overbook 1 -75 200 Overbook 2 -150 125 400 p(s) 0.25 0.40 0.35

Solution in R > p = c(0.25, 0.40, 0.35) > x = c(-75, 200, 200)
> y = c(-150, 125, 400) > EU1 = sum(p*x) > EU2 = sum(p*y) > EU1 [1] > EU2 [1] 152.5

Filling in the normal form matrix
Assessing utilities Utility elicitation Single-attribute utility theory Multi-attribute utility theory Assessing probabilities Eliciting well-calibrated probabilities Deriving probabilities from models Estimating probabilities from data

Decision trees vs. normal form

Decision tree ingredients
Three types of nodes Choice nodes (squares) Chance nodes (circles) Terminal nodes / value nodes Arcs show how decisions and chance events can unfold over time Uncertainties are resolved as time passes and choices are made

Solving decision trees
“Backward induction” “Stochastic dynamic programming” “Average out and roll back” Procedure: Start at tips of tree, work backward Compute expected value at each chance node “Averaging out” Choose maximum expected value at each choice node

Obtaining Pr(s) from Decision trees http://www. eogogics
Decision 1: Develop or Do Not Develop Development Successful + Development Unsuccessful (70% X $172,000) (30% x (- $500,000)) $120,400 + (-$150,000)

What happened to act a and state s. http://www. eogogics
Decision 1: Develop or Do Not Develop Development Successful + Development Unsuccessful (70% X $172,000) (30% x (- $500,000)) $120,400 + (-$150,000)

What are the 3 possible acts in this tree?

What are the 3 possible acts in this tree? (a) Don’t develop; (b) Develop, then rebuild if successful; (c) Develop, then new line if successful.

Optimize decisions! What are the 3 possible acts in this tree? (a) Don’t develop; (b) Develop, then rebuild if successful; (c) Develop, then new line if successful.

Key points Solving decision trees (with decisions) requires embedded optimization Make future decisions optimally, given the information available when they are made Event trees = decision trees with no decisions Can be solved, to find outcome probabilities, by forward Monte-Carlo simulation, or by multiplication and addition In general, sequential decision-making cannot be modeled well using event trees. Must include (optimal choice | information)

What happened to state s. http://www. eogogics
What are the 4 possible states?

What happened to state s. http://www. eogogics
What are the 4 possible states? C1 can succeed or not; C2 can be high or low demand

Acts and states cause consequences http://www. eogogics

Key theoretical insight
A complex decision model can be viewed as a (possibly large) simple c(a, s) model. s = selection of branch at each chance node a = selection of branch at each choice node c = outcome at terminal node for (a, s) Other complex decision models can also be interpreted as c(a, s) or Pr(c | a, s) models s = system state & information signal a = decision rule (information  act) c may include changes in s and in possible a.

Other complex decision models
Markov decision process (MDP) Pr(next state | current state, current act) Reward(current state, current act) Partially observable MDP (POMDP) Pr(signal | current state) Dynamic Bayesian Network (DBN) Choose acts at choice nodes to maximize EU Stochastic optimal control system a = feedback control law

Real decision trees can quickly become “bushy messes” (Raiffa, 1968) with many duplicated sub-trees

Obtaining Pr(s): Influence Diagrams http://en. wikipedia
Often much more compact than decision trees

Limitations of decision trees
Combinatorial explosion Example: Searching for a prize in one of N boxes or locations involves building a tree of depth N! = N(N – 1)…*2*1. Infinite trees Continuous variables When to stop growing a tree? How to evaluate utilities and probabilities?

Optimization formulations of decision problems
Example: Prize is in location j with prior probability p(j), j = 1, 2, …, N It costs c(j) to inspect location j What search strategy minimizes expected cost of finding prize? What is a strategy? Order in which to inspect How many are there? N!

With two locations, 1 and 2 Strategy 1: Inspect 1, then 2 if needed:
Expected cost: c1 + (1 – p1)c2 = c1 + c2 – p1c2 Strategy 2: Inspect 2, then 1 if needed: Expected cost: c2 + (1 – p2)c1 = c1 + c2 – p2c1 Strategy 1 has lower expected cost if: p1c2 > p2c1, or p1/c1 > p2/c2 So, look first at location with highest success probability per unit cost

With N locations Optimal decision rule: Always inspect next the (as-yet uninspected) location with the greatest success probability-to-cost ratio Example of an “index policy,” “Gittins index” If M players take turns, competing to find prize, each should still use this rule. A decision table or tree can be unwieldy even for such simple optimization problems

Other optimization formulations
maxa A EU(a) Typically, a is a vector, A is the feasible set More generally, a is a strategy/policy/decision rule, A is the choice set of feasible strategies In previous example, A = set of permutations s.t. EU(a) = ∑cPr(c | a)u(c) Pr(c | a) = ∑sPr(c | a, s)p(s) g(a) ≤ 0 (feasible set, A)

Advanced decision tree analysis
Game trees Different decision-makers Monte Carlo tree search (MCTS) in games with risk and uncertainty Generating trees Apply rules to expand and evaluate nodes Learning trees from data Sequential testing

Summary on decision trees
Decision trees show sequences of choices, chance nodes, observations, and final consequences. Mix observations, acts, optimization, causality Good for very small problems; less good for medium-sized problems; unwieldy for large problems  use IDs instead Can view decision trees and other decision models as simple c(a, s) models But need good optimization solvers!

Skill 5: Calculating certainty equivalents (CEs)

Recall EU calculation with u(x) = log(x)
Define the utility function, as follows: u <- function(x){ value <- log(x) return(value) } Application: Calculate expected utility R: x <- c(10, 100) R: p <- c(0.4, 0.6) R: EU <- sum(p*u(x)) Result: EU(X) = 3.68 R: u <- function(x){ value <- log(x) return(value) } R: x <- c(10, 100) R: p <- c(0.4, 0.6) x <- c(10, 100) R: EU <- sum(p*u(x)) R: EU p <- c(0.4, 0.6) EU <- sum(p*u(x)) EU [1]

Certainty equivalents
Problem for communicating results to client/customer/employer: Knowing that EU(X) = tells us nothing useful!

Certainty equivalents
Problem for communicating results to client/customer/employer: Knowing that EU(X) = tells us nothing useful! Better approach: Answer the decision question, “What is X worth?” What is the least we should be willing to sell it for? Answer is the selling price or certainty equivalent (CE) of X.

Defining certainty equivalents (CEs) of monetary r.v.s
Challenge: What is the definition of CE(X) in terms of the utility function u() and the expected utility function EU()? Utility function u() maps outcomes (“consequences” of choice) to numbers, often scaled to run from 0 for least-preferred possible outcome to 1 for most-preferred possible outcome. Expected utility function (or functional) maps random variables X to numbers (given u()).

Defining certainty equivalents (CEs) of monetary r.v.s
Challenge: What is the definition of CE(X) in terms of the utility function u() and the expected utility function EU()? Answer: u(CE(X)) = EU(X) Interpretation: The decision maker (d.m.) is indifferent between receiving amount CE(X) for sure and receiving the outcome of r.v. X Both have the same EU Mathematical implication: CE(X) = u-1(EU(X)) u-1() is the inverse function of utility function u()

Example: Find CE(X), given that u(x) = log(x) and EU(X) = 3.684136
Solution: u(CE(X)) = EU(X) log(CE(X)) = CE(X) = u-1(EU(X)) = exp( ) = (answer) Interpretation: A prospect that pays $100 with probability 0.6, else $10, is worth $39.81 to this d.m. EMV is 0.6* *10 = $64

A solver for CEs Definition of CE(X): u(CE(X)) = EU(X)
Represent X by vectors x = (x1, x2, …, xn) and p = (p1, p2, …, pn), listing possible values of X and their respective probabilities. CE(X) is value of C that solves u(C) = EU(x, p), or u(C) - EU(x, p) = 0 Here, EU(x, p) = sum(p*u(x)) Can use root-finding search algorithm to solve u(C) - EU(x, p) = 0. The following R script (in CAT) solves for CE(X): R: EU <- function(x,p){EU <- sum(p*u(x)) return(EU)} R: f <- function(C) u(C) - EU(x, p) R: str(xmin <- uniroot(f, c(0, 100), tol = ))

Theory: Risk aversion = concave (downward-bent) utility function
EMV(X) – CE(X) = Risk Premium (RP) For a risk-averse d.m., CE(X) < EMV(X) W = wealth, an r.v. W1 = maximum possible, and W0 = minimum possible wealth after uncertainty is resolved U() = utility function RP = risk premium U(pW1 + (1- p)W0) < pU(W1) + (1- p)U(W0)

How does CE change if d.m. starts with $10,000?
With this initial wealth, CE(X) is $63.90, very close to the EMV of $64.00. Thus, greater initial wealth reduces risk premium (and risk aversion) for a relatively small but uncertain (r.v.) change in wealth. # Calculation in CAT * R: x [1] R: x [1] R: EU( x, p) [1] R: CE <- exp(EU( x, p)) R: CE [1] * Recall that we defined the function EU(x, p) = p1u(x1) + p2u(x2) pnu(xn) in R as follows: EU <- function(x,p){ EU <- sum(p*u(x)) return(EU)}

Important special case
Let CE(w, X) = CE of X if initial wealth is w. Theorem: If CE(w, X) does not depend on w and the d.m. is risk-averse, then the utility function is u(x) = 1 - e-kx. K indicates relative risk aversion E.g., Abbas, 2007, equation 19 Theorem: If X is normally distributed and u(x) = 1 - e-kx, CE(X) = E(X) – (k/2)Var(X) E.g., Myerson,

Wrap-up on certainty equivalents
Certainty equivalents (CEs) express the values of money random variables in natural units (dollars) that are more easily communicated and interpreted than expected utilities (EUs) CE(X) = selling price of X CEs can be calculated from same information as EUs (by solving u(CE(X)) = EU(X)) x = (x1, x2, …, xn), p = (p1, p2, …, pn), u(x) CE(W + X) and EU(W + X) typically get closer as initial wealth W becomes larger W = initial wealth, X = change in wealth

Is the CE well-defined for real people?
Endowment effect: CE for something jumps up when we own it.

Endowment effect suggests that owning something increases its utility
u(x) at time of decision  u(x) experienced after decision is made

Prospect theory explains endowment effect by loss aversion
Pain of loss is roughly twice pleasure of gain. Aversion to loss explains the effect.

Sizes of endowment effects

Endowment effect applications
Holding on to underperforming stocks Clinging to opinions Status quo bias Adoption Sales

Conclusions on endowment effect
Shows that the assumption that people have stable, well-defined preferences for outcomes does not describe some real preferences Normative/prescriptive vs. descriptive models Real people often care about changes relative to a reference point rather than final outcomes This leads to logical inconsistencies and possibilities for manipulation that marketers, politicians, and others exploit

Is a unified theory of decisions possible (and desirable)?

What is a decision? Any one of a small number of available alternative acts or choices “rows” in a decision table A choice of act, based on information available when the choice must be made Choice of an act at each node in a decision trees or influence diagrams A choice of values for decision variables Optimization models, open-loop control

What is a decision? Part of a set or sequence of actions (a “plan”) for achieving a goal Artificial intelligence (AI) An if-then rule (“decision rule” or “policy”) mapping observations to actions Stochastic optimal control, learning, evolution Can map observations to choice probabilities Choice between probability distributions Fundamental framework for decision analysis

Types of decisions One-shot vs. sequential
Discrete vs. continuous choice set Explicit vs. implicit (e.g., all feasible allocations) Deterministic vs. stochastic (random) consequences Consequences: Reward, next state Known vs. uncertain probabilities Learning how to act in uncertain environments Known vs. uncertain preferences

Types of decisions One vs. many objectives or criteria
Objectives, attributes, or criteria for comparing and evaluating outcomes Single-attribute vs. multi-attribute decisions Mutli-criteria decision making (MCDM) One, two, few, or many decision makers Teams, games, coalitions, organizations Distributed/decentralized vs. centralized

Some alternative ideas about how to decide
Choose whatever seems best/feels right Trust our intuition and experience (“System 1”) Choose whatever usually works best Explore and exploit, learn and adapt, experiment, evolve Imitate others (social norms, learning) Choose best bet (“System 2”) How to define and calculate “best” bet?

Yet, a unified theory of decisions is possible (and useful!)

The basics of normative theory: Expected utility theory

Key normative idea: Maximize expected utility!

Techniques for normative decision theory
Single-person decision theory Normal form Decision tables Extensive form Decision trees Influence diagrams (IDs) Markov decision processes (MDPs) Stochastic optimal control We will often use normal form Also crucial to understand Modern method advanced research topics

Context: One person, one decision Choose a decision rule, mapping what you see (events) to what you do (actions)

Pre-decision structuring

Questions?

Conditioning

Notation: Conditioning
Pr(A | B) = conditional probability of A, given B A and B are events Events are subsets of a “sample space” We will sometimes condition on acts/decisions “|” is the sign for “conditioned on” or “given” Example: For a fair die, what is Pr(3)? What is Pr(3 | odd)?

Conditioning in a data set
Record Gender Age Smoker? COPD? 1 M 31 Yes No 2 F 41 3 59 4 26 5 53 6 58 For a randomly sampled record, what is… Pr(smoker), Pr(COPD | smoker), Pr(smoker | COPD) Pr(COPD | male smoker), Pr(male smoker | COPD & > 50) Pr(smoker & COPD)

Conditioning in a data set
Record Gender Age Smoker? COPD? 1 M 31 Yes No 2 F 41 3 59 4 26 5 53 6 58 For a randomly sampled record, what is… Pr(smoker) = 3/6 = 1/2, Pr(COPD | smoker) = 2/3 Pr(smoker | COPD) = 1, Pr(COPD | male smoker) = 1/2 Pr(male smoker | COPD & > 50) = 1/2, Pt(smoker & COPD) = 1/3

Why does conditioning matter?
Learning (or “updating of beliefs”) takes place by conditioning, in traditional DA Value of information is determined by how much it increases the conditional expected utility of the best decision Causality: Suggests (but does not prove) how changing some choices (e.g., smoking) might change the probabilities of consequences (e.g., COPD)

DA strategy Describe real-world decision problem in terms of acts, states, and consequences Assess utility for each consequence, u(c) Assess probability of each consequence, for each choice, Pr(c | a) Pr(c | a) is given by a (causal) risk model Choose act with greatest expected utility, EU(a) = sum over all c of Pr(c | a)u(c)

How to calculate Pr(c | a)
Inputs: Causal model, c(a, s) = consequence of act a if state is s (from normal form table) State probabilities, Pr(s) If you have information I, use Pr(s | I) Calculation: Pr(c | a) = sum over s of c(a, s)*Pr(s) This summing “marginalizes out s”

Generalization to stochastic causal model
Inputs: Probabilistic causal model, Pr(c | a, s) State probabilities, Pr(s) If you have information I, use Pr(s | I) Calculation: Pr(c | a) = sum over s of Pr(c | a, s)*Pr(s) Similar generalizations if choices affect state probabilities, states affect utilities

Example: Decision to smoke
Pr(COPD | smoke) = 0.17 Pr(COPD | no smoke = 0.01) u(smoke and COPD) = 0.1 u(no smoke and COPD) = 0 u(no smoke and no COPD) = 0.5 u(smoke and no COPD) = 1 Should this person smoke?

COPD No COPD EU Smoke 0.17*0.1 0.83*1 0.17* *1 = Do not smoke 0.01*0 0.99*0.5 0.495 Pr(COPD | smoke) = 0.17 Pr(COPD | no smoke = 0.01) u(smoke and COPD) = 0.1 u(no smoke and COPD) = 0 u(no smoke and no COPD) = 0.5 u(smoke and no COPD) = 1

COPD No COPD EU Smoke 0.17*0.1 0.83*1 0.17* *1 = Do not smoke 0.01*0 0.99*0.5 0.495 In this model, EU theory implies that the decision maker should smoke. Does not model future regrets, social approval, etc.

Solution via decision table (reusing previous script)
R: U <- matrix(c(500, 100, -600,50, 50, 50), 2, 3, byrow = T) R: colnames(U) <- c("State 1", "State 2", "State 3") U <- matrix(c(500, 100, -600,50, 50, 50), 2, 3, byrow = T) R: rownames(U) <- c("Act 1", "Act 2") colnames(U) <- c("State 1", "State 2", "State 3") R: p <- c(0.10, 0.65, 0.25) R: EU <- U %*% p rownames(U) <- c("Act 1", "Act 2") R: colnames(EU) <- 'EU' p <- c(0.10, 0.65, 0.25) R: best_act <- which(EU == max(EU)) R: best_act EU <- U %*% p R: table <- rbind(cbind(U, EU), p) R: table[nrow(table), ncol(table)] <- NA colnames(EU) <- 'EU' R: table best_act <- which(EU == max(EU)) best_act [1] 2 table <- rbind(cbind(U, EU), p) table[nrow(table), ncol(table)] <- NA table State 1 State 2 State 3 EU Act Act p NA Probabilities 0.1 0.65 0.25 Decision State 1: large increase State 2: small increase State 3: large decrease 1: Risky investment 500 100 -600 2: CD 50

Influence diagram Rectangle = choice/decision Ellipse = chance node
Hexagon = value node

Response surface (possibly unknown)

Reinforcement learning: Adaptive trial-and-error learning

Dynamic decisions: Markov Decision Processes (MDPs)
Elements: States Actions (for each state) State transition probabilities Rewards Policy: What to do in each state

Reinforcement learning for pole-balancing (deterministic control)

Decision models to maximize EU
Trees Influence diagrams Tables c(a, s) or Pr(c | a, s) Response functions Design of experiments, optimization via adaptive learning, simulation-optimization Optimization models maxaAEU(a) Dynamic models Markov decision processes Optimal control and learning stochastic, robust, adaptive low-regret

Summary: Three main questions addressed in course
How should we make decisions? Normative decision theories Techniques: tables, trees, IDs, simulation How do we make decisions? Descriptive decision theories, psychology How can we make better decisions? Prescriptive decision theories Practical techniques and decision aids

Wrap-up on course overview
Course teaches techniques and theory to improve practical decision-making Applying these techniques requires overcoming psychological traps that impede good decisions Use data and models (learned from data) to inform decisions Q: When are better-informed decisions also better decisions? A: When the additional information increases the expected utility of the best decision compared to not having it. (The difference reflects value of information.)

Wrap-up on course overview
Course teaches techniques and theory to improve practical decision-making Applying these techniques requires overcoming psychological traps that impede good decisions Use data and models (learned from data) to inform decisions If trustworthy models are not yet available, learning-based techniques can improve decisions over time

Decision Analysis Lecture 2

Similar presentations

Presentation on theme: "Decision Analysis Lecture 2"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Decision Analysis Lecture 2

Similar presentations

Presentation on theme: "Decision Analysis Lecture 2"— Presentation transcript:

Similar presentations

About project

Feedback