Download presentation

Presentation is loading. Please wait.

Published byChristian Risden Modified over 2 years ago

1
(c) Stephen Senn 20101 You may believe you are a Bayesian But you are probably wrong Stephen Senn

2
(c) Stephen Senn 20102 Outline The four systems of statistical inference –An example of where it is good to be Bayesian –Fisher’s argument against the Neyman-Pearson approach Examples of experts applying ‘the Bayesian’ approach –Adrian Smith and colleagues 1987 –Lindley, 1993 –Howson and Urbach, 1989 Some theoretical reasons for hesitation Conclusion –Why I shall (probably) still be using mongrel statistics after this conference

3
(c) Stephen Senn 20103 Warning This talk should not be taken as an attack on the subjective Bayesian approach to statistical inference I do not claim it is a bad approach I do claim it can be very difficult and perhaps dangerous to rely on it as the only approach

4
(c) Stephen Senn 20104 Four systems (Barnard) Fisherian Neyman-Pearson Jeffreys Bayesian (Ramsey-De Finetti-Savage) George Barnard’s advice was to be familiar with all four

5
(c) Stephen Senn 20105 InferencesDecisions Direct Probability Inverse probability Neyman Fisher Jeffreys De Finetti Use of semi- objective prior distributions to produce inverse probabilities Use of subjective expectation via utility Fiducial inference Likelihood Significance tests Pearson A two dimensional view of the four systems

6
(c) Stephen Senn 20106 TGN1412 A monoclonal antibody First-in-man study on 13 March 2006 carried out by Parexel on behalf of TeGenero In first cohort 8 volunteers Six allocated TGN1412 and two allocated placebo All six given TGN1412 suffered a cytokine storm

7
(c) Stephen Senn 20107 See.Senn SJ. Lessons from TGN1412. Applied Clinical Trials 2007;16(6):18- 22.

8
(c) Stephen Senn 20108 A Conventional Analysis FISHER'S EXACT TEST Statistic based on the observed 2 by 2 table(x) : P(X) = Hypergeometric Prob. of the table = 0.0357 FI(X) = Fisher statistic = 6.095 Asymptotic p-value: (based on Chi-Square distribution with 1 df ) Two-sided:Pr{FI(X).GE. 6.095} = 0.0136 One-sided:0.5 * Two-sided = 0.0068 Exact p-value and point probabilities : Two-sided:Pr{FI(X).GE. 6.095}= Pr{P(X).LE. 0.0357}= 0.0357 Pr{FI(X).EQ. 6.095}= Pr{P(X).EQ. 0.0357}= 0.0357 One-sided:Let y be the value in Row 1 and Column 1 y =6 min(Y) =4 max(Y) =6 mean(Y) = 4.500 std(Y) = 0.5669 Pr { Y.GE. 6 } = 0.0357 Pr { Y.EQ. 6 } = 0.0357

9
(c) Stephen Senn 20109 A Slightly Less Conventional Analysis Datafile: C:\Program Files\Numerical\StatXact-4.0.1\Files\Research\TGN1412.cy3 BARNARD'S UNCONDITIONAL TEST FOR DIFFERENCE OF TWO BINOMIAL PROPORTIONS Statistic based on the observed 2 by 2 table : Binomial proportion for column : pi_1 = 1.000 Binomial proportion for column : pi_2 = 0.0000 Difference of binomial proportions : Delta = pi_2 - pi_1 = -1.000 Standardized difference of binomial proportions : Delta/Stdev = -2.828 Results: ------------------------------------------------------------------------- Method P-value(1-sided) P-value( 2-sided) ------------------------------------------------------------------------- Asymp 0.0023 (Left Tail) 0.0047 Exact 0.0111 (Left Tail) 0.0113

10
(c) Stephen Senn 201010 Conclusions “If you need statistics to prove it, I don’t believe it” Here the problem is the reverse You can’t prove it with statistics but everybody believes So does this mean statistics is irrelevant? Not if you look more closely…

11
(c) Stephen Senn 201011 Further information Timing of adverse events Increasing interest in using this feature in epidemiological studies –Case series methodology Farrington and Whitaker (2006) Also if we use background knowledge of risk of cytokine storm we come to quite different conclusions But this is to be rather Bayesian

12
(c) Stephen Senn 201012

13
(c) Stephen Senn 201013 Fisher on Neyman-Pearson ‘Their method only leads to definite results when mathematical postulates are introduced which could only be justified as a result of extensive experience.’ Fisher to Chester Bliss 6 October 1938 (Published in Bennett, 1990) What Fisher is pointing to here is that although a null hypothesis may be more primitive than a test statistic, the same is not true of an alternative hypothesis. Thus the alternative hypothesis cannot be made the justification for choosing the test statistic

14
(c) Stephen Senn 201014 Three Examples Provided by Expert Bayesians Two involve choice of prior distribution followed by formal Bayesian updating –Racine-Poon, Grieve, Fluehler and Smith 1987 –Lindley 1993 One involves an intuitive assertion of the posterior result, which is claimed to be Bayesian –Howson and Urbach

15
(c) Stephen Senn 201015 Racine et al This is a fine paper with many examples as to how the Bayesian approach can be applied in drug development I shall just look at one of these The analysis of the Martin and Browning (1985) Data of metoprolol –Actually, this paper is not cited by Racine et al but this is the relevant citation

16
(c) Stephen Senn 201016 Design 100 mg once daily 200 mg once daily Randomisation Period 1 6 weeks Period 2 6 weeks Run in 4 weeks 31 patients aged 65+ with diastolic blood pressure in excess of 100mmHg randomised to these sequences. DBP measured after 6 weeks and 4-8 hours after last dose.

17
(c) Stephen Senn 201017 Carry-over Problem The period 2 values could still be being influenced by the period 1 treatment Hence a comparison of period 1 and period 2 results would provide a biased measurement of the effect of treatment However, if we knew what the magnitude carry- over was we could take account of it Hence carry-over is a nuisance parameter and a prime candidate for the Bayesian approach

18
(c) Stephen Senn 201018 Unfortunately None of the authors noted that the carry- over effect has to last for six weeks –Nor did any of the discussants whether Bayesian or frequentist However the treatment effect only has to last for 4-8 hours The ratio of one to the other is at least 126 You cannot use an uninformative prior for carry-over and be coherent

19
(c) Stephen Senn 201019 Anyone who is not shocked by quantum theory has not understood a single word. Niels Bohr Anyone who is not shocked by the Bayesian theory of statistical inference has not understood a single word Stephen Senn

20
(c) Stephen Senn 201020 A Bayesian Lady Tasting Wine Paper by Lindley. Lindley, D. The Analysis of Experimental Data, Teaching Statistics, 15, 22-25 (1993) “The lady is a wine expert, testified by her being a Master (sic) of Wine, MW. She was given 6 pairs of glasses (not cups). One member of each pair contained some French claret. The other had a Californian Cabernet Sauvignon Merlot Blend.” see also Lindley, D. A Bayesian lady tasting tea. In Statistics an Appreciation, David and David (ed) Iowa State University Press (1984).

21
(c) Stephen Senn 201021 Lindley’s Prior for Wine Tasting

22
(c) Stephen Senn 201022 ‘At this point I can only speak for myself though I hope many will agree with me. You may freely disagree and still be sensible.’ Lindley I do disagree Either the Lady knows something about wine or she hasn’t a clue. If she has, I think that she can repeat the trick of correct identification with high probability. If she is a charlatan, there is a small probability that she may have a fine palate

23
(c) Stephen Senn 201023 The Difference between Mathematical and Applied Statistics Mathematical statistics is full of lemmas whereas applied statistics is full of dilemmas.

24
(c) Stephen Senn 201024 Senn’s Prior for Wine Tasting

25
(c) Stephen Senn 201025 Place Your Bets Imagine the lady has to distinguish between 20 pairs of glasses. You are given £100,000 to place at evens either for or against the following The lady will choose correctly in 12, 13,14, 15 or 16 pairs. How do you choose?

26
(c) Stephen Senn 201026 An Example of Howson and Urbach’s Consider example of die rolled 600 times Results are –100, sixes, fives, fours and threes –123 twos –77 ones Pearson-Fisher chi-square statistic is 10.58

27
(c) Stephen Senn 201027 Howson and Urbach’s Conclusion...one is, therefore, under no obligation to reject the null hypothesis, even though that hypothesis has pretty clearly got it badly wrong, in particular, in regard to the outcomes two and one” (p136, my italics). From the second edition

28
(c) Stephen Senn 201028

29
(c) Stephen Senn 201029 An Analysis Using Good’s Approach Lump of probability on fair die Symmetric Dirichlet prior over alternative Do not commit yourself to particular value of k, (Dirichlet parameter) Instead plot Bayes factor as function of K This is a sort of Type II likelihood

30
(c) Stephen Senn 201030 Parameter of symmetric Dirichlet Bayes factor as function of prior

31
(c) Stephen Senn 201031 Conclusion If you had witnessed the die being rolled you would not necessarily conclude it was unfair If you were asked to decide whether these were results from a real die or one some philosophers had written down in a book you might decide on the latter This is because the Dirichlet distribution could not model your prior distribution It is somewhat unfair of H&U to claim that the frequentist approach has pretty clearly got it badly wrong I think that they would have great difficulty honestly specifying a prior distribution that allowed them to ‘get it right’ for this example and not look foolish for others

32
(c) Stephen Senn 201032 Am I being unfair? Yes, if my aim is to claim that Bayesian methods are particularly bad –They are not –We all make errors in our search for errors and I am no exception No, if my aim is to counter the claim that Bayesian statistics is uniquely good In particular if the argument is that the only requirement for inference is coherence

33
(c) Stephen Senn 201033 Perfection and Goodness The DeFinetti theory is a theory of how to remain perfect You have a prior probability of all possible sequences of events As events unfold you strike out the sequences that did not occur and renormalise This is not, however a theory of how to become good

34
(c) Stephen Senn 201034 If You are not Already a Bayesian You have a collection of priors which do not form a coherent set. You can only become Bayesian by trashing some of the priors until those that are left are coherent. But if this is a legitimate thing to do, it seems to me that it must remain a legitimate thing to do in the future. This is then a license to continue not being Bayesian. This then means that the Dutch book argument loses much of its force.

35
(c) Stephen Senn 201035 The Date of Information Problem Statistician: Here is the result of the analysis of the trial you asked me to look at. I have added the likelihood to your prior. This is the posterior distribution. Physician: Excellent! Now could you please take the results of the previous trials and do a meta-analysis? Statistician. (after a pause) There is no need. The result I gave you is the meta-analysis. The previous trials are in your prior. Physician. (after a pause). If the previous trials are in my prior, they got into my prior without your help at all. Why did I need you to help with producing the posterior?

36
(c) Stephen Senn 201036 The Bayesian Meta-Analyst’s Dilemma In general P n-1 + D n P n Step 1: P 0 + D 1 P 1, Step 2: P 1 + D 2 P 2, Or equivalently, P 0 + D 1 + D 2 P 2 But suppose P 0 already includes D 1 then this analysis would be illegitimate (like analysing 50 values using a chi-square on the percentages).

37
(c) Stephen Senn 201037 The Dilemma Continued So use step 2 only. But suppose that P 1 does not include D 1. This would be equivalent to analysing a contingency table of 200 observations using a chi-square on the percentages. Then the principle of total information has been violated. (Note, however that according to David Miller the principle of total information seems to be an independent principle which cannot be derived from maximising expected posterior utility except by imposing very artificial additional conditions.)

38
(c) Stephen Senn 201038 The Two Faces of (subjective) Bayes Theory Elegant development based on coherence Claim that it is the only way to behave Claim to integrate all sources of information Requires (in my view) perfect temporal coherence Practice A rag-bag of computational tools Use of Bayes theorem but not therefore coherent Often surprisingly poor treatment of prior distributions Back to the drawing board allowed

39
(c) Stephen Senn 201039 My conclusion It is highly doubtful that the strong claims for Bayesian theory are a justification for Bayesian practice This does not mean that Bayesian statistics as practised is not useful –The applied statistician needs a method that is useful in practice and not just in theory I remain sceptical of its claims to be the only useful statistical approach not least because admitting this to be true would still leave you sorely puzzled to do in practice

40
(c) Stephen Senn 201040 The Robot Turtle in the Corner When the robot gets stuck the scientist gets up and gives it a kick The robot does not know it is stuck To avoid being stuck in the inferential corner it is useful for us to have different ways of making inferences Where they disagree there is a warning that it is time to do some creative thinking

41
(c) Stephen Senn 201041 Where this leaves me Bayesian approach is excellent when you have to make decisions If you are going to uses frequentist approaches to decision making you may need to use stopping rules However, stopping rule adjustments are not a good way to summarise evidence And the same is true of Bayesian analyses I like randomisation but don’t make a fetish of it I like the likelihood principle but don’t make a fetish of it No (current) single approach to statistical inference seems to fit my needs as a jobbing statistician I like (to the limits of my lesser ability) following George Barnard’s advice of being prepared to consider four

42
(c) Stephen Senn 201042 Finally Frequentists think that it is the thought that counts whereas Bayesians count the thoughts.

Similar presentations

Presentation is loading. Please wait....

OK

8-2 Basics of Hypothesis Testing

8-2 Basics of Hypothesis Testing

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google