Presentation is loading. Please wait.

Presentation is loading. Please wait.

MSc Methods part II: Bayesian analysis Dr. Mathias (Mat) Disney UCL Geography Office: 113, Pearson Building Tel: 7670 0592

Similar presentations


Presentation on theme: "MSc Methods part II: Bayesian analysis Dr. Mathias (Mat) Disney UCL Geography Office: 113, Pearson Building Tel: 7670 0592"— Presentation transcript:

1 MSc Methods part II: Bayesian analysis Dr. Mathias (Mat) Disney UCL Geography Office: 113, Pearson Building Tel: 7670 0592 Email: mdisney@ucl.geog.ac.uk www.geog.ucl.ac.uk/~mdisney

2 Intro to Bayes’ Theorem –Science and scientific thinking –Probability & Bayes Theorem – why is it important? –Frequentists v Bayesian –Background, rationale –Methods: MCMC …… –Advantages / disadvantages Applications: –parameter estimation, uncertainty –Practical – basic Bayesian estimation Lecture outline

3 Reading and browsing Gauch, H., 2002, Scientific Method in Practice, CUP. Sivia, D. S. with Skilling, J. (2008) Data Analysis, 2 nd ed., OUP, Oxford. Monteith and Unsworth, Computational Numerical Methods in C (XXXX) Flake, W. G. (2000) Computational Beauty of Nature, MIT Press. Gershenfeld, N. (2002) The Nature of Mathematical Modelling,, CUP. Mathematical texts –Blah Kalman filters –Welch and Bishop –Maybeck Papers

4 Carry out experiments? Collect observations? Test hypotheses (models)? Generate “understanding”? Objective knowledge?? Induction? Deduction? So how do we do science?

5 Deduction –Inference, by reasoning, from general to particular –E.g. Premises: i) every mammal has a heart; ii) every horse is a mammal. –Conclusion: Every horse has a heart. –Valid if the truth of premises guarantees truth of conclusions & false otherwise. –Conclusion is either true or false Induction and deduction

6 Induction –Process of inferring general principles from observation of particular cases –E.g. Premise: every horse that has ever been observed has a heart –Conclusion: Every horse has a heart. –Conclusion goes beyond information present, even implicitly, in premises –Conclusions have a degree of strength (weak -> near certain). Induction and deduction

7

8 If plants lack nitrogen, they become yellowish –The plants are yellowish, therefore they lack N –The plants do not lack N, so they do not become yellowish –The plants lack N, so they become yellowish –The plants are not yellowish, so they do not lack N Affirming the antecedent: p  q; p,  q ✓ Denying the consequent: p  q: ~q,  ~p ✓ Affirming the consequent: p  q: q,  p X Denying the antecedent: p  q: ~p,  ~q X Aside: sound argument v fallacy

9 Fallacies can be hard to spot in longer, more detailed arguments: –Fallacies of composition; ambiguity; false dilemmas; circular reasoning; genetic fallacies (ad hominem) Gauch (2003) notes: –For an argument to be accepted by any audience as proof, audience MUST accept premises and validity –That is: part of responsibility for rational dialogue falls to the audience –If audience data lacking and / or logic weak then valid argument may be incorrectly rejected (or vice versa) Aside: sound argument v fallacy

10 1.Realism: physical world is real; 2.Presuppositions: world is orderly and comprehensible; 3.Evidence: science demands evidence; 4.Logic: science uses standard, settled logic to connect evidence and assumptions with conclusions; 5.Limits: many matters cannot usefully be examined by science; 6.Universality: science is public and inclusive; 7.Worldview: science must contribute to a meaningful worldview. Gauch (2006): “Seven pillars of Science”

11 Fundamental laws of probability can be derived from statements of logic BUT there are different ways to apply Two key ways –Frequentist –Bayesian – after Rev. Thomas Bayes (1702-1761) What’s this got to do with methods?

12 Informally, the Bayesian Q is: –“What is the probability (P) that a hypothesis (H) is true, given the data and any prior knowledge?” –Weighs hypotheses (different models) in the light of data The frequentist Q is: –“How reliable is an inference procedure, but virtue of not rejecting a true hypothesis or accepting a false hypothesis?” –Weighs procedures (different sets of data) in the light of hypothesis Bayes: see Gauch (2003) ch 5

13 Prior knowledge? –What is known beyond the particular experiment at hand, which may be substantial or negligible We all have priors: assumptions, experience, other pieces of evidence Bayes approach explicitly requires you to assign a probability to your prior (somehow) Bayesian view - probability as degree of belief rather than a frequency of occurrence (in the long run…) Bayes: see Gauch (2003) ch 5

14 The “chief rule involved in the process of learning from experience” (Jefferys, 1983) Formally: P(H|D) = Posterior i.e. probability of hypothesis (model) H being true, given data D P(D|H) = Likelihood i.e probability of data D being observed if H is true P(H) = Prior i.e. probability of hypothesis being true before measurement of D Bayes’ Theorem

15 Importance? P(H|D) appears on the left of BT It solves the inverse (inductive) problem – probability of a hypothesis given some data This is how we do science in practice! We don’t have access to infinite repetitions of expts (the ‘long run frequency’ view) Bayes’ Theorem

16 I is ‘background information’ as there is ‘no such thing as absolute probability’ (see S & S p 5) P(rain today) will depend on clouds this morning, whether we saw forecast etc. etc. – I usually left out but …. Power of Bayes’ Theorem –Relates the quantity of interest i.e. P of H being true given D, to that which we might estimate in practice i.e. P of observing D, given H is correct Bayes Theorem

17 To go from to  to = we need to divide by P(D|I) Where P(D|I) is known as the Evidence Normalisation constant which can be left out for parameter estimation as independent of H But is required in model selection for e.g. where data amount may be critical Bayes Theorem

18 To go from to  to = we need to divide by P(D|I) Where P(D|I) is known as the Evidence Normalisation constant which can be left out for parameter estimation as independent of H But is required in model selection for e.g. where data amount may be critical Bayes Theorem & marginalisation

19 For two mutually exclusive H1, H2 i.e. P(H2|D) = 1 – P(H1|D) we can express in ratio or ‘odds’ form Posterior odds = likelihood odds x prior odds E.g. if prior odds P(H1)/P(H2) 3:1 and new data shows likelihood odds P(D|H1)/P(D|H2) then posterior odds = 1:9 i.e. now H2 favoured over H1 Bayes’s Theorem

20 Ignored priors & the rare diseases Disease affects 1:100,000 randomly If you have it, test will correctly say so with P = 0.95 Test gives incorrect positive diagnosis (false positive) with P = 0.005 If test is positive, what is P that diagnosis is correct? Bayes: examples, implications

21 Use Bayes’s Theorem two hypothesis case dasasd Bayes: examples, implications


Download ppt "MSc Methods part II: Bayesian analysis Dr. Mathias (Mat) Disney UCL Geography Office: 113, Pearson Building Tel: 7670 0592"

Similar presentations


Ads by Google