Estimation, Variation and Uncertainty Simon French

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.
Bayesian Statistics Simon French
Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Bayesian Estimation in MARK
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
EVSC 495/EVAT 795 Data Analysis & Climate Change Class hours: TuTh 2:00-3:15 pm Instructor: Michael E. Mann.
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
FREQUENCY ANALYSIS Basic Problem: To relate the magnitude of extreme events to their frequency of occurrence through the use of probability distributions.
Machine Learning CMPT 726 Simon Fraser University CHAPTER 1: INTRODUCTION.
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Sampling Distributions
Machine Learning CMPT 726 Simon Fraser University
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Probability and Statistics Review
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Lecture II-2: Probability Review
Standard error of estimate & Confidence interval.
Hamid R. Rabiee Fall 2009 Stochastic Processes Review of Elementary Probability Lecture I.
Statistical Analysis of Systematic Errors and Small Signals Reinhard Schwienhorst University of Minnesota 10/26/99.
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
Chapter 7: Random Variables
A quick intro to Bayesian thinking 104 Frequentist Approach 10/14 Probability of 1 head next: = X Probability of 2 heads next: = 0.51.
1 Probability and Statistics  What is probability?  What is statistics?
Statistical Decision Theory
Model Inference and Averaging
Bayesian Inference, Basics Professor Wei Zhu 1. Bayes Theorem Bayesian statistics named after Thomas Bayes ( ) -- an English statistician, philosopher.
Harrison B. Prosper Workshop on Top Physics, Grenoble Bayesian Statistics in Analysis Harrison B. Prosper Florida State University Workshop on Top Physics:
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur.
Uses of Statistics: 1)Descriptive : To describe or summarize a collection of data points The data set in hand = the population of interest 2)Inferential.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Lecture 2 Forestry 3218 Lecture 2 Statistical Methods Avery and Burkhart, Chapter 2 Forest Mensuration II Avery and Burkhart, Chapter 2.
G. Cowan Lectures on Statistical Data Analysis Lecture 1 page 1 Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics;
Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( ,  2 ) Joint pdf for the whole random sample Maximum likelihood estimates.
Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove.
Determination of Sample Size: A Review of Statistical Theory
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Two Main Uses of Statistics: 1)Descriptive : To describe or summarize a collection of data points The data set in hand = the population of interest 2)Inferential.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Populations III: evidence, uncertainty, and decisions Bio 415/615.
MATH 643 Bayesian Statistics. 2 Discrete Case n There are 3 suspects in a murder case –Based on available information, the police think the following.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
Analysis of Experimental Data; Introduction
Course on Bayesian Methods in Environmental Valuation
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Lecture 4 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Statistical Data Analysis: Lecture 5 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Canadian Bioinformatics Workshops
Theoretical distributions: the Normal distribution.
Bayes Net Learning: Bayesian Approaches
Bayesian Inference, Basics
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Statistical NLP: Lecture 4
CS639: Data Management for Data Science
Mathematical Foundations of BME Reza Shadmehr
Presentation transcript:

Estimation, Variation and Uncertainty Simon French

Aims of Session gain a greater understanding of the estimation of parameters and variables. gain an appreciation of point estimation. gain an appreciation of how to assess the uncertainty and confidence levels in estimates

Cynefin and statistics Repeatable events Unique events Events? Estimation and confirmatory analysis exploratory analyses

Frequentist Statistics Key point: Probability represents a long run frequency of occurrence

Frequentist Statistics Scientific Method is based upon repeatability of experiments Parameters in a (scientific) model or theory are fixed  Cannot talk of the probability of a objective quantity or parameter value Data come from repeatable experiments  Can talk of the probability of a data value

Measurement and Variation of Objective Quantities Ideally we simply perform an experiment and measure the quantities that interest us But variation and experimental error mean that we cannot simply do this So we need to make multiple measurements, learn about the variation and estimate the quantity of interest

Estimation Try to find a function of the data that is tightly distributed about the quantity of interest. Distribution of data data point Quantity of interest,  Distribution of mean Quantity of interest,  Data mean

Confidence intervals intervals defined from the data  95% confidence intervals: calculate interval for each of 100 data sets about 95 will contain .

Uncertainty But there is more uncertainty in what we do than just variation and experimental error We do our calculations in a statistical model. But the model is not the real world So there is modelling error – which covers a multitude of sins!

Uncertainty So a 95% confidence interval may represent a much greater uncertainty! Studies have shown that the uncertainty bounds given by scientists (and others!) are often overconfident by a factor of 10.

Estimation of model parameters Sometimes the quantities that we wish to estimate do not exist! Parameters may only have existence within a model –Transfer coefficients –Release height in atmospheric dispersion –Risk aversion

Why do we want estimates? [Remember our exhortations that you should be clear on your research objectives or questions.] To measure ‘something out there’ To find the parameter to use for some purpose in a model –Evaluation of systems –Prediction of some effect –May use estimate of parameters and their uncertainty to predict how a complex systems may evolve, e.g. through Monte Carlo Methods.

Independence Many estimation methods assume that each error is probabilistically independent of the other errors… and often they are far from independent. –1700  2 ‘independent’ samples –IPCC work on climate change Dependence in data changes – increases! - the uncertainty in the estimates

Bayesian Statistics

Rev. Thomas Bayes 1701?-1761 Main work published posthumously: T. Bayes (1763) An essay towards solving a problem in the doctrine of chances. Phil Trans Roy. Soc Bayes Theorem – inverse probability

Bayes theorem Posterior probability  likelihood  prior probability p(  | x)  p(x |  ) × p(  )

Bayes theorem There is a constant, but ‘easy’ to find as probability adds (integrates) to one Posterior probability  likelihood  prior probability p(  | x)  p(x |  ) × p(  )

18 Bayes theorem Probability distribution of parameters p(  ) Posterior probability  likelihood  prior probability p(  | x)  p(x |  ) × p(  )

19 Bayes theorem likelihood of data given parameters p(x|  ) Posterior probability  likelihood  prior probability p(  | x)  p(x |  ) × p(  )

20 Bayes theorem Probability distribution of parameters given data p(  |x) Posterior probability  likelihood  prior probability p(  | x)  p(x |  ) × p(  )

On the treatment of negative intensity measurements Simon French

Crystallography data Roughly, x-rays shone at a crystal diffract into many rays radiating out in a fixed pattern from the crystal. The intensities of these diffracted rays are related to the modulus of the coefficients in the Fourier expansion of the electron density of molecule. So getting hold of the intensities gives structural information

Intensity measurement Measure X-ray intensity in a diffracted ray and subtract the background ‘near to it’ Measured intensity, I = ray strength - background But in protein crystallography most intensities are small relative to background so some are ‘measured’ as negative And theory says they are non-negative … Approaches in the early 1970s simply set negative measurements to zero … and got biased data sets

A Bayesian approach Good reason to think the likelihood for intensity measurements is near normal –Difference of Poisson (‘counting statistics’) –Further ‘corrections’ Theory gives the prior: “Wilson’s statistics” (AJC Wilson 1949) Estimate with the posterior mean Normal LikelihoodWilson’s Statistics

Simon French and Keith Wilson (1978) On the treatment of negative intensity measurements Acta Crystallographica A34,

Prior Posterior Toss a biased coin 12 times; obtain 9 heads Bayes theorem

Prior Posterior Toss a biased coin 12 times; obtain 9 heads Bayesian Estimation Take mean, median or mode

Prior Posterior Toss a biased coin 12 times; obtain 9 heads Bayesian confidence interval Highest 95% density

But why do any of these? Just report the posterior. It encodes all that is known about  1