Modelling data and curve fitting

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
The Maximum Likelihood Method
Regression Eric Feigelson Lecture and R tutorial Arcetri Observatory April 2014.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Ch11 Curve Fitting Dr. Deshi Ye
The General Linear Model. The Simple Linear Model Linear Regression.
A Short Introduction to Curve Fitting and Regression by Brad Morantz
Chapter 10 Simple Regression.
7. Least squares 7.1 Method of least squares K. Desch – Statistical methods of data analysis SS10 Another important method to estimate parameters Connection.
Measures of Regression and Prediction Intervals
Introduction to Error Analysis
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
Correlation and Regression Chapter 9. § 9.3 Measures of Regression and Prediction Intervals.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
Geology 5670/6670 Inverse Theory 21 Jan 2015 © A.R. Lowry 2015 Read for Fri 23 Jan: Menke Ch 3 (39-68) Last time: Ordinary Least Squares Inversion Ordinary.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Machine Learning 5. Parametric Methods.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Data Modeling Patrice Koehl Department of Biological Sciences
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.
The simple linear regression model and parameter estimation
Chapter 4: Basic Estimation Techniques
The Maximum Likelihood Method
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Chapter 4 Basic Estimation Techniques
Chapter 7. Classification and Prediction
Regression Analysis AGEC 784.
Probability Theory and Parameter Estimation I
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.
Parameter Estimation and Fitting to Data
Basic Estimation Techniques
Model Inference and Averaging
Ch12.1 Simple Linear Regression
The Maximum Likelihood Method
Chapter 11 Simple Regression
Simple Linear Regression - Introduction
The Maximum Likelihood Method
Basic Estimation Techniques
Statistical Methods For Engineers
Introduction to Instrumentation Engineering
Chapter 7 Estimation: Single Population
Chi Square Distribution (c2) and Least Squares Fitting
Regression Models - Introduction
Multiple Regression Models
J.-F. Pâris University of Houston
Linear regression Fitting a straight line to observations.
10701 / Machine Learning Today: - Cross validation,
Chapter 14 Inference for Regression
6.5 Taylor Series Linearization
5.2 Least-Squares Fit to a Straight Line
Regression Lecture-5 Additional chapters of mathematics
Simple Linear Regression
Correlation and Regression
Simple Linear Regression
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.
Parametric Methods Berlin Chen, 2005 References:
Chapter 8 Estimation.
Measures of Regression Prediction Interval
Regression Models - Introduction
Presentation transcript:

Modelling data and curve fitting Least squares Maximum likelihood Chi squared Confidence limits General linear fits (Chapter 15, Numerical recipes. Press et al )

Best fit straight line Assume we measure a parameter y for a set of x values, giving a set of data [xi ] and [yi ] We want to model the data using a linear relation y(xi) = a + b xi

Best fit straight line How do we find the coefficients a and b that give the best fit to the data? Given a pair of values for a and b, we need to define a measure of the ‘goodness of fit’. Then choose the a and b values that give the best fit.

Least squares fit For each data point, xi, calculate the difference between measured yi and the model prediction, a+bxi Note, Δyi can be positive or negative, so ΣΔyi can be zero. Minimizing the sum of the squared residuals will give a good overall fit Computationally, try a range of values for a and b, and for each pair calculate The pair which gives the smallest S is the best fit Δyi = yi – a – bxi S=Σ(Δyi2)

Maximum likelihood It can be shown that the parameters that minimize the sum of the squares are the most likely, given the measured data Assume the x values are exact, and the measurement errors on the y values are Gaussian, with mean zero, and deviation σ. So Where εi is a random variable taken from a Gaussian distribution yi = ytrue(xi) + εi

Example Gaussian distribution

If the true values of a and b are a0 and b0 then So the probability of observing yi is (assuming σ is the same for all measurements)

And the probability of observing the whole dataset [yi ] is We can use Bayes theorem to relate this to how likely it is that the model parameters are a and b

P(A|B) P(B) = P(B|A)P(A) Bayes theorem Given two events A and B, then the conditional probabilities are related P(A|B) P(B) = P(B|A)P(A) P(A|B) is the probability of A happening, given that B has happened P(A) is the probability of A happening, independent of B

Application of Bayes theorem Consider a model M and some data D. Then Bayes theorem tells you the probability that the model is right, given the data that you have observed: So the probability of a particular model, given the data, depends on the probability of observing your data given the model The most probable model is the one for which the observed data is most likely Vary a and b to find the maximum P(M(a,b)|D), which is the same P(a0,b0) defined earlier P(M|D) = P(D|M)P(M)/P(D)

Maximizing means minimizing So for uniform Gaussian errors, maximum likelihood is the same as least squares

Non-Gaussian errors Sometimes you know errors are not Gaussian, so least squares may not be the best method. Minimizing the sum of the modulus is very robust It is equivalent to using the median instead of the mean In general use M-estimates: maximum-likelihood based on non-Gaussian error distribution

(Chi squared) If the uncertainty is different for each measurement then define a quantity If the errors are Gaussian, then minimizing will give the maximum likelihood values of the parameters.

Example of minimum

Finding minimum of (numerically) Calculate Σ(Δyi2) for a grid of a and b values and pick the point that is the minimum

Finding minimum of (analytically) Analytically differentiate with respect to a and b and set and Leads to

Confidence interval The distribution of has a chi-square distribution with N-M degrees of freedom. The distribution of has a chi-square distribution with M degrees of freedom (for M parameters). The probability of a given value of A being the true value is given by the probability of getting the observed for that value. When this corresponds to 68% ie 1σ

The value of The value of tells you more about the model and the data: If is greater than the number of degrees of freedom either the real errors are greater than the that you used, or the model is not good. If is less than the number of degrees of freedom either the real errors are smaller than the that you used, or the model has too many parameters.

General linear models Express your model as the sum of basis functions with linear coefficients The functions can be arbitrary, but are fixed A common example is a polynomial fit, where the functions Xi(x) are powers of x

Finding minimum of (analytically for general model) Differentiate with respect to each parameter ak and set the differentials to zero Define a matrix, α, and a vector β α is called the curvature matrix

Then the equations can be written in matrix form And the solutions are given by Where C is the inverse of the curvature matrix C is also called the covariance matrix

Non-linear fits Easiest approach is to make it linear, for example take logs Otherwise use a direct parameter search for then minimum

Workshop Least squares straight line fit, and interpreting the measuring chi square Non-linear fit using a simple search for the minimum chi square