5.2 Least-Squares Fit to a Straight Line

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

The Maximum Likelihood Method
Managerial Economics in a Global Economy
Statistical Techniques I EXST7005 Simple Linear Regression.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
P M V Subbarao Professor Mechanical Engineering Department
The General Linear Model. The Simple Linear Model Linear Regression.
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Chi Square Distribution (c2) and Least Squares Fitting
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Linear and generalised linear models
Maximum likelihood (ML)
Principles of Least Squares
Least-Squares Regression
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Curve-Fitting Regression
CHAPTER 4 Adaptive Tapped-delay-line Filters Using the Least Squares Adaptive Filtering.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION ASEN 5070 LECTURE 11 9/16,18/09.
Linear Systems – Iterative methods
Lecture 16 - Approximation Methods CVEN 302 July 15, 2002.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
Analysis of Experimental Data; Introduction
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
CHAPTER 4 ESTIMATES OF MEAN AND ERRORS. 4.1 METHOD OF LEAST SQUARES I n Chapter 2 we defined the mean  of the parent distribution and noted that the.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Data Modeling Patrice Koehl Department of Biological Sciences
The Maximum Likelihood Method
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Chapter 7. Classification and Prediction
STATISTICAL ORBIT DETERMINATION Kalman (sequential) filter
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
12. Principles of Parameter Estimation
(5) Notes on the Least Squares Estimate
5 Systems of Linear Equations and Matrices
Gauss-Siedel Method.
Numerical Analysis Lecture12.
The Maximum Likelihood Method
The Maximum Likelihood Method
Chapter 9 Hypothesis Testing.
Chapter 12 Curve Fitting : Fitting a Straight Line Gab-Byung Chae
Introduction to Instrumentation Engineering
LESSON 21: REGRESSION ANALYSIS
Today (2/16/16) Learning objectives (Sections 5.1, 5.2, and 5.3):
Modelling data and curve fitting
Chi Square Distribution (c2) and Least Squares Fitting
Numerical Analysis Lecture 16.
J.-F. Pâris University of Houston
Linear regression Fitting a straight line to observations.
Nonlinear regression.
6.2 Grid Search of Chi-Square Space
6.5 Taylor Series Linearization
5.4 General Linear Least-Squares
5.1 Introduction to Curve Fitting why do we fit data to a function?
OVERVIEW OF LINEAR MODELS
3.3 Experimental Standard Deviation
6.1 Introduction to Chi-Square Space
6.6 The Marquardt Algorithm
9.5 Least-Squares Digital Filters
Parametric Methods Berlin Chen, 2005 References:
12. Principles of Parameter Estimation
Chapter 8 Estimation.
Numerical Analysis Lecture11.
Presentation transcript:

5.2 Least-Squares Fit to a Straight Line total probability and chi-square minimizing chi-square for a straight line solving the determinants example least-squares fit weighted least-squares with an example least-squares with counting errors counting example with an iterative solution 5.2 : 1/14

Probability for yi = a0 + a1xi The dependent variable, y, is related to the independent variable, x, by an equation representing a straight line. If the true value of the coefficients were known (assume x is error-free), the mean of y would be given by the following equation. m = a0 + a1x To test some hypotheses, estimates of the equation coefficients, a0 and a1, are needed. To obtain these estimates, N values of x will be chosen and the corresponding values of y measured. It is assumed that the measured value of y has error described by a normal pdf and that the magnitude of this error, s, can vary with x. For any one chosen value of x = xi , the probability that some particular value, yi , would be measured is, 5.2 : 2/14

Total Probability and Chi-Square Now consider the case where N measurements are made. The method of maximum likelihood will be applied to the total probability of the N measurements in the belief that this procedure will provide the most probable estimates of the coefficients. Only the term within the middle set of brackets contains a0 and a1. As a result, only this term is important in maximizing ptotal with respect to the values of a0 and a1. Maximization is achieved by minimizing the summation within the exponent - a procedure called least-squares. This particular summation is so common it is given its own name and symbol - chi-square, c2 . 5.2 : 3/14

Minimizing Chi-Square For a straight line, chi-square has the following form. The method of maximum likelihood requires that chi-square be minimized with respect to the two coefficients. This is done by taking the partial derivative with respect to each coefficient and setting the expression equal to zero. 5.2 : 4/14

Minimizing Chi-Square And so, we have a system of two equations with two unknowns (a0, a1). All of the terms consisting of summations are known constants calculable from the experimental data set. The two equations above can be rewritten using matrix notation. Or… 5.2 : 4/14

Minimizing Chi-Square Now, we just need to invert the equation to isolate and solve for the most probable coefficients a. For a 2×2 matrix, the inverse is straightforward to calculate. Substituting back in the explicit expressions for the elements of the different vectors and matrices yields the following. 5.2 : 4/14

Solving the Determinant (2) In most experiments, the noise on y does not depend upon x, and all si = s. The un-weighted least-squares solution is then given by the following, where the subscripts have been left off to simplify writing. Note that the units of D are now x2, giving a0 units of y and a1 units of y/x. 5.2 : 6/14

Example Unweighted Least-Squares N = 5 values of x were selected and the value of y measured: (1,16.88)(2,9.42)(3,2.41)(4,-5.28)(5,-13.23) The summations were computed as: Sx = 15, Sx2 = 55, Sy = 10.20, Sxy = -44.33 The graph shows the raw data plus the regression line, y = 24.51 - 7.49x. 5.2 : 7/14

Example Chi-Square Minimum The two graphs at the right show how the total probability is maximized at the least-squares coefficients. The two graphs at the right show how chi-square is minimized at the least squares coefficients. Since these were synthetic data, the true coefficients are known: m0 = 24.53, m1 = -7.22 The true values do not give a maximum probability nor minimum chi-square! 5.2 : 8/14

Weighted Least-Squares In a weighted least-squares, the inverse of the variance is defined as the weight. When the variance is large, the weight is small, etc. The least-squares equations can be written using this new variable. For example, With normally distributed noise it is uncommon to have a separate weight for each value of x. It is more common to have ranges of the independent variable with different weights. Also, the weights are often whole numbers, since only the relative values of s may be known. For the example on the next slide there is a change in sensitivity at 5, where s = 5 for x  5, and s = 1 for x > 5. The corresponding weights would be w = 1 for x  5, and w = 25 for x > 5. 5.2 : 9/14

Example Weighted Least-Squares (1) N = 11 values of x were selected and the value of y measured: (0,4.75)(1,11.14)(2,21.74)(3,37.13)(4,37.43)(5,45.53) (6,47.38)(7,54.36)(8,61.77)(9,68.51)(10,74.99) The first row of values had a sensitivity 5 times larger than the second row. The higher sensitivity made the noise 5 times larger. The first 6 values were weighted 1, and the second 5 weighted 25. 5.2 : 10/14

Example Weighted Least-Squares (2) The experimental data and regression line are shown below. The error in the first six points was purposely made sufficiently large that the difference in standard deviation could be seen visually. The blue line is the weighted least-squares while the magenta line is the unweighted least-squares. Note how the unweighted least-squares incorrectly fits the points for x = 6 though 10. 5.2 : 11/14

Weighting with Counting Errors (1) When the y-axis involves counts with a Poisson pdf, each value of y will have a unique s. To the extent that the Poisson pdf approximates a normal pdf, the least-squares equations can be used. The equations are obtained by substituting mi for si2. With counting experiments y, m, and s have no units. Thus D has units of x2, making a0 have no units and a1 have units of x-1. 5.2 : 12/14

Weighting with Counting Errors (2) Since the mi are unknown, an iterative procedure must be used. For the first step, estimate mi with the experimental yi. Use the a0 and a1 obtained in step (1) to compute a better estimate of the means, mi = a0 + a1xi Repeat step (2) using the revised estimates of mi. Continue repeating until the values of a0 and a1 are stable to the desired precision. The procedure gives the same coefficient values as the method of maximum likelihood applied to the Poisson pdf. 5.2 : 13/14

Example with Counting Errors N = 11 values of x were selected and the value of y measured: (0,20)(1,31)(2,36)(3,41)(4,37)(5,50) (6,58)(7,52)(8,58)(9,54)(10,56) iteration a0 a1 1 25.352 3.775 2 26.052 3.753 3 26.083 3.747 4 26.086 3.746 MML The blue regression line is the initial iteration, the magenta line is the Poisson MML solution. 5.2 : 14/14