Presentation is loading. Please wait.

Presentation is loading. Please wait.

LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 5 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.

Similar presentations


Presentation on theme: "LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 5 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall."— Presentation transcript:

1 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 5 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Intermediate Lab PHYS 3870 Lecture 4 Comparing Data and Models— Quantitatively Linear Regression References: Taylor Ch. 8 and 9 Also refer to “Glossary of Important Terms in Error Analysis”

2 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 2 Lecture 5 Slide 2 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Intermediate Lab PHYS 3870 Errors in Measurements and Models A Review of What We Know

3 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 3 Lecture 5 Slide 3 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Quantifying Precision and Random (Statistical) Errors

4 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 4 Lecture 5 Slide 4 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Single Measurement: Comparison with Other Data Comparison of precision or accuracy?

5 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 5 Lecture 5 Slide 5 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Single Measurement: Direct Comparison with Standard Comparison of precision or accuracy?

6 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 6 Lecture 5 Slide 6 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Multiple Measurements of the Same Quantity

7 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 7 Lecture 5 Slide 7 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Standard Deviation Multiple Measurements of the Same Quantity

8 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 8 Lecture 5 Slide 8 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Standard Deviation of the Mean Multiple Sets of Measurements of the Same Quantity

9 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 9 Lecture 5 Slide 9 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Errors Propagation—Error in Models or Derived Quantities

10 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 10 Lecture 5 Slide 10 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Specific Rules for Error Propagation (Worst Case)

11 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 11 Lecture 5 Slide 11 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 General Formula for Error Propagation

12 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 12 Lecture 5 Slide 12 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 General Formula for Multiple Variables Worst case Best case

13 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 13 Lecture 5 Slide 13 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Error for a function of Two Variables: Addition in Quadrature Consider the arbitrary derived quantity q(x,y) of two independent random variables x and y. Expand q(x,y) in a Taylor series about the expected values of x and y (i.e., at points near X and Y). Error Propagation: General Case Fixed, shifts peak of distribution FixedDistribution centered at X with width σ X Best case

14 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 14 Lecture 5 Slide 14 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Independent (Random) Uncertaities and Gaussian Distributions

15 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 15 Lecture 5 Slide 15 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Gaussian Distribution Function Width of Distribution (standard deviation) Center of Distribution (mean) Distribution Function Independent Variable Gaussian Distribution Function Normalization Constant

16 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 16 Lecture 5 Slide 16 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Standard Deviation of Gaussian Distribution Area under curve (probability that –σ<x<+σ) is 68% 5% or ~2σ “Significant” 1% or ~3σ “Highly Significant” 1σ “Within errors” 0.5 ppm or ~5σ “Valid for HEP” See Sec. 10.6: Testing of Hypotheses

17 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 17 Lecture 5 Slide 17 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Mean of Gaussian Distribution as “Best Estimate” Principle of Maximum Likelihood Best Estimate of X is from maximum probability or minimum summation Consider a data set Each randomly distributed with {x 1, x 2, x 3 …x N } To find the most likely value of the mean (the best estimate of ẋ ), find X that yields the highest probability for the data set. The combined probability for the full data set is the product Minimize Sum Solve for derivative set to 0 Best estimate of X

18 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 18 Lecture 5 Slide 18 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Uncertaity of “Best Estimates” of Gaussian Distribution Principle of Maximum Likelihood Best Estimate of X is from maximum probability or minimum summation Consider a data set{x 1, x 2, x 3 …x N } To find the most likely value of the mean (the best estimate of ẋ ), find X that yields the highest probability for the data set. The combined probability for the full data set is the product Minimize Sum Solve for derivative wrst X set to 0 Best estimate of X Best Estimate of σ is from maximum probability or minimum summation Minimize Sum Best estimate of σ Solve for derivative wrst σ set to 0 See Prob. 5.26

19 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 19 Lecture 5 Slide 19 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Weighted Averages Question: How can we properly combine two or more separate independent measurements of the same randomly distributed quantity to determine a best combined value with uncertainty?

20 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 20 Lecture 5 Slide 20 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Weighted Averages Best Estimate of χ is from maximum probibility or minimum summation Minimize SumSolve for best estimate of χ Solve for derivative wrst χ set to 0 Note: If w 1 =w 2, we recover the standard result X wavg = (1/2) (x 1 +x 2 )

21 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 21 Lecture 5 Slide 21 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Intermediate Lab PHYS 3870 Comparing Measurements to Models Linear Regression

22 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 22 Lecture 5 Slide 22 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Motivating Regression Analysis Question: Consider now what happens to the output of a nearly ideal experiment, if we vary how hard we poke the system (vary the input). SYSTEM Input Output Uncertainties in Observations The Universe The simplest model of response (short of the trivial constant response) is a linear response model y(x) = A + B x

23 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 23 Lecture 5 Slide 23 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Questions for Regression Analysis Two principle questions: What are the best values of A and B, for: (see Taylor Ch. 8) A perfect data set, where A and B are exact? A data set with uncertainties? What confidence can we place in how well a linear model fits the data? (see Taylor Ch. 9) The simplest model of response (short of the trivial constant response) is a linear response model y(x) = A + B x

24 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 24 Lecture 5 Slide 24 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Intermediate Lab PHYS 3870 Graphical Analysis

25 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 25 Lecture 5 Slide 25 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Graphical Analysis An “old School” approach to linear fits. Rough plot of data Estimate of uncertainties with error bars A “best” linear fit with a straight edge Estimates of uncertainties in slope and intercept from the error bars This is a great practice to get into as you are developing an experiment!

26 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 26 Lecture 5 Slide 26 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Is it Linear? Adding 2D error bars is sometimes helpful. A simple model is a linear model You know it when you see it (qualitatively) Tested with a straight edge Error bar are a first step in gauging the “goodness of fit”

27 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 27 Lecture 5 Slide 27 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Making It Linear or Linearization A simple trick for many models is to linearize the model in the independent variable. Refer to Baird Ch.5 and the associated homework problems.

28 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 28 Lecture 5 Slide 28 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Special Graph Paper Linear Semilog Log-Log “Old School” graph paper is still a useful tool, especially for reality checking during the experimental design process. Semi-log paper tests for exponential models. Log-log paper tests for power law models. Both Semi-log and log -log paper are handy for displaying details of data spread over many orders of magnitude.

29 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 29 Lecture 5 Slide 29 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Intermediate Lab PHYS 3870 Linear Regression

30 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 30 Lecture 5 Slide 30 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Basic Assumptions for Regression Analysis We will (initially) assume: The errors in how hard you poke something (in the input) are negligible compared with errors in the response (see discussion in Taylor Sec. 9.3) The errors in y are constant (see Problem 8.9 for weighted errors analysis) The measurements of y i are governed by a Gaussian distribution with constant width σ y

31 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 31 Lecture 5 Slide 31 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Question 1: What is the Best Linear Fit (A and B)? Best Estimate of intercept, A, and slope, B, for Linear Regression or Least Squares- Fit for Line

32 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 32 Lecture 5 Slide 32 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Best Estimates of A and B are from maximum probibility or minimum summation Minimize SumBest estimate of A and B Solve for derivative wrst A and B set to 0 “Best Estimates” of Linear Fit

33 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 33 Lecture 5 Slide 33 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Best Estimates of A and B are from maximum probibility or minimum summation Minimize SumBest estimate of A and B Solve for derivative wrst A and B set to 0 “Best Estimates” of Linear Fit In a linear algebraic formThis is a standard eigenvalue problemWith solutions is the uncertainty in the measurement of y, or the rms deviation of the measured to predicted value of y

34 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 34 Lecture 5 Slide 34 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Least Squares Fits to Other Curves

35 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 35 Lecture 5 Slide 35 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Least Squares Fits to Other Curves

36 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 36 Lecture 5 Slide 36 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Least Squares Fits to Other Curves

37 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 37 Lecture 5 Slide 37 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Fitting a Polynomial This leads to a standard 3x3 eigenvalue problem, Which can easily be generalized to any order polynomial Extend the linear solution to include on more tem, for a second order polynomial This looks just like our linear problem, with the deviation in the summation replace by

38 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 38 Lecture 5 Slide 38 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Least Squares Fits to Other Curves

39 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 39 Lecture 5 Slide 39 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Least Squares Fits to Other Curves

40 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 40 Lecture 5 Slide 40 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Least Squares Fits to Other Curves

41 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 41 Lecture 5 Slide 41 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Problem 8.24

42 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 42 Lecture 5 Slide 42 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Problem 8.24

43 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 43 Lecture 5 Slide 43 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Problem 8.24

44 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 44 Lecture 5 Slide 44 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Problem 8.24

45 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 45 Lecture 5 Slide 45 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Problem 8.24

46 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 46 Lecture 5 Slide 46 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Intermediate Lab PHYS 3870 Correlations

47 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 47 Lecture 5 Slide 47 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Uncertaities in a Function of Variables Consider an arbitrary function q with variables x,y,z and others. Expanding the uncertainty in y in terms of partial derivatives, we have If x,y,z and others are independent and random variables, we have If x,y,z and others are independent and random variables governed by normal distributions, we have We now consider the case when x,y,z and others are not independent and random variables governed by normal distributions.

48 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 48 Lecture 5 Slide 48 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Covariance of a Function of Variables The standard deviation of the N values of q i is We then find the simple result for the mean of q Note partial derivatives are all taken at X or Y and are hence the same for each i We now consider the case when x and y are not independent and random variables governed by normal distributions. Assume we measure N pairs of data (x i,y i ), with small uncertaities so that all x i and y i are close to their mean values X and Y. Expanding in a Taylor series about the means, the value q i for (x i,y i ), 00

49 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 49 Lecture 5 Slide 49 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Covariance of a Function of Variables The standard deviation of the N values of q i is If x and y are independent

50 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 50 Lecture 5 Slide 50 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Schwartz Inequality Show that See problem 9.7

51 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 51 Lecture 5 Slide 51 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Schwartz Inequality Combining the Schwartz inequality With the definition of the covariance yields Then completing the squares And taking the square root of the equation, we finally have At last, the upper bound of errors is And for independent and random variables

52 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 52 Lecture 5 Slide 52 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Another Useful Relation

53 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 53 Lecture 5 Slide 53 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Question 2: Is it Linear? Consider the limiting cases for: r=0 (no correlation) [for any x, the sum over y-Y yields zero] r=±1 (perfect correlation). [Substitute yi-Y=B(xi-X) to get r=B/|B|=±1] y(x) = A + B x

54 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 54 Lecture 5 Slide 54 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Tabulated Correlation Coefficient Consider the limiting cases for: r=0 (no correlation) r=±1 (perfect correlation). To gauge the confidence imparted by intermediate r values consult the table in Appendix C. r value N data points Probability that analysis of N=70 data points with a correlation coefficient of r=0.5 is not modeled well by a linear relationship is 3.7%. Therefore, it is very probably that y is linearly related to x. If Prob N (|r|>r o )<32%  it is probably that y is linearly related to x Prob N (|r|>r o )<5%  it is very probably that y is linearly related to x Prob N (|r|>r o )<1%  it is highly probably that y is linearly related to x

55 LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 55 Lecture 5 Slide 55 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2013 Uncertainties in Slope and Intercept Taylor: Relation to R 2 value:


Download ppt "LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 5 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall."

Similar presentations


Ads by Google