Petter Mostad mostad@chalmers.se Linear regression Petter Mostad mostad@chalmers.se.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
The Simple Regression Model
Objectives 10.1 Simple linear regression
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
Chapter 10 Curve Fitting and Regression Analysis
Linear regression models
Simple Regression. Major Questions Given an economic model involving a relationship between two economic variables, how do we go about specifying the.
Ch11 Curve Fitting Dr. Deshi Ye
Objectives (BPS chapter 24)
A Short Introduction to Curve Fitting and Regression by Brad Morantz
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Section 4.2 Fitting Curves and Surfaces by Least Squares.
The Simple Linear Regression Model: Specification and Estimation
Statistics for the Social Sciences
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
Sample size computations Petter Mostad
The Simple Regression Model
SIMPLE LINEAR REGRESSION
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Chapter 11 Multiple Regression.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 11 Notes Class notes for ISE 201 San Jose State University.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
SIMPLE LINEAR REGRESSION
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Simple Linear Regression and Correlation
Chapter 12 Section 1 Inference for Linear Regression.
Chapter 2 – Simple Linear Regression - How. Here is a perfect scenario of what we want reality to look like for simple linear regression. Our two variables.
Correlation & Regression
Regression and Correlation Methods Judy Zhong Ph.D.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Simple Linear Regression Models
Ch4 Describing Relationships Between Variables. Pressure.
Introduction to Linear Regression
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Chapter 8 Curve Fitting.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Curve-Fitting Regression
Variation and Prediction Intervals
Regression Regression relationship = trend + scatter
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Simple linear regression Tron Anders Moger
Linear Prediction Correlation can be used to make predictions – Values on X can be used to predict values on Y – Stronger relationships between X and Y.
Curve Fitting Pertemuan 10 Matakuliah: S0262-Analisis Numerik Tahun: 2010.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
The simple linear regression model and parameter estimation
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Kin 304 Regression Linear Regression Least Sum of Squares
BPK 304W Regression Linear Regression Least Sum of Squares
CHAPTER 29: Multiple Regression*
Simple Regression Mary M. Whiteside, PhD.
Regression Models - Introduction
SIMPLE LINEAR REGRESSION
Regression Models - Introduction
Presentation transcript:

Petter Mostad mostad@chalmers.se Linear regression Petter Mostad mostad@chalmers.se

Relationships between variables We want to understand the relationship between x and y!

Relationships between variables We want to understand the relationship between x and y!

What to do with a fitted line Interpolation Extrapolation Interpret parameters of line

How to define the ”best fitting” line? Sum of squares of ”errors”, or residuals, is minimized = Least squares method Note: Other things could possible be optimized instead

How to compute the least squares line? Let (x1, y1), (x2, y2),...,(xn, yn) be the data points. Find a and b such that y=a+bx fits the points, by minimizing Solution: where and all sums are made for i=1,...,n.

How do you obtain these formulas? Differentiate S with respect to a and b, and set results to zero: We get: These are two equations with two unknowns, and the soluton is the answer above.

Example Crickets make sound by rubbing their wings together. There is a correlation between the temperature and the number of movements per second, unique for every species. Here are some data for Nemobius fasciatus fasciatus: Movements/sec Temperature 20,0 31,4 16,0 22,0 19,8 34,1 18,4 29,1 15,5 24,0 14,7 21,0 17,1 27,7 15,4 20,7 16,2 28,5 15,0 26,4 17,2 28,1 17,0 28,6 14,4 24,6 If you measure 18 movements per second, what is estimated temperature? Data from Pierce, GW. The Songs of Insects. Cambridge, Mass.: Harvard University Press, 1949, pp. 12-21

Example (cont.) Computations: Answer: Estimated temperature

What about the uncertainty in the prediction? Temperature is not perfectly predicted! We assume: A linear model relates the number of wing movements with a mean predicted temperature The actual temperature has a normal distribution around that mean prediction, with variance σ2 In order to make a prediction with uncertainty, we must: First estimate the parameters of the line from data Find the predicted temperature at the given wing movement number, with uncertainty Add the ”random error”, with the estimated variance

More examples A model for prediction of y claims that every unit increase in a variable x increases the expected value of y by 1.4, and that y is normally distributed around this expectation with some fixed variance. How can we test this model? You have a choice between a model relating y with x where either y=ax+b+error, or y=ax+error. How can you choose?

Procedure for answering such quesitons: We set up a linear model for our observations We estimate its parameters, with uncertainty We draw conclusions from these estimates, with uncertainty, or We use the estimates, with uncertainties, to make predictions, with uncertainties We return soon to this, but first something more about the basic estimation…

y against x ≠ x against y Linear regression of y against x does not give the same result as the opposite: Regression of y against x This example needs to be carefully explained in order to be understood!! Use the cricket example! Regression of x against y

Centering the variables Assume we subtract the averages fra x- and y-values We get and From definitions of correlation and standard deviation follows: (even in uncentered case) Note: The Residuals sum to 0. Note how subtracting averages corresponds to changing the coordinate system, which should not change the regression line

Example: transformed variables The connection between the variables is not always linear. Example: The natural model may be We still want to find a and b such that approximates the data as well as possible

Example (cont) When then Use standard formulas on the pairs (x1, log(y1)), (x2, log(y2)),..,(xn, log(yn)) We get estimates for log(a) and b, and thus a and b

Another example with transformed variables Another natural model may be We then get Use standard formulas on the pairs (log(x1), log(y1)), (log(x2), log(y2)), ...,(log(xn),log(yn)) Note: In this model, the fitted line always goes through (0,0)

Several explanatory variables Assume our data is of the type (x11, x12, x12, y1), (x21, x22, x23, y2), ... We can try to predict or ”explain” y from the x-values with a model Exactly as before we can deduce formulas for a,b,c,d minimizing the sum of the squares of the ”errors”, or residuals. x1,x2,x3 can be transformations of different variables, or even the same variable. Definer ordet ”forklaringsvariable”.... Selvfölgelig vilkårlig antall forklaringsvariable...(men må väre mindre enn antallet ukjente..)

Example: Fitting a polynomial line Assume data (x1,y1),..., (xn,yn) seem to be following the curve of a third-degree polynomial We use the above theory on (x1, x12, x13, y1), (x2, x22, x23, y2),... We estimate a,b,c,d, and a third-degree polynomial

Regression as a linear model Responses are modelled as a linear function of ”explanatory variables”, with unknown coefficients, plus a random error. The random error is normally distributed with zero mean and a fixed variance for all observations. With these assumptions, values for the unknown coefficients can be estimated using least squares. We can find an estimate for the variance. This can be used to obtain confidence intervals, and confidence regions, for the unknown coefficients.