HRP 223 – 2007 lm.ppt - Linear Models Copyright © 1999-2007 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Introduction to Regression ©2005 Dr. B. C. Paul. Things Favoring ANOVA Analysis ANOVA tells you whether a factor is controlling a result It requires that.
Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Linear regression models
Qualitative Variables and
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
1 Merging with SQL HRP223 – 2011 October 31, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation.
1 Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
1 Processing Grouped Data HRP223 – 2011 November 14 th, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
1 Creating and Tweaking Data HRP223 – 2010 October 24, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
1 Database Theory and Normalization HRP223 – 2010 November 14 th, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning:
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Correlation and Regression Analysis
Linear Regression/Correlation
Advantages of Multivariate Analysis Close resemblance to how the researcher thinks. Close resemblance to how the researcher thinks. Easy visualisation.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Chapter 13: Inference in Regression
Hypothesis Testing in Linear Regression Analysis
Simple linear regression Linear regression with one predictor variable.
Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
HPR Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Introduction to Linear Regression
BIOL 582 Lecture Set 11 Bivariate Data Correlation Regression.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Multivariate Analysis. One-way ANOVA Tests the difference in the means of 2 or more nominal groups Tests the difference in the means of 2 or more nominal.
Examining Relationships in Quantitative Research
Go to Table of Content Single Variable Regression Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
1 Lab 2 and Merging Data (with SQL) HRP223 – 2009 October 19, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning:
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
Warsaw Summer School 2015, OSU Study Abroad Program Regression.
Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Data Analysis.
Midterm Review Ch 7-8. Requests for Help by Chapter.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
ANOVA, Regression and Multiple Regression March
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Correlation & Simple Linear Regression Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1.
Chapter 12: Correlation and Linear Regression 1.
Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international.
Regression Analysis AGEC 784.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Multiple Regression.
CHAPTER 29: Multiple Regression*
Comparing Groups.
Linear Regression/Correlation
Introduction to Regression
F test for Lack of Fit The lack of fit test..
Presentation transcript:

HRP 223 – 2007 lm.ppt - Linear Models Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.

HPR lm.ppt ANOVA as a Model zAt this point, you have seen how you can build a model to describe the mean level of an outcome variable if the predictor is categorical. yYou predict the outcome at a baseline level and then add something (a constant) if you are not in the baseline group. After the prediction is made, you are left with some unexplained variability. The extra variability is assumed to be approximately normally distributed with the peak of the bell-shaped curve centered on the mean you guessed.

HPR lm.ppt ANOVA as a Model zThe model can be written as: zor to keep it simpler, baseline and 2 nd group: You can think of this as being a baseline amount (call it α) plus a change (call it β 1 ) for every one unit change in the group membership indicator for group 1. Begin to visualize the data as a bell-shaped histogram centered around the mean for group 0. You then shift the histograms for the other groups to the right or left by the amount specified as the β value.

HPR lm.ppt From the Last Slide “You can think of this as being a baseline amount (call it α) plus a change (call it β 1 ) for every one unit change in the group membership indicator (for group 1).” zInstead of a binary group membership indicator, put in a predictor variable that can take on any integer value between 0 and 10. What happens? yYou shift the bell-shaped curve up (or down if the β is negative).

HPR lm.ppt Regression! zIf you change the predictor to allow values of 0 to 10, the formula is just as simple. zConceptually, you scoot the histogram up a bit for every one unit increase in the predictor. zRemember high school?

HPR lm.ppt Continuous Predictors zIf you allow your predictor to take on any value and you are comfortable saying you are moving a bell-shaped distribution up or down, you can model the outcome with a line! zAgain, the idea is that you are just shifting your best guess at the outcome mean up by some amount (the β) for every one unit increase in the predictor.

HPR lm.ppt Mortality Rates zSay you want to look at the relationship between mortality caused by malignant melanoma and exposure to sun (as measured by the proxy of latitude). The outcome is mortality. So you will be shifting the distribution of mortality down as latitude goes North.

HPR lm.ppt Plot first of course. zA scatter plot shows the relationship between two measures on the same subject. The outcome goes on the y axis.

HPR lm.ppt A line? zThere is something like a linear relationship here. You can ask SAS to put its best guess at a line easily:

HPR lm.ppt Think about that line. zIf the best guess at the mean of the outcome does not need to be shifted up or down as the predictor changes, what will the line look like? yFLAT. yYour best guess at the outcome is just some baseline amount.

HPR lm.ppt Therefore… zThe test for the impact of a predictor in a linear model becomes a test of whether the β is close enough to 0 to call it “zero slope”.

HPR lm.ppt That Line zThe formulas to get the line are really easy. You just solve two simultaneous equations where there is a closed form solution.

HPR lm.ppt What’s going on? zIf you don’t like math, put a tack on the plot at the mean of the predictor and the mean of the outcome. Then put a ruler on the plot (touching the tack) and wiggle the ruler around until it is as close as possible to all the data points.

HPR lm.ppt Minimizing Errors

HPR lm.ppt Residuals zWhat you are doing unconsciously when you wiggle around the ruler is minimizing the errors between the line and the dots (measured up and down). These errors are called residuals.

HPR lm.ppt Guess how you measure error. zJust like every other time you have seen measurements of error, it is expressed as a variance. The model fitting process is just a process of making the line as compatible as possible with the data.

HPR lm.ppt Quality of an ANOVA Model zRemember the ANOVA model is comparing the variance across groups vs. the variance within groups. zEssentially it was saying, do you reduce the variance significantly if you use different mean lines for each subgroup of the data relative to the variance relative to a single mean?

HPR lm.ppt Quality of a Regression Model zHere you are testing to see if the variance is reduced significantly by using a sloped line rather than a flat one.

HPR lm.ppt If you like math… zThe SAS Enterprise Guide project on the class website has a data file called parts which shows how the totals accumulate for the Σ notation. Squared differences Keep a running total

HPR lm.ppt

Hypothesis Testing zThe test of the slope can be thought of as a T statistic using this formula. zFor me it is more intuitive to look at it with an ANOVA table.

HPR lm.ppt Hypothesis Testing zYou parse the sum of squares (SS) between each data point and the overall mean into two parts: yThe SS between the regression line and the overall mean yThe SS between each point and the regression line

HPR lm.ppt Partitioning the Variance

HPR lm.ppt Σ = 53,637.3 Σ = 36,464.2 Σ = 17,173.1