Simple Linear Regression

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L10.1 CorrelationCorrelation The underlying principle of correlation analysis.
Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Chapter 12 Simple Regression
Introduction to Biostatistics, Harvard Extension School, Spring, 2007 © Scott Evans, Ph.D. and Lynne Peeples, M.S.1 Correlation and Simple Linear Regression.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Chapter Eighteen MEASURES OF ASSOCIATION
Nemours Biomedical Research Statistics April 2, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
RESEARCH STATISTICS Jobayer Hossain Larry Holmes, Jr November 6, 2008 Examining Relationship of Variables.
Introduction to Biostatistics, Harvard Extension School, Spring, 2007 © Scott Evans, Ph.D. and Lynne Peeples, M.S.1 Correlation and Simple Linear Regression.
REGRESSION AND CORRELATION
Introduction to Probability and Statistics Linear Regression and Correlation.
Regression Continued…
Introduction to Biostatistics, Harvard Extension School, Spring, 2007 © Scott Evans, Ph.D. and Lynne Peeples, M.S.1 Regression Continued… Prediction, Model.
Measures of Association Deepak Khazanchi Chapter 18.
Multiple Regression Research Methods and Statistics.
Correlation and Regression Analysis
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Linear Regression/Correlation
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Linear Regression and Correlation Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and the level of.
Correlation & Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Correlation and Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Introduction to Linear Regression and Correlation Analysis
Correlation and Regression
Inference for regression - Simple linear regression
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
Biostatistics Case Studies 2015 Youngju Pak, PhD. Biostatistician Session 4: Regression Models and Multivariate Analyses.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
SESSION Last Update 17 th June 2011 Regression.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Linear correlation and linear regression + summary of tests
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Chapter 12: Correlation and Linear Regression 1.
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 12 Correlational Designs.
Chapter 11: Linear Regression and Correlation Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables.
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Topics, Summer 2008 Day 1. Introduction Day 2. Samples and populations Day 3. Evaluating relationships Scatterplots and correlation Day 4. Regression and.
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Correlation & Simple Linear Regression Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1.
Stats Methods at IC Lecture 3: Regression.
Correlation & Regression
CHAPTER 29: Multiple Regression*
Linear Regression/Correlation
BA 275 Quantitative Business Methods
Association, correlation and regression in biomedical research
Simple Linear Regression
Correlation & Regression
Introduction to Regression
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Simple Linear Regression Correlation and Simple Linear Regression © Scott Evans, Ph.D.

Correlation Used to assess the relationship between two continuous variables. Example: weight and number of hours of exercise per week Scatterplots are useful graphical tools to visualize the relationship between two continuous variables. Plot of y vs. x Plots are generally recommended as they are informative, easy to understand, and do not cost anything from a statistical sense. © Scott Evans, Ph.D.

Correlation Coefficient Ranges from –1 (perfect negative correlation) to 1 (perfect positive correlation) Correlation of –1 means that all of the data lie in a straight line with negative (but further unknown) slope. Correlation of 1 means that all of the data lie in a straight line with positive (but further unknown) slope. Correlation of 0, means that the variables are not correlated. A plot of the data would reveal a cloud with no discernable pattern. © Scott Evans, Ph.D.

Correlation Coefficient  denotes the population correlation coefficient (a parameter) r denotes the sample correlation coefficient (a statistic) Measures the strength of a linear relationship © Scott Evans, Ph.D.

Correlation Coefficient Can conduct hypothesis tests for  using the t distribution. H0: =0 is often tested (but is not limiting) CIs for  may also be obtained. © Scott Evans, Ph.D.

Correlation Coefficient Two major correlation coefficients: Pearson correlation: parametric Sensitive to extreme observations Spearman correlation: nonparametric Based on ranks Robust to extreme observations and thus is recommended particularly when “outliers” are present. I use Spearman almost exclusively © Scott Evans, Ph.D.

Correlation Warnings: Correlation does NOT necessarily imply causation. Correlation does NOT measure the magnitude of the regression slope. © Scott Evans, Ph.D.

Simple Linear Regression Used to describe and estimate the relationship between 2 continuous variables We attempt to characterize this relationship with two parameters: (1) an intercept, and (2) a slope. Has several assumptions © Scott Evans, Ph.D.

Simple Linear Regression The model: yi = 0 + 1xi yi is the dependent variable xi is the independent variable (predictor) 0 is the y-intercept 1 is the slope of the line It describes how steep the line is and which way it leans It is also the “effect” (e.g., a treatment effect) on y of a 1 unit change in x Note that if there was no association between x and y, then 1 will be close to 0 (i.e., x has no effect on y) © Scott Evans, Ph.D.

Simple Linear Regression Plot the data first: Scatterplots This enables you to see the relationship between the 2 variables If the relationship is non-linear (as evidenced by the plot) then a simple linear regression model is the wrong approach. © Scott Evans, Ph.D.

Simple Linear Regression Uses the method of least squares Identifies the line that minimizes the sum of squared deviations between observed and predicted values. Produces an ANOVA table with an F test Note that F simplifies to t when the numerator d.f. = 1. © Scott Evans, Ph.D.

Simple Linear Regression Can be used to estimate the correlation between x and y. Hypothesis tests and CIs may be obtained for 1 One may make predictions of the effect of changes in x on y. © Scott Evans, Ph.D.

Simple Linear Regression We desire to capture all of the structure in the data with the model (finding the systematic component) Thus leaving only random errors. Thus if we plotted the errors, we would hope that no discernable pattern exists If a pattern exists, then we have not captured all of the systematic structure in the data, and we should look for a better model. © Scott Evans, Ph.D.

Multiple Regression Can incorporate (and control for) many variables A single (continuous) dependent variable Multiple independent variables (predictors) These variables may be of any scale (continuous, nominal, or ordinal) Outcome = function of many variables (e.g., sex, age, race, smoking status, exercise, education, treatment, genetic factors, etc.) © Scott Evans, Ph.D.

Multiple Regression Multiple regression can estimate the effect of each of these variables while controlling for (adjusting for) the effects of other (potentially confounding variables) in the model Confounding occurs when the effect of a variable of interest is distorted when you do not control for the effect of another “confounding” variable. © Scott Evans, Ph.D.

Multiple Regression Interactions (effect modification) may be investigated The effect of one variable depends on the level of another variable. Example: the effect of treatment may depend on whether you are male or female © Scott Evans, Ph.D.

Multiple Regression Indicator variables are created and used for categorical variables. Selection procedures can help a research choose a final model from a shopping list of potential independent variables. Backwards Forwards Stepwise Best subsets © Scott Evans, Ph.D.

Multiple Regression Models require an assessment of model adequacy and goodness of fit. Examination of residuals (comparison of observed vs. predicted values). © Scott Evans, Ph.D.

Other Regression Models Generalized Linear Models (GLMs) A function of the dependent variable is a linear function of the covariates Multiple regression (link = identity) Logistic Regression Used when the dependent variable is binary Very common in public health/medical studies (e.g., disease vs. no disease) Poisson Regression Used when the dependent variable is a count © Scott Evans, Ph.D.

Other Regression Models (Cox) Proportional Hazards (PH) Regression Used when the dependent variable is a “event time” with censoring © Scott Evans, Ph.D.