Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges

Slides:



Advertisements
Similar presentations
Basic Biostat15: Multiple Linear Regression1. Basic Biostat15: Multiple Linear Regression2 In Chapter 15: 15.1 The General Idea 15.2 The Multiple Regression.
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
Forecasting Using the Simple Linear Regression Model and Correlation
Hypothesis Testing Steps in Hypothesis Testing:
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Chapter 10 Simple Regression.
Chapter Eighteen MEASURES OF ASSOCIATION
Chapter Topics Types of Regression Models
Ch. 14: The Multiple Regression Model building
Pertemua 19 Regresi Linier
Measures of Association Deepak Khazanchi Chapter 18.
Correlation and Regression Analysis
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Lecture 5 Correlation and Regression
Lecture 15 Basics of Regression Analysis
Regression and Correlation Methods Judy Zhong Ph.D.
ANALYSIS OF VARIANCE. Analysis of variance ◦ A One-way Analysis Of Variance Is A Way To Test The Equality Of Three Or More Means At One Time By Using.
Introduction to Linear Regression and Correlation Analysis
Chapter 11 Simple Regression
Simple Linear Regression
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Linear correlation and linear regression + summary of tests
CORRELATION: Correlation analysis Correlation analysis is used to measure the strength of association (linear relationship) between two quantitative variables.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Lecture 10: Correlation and Regression Model.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Bivariate analysis. * Bivariate analysis studies the relation between 2 variables while assuming that other factors (other associated variables) would.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Correlation & Simple Linear Regression Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1.
Stats Methods at IC Lecture 3: Regression.
Multiple Regression.
The simple linear regression model and parameter estimation
More than two groups: ANOVA and Chi-square
Regression Analysis AGEC 784.
Regression Analysis Module 3.
Statistics for Managers using Microsoft Excel 3rd Edition
Non-Parametric Tests 12/1.
Correlation and Simple Linear Regression
Non-Parametric Tests 12/1.
Kin 304 Regression Linear Regression Least Sum of Squares
Tests for Continuous Outcomes II
Non-Parametric Tests 12/6.
CHOOSING A STATISTICAL TEST
Chapter 11 Simple Regression
Non-Parametric Tests.
Y - Tests Type Based on Response and Measure Variable Data
Correlation and Simple Linear Regression
SDPBRN Postgraduate Training Day Dundee Dental Education Centre
Regression Analysis Week 4.
Multiple Regression.
Prepared by Lee Revere and John Large
Correlation and Simple Linear Regression
Association, correlation and regression in biomedical research
LEARNING OUTCOMES After studying this chapter, you should be able to
Non – Parametric Test Dr. Anshul Singh Thapa.
Simple Linear Regression and Correlation
Linear Regression Summer School IFPRI
Introduction to Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges Linear Regression Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges

Choice of Statistical Test Outcome Variable Are the observations independent or correlated? Assumptions independent correlated Continuous (e.g. Body mass index, blood pressure) Ttest ANOVA Linear correlation Linear regression Paired ttest Repeated-measures ANOVA Mixed models/GEE modeling Outcome is normally distributed (important for small samples). Outcome and predictor have a linear relationship. Binary or categorical (e.g. fracture yes/no) Difference in proportions Relative risks Chi-square test Logistic regression McNemar’s test Conditional logistic regression GEE modeling Chi-square test assumes sufficient numbers in each cell (>=5) Time-to-event (e.g. time to fracture) Kaplan-Meier statistics Cox regression n/a Cox regression assumes proportional hazards between groups

Tests for Continuous outcomes Outcome Variable Are the observations independent or correlated? Alternatives if the normality assumption is violated (and small sample size): independent correlated Continuous (e.g. Body mass index, blood pressure) T-test: compares means between two independent groups ANOVA: compares means between more than two independent groups Pearson’s correlation coefficient (linear correlation): shows linear correlation between two continuous variables Linear regression: multivariate regression technique used when the outcome is continuous; gives slopes Paired ttest: compares means between two related groups (e.g., the same subjects before and after) Repeated-measures ANOVA: compares changes over time in the means of two or more groups (repeated measurements) Mixed models/GEE modeling: multivariate regression techniques to compare changes over time between two or more groups; gives rate of change over time Non-parametric statistics Wilcoxon sign-rank test: non-parametric alternative to the paired ttest Wilcoxon sum-rank test (=Mann-Whitney U test): non-parametric alternative to the ttest Kruskal-Wallis test: non-parametric alternative to ANOVA Spearman rank correlation coefficient: non-parametric alternative to Pearson’s correlation coefficient

Correlation vs. Regression Assesses the relationship only Assesses the relationship Finds the best line of fit Prediction estimation

General Idea Simple regression considers the relation between a single explanatory variable and response variable Y X Dependent Outcome Response Explained variable Regressand Independent Predictor Covariate Explanatory variable Regressor

General Idea Multiple regression simultaneously considers the influence of multiple explanatory variables on a response variable Y The intent is to look at the independent effect of each variable while “adjusting out” the influence of potential confounders

Regression Modeling A simple regression model (one independent variable) fits a regression line in 2- dimensional space A multiple regression model with two explanatory variables fits a regression plane in 3-dimensional space

Simple Regression Model Regression coefficients are estimated by minimizing ∑residuals2 (i.e., sum of the squared residuals) to derive this model: The standard error of the regression (sY|x) is based on the squared residuals:

Multiple Regression Model Estimates for the multiple slope coefficients are derived by minimizing ∑residuals2 to derive this multiple regression model: The standard error of the regression is based on the ∑residuals2:

Multiple Regression Model Intercept α predicts where the regression plane crosses the Y axis Slope for variable X1 (β1) predicts the change in Y due to per unit change in X1 holding X2 constant The slope for variable X2 (β2) predicts the change in Y due to per unit change in X2 holding X1 constant

Multiple Regression Model A multiple regression model with k independent variables fits a regression “surface” in k + 1 dimensional space (cannot be visualized)

Understanding LR Regression is the attempt to explain the variation in a dependent variable using the variation in independent variables. Regression is thus an explanation of causation. If the independent variable(s) sufficiently explain the variation in the dependent variable, the model can be used for prediction.

Understanding LR The output of a regression is a function that predicts the dependent variable based upon values of the independent variables. Simple regression fits a straight line to the data.

Understanding LR The function will make a prediction for each observed data point. The observation is denoted by y and the prediction is denoted by y (hat).

Understanding LR

Understanding LR A least squares regression selects the line with the lowest total sum of squared prediction errors. This value is called the Sum of Squares of Error, or SSE.

Understanding LR The Sum of Squares Regression (SSR) is the sum of the squared differences between the prediction for each observation and the population mean.

Understanding LR

Understanding LR The proportion of total variation (SST) that is explained by the regression (SSR) is known as the Coefficient of Determination, and is often referred to as R . The value of R can range between 0 and 1, and the higher its value the more accurate the regression model is. It is often referred to as a percentage.

LR Interpretation The slope coefficient associated for SMOKE is −.206, suggesting that smokers have .206 less FEV on average compared to non-smokers (after adjusting for age) The slope coefficient for AGE is .231, suggesting that each year of age in associated with an increase of .231 FEV units on average (after adjusting for SMOKE)

LR Assumptions: Linearity

LR Assumptions: Outcome variable continuous Weight

LR Assumptions: Zero mean error

LR Assumptions: Equal variance

LR Assumptions: Uncorrelated Errors

LR presentation in Journal