Introduction to Biostatistics and Bioinformatics Regression and Correlation.

Slides:



Advertisements
Similar presentations
Kin 304 Regression Linear Regression Least Sum of Squares
Advertisements

Correlation and regression
Chapter 12 Simple Linear Regression
Regresi Linear Sederhana Pertemuan 01 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Regression, Correlation. Research Theoretical empirical Usually combination of the two.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Simple Linear Regression
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L10.1 CorrelationCorrelation The underlying principle of correlation analysis.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Correlation and Regression. Spearman's rank correlation An alternative to correlation that does not make so many assumptions Still measures the strength.
Chapter 10 Simple Regression.
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Least Square Regression
Chapter Eighteen MEASURES OF ASSOCIATION
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Stat 512 – Lecture 17 Inference for Regression (9.5, 9.6)
Simple Linear Regression Analysis
Measures of Association Deepak Khazanchi Chapter 18.
Correlation and Regression Analysis
Correlation Coefficients Pearson’s Product Moment Correlation Coefficient  interval or ratio data only What about ordinal data?
Lecture 5 Correlation and Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Chapter 11 Simple Regression
Proteomics Informatics – Data Analysis and Visualization (Week 13)
MAT 254 – Probability and Statistics Sections 1,2 & Spring.
Simple Linear Regression
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Chapter 16 Data Analysis: Testing for Associations.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Chapter 11 Correlation and Simple Linear Regression Statistics for Business (Econ) 1.
Chapter 6 Simple Regression Introduction Fundamental questions – Is there a relationship between two random variables and how strong is it? – Can.
ECON 338/ENVR 305 CLICKER QUESTIONS Statistics – Question Set #8 (from Chapter 10)
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 18 Introduction to Simple Linear Regression (Data)Data.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Advanced Statistical Methods: Continuous Variables REVIEW Dr. Irina Tomescu-Dubrow.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Psychology 202a Advanced Psychological Statistics October 22, 2015.
Stats of Engineers, Lecture 8. 1.If the sample mean was larger 2.If you increased your confidence level 3.If you increased your sample size 4.If the population.
Chapter 11: Linear Regression and Correlation Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Chapter 12: Correlation and Linear Regression 1.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Regression Analysis AGEC 784.
Kin 304 Regression Linear Regression Least Sum of Squares
BPK 304W Regression Linear Regression Least Sum of Squares
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
The Least-Squares Line Introduction
Relationship between two continuous variables: correlations and linear regression both continuous. Correlation – larger values of one variable correspond.
Simple Linear Regression and Correlation
Simple Linear Regression
Linear Regression and Correlation
Adequacy of Linear Regression Models
Linear Regression and Correlation
Adequacy of Linear Regression Models
Presentation transcript:

Introduction to Biostatistics and Bioinformatics Regression and Correlation

Learning Objectives Regression – estimation of the relationship between variables Linear regression Assessing the assumptions Non-linear regression

Learning Objectives Regression – estimation of the relationship between variables Linear regression Assessing the assumptions Non-linear regression Correlation Correlation coefficient quantifies the association strength Sensitivity to the distribution

Relationships Relationship No Relationship

Relationships Linear RelationshipsNon-Linear Relationship

Relationships Linear, StrongLinear, Weak

Linear Regression Linear, StrongLinear, WeakNon-Linear

Linear Regression - Residuals Linear, StrongLinear, WeakNon-Linear Residuals

Linear Regression Model Linear component Intercept Slope Random Error Dependent Variable Independent Variable Random Error component

Linear Regression Assumptions The relationship between the variables is linear.

Linear Regression Assumptions The relationship between the variables is linear. Errors are independent, normally distributed with mean zero and constant variance.

Linear Regression Assumptions LinearNon-Linear Residuals

Linear Regression Assumptions Constant VarianceVariable Variance Residuals

Linear Regression Model Linear component Intercept Slope Random Error Dependent Variable Independent Variable Random Error component

Linear Regression – Estimating the Line Estimated Intercept Estimated Slope Estimated Value Independent Variable

Least Squares Method Find slope and intercept given measurements X i,Y i, i=1..N that minimizes the sum of the squares of the residuals.

Least Squares Method Find slope and intercept given measurements X i,Y i, i=1..N that minimizes the sum of the squares of the residuals.

Least Squares Method Find slope and intercept given measurements X i,Y i, i=1..N that minimizes the sum of the squares of the residuals.

Least Squares Method Find slope and intercept given measurements X i,Y i, i=1..N that minimizes the sum of the squares of the residuals.

Linear Regression in Python import scipy.stats as stats slope,intercept,r_value,p_value,std_err = stats.linregress(x,y)

Linear Regression Example Linear, Strong Residuals x=np.linspace(-1,1,points) y=x+0.1*np.random.normal(size=points) slope,intercept,r_value,p_value,std_err = stats.linregress(x,y) y_line=slope*x+intercept fig, (ax1) = plt.subplots(1,figsize=(4,4)) ax1.scatter(x,y,color='#4D0132',lw=0,s=60) ax1.set_xlim([-1.5,1.5]) ax1.set_ylim([-1.5,1.5]) ax1.plot(x,y_line,color='red',lw=2) fig.savefig('linear.png') fig, (ax1) = plt.subplots(1,figsize=(4,4)) ax1.scatter(x,y-y_line, color='#963725',lw=0,s=60) ax1.set_xlim([-1.5,1.5]) ax1.set_ylim([-1.5,1.5]) fig.savefig('linear-residuals.png')

Linear Regression Example x=np.linspace(-1,1,points) y=x+0.4*np.random.normal(size=points) slope,intercept,r_value,p_value,std_err = stats.linregress(x,y) y_line=slope*x+intercept fig, (ax1) = plt.subplots(1,figsize=(4,4)) ax1.scatter(x,y,color='#4D0132',lw=0,s=60) ax1.set_xlim([-1.5,1.5]) ax1.set_ylim([-1.5,1.5]) ax1.plot(x,y_line,color='red',lw=2) fig.savefig('linear-weak.png') fig, (ax1) = plt.subplots(1,figsize=(4,4)) ax1.scatter(x,y-y_line, color='#963725',lw=0,s=60) ax1.set_xlim([-1.5,1.5]) ax1.set_ylim([-1.5,1.5]) fig.savefig('linear-weak-residuals.png') Linear, Weak Residuals

Linear Regression Example Outlier

Regression – Non-linear data Solution 1: Transformation Solution 2: Non-linear Regression

Correlation Coefficient A measure of the correlation between the two variables Quantifies the association strength Pearson correlation coefficient:

Correlation Coefficient

Source: Wikipedia

Coefficient of Variation Variance Sample Mean Coefficient of Variation (CV)

Correlation Coefficient and CV Uniform distribution

Correlation Coefficient and CV Uniform distributionNormal distributionLognormal distribution

Correlation Coefficient - Outliers Outlier

Correlation Coefficient – Non-linear Solutions: Transformation Rank correlation (Spearman, r=0.93)

Correlation Coefficient and p-value Hypothesis: Is there a correlation? r rr p pp

Application: Analytical Measurements Theoretical Concentration Measured Concentration

A Few Characteristics of Analytical Measurements Accuracy: Closeness of agreement between a test result and an accepted reference value. Precision: Closeness of agreement between independent test results. Robustness: Test precision given small, deliberate changes in test conditions (preanalytic delays, variations in storage temperature). Lower limit of detection: The lowest amount of analyte that is statistically distinguishable from background or a negative control. Limit of quantification: Lowest and highest concentrations of analyte that can be quantitatively determined with suitable precision and accuracy. Linearity: The ability of the test to return values that are directly proportional to the concentration of the analyte in the sample.

Limit of Detection and Linearity Theoretical Concentration Measured Concentration

Precision and Accuracy Theoretical Concentration Measured Concentration

Summary - Regression Source:

Summary - Correlation

Next Lecture: Experimental Design & Analysis Experimental Design by Christine Ambrosino