What is Regression Analysis?

Slides:



Advertisements
Similar presentations
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Advertisements

Logistic Regression Psy 524 Ainsworth.
Pattern Recognition and Machine Learning
The General Linear Model Or, What the Hell’s Going on During Estimation?
Departments of Medicine and Biostatistics
What role should probabilistic sensitivity analysis play in SMC decision making? Andrew Briggs, DPhil University of Oxford.
Linear Methods for Regression Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
x – independent variable (input)
Linear Regression Models Based on Chapter 3 of Hastie, Tibshirani and Friedman Slides by David Madigan.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Chapter 11 Multiple Regression.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Simple Linear Regression Analysis
Classification and Prediction: Regression Analysis
Generalized Linear Models
Variance and covariance Sums of squares General linear models.
Objectives of Multiple Regression
Simple Linear Regression
Biostatistics Case Studies 2015 Youngju Pak, PhD. Biostatistician Session 4: Regression Models and Multivariate Analyses.
Model Building III – Remedial Measures KNNL – Chapter 11.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.
Generalized Linear Models All the regression models treated so far have common structure. This structure can be split up into two parts: The random part:
Examining Relationships in Quantitative Research
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression Regression Trees.
CpSc 881: Machine Learning
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Logistic regression. Recall the simple linear regression model: y =  0 +  1 x +  where we are trying to predict a continuous dependent variable y from.
Logistic Regression & Elastic Net
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Nonparametric Statistics
Logistic Regression: Regression with a Binary Dependent Variable.
LECTURE 15: PARTIAL LEAST SQUARES AND DEALING WITH HIGH DIMENSIONS March 23, 2016 SDS 293 Machine Learning.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Estimating standard error using bootstrap
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Nonparametric Statistics
The simple linear regression model and parameter estimation
Chapter 7. Classification and Prediction
Deep Feedforward Networks
Logistic Regression APKC – STATS AFAC (2016).
Basic Estimation Techniques
B&A ; and REGRESSION - ANCOVA B&A ; and
Generalized regression techniques
CSE 4705 Artificial Intelligence
Linear Regression (continued)
Chapter 12: Regression Diagnostics
Generalized Linear Models
Checking Regression Model Assumptions
Generalized Linear Models (GLM) in R
Nonparametric Statistics
Checking Regression Model Assumptions
Multiple Regression Models
I271b Quantitative Methods
Linear Model Selection and regularization
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
The Bias Variance Tradeoff and Regularization
Simple Linear Regression
Fixed, Random and Mixed effects
Product moment correlation
Model generalization Brief summary of methods
Correlation and Covariance
Linear Regression Summer School IFPRI
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Lecturer Dr. Veronika Alhanaqtah
Generalized Additive Model
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Presentation transcript:

What is Regression Analysis?  Regression analysis is used to model the relationship between a dependent variable and one or more independent variables.

Linear Regression

Polynomial Regression red curve fits the data better than the green curve= situations where the relation. between the dependent and independent variable seems to be non-linear we can deploy Polynomial Regression Models.

Quantile (percentile) Regression generally use it when outliers, high skeweness and heteroscedasticity exist in the data. aims to estimate either the conditional median or other quantiles of the response variable we try to estimate the quantile of the dependent variable given the values of X’s.

Logistic Regression dependent variable is binary y follows binomial distribution and hence is not normal the error terms are not normally distributed.

Cox Regression (survival analysis; proportional hazards model) investigating the effect of several variables upon the time a specified event takes to happen time-to-event data e.g Time from first heart attack to the second Dual targets are set for the survival model  1. A continuous variable representing the time to event. 2. A binary variable representing the status whether event occurred or not.

Ordinal Regression dependent variable is ordinal-ranked values Example of ordinal variables – Survey responses (1 to 6 scale), patient reaction to drug dose (none, mild, severe). Ordinal regression can be performed using a generalized linear model (GLM) that fits both a coefficient vector and a set of thresholds to a dataset.

Poisson Regression (log-linear model) Dependent variable is count data The dependent variable must meet the following conditions: 1) The dependent variable has a Poisson distribution. 2) Counts cannot be negative. 3)This method is not suitable on non-whole numbers

Negative Binomial Regression deals with count data does not assume distribution of count having variance equal to its mean Deals with overdispersion

Quasi Poisson Regression alternative to negative binomial regression  used for overdispersed count data Both the algorithms give similar results, there are differences in estimating the effects of covariates variance of a quasi-Poisson model is a linear function of the mean while the variance of a negative binomial model is a quadratic function of the mean. can handle both over-dispersion and under-dispersion

Principal Components Regression (PCR)  based on principal component analysis (PCA). calculate the principal components and then use some of these components as predictors in a linear regression model fitted using the typical least squares procedure Dimensionality Reduction & Removal of multicollinearity

Partial Least Squares (PLS) Regression alternative technique of principal component regression when you have independent variables highly correlated. It is also useful when there are a large number of independent variables. finds a linear regression model by projecting the predicted variables and the observable variables to a new space.

Ridge Regression technique for analyzing multiple regression data that suffer from multicollinearity. Regularization parameter When multicollinearity occurs, least squares estimates are unbiased, but their variances are large so they may be far from the true value.

Lasso Regression Least Absolute Shrinkage and Selection Operator performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the statistical model it produces. L1 regularization technique:  mi nimize the objective function by adding a penalty term to the sum of the absolute values of coefficients.

Elastic Net Regression A regularized regression method that linearly combines the L1 and L2 penalties of the lasso and ridge methods. Elastic Net regression is preferred over both ridge and lasso regression when one is dealing with highly correlated independent variables.

Support Vector Regression/ Machine can solve both linear and non- linear models Non-parametric uses non-linear kernel functions (such as polynomial) to find the optimal solution for non-linear models