Alcohol consumption and HDI story TotalBeerWineSpiritsOtherHDI Lifetime span Austria13,246,74,11,60,40,75580,119 Finland12,524,592,242,820,310,80079,724.

Slides:



Advertisements
Similar presentations
Lecture 1: Correlations and multiple regression Aims & Objectives -Should know about a variety of correlational techniques -Multiple correlations and the.
Advertisements

Lesson 10: Linear Regression and Correlation
Forecasting Using the Simple Linear Regression Model and Correlation
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Soc 3306a: Path Analysis Using Multiple Regression and Path Analysis to Model Causality.
Correlation and Linear Regression
Chapter 17 Making Sense of Advanced Statistical Procedures in Research Articles.
Correlation Chapter 9.
Correlation. Introduction Two meanings of correlation –Research design –Statistical Relationship –Scatterplots.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: What it Is and How it Works. Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r Assumptions.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Multiple Linear Regression Model
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Lecture 6: Multiple Regression
Regression Diagnostics Checking Assumptions and Data.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Multiple Regression Research Methods and Statistics.
Correlation and Regression Analysis
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Multiple Regression Dr. Andy Field.
Simple Linear Regression Analysis
Descriptive measures of the strength of a linear association r-squared and the (Pearson) correlation coefficient r.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Correlation & Regression
Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.
Regression and Correlation Methods Judy Zhong Ph.D.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Soc 3306a Lecture 8: Multivariate 1 Using Multiple Regression and Path Analysis to Model Causality.
Simple Covariation Focus is still on ‘Understanding the Variability” With Group Difference approaches, issue has been: Can group membership (based on ‘levels.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
CHAPTER NINE Correlational Research Designs. Copyright © Houghton Mifflin Company. All rights reserved.Chapter 9 | 2 Study Questions What are correlational.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Correlation & Regression
Examining Relationships in Quantitative Research
Correlation & Regression Chapter 15. Correlation It is a statistical technique that is used to measure and describe a relationship between two variables.
Chapter 16 Data Analysis: Testing for Associations.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
REGRESSION DIAGNOSTICS Fall 2013 Dec 12/13. WHY REGRESSION DIAGNOSTICS? The validity of a regression model is based on a set of assumptions. Violation.
Psychology 820 Correlation Regression & Prediction.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
B AD 6243: Applied Univariate Statistics Multiple Regression Professor Laku Chidambaram Price College of Business University of Oklahoma.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
SOCW 671 #11 Correlation and Regression. Uses of Correlation To study the strength of a relationship To study the direction of a relationship Scattergrams.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Lab 4 Multiple Linear Regression. Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function.
Simple Linear Regression and Correlation (Continue..,) Reference: Chapter 17 of Statistics for Management and Economics, 7 th Edition, Gerald Keller. 1.
Regression. Why Regression? Everything we’ve done in this class has been regression: When you have categorical IVs and continuous DVs, the ANOVA framework.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Correlation, Bivariate Regression, and Multiple Regression
Multiple Regression Prof. Andy Field.
Multiple Regression.
Chapter 12: Regression Diagnostics
Regression.
Suppose the maximum number of hours of study among students in your sample is 6. If you used the equation to predict the test score of a student who studied.
Unit 3 – Linear regression
Simple Linear Regression
Checking Assumptions Primary Assumptions Secondary Assumptions
Presentation transcript:

Alcohol consumption and HDI story TotalBeerWineSpiritsOtherHDI Lifetime span Austria13,246,74,11,60,40,75580,119 Finland12,524,592,242,820,310,80079,724 Poland13,254,723,261,5600,71575,976 Russia15,763,650,16,880,340,64467,260 Uganda11,930,5100,1814,520,45353,261 The Human Development Index (HDI) is a composite statistic of life expectancy, education, and income

What is a CORRELATION Correlation – statistical procedure to measure & describe the relationship between two variable

Do two variables covary? Are two variables dependent or independent of one another? Can one variable be predicted from another? What is a CORRELATION

World is full of COVARY

The IQ and brain size

Pearson's product-moment coefficient

.0 to.2 No relationship to very weak association.2 to.4 Weak association.4 to.6 Moderate association.6 to.8 Strong association.8 to 1.0 Very strong to perfect association Interpretation CAUTION!!! Test the null

Testing H0

Alcohol consumption and HDI story

Correlation and causation

B causes A (reverse causation) The more firemen fighting a fire, the bigger the fire is observed to be. Therefore firemen cause an increase in the size of a fire. A causes B and B causes A (bidirectional causation) Increased pressure is associated with increased temperature.Therefore pressure causes temperature. Third factor C (the common-causal variable) causes both A and B) Sleeping with one's shoes on is strongly correlated with waking up with a headache. Therefore, sleeping with one's shoes on causes headache. Illogically inferring causation from correlation Coincidence With a decrease in the wearing of hats, there has been an increase in global warming over the same period. Therefore, global warming is caused by people abandoning the practice of wearing hats.

Church of the Flying Spaghetti Monster

Alcohol consumption and HDI story

Scatterplot Scatter plot of spousal ages, r = 0.97 Scatter plot of Grip Strength and Arm Strength, r = 0.63

Farnsworth favorite game

Anscombe’s quartet IIIIIIIV xyxyxyxy PropertyValue Mean of x in each case 9 Variance of x in each case 11 Mean of y in each case 7.50 Variance of y in each case or Correlation between x and y in each case 0.816

Anscombe’s quartet IIIIIIIV xyxyxyxy PropertyValue Mean of x in each case 9 Variance of x in each case 11 Mean of y in each case 7.50 Variance of y in each case or Correlation between x and y in each case CAUTION!!! Check scatterplot

Anscombe’s quartet

Problems

Problems: Outliers r=0,63 r=0,23

Problems: Range restriction

Coefficient of Determination (r 2 ) CoD = The proportion of variance or change in one variable that can be accounted for by another variable.

Problems: Range restriction

REGRESSION MODELS

Multiple linear regression (MLR) is a multivariate statistical technique for examining the linear correlations between two or more independent variables (IVs) and a single dependent variable (DV). MLR

Poverty prediction

Name of region Population change in 10 years. No. of persons employed in agriculture Percent of families below poverty level Residential and farm property tax rate Percent residences with telephones Percent rural population Median age Number of African/Americans

Level of measurement IVs: MLR involves two or more continuous (interval or ratio) or nominal variables (require recoding into dummy variables) DV: One continuous (interval or ratio) variable Sample size Total N based on ratio of cases to IVs: Min. 5 cases per predictor (5:1) Ideally 20 cases per predictor (20:1) Linearity Are the bivariate relationships linear? Check scatterplots and correlations between the DV (Y) and each of the IVs (Xs) Check for influence of bivariate outlier Multicollinearity Is there multicollinearity between the IVs? (i.e., are they overly correlated e.g., above.7?) Homoscedasticity The variance of the error is constant across observations. Check scatterplots between Y and each of Xs and/or check scatterplot of the residuals (ZRESID) and predicted values (ZPRED) MLR: Pre-analysis assumptions

MLR: Dummy coding for nominal data

MLR: Main Idea

Poverty prediction

MLR: Post-analysis assumptions Multivariate outliers Check whether there are influential multivariate outlying cases using Mahalanobis' Distance (MD) & Cook’s D (CD). Normality of residuals Residuals are more likely to be normally distributed if each of the variables normally distributed Check histograms of all variables in an analysis Normally distributed variables will enhance the MLR solution

MLR: Post-analysis assumptions

Poverty prediction

MLR: Types of MLR Direct (or Standard) All IVs are entered simultaneously Hierarchical IVs are entered in steps, i.e., some before others Interpret R 2 change Forward The software enters IVs one by one until there are no more significant IVs to be entered Backward The software removes IVs one to one until there are no more non-significant IVs to removed Stepwise A combination of Forward and Backward MLR

MLR: TOTAL 1.Conceptualise the model 2.Recode predictors (if necessary) 3.Check assumptions 4.Choose the type of MLR 5.Interpret statistical output and meaning of results. 6.Depict the relationships in a path diagram or Venn diagram 7.Regression equation: If relevant and useful, interpret Y-intercept and write a regression equation for predicting Y