1 Correlation and Simple Regression. 2 Introduction Interested in the relationships between variables. What will happen to one variable if another is.

Slides:



Advertisements
Similar presentations
Copyright © Cengage Learning. All rights reserved.
Advertisements

Chapter 12 Keynesian Business Cycle Theory: Sticky Wages and Prices.
Copyright © 2008 Pearson Addison-Wesley. All rights reserved. Chapter 16 Unemployment: Search and Efficiency Wages.
Feichter_DPG-SYKL03_Bild-01. Feichter_DPG-SYKL03_Bild-02.
Chapter 3 Demand and Behavior in Markets. Copyright © 2001 Addison Wesley LongmanSlide 3- 2 Figure 3.1 Optimal Consumption Bundle.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
STATISTICS Joint and Conditional Distributions
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Conversion Problems 3.3.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 10 second questions
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
1 Matters arising 1.Summary of last weeks lecture 2.The exercises 3.Your queries.
Chapter 7 Sampling and Sampling Distributions
Simple Linear Regression 1. review of least squares procedure 2
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
1 STA 536 – Experiments with a Single Factor Regression and ANOVA.
1 1 Slide © 2003 South-Western/Thomson Learning TM Slides Prepared by JOHN S. LOUCKS St. Edwards University.
3/2003 Rev 1 I – slide 1 of 33 Session I Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.
PP Test Review Sections 6-1 to 6-6
EU market situation for eggs and poultry Management Committee 20 October 2011.
5-1 Chapter 5 Theory & Problems of Probability & Statistics Murray R. Spiegel Sampling Theory.
2 |SharePoint Saturday New York City
VOORBLAD.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Statistical Analysis SC504/HS927 Spring Term 2008
Comparing several means: ANOVA (GLM 1)
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
25 seconds left…...
Determining How Costs Behave
Chapter 10 Correlation and Regression
Analyzing Genes and Genomes
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Intracellular Compartments and Transport
PSSA Preparation.
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.
Experimental Design and Analysis of Variance
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Simple Linear Regression Analysis
Correlation and Linear Regression
1 Chapter 13 Nuclear Magnetic Resonance Spectroscopy.
Multiple Regression and Model Building
Energy Generation in Mitochondria and Chlorplasts
Chapter 16: Correlation.
Heibatollah Baghi, and Mastee Badii
Linear Regression Analysis
MAT 254 – Probability and Statistics Sections 1,2 & Spring.
Introduction to Linear Regression
Presentation transcript:

1 Correlation and Simple Regression

2 Introduction Interested in the relationships between variables. What will happen to one variable if another is changed? To what extent is it the case that increases in the interest rate reduce inflation? Might want to know how sensitive the relationship is, and if possible, what form it takes. Models needed.

3 Koops Deforestation Data Y – average annual forest loss, as % of total forested area X - #people per 1000 hectares Date on 70 tropical countries (N=70)

4 Figure 1.1 Deforestation/Population Density Data with Line of Best Fit

5 Predicted Value of Forest Loss Given Population Density

6 X=2000 implies Y=2.3 If there are 2000 people per 1000 hectares, forest loss would be about 2.3%. Comments i) Increased dispersion about the line as X increases; more uncertainty about predictions for higher population densities. ii) Ignores other impacts on deforestation.

7 Correlation Objectives of Correlation To measures how close the relationship between two variables is to linearity – strength of linear association Capture the sign of relationship Determine on common scale for all cases: -1 to +1 Closer to zero, weaker correlation

8 Sample Covariance X and Y vary about their mean values. To what extent is this variation aligned?

9 Scatter Plot of Forest Loss Against Population Density: Axes Crossing at Mean Points

10 Deviations from Mean same sign opposite sign

11 Sample Covariance Formula Problem: varies with the scale of the data

12 Sample Correlation I Standardise using sample standard deviations Sample variance: Sample standard deviation:

13 Sample Correlation II

14 Calculations for Deforestation Data

15 Correlation and Causality Must distinguish between causality and correlation. Correlated does not imply causality. Not even an indication from a correlation of which way the causality should run (from X to Y or the other way round). Two trending time series variables may be spuriously correlated. Causality is judgmental.

16 Example: UK Aggregate Consumption and Income Aggregate UK consumption and income over a period of years is highly correlated. Economists believe there is a relationship between these two variables. Take correlation to be evidence in favour of the existence of a causal relationship: income causes consumption.

17 Time Series Plot of UK Aggregate Consumption and Income

18 Scatter Plot of UK Aggregate Consumption Against Income

19 Another Example Ratio of unemployment benefit to wages, X, and the unemployment rate, Y. Annual observations for for the UK. Theory: X causes Y Policy implication: r>0 implies cut benefits relative to wages to reduce unemployment.

20 Scatter Plot of Unemployment Against Wage/Benefit Ratio What happens to r if the following observation is not included? r =

21 Final Comments Correlation measures linear association on scale [-1,+1]. r=-1,+1 indicates PERFECT linear correlation (exact straight line). Only concerned with the relationship between TWO variables (bivariate). This measure is sensitive to outliers. Correlation may be taken as supportive evidence of a causal relationship, but correlation does not imply causality.

22 Bivariate regression Correlation can: Indicate the strength of a relationship It cannot: Contribute to an understanding of how the variables may be related Make predictions about Y based on knowledge of X Regression analysis can: Examine the nature of the relationship between X and Y Make predictions from that.

23 Figure 2.1 Deforestation/Population Density Data with Line of Best Fit

24 Introduction What is the line of best fit? How can it be defined? What does it mean? Can place line by eye, but non- systematic.

25 UK consumption-income scatter plot gives a very strong indication of a linear relationship.

26 UK unemployment-benefit to wage ratio plot does not look linear.

27 Models Simplest model: straight line Too constrained – will never hold exactly. Allow for disturbances for each case, i=1,2,…,N Properties of disturbances: on average zero, but they vary. They have: mean zero, and variance denoted:

28

29

30 So what? We have a theory that allows us to think of there being an underlying linear relationship, but one that isnt exact. This fits with what we observe. It leads to a statistical theory of errors, the real life equivalent of the theoretical disturbances, that eventually allows testing of various sorts.

31 Least Squares Line: Bivariate Linear Regression Want the BEST LINEAR description of the way Y depends on X Deforestation on population density, or consumption on income, or unemployment on the benefit to wage ratio. Geometrically, we want the best fitting straight line to the data presented on a scatter plot. Needs to be defined

32 error Lots of big errors, e i error Errors smaller here

33 Want calculate best values i=1,2,...,N. in

34 Equation of the fitted line – note that subscripts are not used here: Predicted (fitted) value of Y i given X i

35 YiYi XiXi (X i,Y i )

36 The Error Also called the RESIDUAL There are N, of these, one for each i=1,2,…,N

37 The Best Line Actually, a best line – others can be defined That which minimises the sum of the squares of the errors