Discovering and Describing Relationships

Slides:



Advertisements
Similar presentations
Autocorrelation Functions and ARIMA Modelling
Advertisements

Inference for Regression
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
CORRELATON & REGRESSION
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Chapter 12 Simple Regression
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
10-2 Correlation A correlation exists between two variables when the values of one are somehow associated with the values of the other in some way. A.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Describing the Relation between Two Variables 4.
Describing Relationships: Scatterplots and Correlation
Review of Basic Statistical Concepts
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Correlation and Linear Regression
Correlation and Linear Regression
Correlation and Linear Regression Chapter 13 Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Applied Business Forecasting and Planning
Relationship of two variables
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Linear Regression and Correlation
Correlation and Linear Regression
The Forecast Process Dr. Mohammed Alahmed
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Regression Method.
Multiple Regression. In the previous section, we examined simple regression, which has just one independent variable on the right side of the equation.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Correlation.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Is there a relationship between the lengths of body parts ?
A P STATISTICS LESSON 3 – 2 CORRELATION.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-1 Review and Preview.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Stat 1510: Statistical Thinking and Concepts Scatterplots and Correlation.
Chapter 6 & 7 Linear Regression & Correlation
EQT 272 PROBABILITY AND STATISTICS
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Correlation Analysis. Correlation Analysis: Introduction Management questions frequently revolve around the study of relationships between two or more.
Objectives 2.1Scatterplots  Scatterplots  Explanatory and response variables  Interpreting scatterplots  Outliers Adapted from authors’ slides © 2012.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
4.2 Correlation The Correlation Coefficient r Properties of r 1.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Correlation & Regression Analysis
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Statistics for Managers Using Microsoft® Excel 5th Edition
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Describing the Relation between Two Variables 4.
TESTING FOR NONSTATIONARITY 1 This sequence will describe two methods for detecting nonstationarity, a graphical method involving correlograms and a more.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
TESTING FOR NONSTATIONARITY 1 This sequence will describe two methods for detecting nonstationarity, a graphical method involving correlograms and a more.
REGRESSION AND CORRELATION SIMPLE LINEAR REGRESSION 10.2 SCATTER DIAGRAM 10.3 GRAPHICAL METHOD FOR DETERMINING REGRESSION 10.4 LEAST SQUARE METHOD.
Slide Slide 1 Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10-3 Regression 10-4 Variation and Prediction Intervals 10-5 Multiple.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Correlation and Linear Regression
Regression and Correlation
Is there a relationship between the lengths of body parts?
Elementary Statistics
Lecture Slides Elementary Statistics Twelfth Edition
Correlation and Regression
Product moment correlation
Correlation and Simple Linear Regression
Presentation transcript:

Discovering and Describing Relationships Farideh Dehkordi-Vakil

Exploring Relationships between Two Quantitative Variables Scatter plots Represent the relationship between two different continuous variables measured on the same subjects. Each point in the plot represents the values for one subject for the two variables.

Exploring Relationships between Two Quantitative Variables Example: Data reported by the organization for Economic Development and Cooperation on its 29 member nations in 1998. Per capita gross domestic product is on x-axis Per capita health care expenditures is on y-axis.

Exploring Relationships between Two Quantitative Variables We can describe the overall pattern of scatter plot by Form or shape Direction strength

Exploring Relationships between Two Quantitative Variables Form or shape The form shown by the scatter plot is linear if the points lie in a straight-line pattern. Strength The relation ship is strong if the points lie close to a line, with little scatter.

Exploring Relationships between Two Quantitative Variables Direction Positive and negative association Two variables are positively associated when above-average values of one variable tend to occur in individuals with above average values for the other variable, and below average values of both also tend to occur together. Two variable are negatively associated when above average values for one tend to occur in subjects with below average values of the other, and vice-versa

Exploring Relationships between Two Quantitative Variables Per capita health care example “subjects” studied are countries Form of relationship is roughly linear The direction is positive The relationship is strong.

Correlation It is often useful to have a measure of degree of association between two variables. For example, you may believe that sales may be affected by expenditures on advertising, and want to measure the degree of association between sales and advertising. Correlation coefficient is a numeric measure of the direction and strength of linear relationship between two continuous variables The notation for sample correlation coefficient is r.

Correlation There are several alternative ways to write the algebraic expression for the correlation coefficient. The following is one. X and Y represent the two variables of interest. For example advertising and sales or per capita gross domestic product, and the per capita health care expenditure. n is the number of subjects in the sample The notation for population correlation coefficient is .

Correlation Facts about correlation coefficient r has no unit. r > 0 indicates a positive association; r < 0 indicates a negative association r is always between –1 and +1 Values of r near 0 imply a very weak linear relationship Correlation measures only the strength of linear association.

Correlation We could perform a hypothesis test to determine whether the value of a sample correlation coefficient (r) gives us reason to believe that the population correlation () is significantly different from zero The hypothesis test would be H0:  = 0 Ha:   0

Correlation The test statistic would be Reject H0 if The test statistic has a t-distribution with n-2 degrees of freedom. Reject H0 if

Example: Do wages rise with experience? Many factors affect the wages of workers: the industry they work in, their type of job, their education and their experience, and changes in general levels of wages. We will look at a sample of 59 married women who hold customer service jobs in Indiana banks. The following table gives their weekly wages at a specific point in time also their length of service with their employer, in month. The size of the place of work is recorded simply as “large” (100 or more workers) or “small.” Because industry, job type, and the time of measurement are the same for all 59 subjects, we expect to see a clear relationship between wages and length of service.

Example: Do wages rise with experience?

Example: Do wages rise with experience?

Example: Do wages rise with experience? The correlation between wages and length of service for the 59 bank workers is r = 0.3535. We expect a positive correlation between length of service and wages in the population of all married female bank workers. Is the sample result convincing that this is true?

Example: Do wages rise with experience? To compute correlation: we need: Replacing these in the formula We want to test H0:  = 0 Ha:  > 0 The test statistic is

Example: Do wages rise with experience? Comparing t = 2.853 with critical values from the t table with n - 2 = 57 degrees of freedom help us to make our decision. Conclusion: Since P( t > 2.853) < .005, we reject H0. There is a positive correlation between wages and length of service.

Correlograms: An Alternative Method of Data Exploration In evaluating time series data, it is useful to look at the correlation between successive observations over time. This measure of correlation is called autocorrelation and may be calculated as follows: rk = autocorrelation coefficient for a k period lag. mean of the time series. yt = Value of the time series at period t. y t-k = Value of time series k periods before period t.

Correlograms: An Alternative Method of Data Exploration Autocorrelation coefficient for different time lags can be used to answer the following questions about a time series data. Are the data random? In this case the autocorrelations between yt and y t-k for any lag are close to zero. The successive values of a time series are not related to each other.

Correlograms: An Alternative Method of Data Exploration Is there a trend? If the series has a trend, yt and y t-k are highly correlated The autocorrelation coefficients are significantly different from zero for the first few lags and then gradually drops toward zero. The autocorrelation coefficient for the lag 1 is often very large (close to 1). A series that contains a trend is said to be non-stationary.

Correlograms: An Alternative Method of Data Exploration Is there seasonal pattern? If a series has a seasonal pattern, there will be a significant autocorrelation coefficient at the seasonal time lag or multiples of the seasonal lag. The seasonal lag is 4 for quarterly data and 12 for monthly data.

Correlograms: An Alternative Method of Data Exploration Is it stationary? A stationary time series is one whose basic statistical properties, such as the mean and variance, remain constant over time. Autocorrelation coefficients for a stationary series decline to zero fairly rapidly, generally after the second or third time lag.

Correlograms: An Alternative Method of Data Exploration To determine whether the autocorrelation at lag k is significantly different from zero, the following hypothesis and rule of thumb may be used. H0: k= 0, Ha: k  0 For any k, reject H0 if Where n is the number of observations. This rule of thumb is for  = 5%

Correlograms: An Alternative Method of Data Exploration The hypothesis test developed to determine whether a particular autocorrelation coefficient is significantly different from zero is: Hypotheses H0: k= 0, Ha: k  0 Test Statistic:

Correlograms: An Alternative Method of Data Exploration Reject H0 if

Correlograms: An Alternative Method of Data Exploration The plot of the autocorrelations versus time lag is called Correlogram. The horizontal scale is the time lag The vertical axis is the autocorrelation coefficient. Patterns in a Correlogram are used to analyze key features of data.

Example:Mobil Home Shipment Correlograms for the mobile home shipment Note that this is quarterly data

Example:Japanese exchange Rate As the world’s economy becomes increasingly interdependent, various exchange rates between currencies have become important in making business decisions. For many U.S. businesses, The Japanese exchange rate (in yen per U.S. dollar) is an important decision variable. A time series plot of the Japanese-yen U.S.-dollar exchange rate is shown below. On the basis of this plot, would you say the data is stationary? Is there any seasonal component to this time series plot?

Example:Japanese exchange Rate

Example:Japanese exchange Rate Here is the autocorrelation structure for EXRJ. With a sample size of 12, the critical value is This is the approximate 95% critical value for rejecting the null hypothesis of zero autocorrelation at lag K.

Example:Japanese exchange Rate The Correlograms for EXRJ is given below

Example:Japanese exchange Rate Since the autocorrelation coefficients fall to below the critical value after just two periods, we can conclude that there is no trend in the data.

Example:Japanese exchange Rate To check for seasonality at  = .05 The hypotheses are: H0; 12 = 0 Ha:12  0 Test statistic is: Reject H0 if

Example:Japanese exchange Rate Since We do not reject H0 , therefore seasonality does not appear to be an attribute of the data.