17.1 INTRODUCTION Explanatory variable CHAPTER 17 FURTHER DATA ANALYSIS 2.

Slides:



Advertisements
Similar presentations
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
Regresi Linear Sederhana Pertemuan 01 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Multiple Regression Analysis
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Chapter 12 Simple Regression
Simple Linear Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Linear Regression and Correlation Analysis
Correlation and Regression. Correlation What type of relationship exists between the two variables and is the correlation significant? x y Cigarettes.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Statistics 303 Chapter 10 Least Squares Regression Analysis.
Korelasi dalam Regresi Linear Sederhana Pertemuan 03 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Chapter 7 Forecasting with Simple Regression
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Chapter 11 Simple Regression
Correlation and Linear Regression
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Statistical Methods Statistical Methods Descriptive Inferential
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Lecture 10: Correlation and Regression Model.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Chapter 8: Simple Linear Regression Yang Zhenlin.
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 18 Introduction to Simple Linear Regression (Data)Data.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
CHAPTER 15 EXPLORING RELATIONSHIPS 15.1 INTRODUCTION EXPLANATORY VARIABLE.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Correlation and Regression Elementary Statistics Larson Farber Chapter 9 Hours of Training Accidents.
Correlation and Linear Regression
Chapter 13 Simple Linear Regression
The simple linear regression model and parameter estimation
Regression and Correlation
Statistics for Managers using Microsoft Excel 3rd Edition
Correlation and Simple Linear Regression
3.1 Examples of Demand Functions
Linear Regression and Correlation Analysis
Chapter 11: Simple Linear Regression
Chapter 11: Simple Linear Regression
Simple Linear Regression
Correlation and Simple Linear Regression
Prepared by Lee Revere and John Large
Correlation and Simple Linear Regression
Least-Squares Regression
Simple Linear Regression and Correlation
Least-Squares Regression
Introduction to Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

17.1 INTRODUCTION Explanatory variable CHAPTER 17 FURTHER DATA ANALYSIS 2

 Example - Given sample data from a random sample of students about their IQ and their height  Does the height of an individual student influence the IQ of the student? - Given sample data from a random sample of people about their height and the height of their father  Does the height of a father influence the height of his son? - Given sample data about the number of new born babies in the city per day and the number of the wild gooses flying over the city per day  Does the number of new born babies in the city and the number of the wild gooses flied over the city

 Methodology of Data Anylysis - Initial Data Analysis  Very strong evidence to support a link,  No evidence of any link,  The sample evidence is inconclusive and further more sophisticated data analysis is required. - Further Data Analysis  The sample evidence is consistent with  No link between the response variable and the explanatory variable.  A link between the response variable and the explanatory variable, the nature of the relationship needs to be described.

17.2 WHAT IS MEANT BY A RELATIONSHIP  Relationship between a measured response variable and a measured explanatory variable? - Y-- the response variable - X-- the explanatory variable - Conceptual graph of Y against X

Population

 Deterministic Relationship - Fig. 3 shows a Deterministic Relationship between Y and X - In the data-analysis context, it would not be deterministic - for example, there was any connection between IQ and height, or height of son and height of father?

 Statistical Relationship

 Graph 1 - Perfect linear relationship - Determining intercept and gradient/slope - Response Y depends only on the variable X  Graph 2 - Statistical relationship/link  As the value of the explanatory variable X increases, the value of the response variable Y also tends to increase  the response Y may depend on a number of different variables, say X, U, V, W, Z  Y=f(X,U,V,W,Z,….)

 Y= f(X) + effect of all other variables =Y= f(X) + e  e is the effect of all other variables  The influence on Y is from two parts  Variation in Y Explained by changes X (Explained Variation )  Variation in Y not explained by changes in X (Unexplained Variation )  The Total Variation in Y' = 'Explained Variation' + 'Unexplained Variation' - In Graph 1, Unexplained Variation is nil - In Graph 2, Explained Variation is large relative to the Unexplained Variation

 Graph 3 - Y seems to be unrelated to X - Explained Variation is zero - Unexplained Variation influences all changes in Y.

 Graph 4 - Similar to Graph 2  Graph 5 - Similar to Graph 1

 Summary for Graph1 to Graph 5

 Model of relationship between Y and X - Y= f(X) + e - Total Variation in Y = Explained Variation +Unexplained Variation - Two issues  Can a model of the link be made?  Can 'The Total Variation in Y', 'Explained Variation' and the 'Unexplained Variation' be measured? - For Graph 1 and Graph 5  Y= a + bX  a – intercept  b – gradient/slope - For Graphs 2,3 & 4  Statistical model

17.3 DEVELOPING A STATISTICAL MODEL  Simple numerical example - Using EXCEL - Fitting line by eye  Intercept ?  Gradient?

- Actual Y - Predicted Y p =2+5X  Measures of the disagreement between the actual data Y and the fitted line(predicted Y p ) -  (Y-Y p ) -  (Y-Y p ) 2 – satisfactory measure of disagreement  The line of Best Fit:  (Y-Y p ) 2 is as small as possible  The Method of Least Squares : finding the intercept and the gradient of a line to minimize  (Y-Y p ) 2

 How to find the values of the intercept and the gradient - Trial and Error in Excel - Using Excel SOLVER - Using MINTAB  The Unexplained Variation -  (Y-Y p ) measure of the 'Unexplained Variation'  The Total Variation in Y

 The Explained Variation

17.4 The coefficient of Determination R 2  The Total Variation in Y ='Explained Variation + Unexplained Variation - Example: =  Definition of R 2

 Example - For Graphs 1 & 5  R 2 =1 or 100% - For Graph 3  R 2 =0 - For Graphs 2 & 4  R 2 is between 0 and 1  The interpretation of the ratio R 2

17.5 USING SAMPLE DATA TO TRACK A CONNECTION: INTRODUCTION  Example 1 - To explore the relationship between the size of the engine, as measured by the cubic capacity in cubic centimetres (CC) and the petrol consumption, as measured in miles per gallon, (M.P.G.). CarCCM.P.G

 Example 2 credit data - To investigate the connection between the response variable 'Amount Borrowed on Credit' CREDIT and the explanatory variable 'PAYOUT'.

17.6 THE INITIAL DATA ANALYSIS  Example 1 - Graph ENGINE SIZE (CC)

- A spreadsheet model

- Interpretation  By I.D.A, there is a clear link between M.P.G. and Engine size, as the Engine size increases the petrol consumption is lower. This is confirmed by the value of R 2. The interpretation of R 2 is suggesting that 89.54% of the changes in M.P.G. are explained by changes in Engine size. Alternatively 10.46% of the changes in M.P.G. are due to other variables.  Predict fuel consumption from engine size?

- Using MINITAB  Graph—Scatterplot  Stat—Regression  Interpretation

 Example 2 - Graph - Regression Analysis - Interpretation  The I.D.A. is inconclusive and farther analysis is required. The regression equation is CREDIT = PAYOUT

17.7 THE FURTHER DATA ANALYSIS  FDA four steps - Specify the hypotheses. - Defining the decision rule. - Examining the sample evidence. - Conclusions.

 Specify the hypotheses - H 0 : R 2 = 0  There is no relationship between the response and the explanatory variable. - H 1 : R 2 > 0  There is a relationship between the response and the explanatory variable.

 Defining the decision rule - If F calc F Table then favour H 1

 Examining the sample evidence - MINITAB REGRESS output  Conclusions  Worked Example 2

- F calc = - F Table = 3.85 - Conclusions  Sample evidence favours H 1. So there is evidence of a connection between 'CREDIT' and 'PAYOUT'

17.8 DESCRIBING THE RELATIONSHIP  The R 2 value can be interpreted as a measure of the quality of predictions made from the line of best fit according to the rule of thumb:

 Example 1 - The regression equation is M.P.G. = cc - R 2 = 89.4% - Making predictions CCM.P.G

- Inside the range from 1000cc to 3000cc, prediction is likely to be of good quality. - Outside the range from 1000cc to 3000cc, prediction is not likely to be very reliable  When cc=0, M.P.G.=  When cc=8500, M.P.G.=

 Example 2 - CREDIT = PAYOUT - R 2 = 22.4% - When PAYOUT = £ 10, CREDIT = £ *10 = = £ - This is within the range of values of PAYOUT within the data so is a valid prediction, but is not a very reliable prediction since the value of R2 is 22.4%