Chapter Outline EMPIRICAL MODELS 11-2 SIMPLE LINEAR REGRESSION 11-3 PROPERTIES OF THE LEAST SQUARES ESTIMATORS 11-4 SOME COMMENTS ON USES OF REGRESSION.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

CHAPTER 2 Building Empirical Model. Basic Statistical Concepts Consider this situation: The tension bond strength of portland cement mortar is an important.
Inference for Regression
Probability & Statistical Inference Lecture 9
12-1 Multiple Linear Regression Models Introduction Many applications of regression analysis involve situations in which there are more than.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
12 Multiple Linear Regression CHAPTER OUTLINE
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Ch11 Curve Fitting Dr. Deshi Ye
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
SIMPLE LINEAR REGRESSION
Linear Regression and Correlation Analysis
Chapter 11 Multiple Regression.
13-1 Designing Engineering Experiments Every experiment involves a sequence of activities: Conjecture – the original hypothesis that motivates the.
Multiple Linear Regression
SIMPLE LINEAR REGRESSION
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Correlation and Regression Analysis
Simple Linear Regression Analysis
Relationships Among Variables
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Correlation & Regression
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
SIMPLE LINEAR REGRESSION
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Correlation and Regression
Chapter 13: Inference in Regression
Correlation and Linear Regression
Hypothesis Testing in Linear Regression Analysis
Regression Analysis (2)
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
1 Chapter 3 Multiple Linear Regression Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Correlation & Regression Analysis
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Chapter 13 Simple Linear Regression
Applied Statistics and Probability for Engineers
Chapter 4 Basic Estimation Techniques
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Slides by JOHN LOUCKS St. Edward’s University.
Correlation and Simple Linear Regression
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Prepared by Lee Revere and John Large
Correlation and Simple Linear Regression
Simple Linear Regression
Product moment correlation
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Chapter Outline EMPIRICAL MODELS 11-2 SIMPLE LINEAR REGRESSION 11-3 PROPERTIES OF THE LEAST SQUARES ESTIMATORS 11-4 SOME COMMENTS ON USES OF REGRESSION (CD ONLY) 11-5 HYPOTHESIS TESTS IN SIMPLE LINEAR REGRESSION Use of t-Tests Analysis of Variance Approach to Test Significance of Regression 11-6 CONFIDENCE INTERVALS Confidence Intervals on the Slope and Intercept Confidence Interval on the Mean Response 11-7 PREDICTION OF NEW OBSERVATIONS 11-8 ADEQUACY OF THE REGRESSION MODEL Residual Analysis Coefficient of Determination (R2) 11-9 TRANSFORMATIONS TO A STRAIGHT LINE CORRELATION

EMPIRICAL MODELS Experimental, tentative, experiential (they provided considerable empirical evidence to support their argument ) جريبي, اختباري Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis is a statistical technique that is very useful for these types of problems. For example, in a chemical process, suppose that the yield of the product is related to the process- operating temperature. Regression analysis can be used to build a model to predict yield at a given temperature level. This model can also be used for process optimization, such as finding the level of temperature that maximizes yield, or for process control purposes. As an illustration, consider the data in Table In this table y is the purity of oxygen produced in a chemical distillation process, and x is the percentage of hydrocarbons that are present in the main condenser of the distillation unit. Figure 11-1 presents a scatter diagram

Scatter diagram of Table This is just a graph on which each (xi, yi) pair is represented as a point plotted in a two-dimensional coordinate system. This scatter diagram was produced by Minitab, and we selected an option that shows dot diagrams of the x and y variables along the top and right margins of the graph, respectively, making it easy to see the distributions of the individual variables (box plots or histograms could also be selected). Inspection of this scatter diagram indicates that, although no simple curve will pass exactly through all the points, there is a strong indication that the points lie scattered randomly around a straight line.???

EMPIRICAL MODELS Therefore, it is probably reasonable to assume that the mean of the random variable Y is related to x by the following straight-line relationship: Where the slope and intercept of the line are called regression coefficients. While the mean of Y is a linear function of x, the actual observed value y does not fall exactly on a straight line. The appropriate way to generalize this to a Probabilistic Linear Model is to assume that the expected value of Y is a linear function of x, but that for a fixed value of x the actual value of Y is determined by the mean value function (the linear model) plus a random error term, say, where is the random error term. We will call this model the simple linear regression model, a. because it has only one independent variable or regressor.. Sometimes a model like this will arise from a theoretical relationship. At other times, we will have no theoretical knowledge of the relationship between x and y, and the choice of the model is based on inspection of a scatter diagram, such as we did with the oxygen purity data. We then think of the regression model as an empirical model.

Derivation Suppose that the mean

Conclusion

SIMPLE LINEAR REGRESSION another look

Figure 11-3 Deviations of the data from the estimated regression model.

Method Of Least Squares.

The residual describes the error in the fit of the model to the i th observation of y i

xiyixiyi xi2xi

Example home work !!!! But add 2 to column 1 and 2 to Colum 2 compare equation

Is the total sum of squares of the response variable y

Example Y X a)Fit the simple linear regression model using least squares method b)Find an estimator of σ 2 c)Predict wear when viscosity x = 30 d)Obtain the fitted value of y when x = 22 and calculate the corresponding residual

Properties of the least square estimators

Hypothesis test in simple linear regression An important part of assessing the adequacy of a linear regression model is testing statistical hypotheses about the model parameters and constructing certain confidence intervals. Hypothesis testing in simple linear regression is discussed in this section, and Section 11-6 presents methods for constructing confidence intervals. To test hypotheses about the slope and intercept of the regression model, we must make the additional assumption that the error component in the model,, is normally distributed. Thus, the complete assumptions are that the errors are normally and independently distributed with mean zero and variance, abbreviated NID(0,).

Hypothesis test in simple linear regression We would reject the null hypothesis if

Special Case Accept the null hypothesis is equivalent to conclude that there is no linear relationship between x and y. Accept the null hypothesis

Reject the null hypothesis

Example

Other example

Regression analysis is used to investigate and model the relationship between a response variable and one or more predictors. Minitab provides least squares, nonlinear, orthogonal, partial least squares, and logistic regression procedures: · Use least squares procedures when your response variable is continuous. · Use nonlinear regression when you cannot adequately model the relationship with linear parameters.nonlinear regression · Use orthogonal regression when the response and predictor both contain measurement error.orthogonal regression · Use partial least squares regression when your predictors are highly correlated or outnumber your observations. · Use logistic regression when your response variable is categorical. Both least squares and logistic regression methods estimate parameters in the model so that the fit of the model is optimized. Least squares methods minimize the sum of squared errors to obtain parameter estimates, whereas Minitab's logistic regression obtains maximum likelihood estimates of the parameters. Partial least squares (PLS) extracts linear combinations of the predictors to minimize prediction error. See Partial Least Squares Overview for more information.Partial Least Squares Overview Use the table below to select a procedure:

Use... To...Response type Estimation method Regressionperform simple or multiple least squares regressioncontinuousleast squares General Regressionperform simple, multiple regression or polynomial least squares regression with continuous and categorical predictors, with no need to create indicator variables continuousleast squares Stepwiseperform stepwise, forward selection, or backward elimination to identify a useful subset of predictors continuousleast squares Best Subsets identify subsets of the predictors based on the maximum R criterion continuousleast squares Fitted Line Plot perform linear and polynomial regression with a single predictor and plot a regression line through the data continuousleast squares Nonlinear Regressionperform simple or multiple regression using the nonlinear function of your choice continuousleast squares Orthogonal Regressionperform orthogonal regression with one response and one predictor continuousorthogonal PLSperform regression with ill-conditioned data ill-conditioned datacontinuousbiased, non-least squares Binary Logistic perform logistic regression on a response with only two possible values, such as presence or absence categoricalmaximum likelihood Ordinal Logistic perform logistic regression on a response with three or more possible values that have a natural order, such as none, mild, or severe categoricalmaximum likelihood Nominal Logistic perform logistic regression on a response with three or more possible values that have no natural order, such as sweet, salty, or sour categoricalmaximum likelihood

New lecture

Analysis of Variance Approach to Test Significance of Regression A method called the analysis of variance can be used to test for significance of regression. The procedure partitions the total variability in the response variable into meaningful components as the basis for the test. The analysis of variance identity is as follows:

Analysis of Variance Approach to Test Significance of Regression A method called the analysis of variance can be used to test the significance of regression The analysis of variance identity can be written as follow: SS R : Regression sum of squares SS E : Error sum of squares Total corrected sum of squares SS T = SS R + SS E

Analysis of Variance Approach to Test Significance of Regression

Example

Find ANOVA table

Confidence Intervals on the Slope and Intercept In addition to point estimates of the slope and intercept, it is possible to obtain confidence interval estimates of these parameters. The width of these confidence intervals is a measure of the overall quality of the regression line. If the error terms, i, in the regression model are normally and independently distributed,

example

Confidence Intervals on the Mean Response

11-7 Prediction Of New Observations

Prediction of New Observations

11-8 ADEQUACY OF THE REGRESSION MODEL Fitting a regression model requires several assumptions. 1. Estimation of the model parameters requires the assumption that the errors are uncorrelated random variables with mean zero and constant variance. 2. Tests of hypotheses and interval estimation require that the errors be normally distributed. 3. In addition, we assume that the order of the model is correct; that is, if we fit a simple linear regression model, we are assuming that the phenomenon actually behaves in a linear or first-order manner. The analyst should always consider the validity of these assumptions to be doubtful and conduct analyses to examine the adequacy of the model that has been tentatively entertained. In this section we discuss methods useful in this respect.

Residual Analysis The residuals from a regression model are, where yi is an actual observation and is the corresponding fitted value from the regression model. Analysis of the residuals is frequently helpful in checking the assumption that the errors are approximately normally distributed with constant variance, and in determining whether additional terms in the model would be useful. As an approximate check of normality, the experimenter can construct a frequency histogram of the residuals or a normal probability plot of residuals. Many computer programs will produce a normal probability plot of residuals, and since the sample sizes in regression are often too small for a histogram to be meaningful, the normal probability plotting method

a)Satisfactory b)Funnel Variance increases with x c) double bow Inequality of variance d) Nonlinear Model inadequacy

example

For the oxygen purity regression model, R 2 = 0.877; model accounts for 87.7% of the variability in the data

Coefficient of Determination(R2)

11-9 TRANSFORMATIONS TO A STRAIGHT LINE We occasionally find that the straight-line regression model is inappropriate because the true regression function is nonlinear. Sometimes nonlinearity is visually determined from the scatter diagram, and sometimes, because of prior experience or underlying theory, we know in advance that the model is nonlinear. Occasionally, a scatter diagram will exhibit an apparent nonlinear relationship between Y and x. In some of these situations, a nonlinear function can be expressed as a straight line by using a suitable transformation. Such nonlinear models are called intrinsically linear.

11-11 CORRELATION