Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 11: Simple Linear Regression

Similar presentations


Presentation on theme: "Chapter 11: Simple Linear Regression"— Presentation transcript:

1 Chapter 11: Simple Linear Regression

2 McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression
Where We’re Going Introduce the straight-line linear regression model as a means of relating one quantitative variable to another quantitative variable. Introduce the correlation coefficient as a means of relating one quantitative variable to another quantitative variable. Assess how well the simple linear regression model fits the sample data. Use the simple linear regression model to predict the value of one variable given the value of another variable. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

3 11.1: Probabilistic Models
There may be a deterministic reality connecting two variables, y and x. But we may not know exactly what that reality is, or there may be an imprecise, or random connection between the variables. The unknown/unknowable influence is referred to as the random error . So our probabilistic models refer to a specific connection between variables, as well as influences we can’t specify exactly in each case: y = f(x) + random error McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

4 11.1: Probabilistic Models
The relationship between goals and attacks in soccer seems at first glance to be deterministic … McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

5 11.1: Probabilistic Models
But if you consider how many scorers actually scores and how many goals the keeper saves, the rigid model becomes more variable. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

6 11.1: Probabilistic Models
General Form of Probabilistic Models y = Deterministic part + Random error where y is the variable of interest, and the mean value of the random error is assumed to be 0. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

7 11.1: Probabilistic Models
McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

8 11.1: Probabilistic Models
The goal of linear regression analysis is to find the straight line that comes closest to all of the points in the scatter plot simultaneously. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

9 11.1: Probabilistic Models
A Linear Probabilistic Model Y = β0 + β1 X+ random error where Y = dependent variable X = independent variable β0 + β1X = deterministic component β0 = y-intercept β1 = slope of the line McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

10 11.1: Probabilistic Models
β0, the y-intercept, and β1, the slope of the line, are population parameters, and are unknown. Regression analysis is designed to estimate these parameters. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

11 11.2: Fitting the Model: The Least Squares Approach
Values on the line are the predicted values of total offerings given the average offering. The distances between the scattered dots and the line are the errors of prediction. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

12 11.2: Fitting the Model: The Least Squares Approach
Values on the line are the predicted values of total offerings given the average offering. The line is estimated to minimize the sum of the squared errors of prediction, and the method of finding this line is called the method of least squares. The distances between the scattered dots and the line are the errors of prediction. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

13 11.2: Fitting the Model: The Least Squares Approach
Estimates: Deviation or prediction error: SSE: McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

14 11.2: Fitting the Model: The Least Squares Approach
The least squares line is the line that has the following two properties: The sum of the errors (SE) equals 0. The sum of squared errors (SSE) is smaller than that for any other straight-line model. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

15 11.2: Fitting the Model: The Least Squares Approach
McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

16 11.5: The Correlation Coefficient
The correlation coefficient, r, is a measure of the strength of the linear relationship between two variables. It is computed as follows: An r value that close to zero suggests there may not be a linear relationship between the variables. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

17 11.5: The Correlation Coefficient
Positive linear relationship No linear relationship Negative linear relationship y y y x x x r → r  r → -1 Values of r equal to +1 or -1 require each point in the scatter plot to lie on a single straight line. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

18 11.5: The Correlation Coefficient
High |r| x provides important information about y Predictions are more accurate based on the model Low |r| Knowing values of x does not substantially improve predictions on y There may be no relationship between x and y, or it may be more subtle than a linear relationship Predict values of y with the mean of y if no other information is available Predict values of y|x based on a hypothesized linear relationship Evaluate the power of x to predict values of y with the correlation coefficient McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

19 McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression
11.7: A Complete Example A fire insurance company wants to relate the amount of fire damage in major residential fires in a city to the distance to the nearest fire station. A sample of 15 recent fires in the city is selected. The amount of damage (y, measured in thousand dollars) and the distance between the fire to the nearest fire station (x, measured in miles) are recorded for each fire. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

20 McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression
11.7: A Complete Example How does the proximity of a residential fire to the fire station (x) affect the damages (y) from the fire? y = f(x) y = β0 + β1 x + error The data (found in Table 11.7 pg 610) produce the following straight line. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

21 McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression
11.7: A Complete Example McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

22 McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression
11.7: A Complete Example The data produce the following estimates (in thousands of dollars): So the estimated damages equal $10,280 + $4920 for each mile from the nearestfire station, or The coefficient of correlation, r, is r = 0.96 McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression

23 McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression
11.7: A Complete Example Suppose we have a new residential fire case whose distance from the nearest fire station is 3.5 miles. We can estimate the damage with the model estimates. The damage due to a residential fire 3.5 miles from the nearest station will be about $27,500. McClave: Statistics, 11th ed. Chapter 11: Simple Linear Regression


Download ppt "Chapter 11: Simple Linear Regression"

Similar presentations


Ads by Google