Presentation is loading. Please wait.

Presentation is loading. Please wait.

Free Powerpoint Templates ROHANA BINTI ABDUL HAMID INSTITUT E FOR ENGINEERING MATHEMATICS (IMK) UNIVERSITI MALAYSIA PERLIS.

Similar presentations


Presentation on theme: "Free Powerpoint Templates ROHANA BINTI ABDUL HAMID INSTITUT E FOR ENGINEERING MATHEMATICS (IMK) UNIVERSITI MALAYSIA PERLIS."— Presentation transcript:

1 Free Powerpoint Templates ROHANA BINTI ABDUL HAMID INSTITUT E FOR ENGINEERING MATHEMATICS (IMK) UNIVERSITI MALAYSIA PERLIS

2 CHAPTER 5 INTRODUCTION TO LINEAR REGRESSION

3 CHAPTER OUTLINE INTRODUCTI ON SCATTE R PLOT LINEAR REGRESSIO N MODEL LEAST SQUARE METHOD COEFFICIENT DETERMINATION CORRELATION TEST OF SIGNIFICANC E ANALYSIS OF VARIANC E (ANOVA)

4   Regression – is a statistical procedure for establishing the relationship between 2 or more variables.  This is done by fitting a linear equation to the observed data.  The regression line is then used by the researcher to see the trend and make prediction of values for the data.  There are 2 types of relationship:  Simple ( 2 variables)  Multiple (more than 2 variables) 5.1 INTRODUCTION TO REGRESSION

5   is an equation that describes a dependent variable (Y) in terms of an independent variable (X) plus random error ε where, = intercept of the line with the Y-axis = slope of the line = random error  Random error, is the difference of data point from the deterministic value.  This regression line is estimated from the data collected by fitting a straight line to the data set and getting the equation of the straight line, THE SIMPLE LINEAR REGRESSION MODEL

6 Example 5.1: (Determine independent, X and dependent variable, Y) 1) A nutritionist studying weight loss programs might wants to find out if reducing intake of carbohydrate can help a person reduce weight. a)X is the carbohydrate intake (independent variable). b)Y is the weight (dependent variable). 2) An entrepreneur might want to know whether increasing the cost of packaging his new product will have an effect on the sales volume. a)X is the cost (independent variable) b)Y is sales volume (dependent variable)

7  5.2 SCATTER PLOT  A scatter plot is a graph or ordered pairs (x,y).  The purpose of scatter plot – to describe the nature of the relationships between independent variable, X and dependent variable, Y in visual way.  The independent variable, x is plotted on the horizontal axis and the dependent variable, y is plotted on the vertical axis.

8   Positive linear relationship SCATTER DIAGRAM E(y)E(y)E(y)E(y) x Intercept  0 Regression line Slope  1 is positive

9   Negative linear relationship SCATTER DIAGRAM E(y)E(y)E(y)E(y) x  0  0Intercept Regression line Slope  1 is negative

10   No relationship SCATTER DIAGRAM E(y)E(y)E(y)E(y) x  0  0Intercept Regression line Slope  1 is 0

11   A linear regression can be develop by freehand plot of the data. Example 5.2: The given table contains values for 2 variables, X and Y. Plot the given data and make a freehand estimated regression line. 5.3 LINEAR REGRESSION MODEL

12

13  The least squares method is commonly used to determine values for and that ensure a best fit for the estimated regression line to the sample data points The straight line fitted to the data set is the line: 5.4 LEAST SQUARES METHOD

14  LEAST SQUARES METHOD  y-Intercept for the Estimated Regression Equation, is the mean of x is the mean of y

15  LEAST SQUARES METHOD  Slope for the Estimated Regression Equation,

16  LEAST SQUARES METHOD

17  Given any value of the predicted value of the dependent variable can be found by substituting into the equation LEAST SQUARES METHOD

18 Example 5.2: Suppose we take a sample of seven household from a low to moderate income neighborhood and collect information on their incomes and food expenditures for the past month. The information obtained (in hundreds of ringgit Malaysia) is given below. Find the least squares regression line of food expenditure (Y) on income (X) IncomeFood expenditure 359 4915 217 3911 155 288 259

19 Solution: To find the least squares regression line, you must know these seven information!!!!!

20

21 The estimated regression model

22   Simple linear regression involves two estimated parameters which are β 0 and β 1.  Test of hypothesis is used in order to know whether independent variable is significant to dependent variable.  The analysis of variance (ANOVA) method is an approach to test the significance of the regression. 5.5 INFERENCES OF ESTIMATED PARAMETERS

23  ANOVA table Source of variation Sum of squares Degree of freedom Mean square f test Regression1MSR f = MSR/MSE ErrorSSE=SST- SSR n-2MSE TotalSST=S yy n-1

24  ANOVA table for Example 5.2 Source of variation Sum of squaresDegree of freedom Mean square f test Regression SSR =0.2642(211.714 3) =55.9349 1 MSR =55.9349 f = MSR/MSE =55.9349/0.9844 =56.8213 Error SSE=60.8571- 55.9349 =4.9222 7-2 = 5 MSE =4.9222/5 =0.9844 Total SST =60.8571 7-1 = 6

25   To determine whether X provides information in predicting Y, we proceed with testing the hypothesis.  Two test are commonly used:  t Test  F Test 5.6 TEST OF SIGNIFICANCE

26  1.Determine the hypothesis 2.Determine the rejection region 3.Compute the test statistics 4.Conclusion t Test

27 1.Determine the hypothesis (NO RELATIONSHIP) (THERE IS RELATIONSHIP) 2.Determine the rejection region We reject H 0 if 3.Compute the test statistics 4.Conclusion If we reject H 0 there is a significant relationship between variable X and Y.

28  F Test 1.Determine the hypothesis 2.Determine the rejection region 3.Compute the test statistics 4.Conclusion

29 1.Determine the hypothesis (NO RELATIONSHIP) (THERE IS RELATIONSHIP) 2.Determine the rejection region We reject H 0 if 3.Compute the test statistics 4.Conclusion If we reject H 0 there is a significant relationship between variable X and Y.

30   Correlation measures the strength of a linear relationship between the two variables.  Also known as Pearson’s product moment coefficient of correlation.  The symbol for the sample coefficient of correlation is r.  Formula : 5.7 CORRELATION ( r )

31  Values of r close to 1 strong positive linear relationshi p between x and y. close to - 1 strong negative linear relationshi p between x and y. close to 0 little or no linear relationshi p between x and y.

32   The coefficient of determination is a measure of the variation of the dependent variable (Y) that is explained by the regression line and the independent variable (X).  If r = 0.90, then r 2 = 0.81. It means that 81% of the variation in the dependent variable (Y) is accounted for by the variations in the independent variable (X).  The rest of the variation, 0.19 or 19%, is unexplained and called the coefficient of nondetermination.  Formula for the coefficient of nondetermination is 1- r 2 5.8 COEFFICIENT OF DETERMINATION( r 2 )

33 Exercise 1 The following table gives information on lists of the midterm, X, and final exam, Y, scores for seven students in a statistics class. 1. Find the least squares regression line. 2. Calculate r and r 2, and explain the values. 3. Predict the final exam scores the student will get if he/she got 60 marks for midterm test. 4. Construct ANOVA table. Do the data support the existence of a linear relationship between midterm and final exam. Test using α = 0.05 X79958166879459 Y 85 97 78 76 94 84 67

34 A research engineer is investigating the use of a windmill to generate electricity and has collected data on the DC output from this windmill and the corresponding wind velocity as shown in the table below. Exercise 2 Observation Number, i [Nombor Pemerhatian, i] Wind Velocity (mph), x i [Halaju Angin (mph), x i ] DC Output, y i [Output DC, y i ] 15.001.582 26.001.822 33.401.057 42.700.500 510.002.236 69.702.386 79.552.294 83.050.558 98.152.166 106.201.866 112.900.653 126.351.930

35 (i) Find the least squares regression line. (ii)Calculate r and r 2, and explain the results. (iii)Do the data support the existence of a linear relationship between the wind velocity and the DC output? Test using t-test at α=0.01.


Download ppt "Free Powerpoint Templates ROHANA BINTI ABDUL HAMID INSTITUT E FOR ENGINEERING MATHEMATICS (IMK) UNIVERSITI MALAYSIA PERLIS."

Similar presentations


Ads by Google