Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis.

Similar presentations


Presentation on theme: "Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis."— Presentation transcript:

1 Linear Regression ( Cont'd )

2 Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis Test :t test, F test, - Classical Assumption Test : No Multicollinearity Homoscedasticity No autocorrelation

3 Multiple Regression Does Consumption is only affected by income ? There are some other variables that could also have relation to the income Then the simple regression should be expanded by introducing some new I.Vs into the model --> multiple regression

4 The MODEL Y i =  0 +  1 X 1i +  2 X 2i +  3 X 3i +........+  k X ki + u i i = 1,2,3,......., N (observation)‏ example: Y i =  0 +  1 X 1 +  2 X 2 +  3 X 3 + u i Y : Consumption X 1 : Income X 2 : Number of Dependance X 3 : Age

5 Checking The Regression 1.Coeffisient of Determination 2.Standard Error of Coefficient 3.Confidence Interval 4.Hypotesis Test: t-test F-test

6 2.Standard Error Principle of OLS --> minimizing error. Therefore the accuracy of the estimators is determined by each standard error (S.e). The formula of S.e. Checking The Regression

7 As, then u i 2 = = u i 2 The minimal standard error resulted from Smallest error

8 How small is S.e to be the best ? Difficult for absolut number More usefull when it is combined with each coefficient of regression Coefficent to S.e ratio The Ratio will be used for t-test.

9 3.Confidence Interval of  j What is Confidence Interval of Parameter ? What for? Formula: b j  t  /2 s.e(b j )‏ or P(b j - t  /2 s.e(b j ) ≤ β j ≤ b j + t  /2 s.e(b j ))= 1- 

10 example From the regression we get : b 1 = 0,1022 and s.e (b 1 ) = 0,0092. observation(n) = 10; Estimated parameter (k) = 2; Then, degree of freedom = 10 – 2 = 8 and signifance level  = 5 %. then from the t-table find ( t  df ) or ( t 0.025, 8) = 2,306 therefore the confidence interval for β 1 is ( 0,1022  2,306 (0,0092) ) or (0,0810 ; 0,1234)‏ interpretation: the value of β 1 will lie on the interval 0,0810 and 0,1234 with the confidence level 95%.

11 4.Hypotesis Test It is an individual testing for coefficient of regression. H 0 :  j = 0 H 1 :  j  0;j = 0, 1, 2........, k isslope of coefficient. For simple regression: (1) H 0 :  0 = 0(2) H 0 :  1 = 0 H 1 :  0  0 H 1 :  1  0; T-test is defined : Testing to find out if  j is not different to 0 t-test

12 The t-computation is compared to t- table. If we get  t  > t  /2,df, then the t value is in rejection area Thus, the null hypothesis(  j = 0) is rejected with confidence level (1-  ) x100%. In other word  j statistically significance.

13 Uji Hipotesis F-Test to find out whether the model statistically significant ? or hypothesis test for all the coefficeint together H 0 :  2 =  3 =  4 =............=  k = 0 H 1 : at least one of  k  0)‏,where k is the number of I.Vs.

14 F-test can be more explained by ANOVA Observation: Y i =  0 +  1 X i + e i Regression: Ŷ i = b 1 + b 2 X i Reduced the two sides by then square the two sides: SST SSR SSE

15 ANOVA Table Source Sum of Square df Mean Squares F-stat RegresiSSR k MSR = SSR/k F = MSR Error SSE n-k-1 MSE= SSE/(n-k-1) MSE Total SST n-1 Compare F-stat with F α(k,n-k-1) ( F-table )‏

16 Classical Assumption of OLS The estimator of OLS shuould be BLUE (Best Linier Unbiased Estimate)‏ 3 main requirements: No Multikoliniearities No Heteroskedasticity No Autotocorrelation

17 Multicolliniearities Multikolinieritas: is linear relation between I.Vs for two regressor, X 1 dan X 2. if : X 1 =  X 2, there is collinearity. But not be the case if, example X 1 = X2 2 or X 1 = log X 2

18 example Y i =  0 +  1 X 1 +  2 X 2 +  3 X 3 + u i Y : Consumption X 1 : Total Income X 2 : Wage income X 3 : non-wage income There is multico--> even can be perfect multico Why ?

19 Data of Perfect Multicolliniearity 11811629 969223 827619 656416 514812 X3X3 X2X2 X1X1 X2 = 4X1. --> perfect multicollinearity relation.

20 Impact of Multicoliniearities High Varians (dari taksiran OLS)‏ Widely Confidence Interval High R 2 but could get much insignificant coefficients from t-test. The direction of coefficent can be misleaded.

21 2267220160 2129210140 1954190135 1456140110 1234120100 102310085 113611090 8568065 6596550 5005040 Asset(X 2 )‏Income (X 1 )‏Consumption (Y)‏

22 Model: Y = 12,8 – 1,414X 1 + 0,202 X 2 SE(4,696) (1,199) (0,117)‏ t (2,726) (-1,179) (1,721)‏ R 2 = 0,982 R 2 is very high 98,2%. What's mean? t-test is not significant. What's mean? Coefficient X1 is negative. What's mean?

23 Detecting Multicoliniearities 1. Comparing R2 and t-stat 2. Using Correlation Matrix for I.Vs

24 3. VIF (Variance Inflation Factor) and Tolerance Value (TOL) --> for SPSS‏ ; j = 1,2,……,k VIF threshold is usually 2 --> Indication of collinearity when below 2

25 Solving Multicolliniearities Relevant Informations ( theory or previous research)‏ Combination of cross-section and time series Eliminating the infected variables – Common to be used. – Be Careful --> specification bias. Transforming the variabel : first difference method Adding additional sample/data

26 Heteroskedastisity Variance of Error is not constant. Generally occurs in cross sectional data. ex. Consumption and Income in Province level The disobedience of homoskedasticity still keep the estimator is unbiased, but not efficient

27 0 20 40 60 80 100 120 0204060 The Pattern of Heteroskedasticity

28 Checking The Heteroskedasticity 1. Graffic Method Analyzing the pattern relationship between (u i 2 ) and predicted Y i.

29 ui2ui2,

30 ui2ui2 ui2ui2

31 Solving heteroskedastisity 1. Transformed in to the Logarithmic model Ln Y j = β 0 + β 1 Ln X j + u j

32 Autocorrelation Is correlation between varable it self, at the different time and individual sample observation. Generally occurs in the case of time series data E (ui uj) becomes not equal 0 The estimator becomes inefficient

33 Autocorrelation : ui ui * * ** * * * * * * * * * ** * * * Waktu/X * ** Waktu/X * * *

34 Detecting autocorrelation Durbin-Watson Test Test -stat compare the d-stat to the d-tabel ( dL and dU)‏

35 Rules of Game undecisive undecisive positive No correlation negative 0 dL dU 4-dU 4-dL 4

36 THANK YOU


Download ppt "Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis."

Similar presentations


Ads by Google