Download presentation

Presentation is loading. Please wait.

Published byLucas Dilly Modified over 2 years ago

1
Part 24: Hypothesis Tests 24-1/33 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics

2
Part 24: Hypothesis Tests 24-2/33 Statistics and Data Analysis Part 24 – Hypothesis Tests

3
Part 24: Hypothesis Tests 24-3/33 Hypothesis Tests Hypothesis Tests in the Regression Model Tests of Independence of Random Variables

4
Part 24: Hypothesis Tests 24-4/33 Application: Monet Paintings Does the size of the painting really explain the sale prices of Monets paintings? Investigate: Compute the regression Hypothesis: The slope is actually zero. Rejection region: Slope estimates that are very far from zero. The hypothesis that β = 0 is rejected

5
Part 24: Hypothesis Tests 24-5/33 Regression Analysis Investigate: Is the coefficient in a regression model really nonzero? Testing procedure: Model: y = α + βx + ε Hypothesis: H 0 : β = 0. Rejection region: Least squares coefficient is far from zero. Test: α level for the test = 0.05 as usual Compute t = b/StandardError Reject H 0 if t is above the critical value 1.96 if large sample Value from t table if small sample. Reject H 0 if reported P value is less than α level Degrees of Freedom for the t statistic is N-2

6
Part 24: Hypothesis Tests 24-6/33 An Equivalent Test Is there a relationship? H 0 : No correlation Rejection region: Large R 2. Test: F= Reject H 0 if F > 4 Math result: F = t 2. Degrees of Freedom for the F statistic are 1 and N-2

7
Part 24: Hypothesis Tests 24-7/33 Partial Effect Hypothesis: If we include the signature effect, size does not explain the sale prices of Monet paintings. Test: Compute the multiple regression; then H 0 : β 1 = 0. α level for the test = 0.05 as usual Rejection Region: Large value of b 1 (coefficient) Test based on t = b 1 /StandardError Regression Analysis: ln (US$) versus ln (SurfaceArea), Signed The regression equation is ln (US$) = ln (SurfaceArea) Signed Predictor Coef SE Coef T P Constant ln (SurfaceArea) Signed S = R-Sq = 46.2% R-Sq(adj) = 46.0% Reject H 0. Degrees of Freedom for the t statistic is N-3 = N-number of predictors – 1.

8
Part 24: Hypothesis Tests 24-8/33 Testing The Regression Degrees of Freedom for the F statistic are K and N-K-1

9
Part 24: Hypothesis Tests 24-9/33 n 1 = Number of predictors n 2 = Sample size – number of predictors – 1

10
Part 24: Hypothesis Tests 24-10/33 Cost Function Regression The regression is significant. F is huge. Which variables are significant? Which variables are not significant?

11
Part 24: Hypothesis Tests 24-11/33 Application: Part of a Regression Model Regression model includes variables x1, x2,… I am sure of these variables. Maybe variables z1, z2,… I am not sure of these. Model: y = α+β 1 x1+β 2 x2 + δ 1 z1+δ 2 z2 + ε Hypothesis: δ 1 =0 and δ 2 =0. Strategy: Start with model including x1 and x2. Compute R 2. Compute new model that also includes z1 and z2. Rejection region: R 2 increases a lot.

12
Part 24: Hypothesis Tests 24-12/33 Test Statistic

13
Part 24: Hypothesis Tests 24-13/33 Gasoline Market

14
Part 24: Hypothesis Tests 24-14/33 Gasoline Market Regression Analysis: logG versus logIncome, logPG The regression equation is logG = logIncome logPG Predictor Coef SE Coef T P Constant logIncome logPG S = R-Sq = 93.6% R-Sq(adj) = 93.4% Analysis of Variance Source DF SS MS F P Regression Residual Error Total R 2 = / =

15
Part 24: Hypothesis Tests 24-15/33 Gasoline Market Regression Analysis: logG versus logIncome, logPG,... The regression equation is logG = logIncome logPG logPNC logPUC logPPT Predictor Coef SE Coef T P Constant logIncome logPG logPNC logPUC logPPT S = R-Sq = 96.0% R-Sq(adj) = 95.6% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Now, R 2 = / = Previously, R 2 = / =

16
Part 24: Hypothesis Tests 24-16/33 Improvement in R 2 Inverse Cumulative Distribution Function F distribution with 3 DF in numerator and 46 DF in denominator P( X <= x ) = 0.95 x = The null hypothesis is rejected. Notice that none of the three individual variables are significant but the three of them together are.

17
Part 24: Hypothesis Tests 24-17/33 Application Health satisfaction depends on many factors: Age, Income, Children, Education, Marital Status Do these factors figure differently in a model for women compared to one for men? Investigation: Multiple regression Null hypothesis: The regressions are the same. Rejection Region: Estimated regressions that are very different.

18
Part 24: Hypothesis Tests 24-18/33 Equal Regressions Setting: Two groups of observations (men/women, countries, two different periods, firms, etc.) Regression Model: y = α+β 1 x1+β 2 x2 + … + ε Hypothesis: The same model applies to both groups Rejection region: Large values of F

19
Part 24: Hypothesis Tests 24-19/33 Procedure: Equal Regressions There are N1 observations in Group 1 and N2 in Group 2. There are K variables and the constant term in the model. This test requires you to compute three regressions and retain the sum of squared residuals from each: SS1 = sum of squares from N1 observations in group 1 SS2 = sum of squares from N2 observations in group 2 SSALL = sum of squares from NALL=N1+N2 observations when the two groups are pooled. The hypothesis of equal regressions is rejected if F is larger than the critical value from the F table (K numerator and NALL-2K-2 denominator degrees of freedom)

20
Part 24: Hypothesis Tests 24-20/ |Variable| Coefficient | Standard Error | T |P value]| Mean of X| Women===|=[NW = 13083]================================================ Constant| AGE | EDUC | HHNINC | HHKIDS | MARRIED | Men=====|=[NM = 14243]================================================ Constant| AGE | EDUC | HHNINC | HHKIDS | MARRIED | Both====|=[NALL = 27326]============================================== Constant| AGE | EDUC | HHNINC | HHKIDS | MARRIED | German survey data over 7 years, 1984 to 1991 (with a gap). 27,326 observations on Health Satisfaction and several covariates. Health Satisfaction Models: Men vs. Women

21
Part 24: Hypothesis Tests 24-21/33 Computing the F Statistic | Women Men All | | HEALTH Mean = | | Standard deviation = | | Number of observs. = | | Model size Parameters = | | Degrees of freedom = | | Residuals Sum of squares = | | Standard error of e = | | Fit R-squared = | | Model test F (P value) = (.000) (.000) (.0000) |

22
Part 24: Hypothesis Tests 24-22/33 A Test of Independence In the credit card example, are Own/Rent and Accept/Reject independent? Hypothesis: Prob(Ownership) and Prob(Acceptance) are independent Formal hypothesis, based only on the laws of probability: Prob(Own,Accept) = Prob(Own)Prob(Accept) (and likewise for the other three possibilities. Rejection region: Joint frequencies that do not look like the products of the marginal frequencies.

23
Part 24: Hypothesis Tests 24-23/33 A Contingency Table Analysis

24
Part 24: Hypothesis Tests 24-24/33 Independence Test Step 2: Expected proportions assuming independence: If the factors are independent, then the joint proportions should equal the product of the marginal proportions. [Rent,Reject] x = [Rent,Accept] x = [Own,Reject] x = [Own,Accept] x =

25
Part 24: Hypothesis Tests 24-25/33 Comparing Actual to Expected

26
Part 24: Hypothesis Tests 24-26/33 When is Chi Squared Large? For a 2x2 table, the critical chi squared value for α = 0.05 is (Not a coincidence, 3.84 = ) Our is large, so the hypothesis of independence between the acceptance decision and the own/rent status is rejected.

27
Part 24: Hypothesis Tests 24-27/33 Computing the Critical Value Calc Probability Distributions Chi- square The value reported is For an R by C Table, D.F. = (R-1)(C-1)

28
Part 24: Hypothesis Tests 24-28/33 Analyzing Default Do renters default more often (at a different rate) than owners? To investigate, we study the cardholders (only) We have the raw observations in the data set. DEFAULT OWNRENT 0 1 All All

29
Part 24: Hypothesis Tests 24-29/33 Hypothesis Test

30
Part 24: Hypothesis Tests 24-30/33 Treatment Effects in Clinical Trials Does Phenogyrabluthefentanoel (Zorgrab) work? Investigate: Carry out a clinical trial. N+0 = The placebo effect N+T – N+0 = The treatment effect Is N+T > N+0 (significantly)? Placebo Drug Treatment No Effect N00 N0T Positive Effect N+0 N+T

31
Part 24: Hypothesis Tests 24-31/33

32
Part 24: Hypothesis Tests 24-32/33 Confounding Effects

33
Part 24: Hypothesis Tests 24-33/33 What About Confounding Effects? Normal Weight Obese Nonsmoker Smoker Age and Sex are usually relevant as well. How can all these factors be accounted for at the same time?

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google