Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Power 16. 2 Review Post-Midterm Cumulative 3 Projects.

Similar presentations


Presentation on theme: "1 Power 16. 2 Review Post-Midterm Cumulative 3 Projects."— Presentation transcript:

1 1 Power 16

2 2 Review Post-Midterm Cumulative

3 3 Projects

4 4 Logistics Put power point slide show on a high density floppy disk for a WINTEL machine. Email Llad@econ.ucsb.edu the slide-show as a PowerPoint attachment

5 5 Assignments 1. Project choice 2. Data Retrieval 3. Statistical Analysis 4. PowerPoint Presentation 5. Executive Summary 6. Technical Appendix 7. Graphics Power_13

6 6 PowerPoint Presentations: Member 4 1. Introduction: Members 1,2, 3 –What –Why –How 2. Executive Summary: Member 5 3. Exploratory Data Analysis: Member 3 4. Descriptive Statistics: Member 3 5. Statistical Analysis: Member 3 6. Conclusions: Members 3 & 5 7. Technical Appendix: Table of Contents, Member 6

7 7 Executive Summary and Technical Appendix

8 8

9 9 Technical Appendix Table of Contents Spreadsheet of data used and sources or if extensive, a subsample of the data Descriptive Statistics and Histograms for the variables in the study If time series data, a plot of each variable against time If relevant, plot of the dependent Vs. each of the explanatory variables

10 10 Technical Appendix (Cont.) Statistical Results, for example regression Plot of the actual, fitted and error and other diagnostics Brief summary of the conclusions, meanings drawn from the exploratory, descriptive, and statistical analysis.

11 11 Post-Midterm Review Project I: Power 16 Contingency Table Analysis: Power 14, Lab 8 ANOVA: Power 15, Lab 9 Survival Analysis: Power 12, Power 11, Lab 7 Multi-variate Regression: Power 11, Lab 6

12 12 Slide Show Challenger disaster

13 13 Project I Number of O-Rings Failing On Launch i: y i (#) = a + b*temp i + e i –Biased because of zeros, even if divide equation by 6 Two Ways to Proceed –Tobit, non-linear estimation: y i (#) = a + b*temp i + e i –Bernoulli variable: probability models Probability Models: y i (0,1) = a + b*temp i + e i

14 14 Project I (Cont.) Probability Models: y i (0,1) = a + b*temp i + e i –OLS, Linear Probability Model, linear approximation to the sigmoid –Probit, non-linear estimate of the sigmoid –Logit, non-linear estimate of the sigmoid Significant Dependence on Temperature –t-test (or z-test) on slope, H 0 : b=0 –F-test –Wald test

15 15 Project I (Cont.) Plots of Number or Probability Vs Temp. –Label the axes Answer all parts, a-f –The most frequent sins Did not explicitly address significance Did not answer b, 66 0 : all launches at lower temperatures had one or more o-ring failures Did not execute c, estimate linear probability model

16 16 Challenger Disaster Failure of O-rings that sealed grooves on the booster rockets Was there any relationship between o-ring failure and temperature? Engineers knew that the rubber o-rings hardened and were less flexible at low temperatures But was there launch data that showed a problem?

17 17 Challenger Disaster What: Was there a relationship between launch temperature and o-ring failure prior to the Challenger disaster? Why: Should the launch have proceeded? How: Analyze the relationship between launch temperature and o-ring failure

18 18 Launches Before Challenger Data –number of o-rings that failed –launch temperature

19 19

20 20

21 21

22 22 Exploratory Analysis Launches where there was a problem

23 23 158 157 170 163 170 275 353 Orings temperature

24 .

25 25 Exploratory Analysis All Launches Plot of failures per observation versus temperature range shows temperature dependence: Mean temperature for the 7 launches with o-ring failures was lower, 63.7, than for the 17 launches without o-ring failures, 72.6. - Contingency table analysis

26 26 Launches and O-Ring Failures (Yes/No)

27 27 Launches and O-Ring Failures (Yes/No) Expected/Observed

28 28 Launches and O-Ring Failures Chi- Square, 2dof=9.08, crit(  =0.05)=6

29 Number of O-ring Failures Vs. Temperature

30 30 Logit Extrapolated to 31F: Aren Probit extrapolated to 31F: Jeffrey, Nathan, Hamid, many more

31 31 Extrapolating OLS to 31F: OLS: Carl, Will, Jong, Yana & more Tobit:Zhimin, Nathan, Sarab, Ufook,

32 32 Conclusions From extrapolating the probability models to 31 F, Linear Probability, Probit, or Logit, there was a high probability of one or more o-rings failing From extrapolating the Number of O-rings failing to 31 F, OLS or Tobit, 3 or more o- rings would fail. There had been only one launch out of 24 where as many as 3 o-rings had failed. Decision theory argument: expected cost/benefit ratio:

33 33 Conclusions Decision theory argument: expected cost/benefit ratio:

34 34 Ways to Analyze Challenger Difference in mean temperatures for failures and successes Difference in probability of one or more o-ring failures for high and low temperature ranges Probabilty models: LPM (OLS), probit, logit Number of o-ring failure per launch Vs. Temp. OLS, Tobit Contingency table analysis ANOVA

35 35 Contingency Table Analysis Challenger example Group 2 example

36 36 Launches and O-Ring Failures (Yes/No)

37 Technical – Alternate Approach Utilize contingency tables

38 38 ANOVA and O-Rings Probability one or more o-rings fail –Low temp: 53-62 degrees –Medium temp: 63-71 degrees –High temp: 72-81 degrees Average number of o-rings failing per launch –Low temp: 53-62 degrees –Medium temp: 63-71 degrees –High temp: 72-81 degrees

39 39 Probability one or more o-rings fails

40 40 Number of o-rings failing per launch

41 41

42 42 Outline ANOVA and Regression (Non-Parametric Statistics) (Goodman Log-Linear Model)

43 43 Anova and Regression: One-Way Salesaj = c(1)*convenience+c(2)*quality+c(3)*price+ e E[salesaj/(convenience=1, quality=0, price=0)] =c(1) = mean for city(1) –c(1) = mean for city(1) (convenience) –c(2) = mean for city(2) (quality) –c(3) = mean for city(3) (price) –Test the null hypothesis that the means are equal using a Wald test: c(1) = c(2) = c(3)

44 44 One-Way ANOVA and Regression Regression Coefficients are the City Means; F statistic

45 45 Anova and Regression: One-Way Alternative Specification Salesaj = c(1) + c(2)*convenience+c(3)*quality+e E[Salesaj/(convenience=0, quality=0)] = c(1) = mean for city(3) (price, the omitted one) E[Salesaj/(convenience=1, quality=0)] = c(1) + c(2) = mean for city(1) (convenience) –c(1) = mean for city(3), the omitted city –c(2) = mean for city(1) minus mean for city(3) –Test that the mean for city(1) = mean for city(3) –Using the t-statistic for c(2)

46 46 Anova and Regression: One-Way Alternative Specification Salesaj = c(1) + c(2)*convenience+c(3)*price+e E[Salesaj/(convenience=0, price=0)] = c(1) = mean for city(2) (quality, the omitted one) E[Salesaj/(convenience=1, price=0)] = c(1) + c(2) = mean for city(1) (convenience) –c(1) = mean for city(2), the omitted city –c(2) = mean for city(1) minus mean for city(2) –Test that the mean for city(1) = mean for city(2) –Using the t-statistic for c(2)

47 47 ANOVA and Regression: Two-Way Series of Regressions; Compare to Table 11, Lecture 15 Salesaj = c(1) + c(2)*convenience + c(3)* quality + c(4)*television + c(5)*convenience*television + c(6)*quality*television + e, SSR=501,136.7 Salesaj = c(1) + c(2)*convenience + c(3)* quality + c(4)*television + e, SSR=502,746.3 Test for interaction effect: F 2, 54 = [(502746.3-501136.7)/2]/(501136.7/54) = (1609.6/2)/9280.3 = 0.09

48 Table of Two-Way ANOVA for Apple Juice Sales

49 49 ANOVA and Regression: Two-Way Series of Regressions Salesaj = c(1) + c(2)*convenience + c(3)* quality + e, SSR=515,918.3 Test for media effect: F 1, 54 = [(515918.3- 502746.3)/1]/(501136.7/54) = 13172/9280.3 = 1.42 Salesaj = c(1) +e, SSR = 614757 Test for strategy effect: F 2, 54 = [(614757- 515918.3)/2]/(501136.7/54) = (98838.7/2)/(9280.3) = 5.32

50 50 Survival Analysis Density, f(t) Cumulative distribution function, CDF, F(t) –Probability you failed up to time t* =F(t*) Survivor Function, S(t) = 1-F(t) –Probability you survived longer than t*, S(t*) –Kaplan-Meier estimates: (#at risk- # ending)/# at risk Applications –Testing a new drug

51 51 Chemotherapy Drug Taxol Current standard for ovarian cancer is taxol and a platinate such as cisplatin Previous standard was cyclophosphamide and cisplatin Kaplan-Meier Survival curves comparing the two regimens –Lab 7: ( # at risk- #ending)/# at riak

52 52 Taxol ( Bristol-Myers Squibb) interrupts cell division (mitosis) It is a cyclical hydrocarbon

53 53 Top Panel: European Canadian and Scottish, 342 at risk for Tc, 292 Survived 1 year Bottom Panel: Gynecological Oncology Group, 196 at risk For Tc, 168 survived 1 year

54 54 Multi-Variate Regression Group 4: Bush dependent on rural conditional on income Group1: simultaneity Group 2: censored data? Group 5: pooling time series and cross- section

55 55 Group 4

56 56 Group1

57 57 Technical – Regression Effects of budget & genre variables –Drama appears most significant Group 2

58 58 Group 2

59 Data Analysis University of California Santa Barbara Econ240a Fall 2004 Group #5 >> Income vs Education Level | Caucasian Group 5

60 60 Data Analysis University of California Santa Barbara Econ240a Fall 2004 Group #5 >> Income vs Education Level | African American

61 61 2003 Final

62 62 Nonparametric Statistics What to do when the sample of observations is not distributed normally?

63 63 3 Nonparametric Techniques Wilcoxon Rank Sum Test for independent samples –Data Analysis Plus Signs Test for Matched Pairs: Rated Data –Eviews, Descriptive Statistics Wilcoxon Signed Rank Sum Test for Matched Pairs: Quantitative Data –Eviews

64 64 Wilcoxon Rank Sum Test for Independent Samples Testing the difference between the means of two populations when they are non-normal A New Painkiller Vs. Aspirin, Xm17-02

65 65 Rating scheme

66 66 Ratings

67 67 Rank the 30 Ratings 30 total ratings for both samples 3 ratings of 1 5 ratings of 2 etc

68 68 3 15 12

69 69 5 30 27 continued

70 70 4 19.5 5 27 Rank Sum 276.5 188.5

71 71 Rank Sum, T E (T )= n 1 (n 1 + n 2 + 1)/2 = 15*31/2 = 232.5 VAR (T) = n 1 * n 2 (n 1 + n 2 + 1)/12 VAR (T) = 15*31/12,  T = 24.1 For sample sizes larger than 10, T is normal Z = [T-E(T)]/  T = (276.5 - 232.5)/24.1 = 1.83 Null Hypothesis is that the central tendency for the two drugs is the same Alternative hypothesis: central tendency for the new drug is greater than for aspirin: 1- tailed test

72 1.645 5%


Download ppt "1 Power 16. 2 Review Post-Midterm Cumulative 3 Projects."

Similar presentations


Ads by Google