Presentation is loading. Please wait.

Presentation is loading. Please wait.

DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked.

Similar presentations


Presentation on theme: "DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked."— Presentation transcript:

1 DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

2 DEPARTMENT OF STATISTICS Stats questions we are often asked  When can I use r and R 2 ?  When can I make a ‘causal-type’ claim?  Why should I be careful with a media reported margin of error?  When can I say a confidence interval gives support to a claim?

3 DEPARTMENT OF STATISTICS Stats questions we are often asked  When can I use r and R 2 ?  When can I make a ‘causal-type’ claim?  Why should I be careful with a media reported margin of error?  When can I say a confidence interval gives support to a claim?

4 DEPARTMENT OF STATISTICS r – little r – what is it?  r is the correlation coefficient between y and x  r measures the strength of a linear relationship  r is a multiple of the slope

5 DEPARTMENT OF STATISTICS r – when can it be used?  Only use r if the scatter plot is linear  Don’t use r if the scatter plot is non-linear! x y * * * * * * * * ** * * * * * * * * * * r = 0.99

6 DEPARTMENT OF STATISTICS r – what does it tell you?  How close the points in the scatter plot come to lying on the line r = 0.99 x y * * * * * * * * ** * * * * * * * * * * r = 0.57 x y * * * * * * * * * * * * * * * * * * * *

7 DEPARTMENT OF STATISTICS R 2 – big R 2 – what is it?  R 2 is the coefficient of determination  Measures how close the points in the scatter plot come to lying on the fitted line or curve x y * * * * * * * * * * * * * * * * * * * * x y * * * * * * * * ** * * * * * * * * * *

8 DEPARTMENT OF STATISTICS R 2 – big R 2 – when can it be used?  When the scatter plot of y versus x is linear or non-linear x y * * * * * * * * * * * * * * * * * * * * x y * * * * * * * * ** * * * * * * * * * *

9 DEPARTMENT OF STATISTICS R 2 – what does it tell you? xx Dotplot of the y ’s Shows the variation in the y ’s y y ˆ Dotplot of the y ’s Shows the variation in the y ’s ˆ ˆ

10 DEPARTMENT OF STATISTICS R 2 – what does it tell you? x We see some additional variation in the y ’s. The excess is not explained by the model. y ˆ y 2 Variation in y 's ˆ Variation in fitted values Variation in y values Variation in y 's R = = Variation in the y ’s This amount of variation can be explained by the model ˆ

11 DEPARTMENT OF STATISTICS R 2 – what does it tell you?  When expressed as a percentage, R 2 is the percentage of the variation in Y that our regression model can explain  R 2 near 100%  model fits well  R 2 near 0%  model doesn’t fit well

12 DEPARTMENT OF STATISTICS R 2 – what does it tell you?  90% of the variation in Y is explained by our regression model. x y * * * * * * * * * * * * * * * * * * * * R 2 = 90%

13 DEPARTMENT OF STATISTICS R 2 – pearls of wisdom!  R 2 and r 2 have the same value ONLY when using a linear model  DON’T use R 2 to pick your model  Use your eyes!

14 DEPARTMENT OF STATISTICS R 2 and Excel & Graphics Calculators

15 DEPARTMENT OF STATISTICS Damaged for life by too much TV

16 DEPARTMENT OF STATISTICS Damaged for life by too much TV N Z Herald (04/10/2005)

17 DEPARTMENT OF STATISTICS Damaged for life by too much TV

18 DEPARTMENT OF STATISTICS Damaged for life by too much TV TV watching Health Score r = - 0.93 Causal relationship?

19 DEPARTMENT OF STATISTICS Causal relationships  Two general types of studies: experiments and observational studies  In an experiment, the experimenter determines which experimental units receive which treatments.

20 DEPARTMENT OF STATISTICS Damaged for life by too much TV TV watching Health Score r = - 0.93 Causal relationship?

21 DEPARTMENT OF STATISTICS Causal relationships  Two general types of studies: experiments and observational studies  In an experiment, the experimenter determines which experimental units receive which treatments.  In an observational study, we simply compare units that happen to have received different levels of the factor of interest.

22 DEPARTMENT OF STATISTICS Causal relationships  Only well designed and carefully executed experiments can reliably demonstrate causation.  An observational study is often useful for identifying possible causes of effects, but it cannot reliably establish causation

23 DEPARTMENT OF STATISTICS Causal relationships - Summary  In observational studies, strong relationships are not necessarily causal relationships.  Correlation does not imply causation.  Be aware of the possibility of lurking variables.

24 DEPARTMENT OF STATISTICS Damaged for life by too much TV

25 DEPARTMENT OF STATISTICS Margin of Error Sunday Star Times: National 44% Labour 37.2% NZ First 4.7% margin of error: 4.4% (n = 540) Herald on Sunday: Labour 42% National 38.5% NZ First 5.5% margin of error: 4.9% (n = 400)

26 DEPARTMENT OF STATISTICS Margin of Error Herald on Sunday: Labour 42% National 38.5% NZ First 5.5% margin of error: 4.9% (n = 400)

27 DEPARTMENT OF STATISTICS Margin of Error Herald on Sunday: Labour 42% National 38.5% NZ First 5.5% margin of error: 4.9% (n = 400) Confidence Interval: estimate ± margin of error

28 DEPARTMENT OF STATISTICS Margin of Error Survey Errors Nonsampling ErrorsSampling Error

29 DEPARTMENT OF STATISTICS Margin of Error Survey Errors Nonsampling Errors Sampling Error  caused by the act of sampling  has potential to be bigger in smaller samples  can determine how large it can be – margin of error  unavoidable (price of sampling)

30 DEPARTMENT OF STATISTICS Margin of Error Survey Errors Nonsampling ErrorsSampling Error  e.g., nonresponse bias, behavioural,...  can be much larger than sampling errors  impossible to correct for after completion of survey  impossible to determine how badly they affect results

31 DEPARTMENT OF STATISTICS Margin of Error Herald on Sunday: Labour 42% National 38.5% NZ First 5.5% margin of error: 4.9% (n = 400)

32 DEPARTMENT OF STATISTICS Approx. 95% confidence interval for p: Margin of Error

33 DEPARTMENT OF STATISTICS Margin of Error Margin of error (single proportion)

34 DEPARTMENT OF STATISTICS Margin of Error Herald on Sunday: Labour 42% National 38.5% NZ First 5.5% margin of error: 4.9% (n = 400) Sunday Star Times: National 44% Labour 37.2% NZ First 4.7% margin of error: 4.4% (n = 540)

35 DEPARTMENT OF STATISTICS  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 Bank Dissatisfaction Scores – 95% CIs

36 DEPARTMENT OF STATISTICS  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 With 95% confidence, the mean dissatisfaction score for Canterbury customers is somewhere between 0.5 and 20.7 larger than the mean dissatisfaction score for Auckland customers. Bank Dissatisfaction Scores – 95% CIs

37 DEPARTMENT OF STATISTICS  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 With 95% confidence, the mean dissatisfaction score for Canterbury customers is somewhere between 0.5 and 20.7 larger than the mean dissatisfaction score for Auckland customers. Bank Dissatisfaction Scores – 95% CIs

38 DEPARTMENT OF STATISTICS Bank Dissatisfaction Scores – 95% CIs  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 With 95% confidence, the mean dissatisfaction score for Auckland customers is somewhere between 9.8 less than and 6.6 greater than the mean dissatisfaction score for Wellington customers.

39 DEPARTMENT OF STATISTICS Bank Dissatisfaction Scores – 95% CIs  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 With 95% confidence, the mean dissatisfaction score for Auckland customers is somewhere between 9.8 less than and 6.6 greater than the mean dissatisfaction score for Wellington customers.

40 DEPARTMENT OF STATISTICS Bank Dissatisfaction Scores – 95% CIs  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 Does this confidence interval support the proposition that there is a difference between the two population means? Supports  A –  W  0 ? No, it doesn’t support the proposition. Since 0 is in the confidence interval, then 0 is a believable value for the difference. There could be no difference between the two means.  A –  W = 0 (no diff)  A –  W  0 (a diff)

41 DEPARTMENT OF STATISTICS Bank Dissatisfaction Scores – 95% CIs  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 Does this confidence interval support the proposition that there is NO difference between the two population means? Supports  A –  W = 0 ? No, it doesn’t support the proposition. Since there are non-zero numbers in the interval  A –  W could be non-zero, there could be a difference.  A –  W = 0 (no diff)  A –  W  0 (a diff)

42 DEPARTMENT OF STATISTICS Bank Dissatisfaction Scores – 95% CIs  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 Does this confidence interval support the proposition that there is a difference between the two population means? Supports  A –  W  0 ? Yes, it does support the proposition. Since zero is not in the interval, it is not believable that the difference is zero. No difference between the means is not believable.  A –  W = 0 (no diff)  A –  W  0 (a diff)

43 DEPARTMENT OF STATISTICS Bank Dissatisfaction Scores – 95% CIs  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 Does this confidence interval support the proposition that there is NO difference between the two population means? Supports  A –  W = 0 ? No, it doesn’t support the proposition. In fact, it provides evidence against it. 0 is not in the interval. No difference between the means is not believable.  A –  W = 0 (no diff)  A –  W  0 (a diff)


Download ppt "DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked."

Similar presentations


Ads by Google