DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked.

DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked

DEPARTMENT OF STATISTICS Stats questions we are often asked  When can I use r and R 2 ?  When can I make a ‘causal-type’ claim?  Why should I be careful with a media reported margin of error?  When can I say a confidence interval gives support to a claim?

DEPARTMENT OF STATISTICS r – little r – what is it?  r is the correlation coefficient between y and x  r measures the strength of a linear relationship  r is a multiple of the slope

DEPARTMENT OF STATISTICS r – when can it be used?  Only use r if the scatter plot is linear  Don’t use r if the scatter plot is non-linear! x y * * * * * * * * ** * * * * * * * * * * r = 0.99

DEPARTMENT OF STATISTICS r – what does it tell you?  How close the points in the scatter plot come to lying on the line r = 0.99 x y * * * * * * * * ** * * * * * * * * * * r = 0.57 x y * * * * * * * * * * * * * * * * * * * *

DEPARTMENT OF STATISTICS R 2 – big R 2 – what is it?  R 2 is the coefficient of determination  Measures how close the points in the scatter plot come to lying on the fitted line or curve x y * * * * * * * * * * * * * * * * * * * * x y * * * * * * * * ** * * * * * * * * * *

DEPARTMENT OF STATISTICS R 2 – big R 2 – when can it be used?  When the scatter plot of y versus x is linear or non-linear x y * * * * * * * * * * * * * * * * * * * * x y * * * * * * * * ** * * * * * * * * * *

DEPARTMENT OF STATISTICS R 2 – what does it tell you? xx Dotplot of the y ’s Shows the variation in the y ’s y y ˆ Dotplot of the y ’s Shows the variation in the y ’s ˆ ˆ

DEPARTMENT OF STATISTICS R 2 – what does it tell you? x We see some additional variation in the y ’s. The excess is not explained by the model. y ˆ y 2 Variation in y 's ˆ Variation in fitted values Variation in y values Variation in y 's R = = Variation in the y ’s This amount of variation can be explained by the model ˆ

DEPARTMENT OF STATISTICS R 2 – what does it tell you?  When expressed as a percentage, R 2 is the percentage of the variation in Y that our regression model can explain  R 2 near 100%  model fits well  R 2 near 0%  model doesn’t fit well

DEPARTMENT OF STATISTICS R 2 – what does it tell you?  90% of the variation in Y is explained by our regression model. x y * * * * * * * * * * * * * * * * * * * * R 2 = 90%

DEPARTMENT OF STATISTICS R 2 – pearls of wisdom!  R 2 and r 2 have the same value ONLY when using a linear model  DON’T use R 2 to pick your model  Use your eyes!

DEPARTMENT OF STATISTICS R 2 and Excel & Graphics Calculators

DEPARTMENT OF STATISTICS Damaged for life by too much TV

DEPARTMENT OF STATISTICS Damaged for life by too much TV N Z Herald (04/10/2005)

DEPARTMENT OF STATISTICS Damaged for life by too much TV TV watching Health Score r = - 0.93 Causal relationship?

DEPARTMENT OF STATISTICS Causal relationships  Two general types of studies: experiments and observational studies  In an experiment, the experimenter determines which experimental units receive which treatments.

DEPARTMENT OF STATISTICS Damaged for life by too much TV TV watching Health Score r = - 0.93 Causal relationship?

DEPARTMENT OF STATISTICS Causal relationships  Two general types of studies: experiments and observational studies  In an experiment, the experimenter determines which experimental units receive which treatments.  In an observational study, we simply compare units that happen to have received different levels of the factor of interest.

DEPARTMENT OF STATISTICS Causal relationships  Only well designed and carefully executed experiments can reliably demonstrate causation.  An observational study is often useful for identifying possible causes of effects, but it cannot reliably establish causation

DEPARTMENT OF STATISTICS Causal relationships - Summary  In observational studies, strong relationships are not necessarily causal relationships.  Correlation does not imply causation.  Be aware of the possibility of lurking variables.

DEPARTMENT OF STATISTICS Margin of Error Sunday Star Times: National 44% Labour 37.2% NZ First 4.7% margin of error: 4.4% (n = 540) Herald on Sunday: Labour 42% National 38.5% NZ First 5.5% margin of error: 4.9% (n = 400)

DEPARTMENT OF STATISTICS Margin of Error Herald on Sunday: Labour 42% National 38.5% NZ First 5.5% margin of error: 4.9% (n = 400)

DEPARTMENT OF STATISTICS Margin of Error Herald on Sunday: Labour 42% National 38.5% NZ First 5.5% margin of error: 4.9% (n = 400) Confidence Interval: estimate ± margin of error

DEPARTMENT OF STATISTICS Margin of Error Survey Errors Nonsampling ErrorsSampling Error

DEPARTMENT OF STATISTICS Margin of Error Survey Errors Nonsampling Errors Sampling Error  caused by the act of sampling  has potential to be bigger in smaller samples  can determine how large it can be – margin of error  unavoidable (price of sampling)

DEPARTMENT OF STATISTICS Margin of Error Survey Errors Nonsampling ErrorsSampling Error  e.g., nonresponse bias, behavioural,...  can be much larger than sampling errors  impossible to correct for after completion of survey  impossible to determine how badly they affect results

DEPARTMENT OF STATISTICS Margin of Error Herald on Sunday: Labour 42% National 38.5% NZ First 5.5% margin of error: 4.9% (n = 400)

DEPARTMENT OF STATISTICS Approx. 95% confidence interval for p: Margin of Error

DEPARTMENT OF STATISTICS Margin of Error Margin of error (single proportion)

DEPARTMENT OF STATISTICS Margin of Error Herald on Sunday: Labour 42% National 38.5% NZ First 5.5% margin of error: 4.9% (n = 400) Sunday Star Times: National 44% Labour 37.2% NZ First 4.7% margin of error: 4.4% (n = 540)

DEPARTMENT OF STATISTICS  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 Bank Dissatisfaction Scores – 95% CIs

DEPARTMENT OF STATISTICS  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 With 95% confidence, the mean dissatisfaction score for Canterbury customers is somewhere between 0.5 and 20.7 larger than the mean dissatisfaction score for Auckland customers. Bank Dissatisfaction Scores – 95% CIs

DEPARTMENT OF STATISTICS Bank Dissatisfaction Scores – 95% CIs  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 With 95% confidence, the mean dissatisfaction score for Auckland customers is somewhere between 9.8 less than and 6.6 greater than the mean dissatisfaction score for Wellington customers.

DEPARTMENT OF STATISTICS Bank Dissatisfaction Scores – 95% CIs  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 Does this confidence interval support the proposition that there is a difference between the two population means? Supports  A –  W  0 ? No, it doesn’t support the proposition. Since 0 is in the confidence interval, then 0 is a believable value for the difference. There could be no difference between the two means.  A –  W = 0 (no diff)  A –  W  0 (a diff)

DEPARTMENT OF STATISTICS Bank Dissatisfaction Scores – 95% CIs  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 Does this confidence interval support the proposition that there is NO difference between the two population means? Supports  A –  W = 0 ? No, it doesn’t support the proposition. Since there are non-zero numbers in the interval  A –  W could be non-zero, there could be a difference.  A –  W = 0 (no diff)  A –  W  0 (a diff)

DEPARTMENT OF STATISTICS Bank Dissatisfaction Scores – 95% CIs  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 Does this confidence interval support the proposition that there is a difference between the two population means? Supports  A –  W  0 ? Yes, it does support the proposition. Since zero is not in the interval, it is not believable that the difference is zero. No difference between the means is not believable.  A –  W = 0 (no diff)  A –  W  0 (a diff)

DEPARTMENT OF STATISTICS Bank Dissatisfaction Scores – 95% CIs  C –  A : 0.5 to 20.7  A –  W : – 9.8 to 6.6 Does this confidence interval support the proposition that there is NO difference between the two population means? Supports  A –  W = 0 ? No, it doesn’t support the proposition. In fact, it provides evidence against it. 0 is not in the interval. No difference between the means is not believable.  A –  W = 0 (no diff)  A –  W  0 (a diff)

DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked.

Similar presentations

Presentation on theme: "DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked.

Similar presentations

Presentation on theme: "DEPARTMENT OF STATISTICS Stats Questions We Are Often Asked."— Presentation transcript:

Similar presentations

About project

Feedback