Presentation is loading. Please wait.

Presentation is loading. Please wait.

with Confidence Intervals

Similar presentations


Presentation on theme: "with Confidence Intervals"— Presentation transcript:

1 with Confidence Intervals
Estimation with Confidence Intervals Bakr M. Bin Sadiq Pediatric Specialist Senior Clinical and Field Epidemiologist Certified in International Research Ethics

2 Definition of Probability
The probability of a given outcome is the number of times that outcome occurs divided by the total number of trials Pr (A) = Number of times A does occur Number of times A can occur

3 Hypothesis testing end
Diagram of the scale of probabilities Diagram of the scale of probabilities Absolutely Certain 100% Impossible 0% Probability 50% Hypothesis testing end 5% Estimation end 95% CI Probability of tossing 5 heads in a row 3% Probability of tossing 4 heads in a row 6% Probability of recovery from skin cancer 95%

4 Statistical tests All statistical tests and methods are used to determine: Confidence interval or P-value

5 Variance, SD Variance = ∑ |x – x|2/n = 272/10 = 27.2 SD =
No. of Positive LNDs Raw Deviation Absolute Deviation Squared Deviation 1 -8 8 64 3 -6 6 36 4 -5 5 25 7 -2 2 9 11 12 16 49 18 81 90 42 272 Variance = ∑ |x – x|2/n = 272/10 = 27.2 SD = Square root of variance 5.22

6 Measures of Variability
The variance: Average of squared deviations from the mean Variance(s2)= (x-x)2/(n-1) Example: 6, 8, 10, 4 X = 28/4 = 7 (x-x)2 = 20 Variance (s2) = (x-x)2/(n-1) = 20/3 = 6.7

7 Measures of Variability
Standard Deviation: Measure the dispersion of the data a round the mean standard deviation (s) =  variance s =   (Xi -X)2 /(n-1) standard deviation (s) = 6.7 = 2.6

8 Measures of Variability
Sample statistic “Point estimate” Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Value of the population parameter, e.g., prevalence of Down’s syndrome in babies born to mothers over 35 years of age

9 The standard Error of the Means
It is related to standard deviation But it measure different thing SE measure the variability of the sample mean around the true population mean se = sd/ n se = 2.6 / 4 = 1.3

10 Estimation Example The mean diastolic blood pressure in 100 individuals was 77 with standard deviation  12 mm Hg. Estimate the mean diastolic blood pressure in the population

11 Estimation with confidence intervals (for the mean)
The general equation of Confidence Intervals (CI) = Statistic  Confidence coefficient * SE = (point estimate)  level of confidence * SE 95% CI =Mean  t value (1.96) * SE 95% CI = 77  1.96 * 1.2 The lower limit = 74.6 mm Hg The upper limit = 79.4 mm Hg

12 DBP solution Interpretation
This results mean that we are 95% confident that the population mean will fall within these limits ( ) Or if we take 100 samples, 100 individuals each, from the same population. The population mean will fall within these intervals ( ) 95 times out of 100.

13 The applications of the t test
The mean blood sodium concentration of 18 cases with certain disease was 115 mmol/l, with standard deviation of 12 mmol/l. assuming that the blood sodium concentration is normally distributed what is the 95% confidence interval within which the population mean of such cases may be expected to lie? 95% CI = mean  t value * SE 95% CI = 115  2.11 * 12/√18 = 109 to

14 Confidence Interval A confidence interval is a measure of how much trust we can place in an estimate derived from a sample, taking into account the scope for chance variation from one sample to another Bigger sample sizes give narrower CI’s

15 Binomial distribution
Binomial distribution used when: We are concerned with the results of n trials. Each trial has exactly two possible outcomes (success-failure, yes- no, true-false). Each trial is independent on the other trials. Each trial has the same probability of success (p) or failure (q) The mean of binomial distribution = np The SD of binomial distribution = npq

16 Binomial distribution
Example: If a penny is toosed 100 times, what is the theoretical standard deviation? Asn: n= 100, p = 0.5, q = 0.5 Mean =np =100 * 0.5 = 50 SD =  npq SD =  100* 0,5* 0,5 =5 This means that the SD of the number of heads from the expected value, 50, will be 5.

17 Estimation with confidence intervals (for the proportion)
The sampling distribution of a proportion follows a binomial distribution If the sample size is large, then sampling distribution of the proportion is approximately normal. 95% CI =proportion  Z value (1.96) * SE SE = p(1- p)/n

18 Estimation with confidence intervals (for the proportion)
Of the 64 women included in the study, 27 reported that they experienced bleeding gums at least once a week. Estimate the proportion of bleeding gums in the population. 95% CI =  1.96 *  ( )/64 95% CI =  1.96 * = (0.301, 0.543)

19 Concepts of Confidence Intervals

20 Confidence Interval A range of reasonable guesses at a population value, a mean for instance Confidence level = chance that range of guesses captures the the population value Most common confidence level is 95%

21 General Format of a Confidence Interval
estimate +/- margin of error

22 Accuracy of a mean A sample of n=36 college women has mean pulse = 75.3. The SD of these pulse rates = 8 . How well does this sample mean estimate the population mean ?

23 Standard Error of Mean SEM = SD of sample / square root of n
SEM = 8 / square root ( 36) = 8 / 6 = 1.33 Margin of error of mean = 2 x SEM Margin of Error = 2.66 , about 2.7

24 Interpretation 95% chance the sample mean is within 2.7 (pulse beats) of the population mean. A 95% confidence interval for the population mean sample mean +/- margin of error 75.3 +/-2.7 ; to 78.0

25 For Large Population of Women
Could the mean pulse be 72 ? Maybe, but our interval doesn't include 72. It's likely that population mean is above 72.

26 C.I. for mean pulse of men n=49 sample mean=70.3, SD = 8
SEM = 8 / square root(49) = 1.1 margin of error=2 x 1.1 = 2.2 Interval is /- 2.2 68.1 to 72.5

27 Do men and women differ in mean pulse?
C.I. for women is 72.6 to 78.0 C.I. for men is 68.1 to 72.5 No overlap between intervals Looks safe to say that population means differ

28 Thought Question Study compares weight loss of men who only diet compared to those who only exercise 95% confidence intervals for mean weight loss Diet only : to 18.0 Exercise only: to 11.2

29 Part A Do you think this means that 95% of men who diet will lose between 13.4 and 18.0 pounds ? Answer : No. The interval is a range of guesses at the population mean. This interval doesn't describe individual variation.

30 Part B Can we conclude that there's a difference between mean weight losses of the two programs ? This is a reasonable conclusion. The two confidence intervals don't overlap. It seems the population means are different.

31 Direct look at the difference
For diet, mean weight loss = 15.8 pounds For exercise, mean weight loss = 8.8 pounds Difference = 7 pounds more loss by diet

32 Confidence Interval for Difference
95% confidence interval for difference in mean weights is 3.5 to 10.5 pounds. Don't worry about the calculations. This interval is entirely above 0. This rules out "no difference" ; 0 difference would mean no difference.

33 Thought Question Compares risk of heart attack for bald men to risk for men with no hair loss A reported 95% confidence interval for relative risk is 1.1 to 8.2 Is it reasonable to conclude that bald men generally have a greater risk?

34 Do Bald Men Have Greater Risk?
This question refers to populations. What's the relative risk, if two groups have the same risk? Rel. Risk = 1 means that risks are same Rel. Risk = ratio of risks The CI is entirely above 1 (but barely) so it appears that the groups don't have the same risk.

35 Lack of Validity Variable doesn't measure what it's supposed to
For example, questions on a test of depression might not really measure depression IQ test might not really measure intelligence so may not be valid

36 Summary Key value for looking at the difference between two means is 0
0 difference means that means are the same Key value for looking at a relative risk is 1 relative risk of 1 means risks are the same

37 Confidence Interval for a Mean
when you have a “small” sample...

38 As long as you have a “large” sample….
A confidence interval for a population mean is: where the average, standard deviation, and n depend on the sample, and Z depends on the confidence level.

39 Example Random sample of 59 students spent an average of $ on Spring 1998 textbooks. Sample standard deviation was $94.40. We can be 95% confident that the average amount spent by all students was between $ and $

40 What happens if you can only take a “small” sample?
Random sample of 15 students slept an average of 6.4 hours last night with standard deviation of 1 hour. What is the average amount all students slept last night?

41 If you have a “small” sample...
Replace the Z value with a t value to get: where “t” comes from Student’s t distribution, and depends on the sample size through the degrees of freedom “n-1”.

42 Student’s t distribution versus Normal Z distribution

43 T distribution Very similar to standard normal distribution, except:
t depends on the degrees of freedom “n-1” more likely to get extreme t values than extreme Z values

44 Let’s compare t and Z values
For small samples, T value is larger than Z value. So,T interval is made to be longer than Z interval.

45 OK, enough theorizing! Let’s get back to our example!
Sample of 15 students slept an average of 6.4 hours last night with standard deviation of 1 hour. Need t with n-1 = 15-1 = 14 d.f. For 95% confidence, t14 = 2.145

46 That is... We can be 95% confident that average amount slept last night by all students is between 5.85 and 6.95 hours. Hmmm! Adults need 8 hours of sleep each night. Logical conclusion: Students need more sleep. (Just don’t get it in this class!)

47 What happens as sample gets larger?

48 What happens to CI as sample gets larger?
For large samples: Z and t values become almost identical, so CIs are almost identical.

49 Example Random sample of 64 students spent an average of 3.8 hours on homework last night with a sample standard deviation of 3.1 hours. Z Confidence Intervals The assumed sigma = 3.10 Variable N Mean StDev % CI Homework (3.037, ) T Confidence Intervals Variable N Mean StDev % CI Homework (3.022, )

50 One not-so-small problem!
It is only OK to use the t interval for small samples if your original measurements are normally distributed. We’ll learn how to check for normality in a minute.

51 Strategy for deciding how to analyze
If you have a large sample of, say, 60 or more measurements, then don’t worry about normality, and use the t-interval. If you have a small sample and your data are normally distributed, then use the t-interval. If you have a small sample and your data are not normally distributed, then do not use the t-interval, and stay tuned for what to do.

52 Using Minitab, let’s go practice this strategy …

53 Z interval in Minitab Find sample standard deviation by Stat>> Basic Statistics >> Display Descriptive Statistics Ask for Z interval by Stat >> Basic Statistics >> 1-Sample Z…. Select desired variable. Type sample stand. deviation in Sigma box. Specify desired confidence level. OK.

54 t interval in Minitab Ask for t interval by Stat >> Basic Statistics >> 1-Sample t…. Select desired variable. Specify desired confidence level. Say OK.

55 Confidence Intervals Proportions

56 The Tattoo Data On first day 19.3% of you said you have a tattoo
n = 228 responded What percent of all PSU students have a tattoo ? Is 19.3% the answer ?

57 General Objective Use sample data to estimate the population proportion

58 95% Confidence Interval Estimate
An interval showing our best guesses at the population value Example statement: "we think population proportion is between 0.14 and 0.24"

59 Confidence Level for Interval
Chance that interval really "captures" the population value 95% is most commonly used for confidence level although, any level could be used.

60 Interval for a proportion
Rule of sample proportions paves the way Sample proportions described by bell curve centered at true p 95% of the time , the sample p is within 2SDs of true p Interval is sample p +/- 2 x SD

61 Formula For SD Sqrt [ p x ( 1- p) / n ] p is supposed to be true p
Do we know true p? No! We'll use sample p in SD formula

62 For the tattoo data sample p = 0.193 and n=228
estimated SD = sqr. root of [(0.193)( )/228] =0.026 2SDs = 2 x = 0.052 Interval is to 95% confident that PSU proportion is between and 0.245

63 Same Survey Last Spring
sample p = 0.15 and n= 210 Can we conclude that tattoo percentage has increased since last spring ?

64 Confidence Interval Based on Last Spring
SD = sqrt[0.15*(1-0.15)/210]=0.025 2 SD = 2 x = 0.05 95% interval is /- 0.05 95% sure that PSU proportion is between 0.10 and 0.20.

65 Comparison Intervals from Spring and Fall overlap
Can't conclude that the population proportions differ.

66 Margin of Error 95% margin of error = 2 SDs Earlier in term, we said -
margin of error = 1 / sqr. root(n) when p=0.5, 2 SD = 1/sqr. root(n)

67 Gallup Poll Question Do you approve of radio personality Howard Stern?
Results were 70% did not with n= 900 margin of error given as 3% Determine 95% conf. interval for population percent samp est - and + margin of error is 67% to 73%

68 Interpretation Based on the sample, we are 95% sure that between 67 and 73% of all Americans disapprove of Howard Stern

69 Some Warnings Sample should be random sample - not volunteer sample
Don't need confidence interval if whole population is sampled (rarely done)


Download ppt "with Confidence Intervals"

Similar presentations


Ads by Google