Chapter 5 Inferences Regarding Population Central Values.

Chapter 5 Inferences Regarding Population Central Values

Inferential Methods for Parameters Parameter: Numeric Description of a Population Statistic: Numeric Description of a Sample Statistical Inference: Use of observed statistics to make statements regarding parameters –Estimation: Predicting the unknown parameter based on sample data. Can be either a single number (point estimate) or a range (interval estimate) –Testing: Using sample data to see whether we can rule out specific values of an unknown parameter with a certain level of confidence

Estimating with Confidence Goal: Estimate a population mean based on sample mean Unknown: Parameter (  ) Known: Approximate Sampling Distribution of Statistic Recall: For a random variable that is normally distributed, the probability that it will fall within 2 standard deviations of mean is approximately 0.95

Estimating with Confidence Although the parameter is unknown, it’s highly likely that our sample mean (estimate) will lie within 2 standard deviations (aka standard errors) of the population mean (parameter) Margin of Error: Measure of the upper bound in sampling error with a fixed level (we will typically use 95%) of confidence. That will correspond to 2 standard errors:

Confidence Interval for a Mean  Confidence Coefficient (1  Probability (based on repeated samples and construction of intervals) that a confidence interval will contain the true mean  Common choices of 1-  and resulting intervals:

1-  0

Philadelphia Monthly Rainfall (1825-1869)

4 Random Samples of Size n=20, 95% CI’s

Factors Effecting Confidence Interval Width Goal: Have precise (narrow) confidence intervals –Confidence Level (1  ) Increasing 1-  implies increasing probability an interval contains parameter implies a wider confidence interval. Reducing 1-  will shorten the interval (at a cost in confidence) –Sample size (n): Increasing n decreases standard error of estimate, margin of error, and width of interval (Quadrupling n cuts width in half) –Standard Deviation (  ): More variable the individual measurements, the wider the interval. Potential ways to reduce  are to focus on more precise target population or use more precise measuring instrument. Often nothing can be done as nature determines 

Precautions Data should be simple random sample from population (or at least can be treated as independent observations) More complex sampling designs have adjustments made to formulas (see Texts such as Elementary Survey Sampling by Scheaffer, Mendenhall, Ott) Biased sampling designs give meaningless results Small sample sizes from nonnormal distributions will have coverage probabilities (1  ) typically below the nominal level Typically  is unknown. Replacing it with the sample standard deviation s works as a good approximation in large samples

Selecting the Sample Size Before collecting sample data, usually have a goal for how large the margin of error should be to have useful estimate of unknown parameter (particularly when comparing two populations) Let E be the desired level of the margin of error and  be the standard deviation of the population of measurements (typically will be unknown and must be estimated based on previous research or pilot study) The sample size giving this margin of error is:

Hypothesis Tests Method of using sample (observed) data to challenge a hypothesis regarding a state of nature (represented as particular parameter value(s)) Begin by stating a research hypothesis that challenges a statement of “status quo” (or equality of 2 populations) State the current state or “status quo” as a statement regarding population parameter(s) Obtain sample data and see to what extent it agrees/disagrees with the “status quo” Conclude that the “status quo” is not true if observed data are highly unlikely (low probability) if it were true

Elements of a Hypothesis Test (I) Null hypothesis (H 0 ): Statement or theory being tested. Stated in terms of parameter(s) and contains an equality. Test is set up under the assumption of its truth. Alternative Hypothesis (H a ): Statement contradicting H 0. Stated in terms of parameter(s) and contains an inequality. Will only be accepted if strong evidence refutes H 0 based on sample data. May be 1-sided or 2-sided, depending on theory being tested. Test Statistic (T.S.): Quantity measuring discrepancy between sample statistic (estimate) and parameter value under H 0 Rejection Region (R.R.): Values of test statistic for which we reject H 0 in favor of H a P-value: Probability (assuming H 0 true) that we would observe sample data (test statistic) this extreme or more extreme in favor of the alternative hypothesis (H a )

Example: Interference Effect Does the way items are presented effect task time? –Subjects shown list of color names in 2 colors: different/black –y i is the difference in times to read lists for subject i: diff-blk –H 0 : No interference effect: mean difference is 0 (  = 0) –H a : Interference effect exists: mean difference > 0 (  > 0) –Assume standard deviation in differences is  = 8 (unrealistic*) –Experiment to be based on n=70 subjects How likely to observe sample mean difference  2.39 if  = 0?

0 2.39 P-value

Elements of a Hypothesis Test (II) Type I Error: Test resulting in rejection of H 0 in favor of H a when H 0 is in fact true –P(Type I error) =  (typically.10,.05, or.01) Type II Error: Test resulting in failure to reject H 0 in favor of H a when in fact H a is true (H 0 is false) –P(Type II error) =  (depends on true parameter value) 1-Tailed Test: Test where the alternative hypothesis states specifically that the parameter is strictly above (below) the null value 2-Tailed Test: Test where the alternative hypothesis is that the parameter is not equal to null value (simultaneously tests “greater than” and “less than”)

Test Statistic Parameter: Population mean (  ) under H 0 is  0 Statistic (Estimator): Sample mean obtained from sample measurements is Standard Error of Estimator: Sampling Distribution of Estimator: –Normal if shape of distribution of individual measurements is normal –Approximately normal regardless of shape for large samples Test Statistic: (labeled simply as z in text) Note: Typically  is unknown and is replaced by s in large samples

Decision Rules and Rejection Regions Once a significance (  ) level has been chosen a decision rule can be stated, based on a critical value: 2-sided tests: H 0 :  =  0 H a :   0 –If test statistic (z obs ) > z  /2 Reject H o and conclude  >  0 –If test statistic (z obs ) < -z  /2 Reject H o and conclude  <  0 –If -z  /2 < z obs < z  /2 Do not reject H 0 :  =  0 1-sided tests (Upper Tail): H 0 :   0 H a :  >  0 –If test statistic (z obs ) > z  Reject H o and conclude  >  0 –If z obs < z  Do not reject H 0 :   0 1-sided tests (Lower Tail): H 0 :   0 H a :  <  0 –If test statistic (z obs ) < -z  Reject H o and conclude  <  0 –If z obs > -z  Do not reject H 0 :   0

Computing the P-Value 2-sided Tests: How likely is it to observe a sample mean as far of farther from the value of the parameter under the null hypothesis? (H 0 :  0 H a :  0 ) After obtaining the sample data, compute the mean and convert it to a z-score (z obs ) and find the area above |z obs | and below -|z obs | from the standard normal (z) table 1-sided Tests: Obtain the area above z obs for upper tail tests (H a :  0 ) or below z obs for lower tail tests (H a :  0 )

Interference Effect (1-sided Test) Testing whether population mean time to read list of colors is higher when color is written in different color Data: y i : difference score for subject i (Different-Black) Null hypothesis (H 0 ): No interference effect (H 0 :   0) Alternative hypothesis (H a ): Interference effect (H a :  > 0) n = 70 subjects in experiment, reasonably large sample Conclude there is evidence of an interference effect (  > 0)

Interference Effect (2-sided Test) Testing whether population mean time to read list of colors is effected (higher or lower) when color is written in different color Data: X i : difference score for subject i (Different-Black) Null hypothesis (H 0 ): No interference effect (H 0 :  = 0) Alternative hypothesis (H a ): Interference effect (+ or -) (H a :  0) Again, evidence of an interference effect (  > 0)

Equivalence of 2-sided Tests and CI’s For given , a 2-sided test conducted at  significance level will give equivalent results to a (1  )  level confidence interval: –If entire interval >  0, P-value z  /2 (conclude  >  0 ) –If entire interval <  0, P-value < , z obs < -z  /2 (conclude  <  0 ) –If interval contains  0, P-value > , -z  /2 < z obs < z  /2 (don’t conclude   0 ) Confidence interval is the set of parameter values that we would fail to reject the null hypothesis for (based on a 2- sided test)

Power of a Test Power - Probability a test rejects H 0 (depends on  ) –H 0 True: Power = P(Type I error) =  –H 0 False: Power = 1-P(Type II error) = 1-  ·Example (Using context of interference data): ·H 0 :  = 0 H A :  > 0    n=16 ·Decision Rule: Reject H 0 (at  =0.05 significance level) if:

Power of a Test Now suppose in reality that  = 3.0 (H A is true) Power now refers to the probability we (correctly) reject the null hypothesis. Note that the sampling distribution of the sample mean is approximately normal, with mean 3.0 and standard deviation (standard error) 2.0. Decision Rule (from last slide): Conclude population mean interference effect is positive (greater than 0) if the sample mean difference score is above 3.29 Power for this case can be computed as:

Power of a Test All else being equal: As sample size increases, power increases As population variance decreases, power increases As the true mean gets further from  0, power increases

Power of a Test Distribution (H 0 )Distribution (H A ) Reject H0 Fail to reject H0.4424.05.95.5576

Power Curves for sample sizes of 16,32,64,80 and varying true values  from 0 to 5 with  = 8. For given , power increases with sample size For given sample size, power increases with 

Sample Size Calculations for Fixed Power Goal - Choose sample size to have a favorable chance of detecting important difference from  0 in 2-sided test: H 0 :  =  0 vs H a :   0 Step 1 - Define an important difference to be detected  )  –Case 1:  approximated from prior experience or pilot study - difference can be stated in units of the data –Case 2:  unknown - difference must be stated in units of standard deviations of the data Step 2 - Choose the desired power to detect the desired important difference (1- , typically at least.80). For 2-sided test:

Example - Interference Data 2-Sided Test: H 0 :  =  vs H a :  0 Set  = P(Type I Error) = 0.05 Choose important difference of |  0 |=  =2.0 Choose Power=P(Reject H 0 |  =2.0) =.90 Set  = P(Type II Error) = 1-Power = 1-.90 =.10 From study, we know   8 Would need 169 subjects to have a.90 probability of detecting effect

Potential for Abuse of Tests Should choose a significance (  ) level in advance and report test conclusion (significant/nonsignificant) as well as the P-value. Significance level of 0.05 is widely used in the academic literature Very large sample sizes can detect very small differences for a parameter value. A clinically meaningful effect should be determined, and confidence interval reported when possible A nonsignificant test result does not imply no effect (that H 0 is true). Many studies test many variables simultaneously. This can increase overall type I error rates

Family of t-distributions Symmetric, Mound-shaped, centered at 0 (like the standard normal (z) distribution Indexed by degrees of freedom (df)  the number of independent observations (deviations) comprising the estimated standard deviation. For one sample problems df = n-1 Have heavier tails (more probability over extreme ranges) than the z-distribution Converge to the z-distribution as df gets large Tables of critical values for certain upper tail probabilities are available (Table 3, p. 679)

Inference for Population Mean Practical Problem: Sample mean has sampling distribution that is Normal with mean  and standard deviation  /  n (when the data are normal, and approximately so for large samples).  is unknown. Have an estimate of , s obtained from sample data. Estimated standard error of the sample mean is: When the sample is SRS from N(  then the t-statistic (same as z- with estimated standard deviation) is distributed t with n-1 degrees of freedom

Probability DegreesofFreedomDegreesofFreedom CriticalValuesCriticalValues Critical Values

One-Sample Confidence Interval for  SRS from a population with mean  is obtained. Sample mean, sample standard deviation are obtained Degrees of freedom are df= n-1, and confidence level (1-  ) are selected Level (1-  ) confidence interval of form: Procedure is theoretically derived based on normally distributed data, but has been found to work well regardless for large n

1-Sample t-test (2-tailed alternative) 2-sided Test: H 0 :  =  0 H a :    0 Decision Rule (t  /2 such that P(t(n-1)  t  /2 )=  /2) : –Conclude  >  0 if Test Statistic (t obs ) is greater than t  /2 –Conclude  <  0 if Test Statistic (t obs ) is less than -t  /2 –Do not conclude Conclude    0 otherwise P-value: 2P(t(n-1)  |t obs |) Test Statistic:

P-value (2-tailed test) -|t obs | |t obs |

1-Sample t-test (1-tailed (upper) alternative) 1-sided Test: H 0 :  =  0 H a :  >  0 Decision Rule (t  such that P(t(n-1)  t  )=  ) : –Conclude  >  0 if Test Statistic (t obs ) is greater than t  –Do not conclude  >  0 otherwise P-value: P(t(n-1)  t obs ) Test Statistic:

P-value (Upper Tail Test)

1-Sample t-test (1-tailed (lower) alternative) 1-sided Test: H 0 :  =  0 H a :  <  0 Decision Rule (t  obtained such that P(t(n-1)  t  )=  ) : –Conclude  <  0 if Test Statistic (t obs ) is less than -t  –Do not conclude  <  0 otherwise P-value: P(t(n-1)  t obs ) Test Statistic:

P-value (Lower Tail Test)

Example: Mean Flight Time ATL/Honolulu Scheduled flight time: 580 minutes Sample: n=31 flights 10/2004 (treating as SRS from all possible flights Test whether population mean flight time differs from scheduled time H 0 :  = 580 H a :   580 Critical value (2-sided test,  = 0.05, n-1=30 df): t.025 =2.042 Sample data, Test Statistic, P-value:

Inference on a Population Median Median: “Middle” of a distribution (50 th Percentile) –Equal to Mean for symmetric distribution –Below Mean for Right-skewed distribution –Above Mean for Left-skewed dsitribution Confidence Interval for Population Median: –Sort observations from smallest to largest (y (1) ...  y (n) ) –Obtain Lower (L  /2 ) and Upper (U  /2 ) “Bounds of Ranks” –Small Samples: Obtain C  (2),n from Table 5 (p. 682) –Large Samples:

Example - ATL/HNL Flight Times n=31, Small-Sample: C.05(2),31 =9 Large-Sample:

Chapter 5 Inferences Regarding Population Central Values.

Similar presentations

Presentation on theme: "Chapter 5 Inferences Regarding Population Central Values."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 5 Inferences Regarding Population Central Values.

Similar presentations

Presentation on theme: "Chapter 5 Inferences Regarding Population Central Values."— Presentation transcript:

Similar presentations

About project

Feedback