Presentation on theme: "PARAMETRIC STATISTICAL INFERENCE"— Presentation transcript:
1 PARAMETRIC STATISTICAL INFERENCE Methodologies that allow us to draw conclusions about population parameters from sample statisticsTYPES OF INFERENCE:EstimationHypothesis testingMethods based on statistical relationships between samples and populationsPOINT ESTIMATION: estimation of parameter from a sample statisticFor the mean, standard deviation, etc..INTERVAL ESTIMATION: using a sample to identify an interval within which the population parameter is thought to lie, with a certain probability
2 ESTIMATION OF POPULATION MEAN Sample mean value is only an estimate of the parameter mean valueParameter value is not knownDue to sampling variability, no two samples will produce exactly the same outcome, or sample mean· Can we estimate how this sample mean value would vary if you take many large samples from the same population?Remember:· sample mean values from large samples have a normal distribution· the mean of the sampling distribution is the same as the unknown parameter standard deviation of for a SRS of size n is ?
3 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION Example: A random sample of 350 male college students were asked for the number of units they were taking. The mean was 12.3 units, with a standard deviation of 2.50 units.What can we say about the mean number of units of all student males at the university? How will the estimate value of the parameter vary from one sample to another with a certain confidence, like 95%?Assume that = ?. s = ?
4 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION Statistical confidenceRemember: The rule· In 95% of all samples, the mean score of x will lie within 2 standard deviations of the population mean score .Since s = 2.50, we can say thatIn 95% of samples, will lie within 5.0 points of the observed sample meanIn 95% of all samples,Thus, the parameter will lie between 7.3 and 17.3, in 95% of samples
5 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION Rephrasing:1. We are 95% confident that the interval contains We have just assigned statistical confidence to our estimation of the parameterWe call this estimated interval a CONFIDENCE INTERVAL for the mean value
6 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION · But, there is still some chance that the true parameter value will not lie in the identified interval e.g. The SRS chosen was one of few samples for which is not within 5.0 points of true mean. 5% of samples will give these incorrect results
7 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION · CONFIDENCE INTERVAL – formal definitionA level C confidence interval for a parameter is defined asestimate margin of errorand gives the interval that will capture the true parameter value in repeated samples with a certain probability· Confidence intervals usually vary between 90% and 99.9%
8 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION BUILDING CONFIDENCE INTERVALSIf we know the parameter and , we can standardize the sample mean. The result is the ONE-SAMPLE Z STATISTICThe z statistic tells us how far the observed is from , in units of standard deviations of Because has a normal distribution, z has the standard normal distribution N(0,1).
9 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION · Constructing confidence intervalsWhen we construct a 95% confidence interval, we are looking for two values for which there is a 95% chance that the population mean is between them. So,P(Low < < High) = 0.95Thus, 0.95 = P(-1.96 < z < 1.96)=0.95 =
10 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION · Draw a SRS of size n from a population having unknown mean , and known standard deviation . A level C confidence interval for This interval is exact when the population distribution is normal and is approximately correct for large n in other caseswhere represents the probability that the interval will not capture the true parameter value in repeated sample or confidence level, and C is the confidence level.
11 Confidence intervals and confidence levels of Standardized normal curve N(0,1)Figure 6.5 and figure 6.6z* = z/2C = chosen confidence level – probability that aparameter will lie within a given interval with a desiredconfidence(1-C)/2 = probability that a parameter will be situatedeither above or below the the lower confidence limit= /2
12 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION · Example:A manufacturer of pharmaceutical products analyzes a specimen from each batch of a product to verify the concentration of the active ingredient. The chemical analysis is not perfectly precise. Repeated measurements on the same specimen give slightly different results. The results of repeated measurements follow a normal distribution. The analysis procedure has no bias, so the mean of the population of all measurements is the true concentration in the specimen. The standard deviation of this distribution is known to be g/l. Three analyses of one specimen give the following concentrations Calculate the 99% confidence interval for the true concentration.
13 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION · INTERVAL ESTIMATION OF WITH UNKNOWN replaced with estimate s – introduces moreuncertaintySTUDENT’S T-DISTRIBUTIONnot standard normal curve
14 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION INTERVAL ESTIMATION OF WITH UNKNOWNIntervals derived from t-distribution are wider than those found with z-distributionFor large samples (n=>30), it makes no difference which distribution we use to estimate confidence interval
15 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION HOW CONFIDENCE INTERVALS BEHAVE· Ideal situation – high confidence and small margin of errorMargin of error (E) =· The smaller the margin of error, the more precise our estimation of
16 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION · Properties of error1. Error increases with smaller sample sizeFor any confidence level, large samples reduce the margin of error2. Error increases with larger standard Deviation As variation among the individuals in the population increases, so does the error of our estimate3. Error increases with larger z valuesTradeoff between confidence level and margin of error
17 Figure 8-10 and 8-11 Interval width (error) increases with Increased confidence levelHigher confidence levels haveHigher z valuesFigure 8-10 and 8-11Error is high in small samples
18 PARAMETRIC STATISTICAL INFERENCE: ESTIMATION Example:Calculate the 99% confidence interval for sample size of 1. = , =99% confidence interval for n=3 was to g/lHow do these compare in relation to the mean? Which one has the larger margin of error?
19 CHOOSING SAMPLE SIZE· Sometimes we wish to estimate our mean within a certain margin of error.Sometimes we wish to determine a certain sample size in order to achieve a given margin of errorHere is how…Remember:Margin of error (E) =To obtain a desired value of E, for a givenconfidence level, you need to figure out n.From the above,· It is the sample size that determines the margin of errorRequired sample size depends on the desired level of confidence
21 CHOOSING SAMPLE SIZE Example: Management asks the pharmaceutical laboratory to produce results accurate to within 0.005 with 95% confidence. How many measurements must be averaged to comply with this request?m = g/lFor 95% confidence level, z = ? = g/l
22 CHOOSING SAMPLE SIZE Example: Management asks the pharmaceutical laboratory to produce results accurate to within 0.005 with 95% confidence. How many measurements must be averaged to comply with this request?m = g/lFor 95% confidence level, z = = g/lis n = 7 or n = 8?Choose one that will give a smaller margin of error.How should we always round to meet the requirements necessary?
23 SUMMARYAll formulas for inference are only correct under certain conditionso Most inference methods have several assumptions attached to them that must be met if the outcomes produced by them are to be reliable.Confidence interval formula has the following assumptions:1. The data must come from a simple random sample.different methods exist for stratified and multistage samplesundercoverage and non-response can add error2. X bar must be a random normal variable3. There must be no outliers. Is the formula sensitive to outliers?4. If sample size is small (<15) and/or is not known but distribution of x still normal, t-distribution must be used to compute interval5. When sigma is known use z-distribution.For large sample sizes we can assume that = s and use either z or t distributions