Presentation on theme: "Estimation of Statistical Parameters"— Presentation transcript:
1 Estimation of Statistical Parameters Estimation theory is a branch of statistics based on measured/empirical data that has a random component.An estimator attempts to approximate the unknown parameters using the measurements.In statistics, estimation refers to the process by which one makes inferences about a population, based on information obtained from a sample
2 OUTLINEObjectives:Describe the characteristics of the normal distribution in statistical termsExplain the concept of a confidence interval and how it relates to an estimated parameter
3 Point Estimate vs. Interval Estimate Statisticians use sample statistics to estimate population parameters.For example:sample means are used to estimate population means;sample proportions, to estimate population proportions.
4 An estimate of a population parameter may be expressed in: Point Estimate vs. Interval EstimateAn estimate of a population parameter may be expressed in:Point estimate.A point estimate of a population parameter is a single value of a statistic.For example, the sample mean x is a point estimate of the population mean μ.Similarly, the sample proportion p is a point estimate of the population proportion P.Interval estimate.An interval estimate is defined by two numbers, between which a population parameter is said to lie.For example, a < x < b is an interval estimate of the population mean μ. It indicates that the population mean is greater than a but less than b.
5 the sample statistic + margin of error. Confidence IntervalsStatisticians use a confidence interval to express the precision and uncertainty associated with a particular sampling method.A confidence interval consists of three parts.A confidence level.A statistic.A margin of error.The confidence level describes the uncertainty of a sampling method.The statistic and the margin of error define an interval estimate that describes the precision of the method.The interval estimate of a confidence interval is defined by:the sample statistic + margin of error.The probability part of a confidence interval is called a confidence level.The confidence level describes how strongly we believe that a particular sampling method will produce a confidence interval that includes the true population parameter.
6 Standard ErrorTo compute a confidence interval for a statistic, you need to know the the standard deviation or the standard error of the statistic.This lesson describes how to find the standard deviation and standard error, and shows how the two measures are related
7 NotationThe following notation is helpful, when we talk about the standard deviation and the standard error.Population parameterSample statisticN: Number of observations in the populationn: Number of observations in the sampleμ: Population meanx: Sample estimate of population meanσ: Population standard deviations: Sample estimate of σ
8 Standard Deviation of Sample Estimates Statisticians use sample statistics to estimate population parameters. Naturally, the value of a statistic may vary from one sample to the next.The variability of a statistic is measured by its standard deviation.StatisticStandard DeviationPopulation mean 𝜎 𝑥 = 𝜎 𝑛StatisticStandard ErrorSample mean, 𝑥𝑆𝐸 𝑥 = 𝑠 𝑛The equations for the standard error are identical to the equations for the standard deviation, except for one thing - the standard error equations use statistics where the standard deviation equations use parameters. Specifically, the standard error equations use p in place of P, and s in place of σ.
9 Central Limit TheoremThe distribution of sample means (sampling distribution) from a population is approximately normal if the sample size is large, i.e.,1. The population distribution can be non-normal.2. Given the population has mean m, then the mean ofthe sampling distribution,3. if the population has variance s2, the standard deviation of the sampling distribution, or the standard error (a measure of the amount of sampling error) is
10 Estimation & Confidence Intervals Normal distribution:Gaussian distributionSymmetricNot skewedUnimodalDescribed by two parameters:Probability density function:μ & σ are parametersμ = meanσ = standard deviationπ, e = constants
11 Estimation of Confidence Intervals Normal distribution: Why do we use it!Many biological variables follow a normal distributionThe normal distribution is well-understood, mathematicallyPunctual estimationIs a value for estimated theoretical parameterm (sample mean) is a punctual estimation of μ (population mean)Is influenced by the fluctuations from samplingCould be very far away from the real value of the estimated parameter
13 Why Confidence Intervals? We are not only interested in finding the point estimate for the mean, but also determining how accurate the point estimate is. The Central Limit Theorem plays a key role here. We assume that the sample standard deviation is close to the population standard deviation (which will almost always be true for large samples). Then the Central Limit Theorem tells us that the standard deviation of the sampling distribution isWe will be interested in finding an interval around x such that there is a large probability that the actual mean falls inside of this interval. This interval is called a confidence interval and the large probability is called the confidence level.
14 DefinitionsA range around the sample estimate in which the population estimate is expected to fall with a specified degree of confidence, usually 95% of the time at a significance level of 5%.P[lower critical value < estimator < higher critical value] = 1-αα = significance levelThe range defined by the critical values will contains the population estimator with a probability of 1-αIt is applied when variables are normal distributed!
16 Confidence Intervals Definition 1: 95% Confidence Interval for m:Definition 1:You can be 95% sure that the true mean (μ) will fall within the upper and lower bounds.Definition 2:95% of the intervals constructed using sample means ( x ) will contain the true mean ( μ ).
17 Confidence Intervals It is calculated taking into consideration: The sample or population sizeThe type of investigated variable (qualitative OR quantitative)Formula of calculus comprised two parts:One estimator of the quality of sample based on which the population estimator was computed (standard error)Standard error: is a measure of how good our best guess is.Standard error: the bigger the sample, the smaller the standard error.Standard error: i always smaller than the standard deviationDegree of confidence (Zα score)It is possible to be calculated for any estimator but is most frequent used for mean
18 Confidence Intervals for Means Standard error of mean is equal to standard deviation divided by square root of number of observations:If standard deviation is high, the chance of error in estimator is highIf sample size is large, the chance of error in estimator is small.
19 Confidence Intervals for Means Lower confidence limit is smaller than the meanUpper confidence limit is higher than the meanFor the 95% confidence intervals: Z5% = 1.96For the 99% confidence intervals : Z1% = 2.58
20 Confidence Interval for a Mean When the Population Standard Deviation is Unknown When the population is normal or if the sample size is large, then the sampling distribution will also be normal, but the use of s to replace s is not that accurate. The smaller the sample size the worse the approximation will be. Hence we can expect that some adjustment will be made based on the sample size. The adjustment we make is that we do not use the normal curve for this approximation.Instead, we use the Student t distribution that is based on the sample size. We proceed as before, but we change the table that we use. This distribution looks like the normal distribution, but as the sample size decreases it spreads out. For large n it nearly matches the normal curve. We say that the distribution has n - 1 degrees of freedom.
21 Confidence Intervals 90% CI : x ± 1.65 ( 𝒔 𝒏 ) CI for μ if n>120:90% CI : x ± 1.65 ( 𝒔 𝒏 )95% CI : x ± 1.96 ( 𝒔 𝒏 )99% CI : x ± 2.58 ( 𝒔 𝒏 )CI for μ if n<120:90% CI : x ± t,n-1 ( 𝒔 𝒏 )95% CI : x ± t,n-1 ( 𝒔 𝒏 )99% CI : x ± t,n-1 ( 𝒔 𝒏 )where t,n-1 distribution is read from table "t" at the and n-1 degrees of freedomThe EXCEL function T.INV.2T ((probability grade_libertate)
23 Confidence Intervals for Means The mean of blood sugar concentration of a sample of 121 patients is equal to 105 and the variance is equal to 36.Which is the confidence levels of blood sugar concentration of the population from which the sample was extracted?Use a significance level of 5% (Z = 1.96). It is considered that the blood sugar concentration is normal distributed.n = 121s2 = 36s = 6m = 105[ ; ][103.93; ][104;106]
24 Comparing Means by using Confidence Levels 200100TAS (mmHg)Treatament ATreatament BTreatament C𝑥CI
25 Problem:A fellow wanted to determine the average serum creatinine level among healthy elderly adult male subjects from Timisoara city. From the literature she could not find any information on on μ or s of serum creatinine among local healthy elderly males. She measured 15 health elderly male volunteers from Timisoara city and the sample mean sCr is 0.94 mg/dL with a sample standard deviation of 0.15 mg/dL. What should be the 95% CI for μ ?
28 ExampleSuppose a student measuring the boiling temperature of a certain liquid observes the readings (in degrees Celsius) 102.5, 101.7, 103.1, 100.9, 100.5, and on 6 different samples of the liquid.He calculates the sample mean to beIf he knows that the standard deviation for this procedure is 1.2 degrees, what is the confidence interval for the population mean at a 95% confidence level?In other words, the student wishes to estimate the true mean boiling temperature of the liquid using the results of his measurements. If the measurements follow a normal distribution, then the sample mean will have the distribution N(,/n). Since the sample size is 6, the standard deviation of the sample mean is equal to 1.2/sqrt(6) = 0.49.
29 Remember! High value of standard error Small sample sizes Correct estimation of a statistical parameter is done with confidence intervals (CI).Confidence intervals depend by the sample, size and standard error.The confidence intervals is larger for:High value of standard errorSmall sample sizes