Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering Lecture 8 Engineering Statistics Part II: Estimation.

Similar presentations


Presentation on theme: "© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering Lecture 8 Engineering Statistics Part II: Estimation."— Presentation transcript:

1 © 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering Lecture 8 Engineering Statistics Part II: Estimation EC 2 Polikar

2 © 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering Review of Basic Concepts of Statistics  Statistics is used to make generalized decisions about a population, by analyzing only a small set of sample from the population.  Parameter vs. statistic  Important statistical quantities: Mean, median, mode, standard deviation, variance Population varianceSample variance Sample standard deviation Population meanSample mean

3 Normal (Gaussian) Distribution Function 0 0.2 0.4 0.6 0.8 1 1.2 1.4 00.511.52 distribution variable, x distribution function, normalized 2  4   6  inflection point marks the standard deviation,  value of x at the peak is the mean 68.2% 95.4% 99.7% Statistical Distributions 95.4% 99.7% ++-- -3  -2  +2  +3  

4 © 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering The Gaussian Curve Area from -   x  +    68.2 % of the total area (x 1 =-  ; x 2 =  ) Area from -2   x  +2    95.4% of the total area (x 1 =-2  ; x 2 =2  ) Area from -3   x  +3    99.7 % of the total area (x 1 =-3  ; x 2 =3  Distribution Function Area under the curve The analytical computation of the area under the Gaussian curve is difficult. Therefore, standardized tables generated for this particular purpose are used. The standardization assumes a mean of zero and variance of 1.

5 © 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering Using Gaussian Tables Normalization to use standard tables: Area under the curve on each side of zero is 0.5. The curve is symmetric, so the total area is 1 Example: if z=0.82  Area under the curve for [0 0.82] : 0.294 Total area for [-∞ 0.82]=0.5+0.294=0.794 This value is the probability that z<0.82

6 © 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering Example The chip manufacturing company Lentil ® produces its much anticipated chip Pantsium © XIX running at 66.666 THz. However, the rival company DAM © manufactures its chip Craplon © 66+ +, also running at 66.666 THz. However, DAM claims that Lentil’s chip is flawed, and cannot run any faster than 63 THz. Lentil, which manufactures 100,000 chips everyday, decides to test its chips. They take a sample of 1% (1000 chips). They find that the mean speed of these chips is 65.980 THz with a std. dev. of 1.2 THz. Assuming that the chip speed is normally distributed, is Lentil’s speed claim justifiable? Assume that the claim is justifiable, if 95% of the chips lie in the speed limits of 65 to 67 THz. Now assume that the claim is justifiable, if 90% of the chips run faster than 65.0 THz. -0.82+0.85 0.302 0.294 The probability that a Lentil chip has a speed in the [65 –67] THz is 0.294+0.302=0.596. Thus only 59.6% of the chips satisfy the criterion. The probability that a Lentil chip has a speed larger than 65THz is 0.294+0.5=0.794. That means, roughly 80% of the chips satisfy the criterion. In any case, however, Lentil does better than DAM’s claim of 63 THz. What % of Lentil chips run over 63THz? (Ans. 99.3%)

7 © 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering Estimation Theory & Confidence Intervals  Point estimate vs. interval estimate  Bulb wattage: 60 W vs. 60 ± 5W  55W ~ 65 W  Part length: 5.28cm vs. 5.28 ± 0.03 cm  5.25 ~ 5.31 cm.  Flight time: 11 hrs vs. 11 h ± 15 min  10 h 45 min ~ 11 h 15 min.  Scientific polls: 59% will vote for XYZ (margin of error 4%)  How confident can we be about such interval estimates?  Are we 75% sure? 90% sure? …95% sure? What does it mean to be 95% sure?  Confidence level: The percentage of confidence  Confidence interval: The interval in which we have certain confidence that a value lies.

8 © 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering Confidence Intervals  Recall: For normal distribution, the mean of a statistic lies within one, two or three sigma intervals, 68.27%. 95.45% and 99.73% of the time, respectively.  Example: Let’s assume that the average height at Rowan is 176 inches, with a standard deviation of 5 inches   68.27% of Rowan students are 176 ± 5 in  171 ~ 181 in  95.45% of Rowan students are 176 ± 2x5in  166 ~186 in  99.73% of Rowan students are 176 ± 3x5 in  161 ~ 191 in  Thus, we are 95.45% sure that Rowan students are 166~186 in.  Note that these numbers are true for variables that are Normally distributed. In most practical scenarios, the statistic of a sample size greater than 30 is usually normally distributed!

9 © 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering How to Compute Confidence Intervals  If the statistic is the sample mean, then the confidence limits (end points of the interval) are given by Sample mean Critical value obtained from normal distribution tables based on the desired confidence Population* std. dev. Sample size Confidence Level (%) 99.7399989695.4595908068.2750 Critical Value z c 3.002.582.332.052.001.961.6451.281.000.675 Use Eq. (2) for finite populations of size M, and use Eq. (1) for infinite (very large) populations. (1) (2) * Since population std. dev. is usually unknown, it is estimated by sample std. dev.

10 © 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering How To… Ex: 98% confidence means we have to be sure that the value we estimate must be within the specified limits 98% of the time. Thus the area under the curve on both sides of the mean must be 0.98. Since the curve is symmetric, 0.49 on one side of the curve. The z c value corresponding to 0.49 is 2.33. For 93% confidence  z c =1.81

11 © 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering Example  Measurements of the diameters of a random sample of 200 ball bearings made by a certain machine has a mean of 0.824 in and a std. dev. of 0.042 in. What are the 95% and 99% confidence limits for the mean diameter of the ball bearings?  95% confidence limit  Half the area under the curve = 0.475  z c =1.96. Confidence limits are therefore 0.824 ± z c *  /√N = 0.0824 ± 1.96 * 0.042 √200 = 0.0824 ± 0.0058 in.  99% confidence limit  Half the area under the curve = 0.495  z c =2.58. Confidence limits are therefore 0.824 ± z c *  /√N = 0.0824 ± 2.58 * 0.042 √200 = 0.0824 ± 0.0077 in.  Note 1: Note that we will use the sample std. dev.  as an estimate of the population std. dev.  Note 2: Our confidence interval of 0.0116 is narrower for 95% confidence, than the 0.0154 for the 99% confidence. This makes sense, because the interval in which the true value takes place becomes larger as we demand a higher confidence.

12 © 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering For Populations  If the statistic to be estimated is a proportion of “successes”, then the confidence limits for p the proportion of success (the probability of success) is For infinite (very large) samples sizesFor a sample size of M>30 P is the sample probability of success, and p is the population probability of success. We will use the sample estimate P for the population estimate p in our calculations.

13 © 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering Example  In an exit poll, a news network asks 300 people (from a state of 9M) for whom they voted, and 55% says they have voted for XYZ. Can the network claim the candidate XYZ the winner with a 95% confidence?  For 95% confidence, the confidence interval is = 0.55 ± 0.056  0.494 ~ 0.606  This means that the network at best, can be 95 % confidence that the actual vote the candidate received is between 0.494 and 0.606. In other words, if 55% of 300 people said they voted for XYZ, than there is a 95% probability (or we can be 95% sure) that the actual vote the candidate received will lie between 49.4% and 60.6%. Since at least 50% is required to win the election, the network cannot claim XYZ as the winner.  The natural question to ask is then, how many people to they need to ask that they can claim XYZ’s success with 95% confidence? Assuming again that 55% of N people ( N is now unknown) said they voted for XYZ, and considering that XYZ needs at least 50% of the votes:  N>380. Thus if 55% of 380 people say they voted for XYZ, then the confidence interval will be


Download ppt "© 2003 All Rights Reserved, Robi Polikar, Rowan University, Dept. of Electrical and Computer Engineering Lecture 8 Engineering Statistics Part II: Estimation."

Similar presentations


Ads by Google