Download presentation

Presentation is loading. Please wait.

Published byAntonio Holmes Modified over 2 years ago

1
Chapter 4. Elements of Statistics # brief introduction to some concepts of statistics # descriptive statistics inductive statistics(statistical inference) # Classification of the field of statistics i) Sampling theory ii) Estimation theory iii) Hypothesis testing iv) Curve fitting or Regression v) Analysis of variance

2
4.2 Sampling Theory–the Sample Mean How many samples are required for a given degree of confidence in the result? # Terminology - population N(size of population) very large or - (random) sample n(size of sample) # one of the most important quantities is the sample mean How close the sample mean might be to the average value of the population?

3
Let the sample have the numerical value of x 1, x 2, … x n Then, the sample mean is given by Note that we are interested in the statistical properties of arbitrary random samples rather than any particular sample. That is, the sample mean becomes a random variable. Therefore, it is appropriate to denote the sample mean as

4
We want the mean value of the sample mean close to the true mean value of the population the mean value of the sample mean = the true mean value of the population The sample mean is a unbiased estimate of the true mean. But, this is not sufficient to indicate whether the sample mean is a good estimator of the true population mean.

5
The variance of the sample mean ? N n (population sampling.) Var mean square of - square of the mean

6
: statistically indep. Var (!)

7
Where is the true variance of the population As n =>, Variance => 0, Which means that large sample sizes lead to a better estimate * : 1)N N sampling with replacement

8
2)N replace Var N-> N = n 0 ( !) `Two examples : pp163 ~165

9
4.3 Sampling Theory – The sample Variance The population variance is needed for determining the sample size required to achieve a desired variance of the sample mean (see eq. 4-4) Definition(Sample Variance): The expected value of the sample variance can be derived easily using not the true variance, that is, a biased estimate rather than an unbiased one

10
Now, we redefine the sample variance for having an unbiased estimate of the population variance : Note that these hold for very large N, that is, N=. How about when the population size is not large?

11
# When N is not large, the expected value of S 2 is given by For obtaining an unbiased estimate, we redefine # The variance of the estimates of the variance : the variance of S 2 : the variance of : where is the 4th central moment of the population

12
4.4 Sampling Distributions & Confidence Intervals what is the probability that the estimates are within specified bounds? p,d,f 2, sample mean ! normalized sample mean Xi Gaussian and independent => Gaussian (0,1)

13
X i not Gaussian n=> Z asymptotically Gaussian by the central limit theorem (n n30 ; A rule of thumb) H.W) Solve the problems in chap.4; 4-2.1, 4-2.5, 4-3.1, 4-4.1, 4-5.1, 4-6.1

14
No longer Gaussian => Student s t distribution with n-1 d.of f. p

15
`pdf of student s t distribution Where the gamma heavier tails (n 30) n any = ! integer

16
( ) confidence interval ? interval estimate ( ) q- percent confidence interval (q/100 )

17
k q pdf. k p (q k )

18
) q=95% -> (q=99% !)

19
: q from PDF F Prob. Distribution for Student s + function (See Appendix F or Table 4-2 page 172 for v = 8 )

20
4.5 Hypothesis Testing The question arises; How does one decide to accept or reject a given hypothesis when the sample size and the confidence level are specified?

21
Two steps; i) to make some hypothesis about the population ii) to determine if the observed sample confirms or rejects this hypothesis.

22
Two tests; one-sided or two-sided. The average life time of the light bulb >= 1000 hours 100ohms resisters too high or too low

23
One-sided test ) A capacitor manufacturer claims that a mean value of breakdown voltage >= 300 V a sample of 100 capacitors –> 99% confidence level is used ) Is the manufacturer s claim valid? ) We would reject the hypothesis!

24
Normalized r, v, Z 99%

25
99.5% – accept the hypothesis less likely more severe requirement

26
(level of significance) (100% - ) more severe!

27
) sample size=9, no longer Gaussian -> Student s + distribution v=n-1=8 dof 99%, – accept the hypothesis

28
a small sample size t heavier tail t distribution more likely to exceed the critical value small size less reliable(less severe) than large size tests

29
Two-sided test ) A manufacture of Zener diodes claims that the true mean breakdown voltage = 10V ) hypothesis : the true accepts or rejects? 100 samples -> 95%

30
) Rejected! z is outside the interval,

31
) 9 samples t is inside the interval, accepted! –Less severe than a large sample test

32
4.6 Curve Fitting and Linear Regression ( ), x y. 1 (linear) or 2 (correlation analysis) x y.

33
–Scatter diagram ( ) data -n samples

34
-Curve fitting to find a mathematical relationship regression curve (equation) ; resulting curve

35
-What is the best fit? In a least squares sense –Let be the errors between the regression curve and the scatter diagram – minimum. – the type of equation to be fitted to the data n smoothing

36
Linear regression a, b ?

37
)

38
MATLAB in function, p = polyfit(y, x, n)

39
A second-order regression ( p.180, 4-3, 4-6)

40
4.7 Correlation between Two Sets of Data Two data sets correlated or not?

41
Linear correlation coefficient Pearson s r Usage ; useful in determining the sources of errors ) a point-to-point digital communication link BER(Bit Error Rate) link quality BER may fluctuate randomly due to wind ) error source wind ? wind 20 resulting BER correlation test r=0.891 yes!

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google