Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 THE CENTRAL LIMIT THEOREM If a random variable X has a normal distribution, its sample mean X will also have a normal distribution. This fact is useful.

Similar presentations


Presentation on theme: "1 THE CENTRAL LIMIT THEOREM If a random variable X has a normal distribution, its sample mean X will also have a normal distribution. This fact is useful."— Presentation transcript:

1 1 THE CENTRAL LIMIT THEOREM If a random variable X has a normal distribution, its sample mean X will also have a normal distribution. This fact is useful for the construction of t statistics and confidence intervals if we are employing X as an estimator of the population mean. n = 1 10 million samples

2 2 However, what happens if we are not able to assume that X has a normal distribution? THE CENTRAL LIMIT THEOREM n = 1 10 million samples

3 3 The standard response is to make use of a central limit theorem. Loosely speaking, a central limit theorem states that the distribution of X will approximate a normal distribution as the sample size becomes large, even when the distribution of X itself is not normal. THE CENTRAL LIMIT THEOREM n = 1 10 million samples

4 4 There are a number of central limit theorems, differing only in the assumptions that they make in order to obtain this result. Here we shall be content with using the simplest one, the Lindeberg–Levy central limit theorem. THE CENTRAL LIMIT THEOREM n = 1 10 million samples

5 5 It states that, provided that the X i in the sample are all drawn independently from the same distribution (the distribution of X), and provided that this distribution has finite population mean and variance, the distribution of X will converge on a normal distribution as n increases. THE CENTRAL LIMIT THEOREM n = 1 10 million samples

6 6 This means that our t statistics and confidence intervals will be approximately valid after all, provided that the sample size is large enough. THE CENTRAL LIMIT THEOREM n = 1 10 million samples

7 7 The figure shows the distribution of X for the case where the X has a uniform distribution with range 0 to 1, for 10 million samples. A uniform distribution is one in which all values over a finite range are equally likely. THE CENTRAL LIMIT THEOREM n = 1 10 million samples

8 8 For a sample of 1, the distribution of X is the uniform distribution itself, and so it is a horizontal line. THE CENTRAL LIMIT THEOREM n = 1 10 million samples

9 9 We now show the distribution of X for a sample of size 10, for 10 million samples. It can be seen that X has a distribution very close to a normal distribution even though the sample size is quite small. THE CENTRAL LIMIT THEOREM n = 10 n = 1 10 million samples

10 10 Here is the distribution of X for samples of size 25. It is even closer to normal. THE CENTRAL LIMIT THEOREM n = 25 n = 10 n = 1 10 million samples

11 11 Here is the distribution for sample size 100. It is indistinguishable from normal. 10 million samples THE CENTRAL LIMIT THEOREM n = 100 n = 25 n = 10 n = 1

12 12 If X had a different distribution, the sample size required for a good approximation would be different. The figure shows the case where X has a lognormal distribution. As you can see, it is heavily skewed. THE CENTRAL LIMIT THEOREM n = 1

13 13 Here is the distribution of X for sample size 10, for 10 million samples. It is still heavily skewed. THE CENTRAL LIMIT THEOREM n = 10 n = 1 10 million samples

14 14 With sample size 25, the distribution is becoming less skewed. THE CENTRAL LIMIT THEOREM n = 25 n = 10 n = 1 10 million samples

15 15 However, even with sample size 100, the distribution is only an approximation to a normal distribution. Notice the difference in the shapes of the tails. We need a larger value of n before we can say that the distribution is approximately normal. THE CENTRAL LIMIT THEOREM 10 million samples n = 100 n = 25 n = 10 n = 1

16 16 In asserting that the distribution of X tends to become normal as the sample size increases, we have glossed over an important technical point that needs to be addressed. The central limit theorem applies only in the limit, as the sample size tends to infinity. THE CENTRAL LIMIT THEOREM 10 million samples n = 100 n = 25 n = 10 n = 1

17 17 However, as the sample size tends to infinity, the distribution of X degenerates to a spike located at the population mean. So how can we talk about the limiting distribution being normal? THE CENTRAL LIMIT THEOREM 10 million samples n = 100 n = 25 n = 10 n = 1

18 18 The answer is to transform the estimator in an appropriate way so that the transformation does have a limiting distribution. Having established the limiting distribution of the transformation, we may be able to work backwards to the properties of the estimator. THE CENTRAL LIMIT THEOREM 10 million samples n = 100 n = 25 n = 10 n = 1

19 19 If X has mean  and variance  2, X has mean  and variance  2 /n. The mean is independent of n, but the variance tends to zero as n tends to infinity. THE CENTRAL LIMIT THEOREM X has mean  and variance  2. X is an estimator of . Search for a transformation of X that has a limiting distribution mean stable, but variance → 0 X  mean variance properties as n increases

20 20 We can deal with the vanishing variance problem by scaling the estimator by. This multiplies its variance by n, and so the variance becomes  2, which is independent of n. We are making progress in finding the appropriate transformation. THE CENTRAL LIMIT THEOREM X has mean  and variance  2. X is an estimator of . Search for a transformation of X that has a limiting distribution mean stable, but variance → 0 variance stable, but mean increases X  mean variance properties as n increases

21 21 However, we now have a problem with the mean. This is now. It increases with n, so the statistic cannot have a limiting distribution. THE CENTRAL LIMIT THEOREM X has mean  and variance  2. X is an estimator of . Search for a transformation of X that has a limiting distribution mean stable, but variance → 0 variance stable, but mean increases X  mean variance properties as n increases

22 22 To deal with this, we consider instead the statistic. This is what we need. Its mean is zero and its variance is unaffected. The mean and variance are both independent of n, and so this statistic can have a limiting distribution. THE CENTRAL LIMIT THEOREM X has mean  and variance  2. X is an estimator of . Search for a transformation of X that has a limiting distribution mean stable, but variance → 0 variance stable, but mean increases X  0 mean and variance both stable mean variance properties as n increases

23 23 The Lindeberg–Levy central limit theorem states that, as n tends to infinity, this statistic has a normal distribution with mean zero and variance  2. THE CENTRAL LIMIT THEOREM X has mean  and variance  2. X is an estimator of . Application of central limit theorem Search for a transformation of X that has a limiting distribution mean stable, but variance → 0 variance stable, but mean increases X  0 mean and variance both stable mean variance properties as n increases

24 24 THE CENTRAL LIMIT THEOREM X has mean  and variance  2. X is an estimator of . Search for a transformation of X that has a limiting distribution mean stable, but variance → 0 variance stable, but mean increases X  0 mean and variance both stable Application of central limit theorem mean variance properties as n increases The arrow with a d over it is mathematical shorthand that means ‘has limiting distribution as n tends to infinity’.

25 25 THE CENTRAL LIMIT THEOREM X has mean  and variance  2. X is an estimator of . This relationship is true only as n goes to infinity. However, from the limiting distribution, we can start working back tentatively to finite samples. We can say, that for large n, the relationship may hold approximately. (The symbol ~ means ‘is distributed as’.) Approximation for finite samples Application of central limit theorem

26 26 THE CENTRAL LIMIT THEOREM X has mean  and variance  2. X is an estimator of . Then, dividing the statistic by, we can say that, for sufficiently large n, the second equation is approximately true. Approximation for finite samples Application of central limit theorem

27 27 THE CENTRAL LIMIT THEOREM X has mean  and variance  2. X is an estimator of . This implies the last equation. We knew, from the beginning, that the sample mean was distributed with mean  and variance  2 /n. Approximation for finite samples Application of central limit theorem

28 28 THE CENTRAL LIMIT THEOREM X has mean  and variance  2. X is an estimator of . What we have shown is that, irrespective of the distribution of X, the distribution of the sample mean is approximately normal in sufficiently large samples. This enables us to perform the usual tests. Approximation for finite samples Application of central limit theorem

29 29 THE CENTRAL LIMIT THEOREM X has mean  and variance  2. X is an estimator of . Approximation for finite samples Of course, this begs the question of what might be considered to be ‘sufficiently large n’. To answer this question, the analysis must be supplemented by simulation. Application of central limit theorem

30 30 The figure shows the distribution of for the uniform distribution when n = 1. It is, of course, just the uniform distribution itself, with the mean of 0.5 subtracted. THE CENTRAL LIMIT THEOREM 10 million samples n = 1

31 31 Here is the distribution of when n = 10. It looks very like a normal distribution. THE CENTRAL LIMIT THEOREM 10 million samples n = 10

32 32 Here is the same figure with the theoretical limiting normal distribution, in red. It confirms that the distribution for the sample mean has virtually converged to normality with a sample size of only 10. THE CENTRAL LIMIT THEOREM 10 million samples n = 10

33 33 The curve for n = 25 has been added. There is hardly any change because convergence has already been achieved. THE CENTRAL LIMIT THEOREM 10 million samples n = 25

34 34 Of course, the curve for n = 100 also coincides. In this case, n = 25 was ‘sufficiently large’. Perhaps even n = 10. 10 million samples THE CENTRAL LIMIT THEOREM n = 100

35 35 Now consider the example of the lognormal distribution. Here is the distribution of for n = 1. It is just the lognormal distribution itself with the mean subtracted. THE CENTRAL LIMIT THEOREM 10 million samples

36 36 Here is the distribution of for n = 10. The theoretical limiting distribution is also shown. Clearly, n = 10 is far from being ‘sufficiently large’. THE CENTRAL LIMIT THEOREM 10 million samples limiting normal distribution n = 10

37 37 Here is the distribution of for n = 25. It is closer to the limiting distribution but there is still a long way to go. THE CENTRAL LIMIT THEOREM 10 million samples limiting normal distribution n = 25

38 38 Here is the distribution of for n = 100. It is closer still to the limiting distribution but convergence has not been achieved. In the case of the lognormal distribution, even a sample size of 100 is clearly not ‘sufficiently large’. We should try 200, perhaps 500. THE CENTRAL LIMIT THEOREM limiting normal distribution n = 100 10 million samples

39 Copyright Christopher Dougherty 2012. These slideshows may be downloaded by anyone, anywhere for personal use. Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author. The content of this slideshow comes from Section R.15 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre http://www.oup.com/uk/orc/bin/9780199567089/http://www.oup.com/uk/orc/bin/9780199567089/. Individuals studying econometrics on their own who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school course EC212 Introduction to Econometrics http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx or the University of London International Programmes distance learning course EC2020 Elements of Econometrics www.londoninternational.ac.uk/lsewww.londoninternational.ac.uk/lse. 2012.11.02


Download ppt "1 THE CENTRAL LIMIT THEOREM If a random variable X has a normal distribution, its sample mean X will also have a normal distribution. This fact is useful."

Similar presentations


Ads by Google