 # Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)

## Presentation on theme: "Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)"— Presentation transcript:

1 THE CENTRAL LIMIT THEOREM If a random variable X has a normal distribution, its sample mean X will also have a normal distribution. This fact is useful for the construction of t statistics and confidence intervals if we are employing X as an estimator of the population mean. n = 1

2 THE CENTRAL LIMIT THEOREM However, what happens if we are not able to assume that X is normally distributed? n = 1

3 THE CENTRAL LIMIT THEOREM The standard response is to make use of a central limit theorem. Loosely speaking, a central limit theorem states that the distribution of X will approximate a normal distribution as the sample size becomes large, even when the distribution of X itself is not normal. n = 1

4 THE CENTRAL LIMIT THEOREM There are a number of central limit theorems, differing only in the assumptions that they make in order to obtain this result. Here we shall be content with using the simplest one, the Lindeberg–Levy central limit theorem. n = 1

5 THE CENTRAL LIMIT THEOREM It states that, provided that the X i in the sample are all drawn independently from the same distribution (the distribution of X), and provided that this distribution has finite population mean and variance, the distribution of X will converge on a normal distribution. n = 1

6 THE CENTRAL LIMIT THEOREM This means that our t statistics and confidence intervals will be approximately valid after all, provided that the sample size is large enough. n = 1

7 The figure shows the distribution of X for the case where the X has a uniform distribution with range 0 to 1, for 10,000,000 samples. A uniform distribution is one in which all values over a finite range are equally likely. THE CENTRAL LIMIT THEOREM n = 1

8 For a sample of 1, the distribution of X is the uniform distribution itself, and so it is a horizontal line. THE CENTRAL LIMIT THEOREM n = 1

9 THE CENTRAL LIMIT THEOREM n = 10 n = 1 We now show the distribution of X for a sample of size 10, for 10,000,000 samples. It can be seen that X has a distribution very close to a normal distribution even when the sample size is quite small.

10 THE CENTRAL LIMIT THEOREM n = 25 n = 10 n = 1 Here is the distribution of X for 10,000,000 samples, each of size 25. It is even closer to normal.

11 THE CENTRAL LIMIT THEOREM n = 100 n = 25 n = 10 n = 1 Here is the distribution of X for 10,000,000 samples, each of size 25. It is indistinguishable from normal.

12 If X had a different distribution, the sample size required for a good approximation would be different. The figure shows the case where X has a lognormal distribution. As you can see, it is heavily skewed. THE CENTRAL LIMIT THEOREM n = 1

13 Here is the distribution of X for sample size 10, for 10,000,000 samples. It is less skewed. THE CENTRAL LIMIT THEOREM n = 1 n = 10

14 With sample size 25, the distribution is increasingly symmetrical. THE CENTRAL LIMIT THEOREM n = 1 n = 10 n = 25

15 However, even with sample size 100, the distribution is only an approximation to a normal distribution. Notice the difference in the shapes of the tails. THE CENTRAL LIMIT THEOREM n = 1 n = 10 n = 25 n = 100

16 In asserting that the distribution of X tends to become normal as the sample size increases, we have glossed over an important technical point that needs to be addressed. The central limit theorem applies only in the limit, as the sample size tends to infinity. THE CENTRAL LIMIT THEOREM n = 1 n = 10 n = 25 n = 100

17 However, as the sample size tends to infinity, the distribution of X degenerates to a spike located at the population mean. So how can we talk about the limiting distribution being normal? THE CENTRAL LIMIT THEOREM n = 1 n = 10 n = 25 n = 100

18 THE CENTRAL LIMIT THEOREM n = 1 n = 10 n = 25 n = 100 The answer is to transform the estimator in an appropriate way so that the transformation does have a limiting distribution. Having established the limiting distribution of the transformation, we may be able to work backwards to the properties of the estimator.

19 THE CENTRAL LIMIT THEOREM Suppose that X has variance. Then, for sample size n, X has variance. It follows that has variance. However, as the sample size becomes large, X tends to , the population mean of X. Hence tends to. As a consequence, does not have a limiting distribution. It increases indefinitely with n. To deal with this, we consider instead the statistic. The Lindeberg–Levy central limit theorem relates to this statistic, and not directly to X. It states that Let us write the variance of X as  2. Then the variance of the sample mean is  2 /n. It follows that has variance  2, which is independent of n. We are making progress in finding the appropriate transformation.

20 THE CENTRAL LIMIT THEOREM Suppose that X has variance. Then, for sample size n, X has variance. It follows that has variance. However, as the sample size becomes large, X tends to , the population mean of X. Hence tends to. As a consequence, does not have a limiting distribution. It increases indefinitely with n. To deal with this, we consider instead the statistic. The Lindeberg–Levy central limit theorem relates to this statistic, and not directly to X. It states that However, as the sample size becomes large, the sample mean tends to the population mean of X, which we will denote . Thus tends to. This increases with n, so it cannot have a limiting distribution.

21 THE CENTRAL LIMIT THEOREM Suppose that X has variance. Then, for sample size n, X has variance. It follows that has variance. However, as the sample size becomes large, X tends to , the population mean of X. Hence tends to. As a consequence, does not have a limiting distribution. It increases indefinitely with n. To deal with this, we consider instead the statistic. The Lindeberg–Levy central limit theorem relates to this statistic, and not directly to X. It states that To deal with this, we consider instead the statistic. This is what we need. The Lindeberg–Levy central limit theorem states that, as n tends to infinity, this statistic tends to a normal distribution with mean zero and variance  2.

22 THE CENTRAL LIMIT THEOREM Suppose that X has variance. Then, for sample size n, X has variance. It follows that has variance. However, as the sample size becomes large, X tends to , the population mean of X. Hence tends to. As a consequence, does not have a limiting distribution. It increases indefinitely with n. To deal with this, we consider instead the statistic. The Lindeberg–Levy central limit theorem relates to this statistic, and not directly to X. It states that The arrow with a d over it is mathematical shorthand that means ‘has limiting distribution as n tends to infinity’.

23 So far we have talked of ‘the’ central limit theorem. In fact, there are numerous CLTs. They differ in the assumptions required for their use. The Lindeberg–Levy CLT is a particularly simple one and sufficient for the present analysis. THE CENTRAL LIMIT THEOREM Suppose that X has variance. Then, for sample size n, X has variance. It follows that has variance. However, as the sample size becomes large, X tends to , the population mean of X. Hence tends to. As a consequence, does not have a limiting distribution. It increases indefinitely with n. To deal with this, we consider instead the statistic. The Lindeberg–Levy central limit theorem relates to this statistic, and not directly to X. It states that

24 Now this relationship is true only as n goes to infinity. However, from the limiting distribution, we can start working back tentatively to finite samples. We can say, that for large n, the relationship may hold approximately. THE CENTRAL LIMIT THEOREM This is an exact statement. However, having established it, we can say that, as an approximation, for sufficiently large n, and hence that, as an approximation, for sufficiently large n, ~ ~

25 Then, dividing the statistic by, we can say that, for sufficiently large n, the second equation is approximately true. (The symbol ~ means ‘is distributed as’.) THE CENTRAL LIMIT THEOREM This is an exact statement. However, having established it, we can say that, as an approximation, for sufficiently large n, and hence that, as an approximation, for sufficiently large n, ~ ~

26 This implies the third equation. We knew that the sample mean was distributed with mean  and variance  2 /n. What we have shown is that its distribution is approximately normal in sufficiently large samples. This enables us to perform the usual tests. THE CENTRAL LIMIT THEOREM This is an exact statement. However, having established it, we can say that, as an approximation, for sufficiently large n, and hence that, as an approximation, for sufficiently large n, ~ ~

27 Of course, this begs the question of what might be considered to be ‘sufficiently large n’. To answer this question, the analysis must be supplemented by simulation. THE CENTRAL LIMIT THEOREM This is an exact statement. However, having established it, we can say that, as an approximation, for sufficiently large n, and hence that, as an approximation, for sufficiently large n, ~ ~

28 The figure shows the distribution of for the uniform distribution when n = 1. It is, of course, just the uniform distribution itself, with the mean of 0.5 subtracted. THE CENTRAL LIMIT THEOREM

29 Here is the distribution of when n = 10. It looks very like a normal distribution. THE CENTRAL LIMIT THEOREM

30 Here is the same figure with the theoretical limiting normal distribution superimposed (the dashed curve in red). It confirms that the distribution for the sample mean has virtually converged to normality with a sample size of only 10. THE CENTRAL LIMIT THEOREM

31 The curve for n = 25 has been added. There is hardly any change because convergence has already been achieved. THE CENTRAL LIMIT THEOREM

32 Of course, the curve for n = 100 also coincides. THE CENTRAL LIMIT THEOREM

33 Now consider the example of the lognormal distribution. Here is the distribution of for n = 1. It is just the lognormal distribution itself with the mean subtracted. THE CENTRAL LIMIT THEOREM

34 Here is the distribution of for n = 10. The theoretical limiting distribution is also shown. Clearly, n = 10 if far from being ‘sufficiently large’. THE CENTRAL LIMIT THEOREM

35 THE CENTRAL LIMIT THEOREM Here is the distribution of for n = 25. It is closer to the limiting distribution but there is still a long way to go.

36 THE CENTRAL LIMIT THEOREM Here is the distribution of for n = 100. It is closer still to the limiting distribution but convergence has not been achieved. In the case of the lognormal distribution, a sample size of 100 is clearly not “sufficiently large”. We should try 200, perhaps 500.

Copyright Christopher Dougherty 2011. These slideshows may be downloaded by anyone, anywhere for personal use. Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author. The content of this slideshow comes from Section R.15 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre http://www.oup.com/uk/orc/bin/9780199567089/http://www.oup.com/uk/orc/bin/9780199567089/. Individuals studying econometrics on their own and who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school course EC212 Introduction to Econometrics http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx or the University of London International Programmes distance learning course 20 Elements of Econometrics www.londoninternational.ac.uk/lsewww.londoninternational.ac.uk/lse. 11.07.25

Download ppt "Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the central limit theorem Original citation: Dougherty, C. (2012)"

Similar presentations