Limits and the Law of Large Numbers Lecture XIII.

Slides:



Advertisements
Similar presentations
Random Processes Introduction (2)
Advertisements

Let X 1, X 2,..., X n be a set of independent random variables having a common distribution, and let E[ X i ] = . then, with probability 1 Strong law.
The Simple Regression Model
Chapter 2 Multivariate Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
Economics 20 - Prof. Anderson1 Stationary Stochastic Process A stochastic process is stationary if for every collection of time indices 1 ≤ t 1 < …< t.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
ORDER STATISTICS.
Use of moment generating functions. Definition Let X denote a random variable with probability density function f(x) if continuous (probability mass function.
10 Further Time Series OLS Issues Chapter 10 covered OLS properties for finite (small) sample time series data -If our Chapter 10 assumptions fail, we.
STAT 497 APPLIED TIME SERIES ANALYSIS
Stationary Stochastic Process
Introduction to stochastic process
Probability theory 2010 Main topics in the course on probability theory  Multivariate random variables  Conditional distributions  Transforms  Order.
1 Chap 5 Sums of Random Variables and Long-Term Averages Many problems involve the counting of number of occurrences of events, computation of arithmetic.
SUMS OF RANDOM VARIABLES Changfei Chen. Sums of Random Variables Let be a sequence of random variables, and let be their sum:
Time Series Basics Fin250f: Lecture 3.1 Fall 2005 Reading: Taylor, chapter
Probability theory 2011 Main topics in the course on probability theory  The concept of probability – Repetition of basic skills  Multivariate random.
4. Convergence of random variables  Convergence in probability  Convergence in distribution  Convergence in quadratic mean  Properties  The law of.
Least Squares Asymptotics Convergence of Estimators: Review Least Squares Assumptions Least Squares Estimator Asymptotic Distribution Hypothesis Testing.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Some standard univariate probability distributions
Math Camp 2: Probability Theory Sasha Rakhlin. Introduction  -algebra Measure Lebesgue measure Probability measure Expectation and variance Convergence.
Probability theory 2011 Convergence concepts in probability theory  Definitions and relations between convergence concepts  Sufficient conditions for.
The moment generating function of random variable X is given by Moment generating function.
Review of Probability and Statistics
Approximations to Probability Distributions: Limit Theorems.
The Multivariate Normal Distribution, Part 2 BMTRY 726 1/14/2014.
1 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent.
Introduction to AEP In information theory, the asymptotic equipartition property (AEP) is the analog of the law of large numbers. This law states that.
Simulation Output Analysis
All of Statistics Chapter 5: Convergence of Random Variables Nick Schafer.
Random Sampling, Point Estimation and Maximum Likelihood.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
Random walk on Z: Simple random walk : Let be independent identically distributed random variables (iid) with only two possible values {1,-1},
1 G Lect 3b G Lecture 3b Why are means and variances so useful? Recap of random variables and expectations with examples Further consideration.
Convergence in Distribution
Chapter 5.6 From DeGroot & Schervish. Uniform Distribution.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
1 ORDER STATISTICS AND LIMITING DISTRIBUTIONS. 2 ORDER STATISTICS Let X 1, X 2,…,X n be a r.s. of size n from a distribution of continuous type having.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
Expectation for multivariate distributions. Definition Let X 1, X 2, …, X n denote n jointly distributed random variable with joint density function f(x.
Week 21 Stochastic Process - Introduction Stochastic processes are processes that proceed randomly in time. Rather than consider fixed random variables.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Confidence Interval & Unbiased Estimator Review and Foreword.
Week 121 Law of Large Numbers Toss a coin n times. Suppose X i ’s are Bernoulli random variables with p = ½ and E(X i ) = ½. The proportion of heads is.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
1 Probability and Statistical Inference (9th Edition) Chapter 5 (Part 2/2) Distributions of Functions of Random Variables November 25, 2015.
1 EE571 PART 3 Random Processes Huseyin Bilgekul Eeng571 Probability and astochastic Processes Department of Electrical and Electronic Engineering Eastern.
Joint Moments and Joint Characteristic Functions.
Week 111 Some facts about Power Series Consider the power series with non-negative coefficients a k. If converges for any positive value of t, say for.
Central Limit Theorem Let X 1, X 2, …, X n be n independent, identically distributed random variables with mean  and standard deviation . For large n:
Chapter 6 Large Random Samples Weiqi Luo ( 骆伟祺 ) School of Data & Computer Science Sun Yat-Sen University :
Chebyshev’s Inequality Markov’s Inequality Proposition 2.1.
Sums of Random Variables and Long-Term Averages Sums of R.V. ‘s S n = X 1 + X X n of course.
Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.
Large Sample Distribution Theory
Large Sample Theory EC 532 Burak Saltoğlu.
Main topics in the course on probability theory
Sample Mean Distributions
Parameter, Statistic and Random Samples
Large Sample Theory EC 532 Burak Saltoğlu.
ORDER STATISTICS AND LIMITING DISTRIBUTIONS
STOCHASTIC HYDROLOGY Random Processes
Lecture 7 Sampling and Sampling Distributions
ORDER STATISTICS AND LIMITING DISTRIBUTIONS
Central Limit Theorem: Sampling Distribution.
Further Topics on Random Variables: 1
Chapter 5 Properties of a Random Sample
Presentation transcript:

Limits and the Law of Large Numbers Lecture XIII

Almost Sure Convergence  Let  represent the entire random sequence {Z t }. As discussed last time, our interest typically centers around the averages of this sequence:

 Definition 2.9: Let {b n (  )} be a sequence of real-valued random variables. We say that b n (  ) converges almost surely to b, written if and only if there exists a real number b such that

 The probability measure P describes the distribution of  and determines the joint distribution function for the entire sequence {Z t }.  Other common terminology is that b n (  ) converges to b with probability 1 (w.p.1) or that b n (  ) is strongly consistent for b.

4 Example 2.10: Let where {Z t } is a sequence of independently and identically distributed (i.i.d.) random variables with E(Z t )=  < . Then by the Komolgorov strong law of large numbers (Theorem 3.1).

4 Proposition 2.11: Given g: R k  R l (k,l<∞) and any sequence {b n } such that where b n and b are k x 1 vectors, if g is continuous at b, then

4 Theorem 2.12: Suppose –y=X  0 +  ; –X’  /n  a.s. 0; –X’X/  a.s. M, finite and positive definite.  Then  n exists a.s. for all n sufficiently large, and  n  a.s.  0.

4 Proof: Since X’X/n  a.s. M, it follows from Proposition 2.11 that det(X’X/n)  a.s. det(M). Because M is positive definite by (iii), det(M)>0. It follows that det(X’X/n)>0 a.s. for all n sufficiently large, so (X’X/N) -1 exists a.s. for all n sufficiently large. Hence

4 In addition, 4 It follows from Proposition 2.11 that

Convergence in Probability 4 A weaker stochastic convergence concept is that of convergence in probability.  Definition 2.23: Let {b n (  )} be a sequence of real-valued random variables. If there exists a real number b such that for every  > 0, as n  , then b n (  ) converges in probability to b.

 The almost sure measure of probability takes into account the joint distribution of the entire sequence {Z t }, but with convergence in probability, we only need to be concerned with the joint distribution of those elements that appear in b n (  ). 4 Convergence in probability is also referred to as weak consistency.

 Theorem 2.24: Let { b n (  )} be a sequence of random variables. If If b n converges in probability to b, then there exists a subsequence {b nj } such that

Convergence in the r th Mean  Definition 2.37: Let {b n (  )} be a sequence of real-valued random variables. If there exists a real number b such that as n   for some r > 0, then b n (  ) converges in the r th mean to b, written as

4 Proposition 2.38: (Jensen’s inequality) Let g: R 1  R 1 be a convex function on an interval B  R 1 and let Z be a random variable such that P[Z  B]=1. Then g(E(Z))  E(g(Z)). If g is concave on B, then g(E(Z))  E(g(Z)).

 Proposition 2.41: (Generalized Chebyshev Inequality) Let Z be a random variable such that E|Z| r 0. Then for ever  > 0

 Theorem 2.42: If b n (  )  r.m. b for some r > 0, then b n (  )  p b.

Laws of Large Numbers 4 Proposition 3.0: Given restrictions on the dependence, heterogeneity, and moments of a sequence of random variables {Z t }, where

Independent and Identically Distributed Observations 4 Theorem 3.1: (Komolgorov) Let {Z t } be a sequence of i.i.d. random variables. Then if and only if E|Z t | <  and E(Z t ) = .  This result is consistent with Theorem (Khinchine) Let {X i } be independent and identically distributed (i.i.d.) with E[X i ] = . Then

4 Proposition 3.4: (Holder’s Inequality) If p > 1 and 1/p+1/q=1 and if E|Y| p <  and E|Z| q < , then E|YZ|  [E|Y| p ] 1/p [E|Z| q ] 1/q. 4 If p=q=2, we have the Cauchy-Schwartz inequality

Asymptotic Normality  Under the traditional assumptions of the linear model (fixed regressors and normally distributed error terms)  n is distributed multivariate normal with: for any sample size n.

 However, when the sample size becomes large the distribution of  n is approximately normal under some general conditions.

4 Definition 4.1: Let {b n } be a sequence of random finite-dimensional vectors with joint distribution functions {F n }. If F n (z)  F(z) as n   for every continuity point z, where F is the distribution function of a random variable Z, then b n converges in distribution to the random variable Z, denoted

4 Other ways of stating this concept are that b n converges in law to Z: Or, b n is asymptotically distributed as F In this case, F is called the limiting distribution of b n.

 Example 4.3: Let {Z t } be a i.i.d. sequence of random variables with mean  and variance  2 < . Define Then by the Lindeberg-Levy central limit theorem (Theorem 6.2.2),

 Theorem (6.2.2): (Lindeberg-Levy) Let {X i } be i.i.d. with E[X i ]=  and V(X i )=  2. Then Z n  N(0,1). 4 Definition 4.8: Let Z be a k x 1 random vector with distribution function F. The characteristic function of Z is defined as where i 2 =-1 and is a k x 1 real vector.

 Example 4.10: Let Z~N( ,  2 ). Then 4 This proof follows from the derivation of the moment generating function in Lecture VII.

4 Specifically, note the similarity between the definition of the moment generating function and the characteristic function: 4 Theorem 4.11 (Uniqueness Theorem) Two distribution functions are identical if and only if their characteristic functions are identical.

4 Note that we have a similar theorem for moment generating functions. 4 Proof of Lindeberg-Levy: –First define f( ) as the characteristic function for Z t -  and let f n ( ) be the characteristic function of

–By the structure of the characteristic function we have

–Taking a second order Taylor series expansion of f( ) around =0 gives Thus,

4 Thus, by the Uniqueness Theorem the characteristic function of the sample approaches the characteristic function of the standard normal.