Presentation is loading. Please wait.

Presentation is loading. Please wait.

ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.

Similar presentations


Presentation on theme: "ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010."— Presentation transcript:

1 ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010

2 Lecture outline Reading: 9.1-9.2 Confidence Intervals – Central limit theorem – Student t-distribution Linear regression

3 Confidence interval Consider an estimator for unknown  We fix a confidence level, 1-  For every  replace the single point estimator with a lower estimate and upper one s.t. We call, a 1-  confidence interval

4 Confidence interval - example Observations Xi’s are i.i.d normal with unknown mean  and known variance  /n Let  =0.05 Find the 95% confidence interval

5 Confidence interval (CI) Wrong: the true parameter lies in the CI with 95% probability…. Correct: Suppose that  is fixed We construct the CI many times, using the same statistical procedure Obtain a collection of n observations and construct the corresponding CI for each About 95% of these CIs will include 

6 A note on Central Limit Theorem (CLT) Let X 1, X 2, X 3,... X n be a sequence of n independent and identically distributed RVs with finite expectation µ and variance σ 2 > 0 CLT: as the sample size n increases, PDF of the sample average of the RVs approaches N(µ,σ 2 /n) irrespective of the shape of the original distribution

7 CLT A probability density functionDensity of a sum of two variables Density of a sum of three variablesDensity of a sum of four variables

8 CLT Let the sum of n random variables be S n, given by S n = X 1 +... + X n. Then, defining a new RV The distribution of Z n converges towards the N(0,1) as n approaches  (this is convergence in distribution),written as In terms of the CDFs

9 Confidence interval approximation Suppose that the observations X i are i.i.d with mean  and variance  that are unknown Estimate the mean and (unbiased) variance We may estimate the variance  /n of the sample mean by the above estimate For any given , we may use the CLT to approximate the confidence interval in this case From the normal table:

10 Confidence interval approximation Two different approximations in effect: – Treating the sum as if it is a normal RV – The true variance is replaces by the estimated variance from the sample Even in the special case where the X i ’s are i.i.d normal, the variance is an estimate and the RV T n (below) is not normally distributed

11 t-distribution For normal X i, it can be shown that the PDF of T n does not depend on  and  This is called t-distribution with n-1 degrees of freedom

12 t-distribution Its is also symmetric and bell-shaped (like normal) The probabilities of various intervals are available in tables When the Xi’s are normal and n is relatively small, a more accurate CI is (z=1-  /2)

13 Example The weight of an object is measured 8 times using an electric scale It reports true weight + random error ~N(0,  ).5547,.5404,.6364,.6438,.4917,.5674,.5564,.6066 Compute the 95% confidence interval Using the t-distribution

14 Linear regression Building a model of relation between two or more variables of interest Consider two variables x and y, based on a collection of data points (x i,y i ), i=1,…,n Assume that the scatter plot of these two variables show a systematic, approximately linear relationship between x i and y i It is natural to build a model: y  0 +  1 x

15 Linear regression Often, we cannot build a model, but we can estimate the parameters: The i-th residual is:

16 Linear regression The parameters are chosen to minimize the sum of squared residuals Always keep in mind that the postulated model may not be true To perform the optimization, we set the partial derivatives to zero w.r.t  0 and  1

17 Linear regression Given n data pairs (x i,y i ), the estimates that minimize the sum of the squared residuals are

18 Example The leaning tower of Pisa continuously tilts Measurements bw 1975-1987 Find the linear regression

19 Solution

20 Justification of the least square Maximum likelihood Approximation of Bayesian linear LMS (under a possibly nonlinear model) Approximation of Bayesian LMS estimation (linear model)

21 Maximum likelihood justification Assume that x i ’s are given numbers Assume y i ’s are realizations of a RV Y i as below where Wi’s are i.i.d ~N(0,  2 ) Y i =  0 +  1 x i + W i The likelihood function has the form ML is equivalent to minimizing the sum of square residuals

22 Approximate Bayesian linear LMS Assume x i and y i are realizations of RVs X i & Y i, (X i,Y i ) pairs are i.i.d with unknown joint PDF Assume an additional independent pair (X 0,Y 0 ) We observe X 0 and want to estimate Y 0 by a linear estimator The linear estimator is of the form

23 Approximate Bayesian LMS For the previous scenario, make the additional assumption of linear model Y i =  0 +  1 x i + W i W i ’s are i.i.d ~N(0,  2 ), independent of X i We know that E[Y 0 |X 0 ] minimizes the mean squared estimation error, for E[Y 0 |X 0 ]=  0 +  1 X i As n  ,

24 Multiple linear regression Many phenomena involve multiple underlying variables, also called explanatory variables Such models are called multiple regression E.g., for a triplet of data points (x i,y i,z i ) we wish to estimate the model: y   0 +  1 x +  2 z Minimize:  i (y i -  0 -  1 x i -  2 z i ) 2 In general, we can consider the model y   0 +  j  j h j (x)

25 Nonlinear regression Sometimes the expression is nonlinear in the unknown parameter Variables x and y obey the form y  h(x;  ) Min  i (y i – h(x i ;  )) 2 The minimization is not typically closed-form Assuming W i ’s are N(0,  2 ), Y i = h(x i ;  ) + W i The ML function

26 Practical considerations Heteroskedasticity Nonlinearity Multicollinearity Overfitting Causality


Download ppt "ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010."

Similar presentations


Ads by Google