Chapter 2 Minimum Variance Unbiased estimation

Slides:



Advertisements
Similar presentations
SOME GENERAL PROBLEMS.
Advertisements

CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Statistical Estimation and Sampling Distributions
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: asymptotic properties of estimators: plims and consistency Original.
ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY
Maximum likelihood (ML) and likelihood ratio (LR) test
The Mean Square Error (MSE):. Now, Examples: 1) 2)
Maximum likelihood (ML)
AGC DSP AGC DSP Professor A G Constantinides© Estimation Theory We seek to determine from a set of data, a set of parameters such that their values would.
Evaluating Hypotheses
Prediction and model selection
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Visual Recognition Tutorial
Statistical Background
Maximum likelihood (ML)
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Wireless Information Transmission System Lab. Institute of Communications Engineering National Sun Yat-sen University 2011 Summer Training Course ESTIMATION.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Confidence Interval & Unbiased Estimator Review and Foreword.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Autoregressive (AR) Spectral Estimation
Probability and Moment Approximations using Limit Theorems.
Introduction to Estimation Theory: A Tutorial
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Computacion Inteligente Least-Square Methods for System Identification.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Presentation : “ Maximum Likelihood Estimation” Presented By : Jesu Kiran Spurgen Date :
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Virtual University of Pakistan
Estimator Properties and Linear Least Squares
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Statistical Estimation
Chapter 7. Classification and Prediction
STATISTICS POINT ESTIMATION
STATISTICAL INFERENCE
Visual Recognition Tutorial
12. Principles of Parameter Estimation
Some General Concepts of Point Estimation
LECTURE 06: MAXIMUM LIKELIHOOD ESTIMATION
Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.
(5) Notes on the Least Squares Estimate
Sampling Distributions and Estimation
STATISTICAL INFERENCE PART I POINT ESTIMATION
統計通訊 黃偉傑.
Introduction to Instrumentation Engineering
Two-Variable Regression Model: The Problem of Estimation
Modern Spectral Estimation
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Summarizing Data by Statistics
Econometrics Chengyaun yin School of Mathematics SHUFE.
Introduction to Estimation
Computing and Statistical Data Analysis / Stat 7
Maths for Signals and Systems Linear Algebra in Engineering Lecture 6, Friday 21st October 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL.
Parametric Methods Berlin Chen, 2005 References:
12. Principles of Parameter Estimation
Chapter 8 Estimation.
16. Mean Square Estimation
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Applied Statistics and Probability for Engineers
Presentation transcript:

Chapter 2 Minimum Variance Unbiased estimation

Introduction In this chapter we will begin our search for good estimators of unknown deterministic parameters. We will restrict our attention to estimators which on the average yield the true parameter value. Then, within this class of estimators the goal will be to find the one that exhibits the least variability. The estimator thus obtained will produce values close to the true value most of the time. The notion of a minimum variance unbiased estimator is examined within this chapter.

Unbiased Estimators For an estimator to be unbiased we mean that on the average the estimator will yield the true value of the unknown parameter. Since the parameter value may in general be anywhere in the interval , unbiasedness asserts that no matter what the true value of θ, our estimator will yield it on the average. (2.1)

Example 2.1 (1/2) Consider the observations where A is the parameter to be estimated and w[n] is WGN. The parameter A can take on any value in the interval . The reasonable estimator for the average value of x[n] is or the sample mean. (2.2)

Example 2.1 (2/2) Due to the linearity properties of the expectation operator for all A. The sample mean estimator is unbiased.

Unbiased Estimators The restriction that for all θ is an important one. It is possible that may hold for some values of θ and not others.

Example 2.2 Consider again Example 2.1 but with the modified sample mean estimator Then, It is seen that (2.3) holds for the modified estimator only for A = 0. Clearly, it is a biased estimator.

Unbiased Estimators That an estimator is unbiased does not necessarily mean that it is a good estimator. It only guarantees that on the average it will attain the true value. A persistent bias will always result in a poor estimator. As an example, the unbiased property has an important implication when several estimators are combined. A reasonable procedure is to combine these estimates into a better one by averaging them to form

Unbiased Estimators Assuming the estimators are unbiased, with the same variance, and uncorrelated with each other, and so that as more estimates are averaged, the variance will decrease.

Unbiased Estimators However, if the estimators are biased or , then and no mater how many estimators are averaged, will not converge to the true value. Note that, in general, is defined as the bias of the estimator.

Minimum Variance Criterion In searching for optimal estimators we need to adopt some optimality criterion. A natural one is the mean square error (MSE), defined as Unfortunately, adoption of this natural criterion leads to unrealizable estimators, ones that cannot be written solely as a function of the data.

Minimum Variance Criterion To understand the problem which arises we first rewrite the MSE as which shows that the MSE is composed of errors due to the variance of the estimator as well as the bias. (2.6)

Minimum Variance Criterion As an example, for the problem in Example 2.1 consider the modified estimator for come constant a. We will attempt to find the a which results in the minimum MSE. Since and , we have

Minimum Variance Criterion Differentiating the MSE with respect to a yields which upon setting to zero and solving yields the optimum value It is seen that the optimal value of a depends upon the unknown parameter A. The estimator is therefore not realizable.

Minimum Variance Criterion In retrospect the estimator depends upon A since the bias term in (2.6) is a function of A. It would seem that any criterion which depends on the bias will lead to an unrealizable estimator. From a practical view point the minimum MSE estimator needs to be abandoned.

Minimum Variance Criterion An alternative approach is to constrain the bias to be zero and find the estimator which minimizes the variance. Such an estimator is termed the minimum variance unbiased (MVU) estimator. Note that from (2.6) that the MSE of an unbiased estimator is just the variance. Minimizing the variance of an unbiased estimator also has the effect of concentrating the PDF of the estimation error about zero. The estimation error will therefore be less likely to be large.

Existence of the Minimum Variance Unbiased Estimator The question arises as to whether a MVU estimator exists, i.e., an unbiased estimator with minimum variance for all θ.

Example 2.3 (1/3) Assume that we have two independent observations x[0] and x[1] with PDF The two estimators can easy be shown to be unbiased.

Example 2.3 (2/3) To compute the variances we have that so that and

Example 2.3 (3/3) Clearly, between these two estimators no MVU estimator exists. No single estimator can have a variance uniformly less than or equal the minima.

Finding the Minimum Variance Unbiased Estimator Even if a MV estimator exists, we may not be able to find it. In the next few chapters we shall discuss several possible approaches. They are: Determine the Cramer-Rao lower bound (CRLB) and check to see if some estimator satisfies it (Chapters 3 and 4). Apply the Rao-Blackwell-Lehmann-Scheffe (RBLS) theorem (Chapter 5). Further restrict the class of estimators to be not only unbiased but also linear. Ten, find the minimum variance estimator within this restricted class (Chapter 6).

Finding the Minimum Variance Unbiased Estimator The CRLB allow us to determine that for any unbiased estimator the variance must be greater than or equal to a given value. If an estimator exists whose variance equals the CRLB for each value of θ, then it must be the MVU estimator.

Extension to a Vector Parameter If is a vector of unknown parameters, then we say that an estimator is unbiased if for i = 1, 2, …, p. By defining (2.7)

Extension to a Vector Parameter We can equivalently define an unbiased estimator to have the property for every θ contained within the space defined in (2.7). A MVU estimator has the additional property that for i = 1, 2, …, p is minimum among all unbiased estimators.

Chapter 3 Cramer-Rao Lower Bound

Introduction Place a lower bound on the variance of any unbiased estimator and assert that an estimator is the MVU estimator. Although many such variance bounds exist [McAulay and Hofstetter 1971, Kendall and Stuart 1979, Seidman 1970, Ziv and Zakai 1969], the Cramer-Rao lower bound (CRLB) is the easiest to determine.

3.3 Estimator Accuracy Considerations Consider the hidden factors that determine how well we can estimate a parameter. The more the PDF is influenced by the unknown parameter, the better we should be able to estimate it. Example 3.1 - PDF dependence on unknown parameter If a single sample is observed as where , and it is desired to estimate A

3.3 Estimator Accuracy Considerations Example 3.1(cont.) A good unbiased estimator is The variance is The estimator accuracy improves as decreases. If and

3.3 Estimator Accuracy Considerations Example 3.1(cont.) the latter is a much weaker dependence on A.

3.3 Estimator Accuracy Considerations The “sharpness” of the likelihood functions determines how accurately we can estimate the unknown parameter.

3.3 Estimator Accuracy Considerations For this example the second derivative does not depend on In general ,a more appropriate measure of curvature is

3.3 Estimator Accuracy Considerations Which measures the average curvature of the log-likelihood function. The expectation is taken with respect to , resulting in a function of A only. The larger the quantity, the smaller the variance of the estimator.

3.4 Cramer-Rao Lower Bound Theorem 3.1 (CRLB – Scalar Parameter) It is assumed that the PDF satisfies the “regularity” condition for all then , the variance of any unbiased estimator must satisfy

3.4 Cramer-Rao Lower Bound Theorem 3.1(cont.) furthermore, an unbiased estimator attains the bound if and only if and min variance

3.4 Cramer-Rao Lower Bound Prove when the CRLB is attained, then proof: Because CRLB is attained and

3.4 Cramer-Rao Lower Bound Proof:(cont.) so we get and then finally,

3.4 Cramer-Rao Lower Bound Regularity

3.4 Cramer-Rao Lower Bound Example 3.2 – CRLB for Example 3.1

3.4 Cramer-Rao Lower Bound Example 3.3 – DC level in white Gaussian Noise consider the multiple observations PDF

3.4 Cramer-Rao Lower Bound Example 3.3(cont.)

3.4 Cramer-Rao Lower Bound Example 3.3(cont.) we see that the sample mean estimator attains the bound and must therefore be the MVU estimator.

3.4 Cramer-Rao Lower Bound Example 3.4 – Phase Estimator A and f0 are assumed known, and we wish to estimate the phase

3.4 Cramer-Rao Lower Bound Example 3.4(cont.) So we get

3.4 Cramer-Rao Lower Bound Example 3.4(cont.) In this example the condition for the bound to hold is not satisfied. Hence, a phase estimator does not exist which unbiased and attains the CRLB. But, a MVU estimator may exist

3.4 Cramer-Rao Lower Bound Efficiency vs min variance

3.4 Cramer-Rao Lower Bound Fisher information properties: Nonnegative Additive for independent observations

3.4 Cramer-Rao Lower Bound The latter property leads to the result that the CRLB for N IID observations is 1/N times that for one observation. For completely dependent samples,

3.5 General CRLB for Signals in White Gaussian Noise Consider

3.5 General CRLB for Signals in White Gaussian Noise finally,

3.5 General CRLB for Signals in White Gaussian Noise Example 3.5 –Sinusoidal Frequency Estimation Assume where A and phase are known. So we get the CRLB If , the CRLB goes to infinity.

3.5 General CRLB for Signals in White Gaussian Noise Example 3.5 (cont.)

3.6 Transformation of Parameters Usually, the parameter we wish to estimate is a function of some more fundamental parameter. In Example 3.3, we wish to estimate A2. Knowing the CRLB for A, we can easily obtain it for A2. As shown in Appendix 3A, if it is desired to estimate , then the CRLB is

3.6 Transformation of Parameters For the present example this becomes and In Example 3.3, the sample mean estimator was efficient for A. It might be supposed that is efficient for A2. But actually, is not even an unbiased estimator! proof: Since

3.6 Transformation of Parameters The efficiency of an estimator is destroyed by a nonlinear transformation. But the efficiency is maintained for linear transformation. Proof: Assume that an efficient estimator for exists and is given by . It is desired to estimate . We choose . Then So that is unbiased.

3.6 Transformation of Parameters The CRLB for But , so that the CRLB is achieved. So, the efficiency is maintained for linear transformation.

3.6 Transformation of Parameters The efficiency is approximately maintained over nonlinear transformations if the data record is large enough. Ex: The example of estimating . Although is biased, we note that is asymptotically unbiased or unbiased as . Since , we can evaluate the variance

3.6 Transformation of Parameters Using the result that if therefore For our problem, we have Hence, as , is an asymptotically efficient estimator of A2.

3.6 Transformation of Parameters This situation occurs due to the statistical linearity of the transformation, as illustrated in figure. As N increased, the PDF of becomes more concentrated about the mean A.

3.6 Transformation of Parameters If we linearize g about A, we have the approximation Within this approximation, the estimator is unbiased (asymptotically). Also, so the estimator achieves the CRLB (asymptotically).

3.7 Extension to a Vector Parameter Now we extend the results to a vector parameter . As derived in Appendix 3B, the CRLB is found as the [i, i] element of the inverse of a matrix where is the Fisher information matrix. for .

3.7 Extension to a Vector Parameter Example 3.6 – DC Level in White Gaussian Noise We now extend example 3.3 to the case where in addition to A the noise variance is also unknown. The parameter vector is , hence p = 2. The 2x2 Fisher information matrix is

3.7 Extension to a Vector Parameter The log-likelihood function is The derivatives are easily found as

3.7 Extension to a Vector Parameter Taking the negative expectation, the Fisher information matrix becomes Although not true in general, for this example the Fisher information matrix is diagonal and hence easily inverted to yield