Minimaxity & Admissibility Presenting: Slava Chernoi Lehman and Casella, chapter 5 sections 1-2,7.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Incremental Linear Programming Linear programming involves finding a solution to the constraints, one that maximizes the given linear function of variables.
SOME GENERAL PROBLEMS.
General Linear Model With correlated error terms  =  2 V ≠  2 I.
Statistical Decision Theory Abraham Wald ( ) Wald’s test Rigorous proof of the consistency of MLE “Note on the consistency of the maximum likelihood.
CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Chapter 7. Statistical Estimation and Sampling Distributions
Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
The General Linear Model. The Simple Linear Model Linear Regression.
Visual Recognition Tutorial
The Mean Square Error (MSE):. Now, Examples: 1) 2)
Chebyshev Estimator Presented by: Orr Srour. References Yonina Eldar, Amir Beck and Marc Teboulle, "A Minimax Chebyshev Estimator for Bounded Error Estimation"
Estimation of parameters. Maximum likelihood What has happened was most likely.
Presenting: Assaf Tzabari
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Visual Recognition Tutorial
Visual Recognition Tutorial
STATISTICAL INFERENCE PART II SOME PROPERTIES OF ESTIMATORS
PATTERN RECOGNITION AND MACHINE LEARNING
MA2213 Lecture 4 Numerical Integration. Introduction Definition is the limit of Riemann sums I(f)
Principles of Pattern Recognition
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Statistical Decision Theory
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Properties of OLS How Reliable is OLS?. Learning Objectives 1.Review of the idea that the OLS estimator is a random variable 2.How do we judge the quality.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Consistency An estimator is a consistent estimator of θ, if , i.e., if
Geology 5670/6670 Inverse Theory 21 Jan 2015 © A.R. Lowry 2015 Read for Fri 23 Jan: Menke Ch 3 (39-68) Last time: Ordinary Least Squares Inversion Ordinary.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Confidence Interval & Unbiased Estimator Review and Foreword.
Estimators and estimates: An estimator is a mathematical formula. An estimate is a number obtained by applying this formula to a set of sample data. 1.
5. Maximum Likelihood –II Prof. Yuille. Stat 231. Fall 2004.
Chapter 5 Statistical Inference Estimation and Testing Hypotheses.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
Week 31 The Likelihood Function - Introduction Recall: a statistical model for some data is a set of distributions, one of which corresponds to the true.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Conditional Expectation
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Geology 5670/6670 Inverse Theory 6 Feb 2015 © A.R. Lowry 2015 Read for Mon 9 Feb: Menke Ch 5 (89-114) Last time: The Generalized Inverse; Damped LS The.
Presentation : “ Maximum Likelihood Estimation” Presented By : Jesu Kiran Spurgen Date :
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
STATISTICS POINT ESTIMATION
STATISTICAL INFERENCE
Visual Recognition Tutorial
12. Principles of Parameter Estimation
Some General Concepts of Point Estimation
Probability Theory and Parameter Estimation I
Ch3: Model Building through Regression
LECTURE 03: DECISION SURFACES
Bayes Net Learning: Bayesian Approaches
Chapter 2 Minimum Variance Unbiased estimation
Statistical Assumptions for SLR
Summarizing Data by Statistics
Parametric Methods Berlin Chen, 2005 References:
Chapter 9 Chapter 9 – Point estimation
12. Principles of Parameter Estimation
Applied Statistics and Probability for Engineers
Presentation transcript:

Minimaxity & Admissibility Presenting: Slava Chernoi Lehman and Casella, chapter 5 sections 1-2,7.

2 Agenda Minimaxity Admissibility Completeness

3 The Model TransmitterChannelSignalReceiver

4 The Model Find a “good” estimate of from measurement X, where Example: Linear Gaussian Model Example for an estimator

5 The Objective Find an estimator which minimizes a given risk function, which is the mean of a loss function Example: Mean Square Error (MSE) Risk In the previous example

6 The Problem In General, depends on and can not be minimized for all Example: is optimal if indeed but performs badly if it not Additional criteria needs to be added for us to be able to solve the problem

7 Main Definitions An estimator is minimax with respect to a risk function if it minimizes the maximum risk An estimators,, is said to dominate an estimator, if

8 Main Definitions An estimator is admissible with respect to a risk function if no other estimator dominates it A class of estimators, C, is complete if it contains all admissible estimators An estimator is called uniformly minimum variance unbiased (UMVU) if it has minimum variance among all unbiased estimators

9 Main Definitions-Example In the case of square loss So if we assume an unbiased estimator, the UMVU is optimal Note: allowing a Bias may significantly decrease the variance

10 Main Definitions Theorem: If an estimator is an unbiased function of the sufficient statistic, it is UMVU An estimator is the Bayes estimator for with respect to a prior distribution if Note2:In the case of square loss, the Bayesian estimator becomes the conditional mean

11 Example – Gaussian Case Continuing the basic example, also assume Assuming Square error loss, the Bayes estimator becomes We end up with a linear Bayes estimator (we will come back to it later)

12 Examples The MSE of 3 estimators, with the dimension n=10 Which are minimax, admissible, inadmissible?

13 Agenda Minimaxity Admissibility Completeness

14 Minimaxity What kind of an estimator is minimax? Minimax estimators are best in the worst case So, logically, it should be the best (Bayes) estimator for the worst possible distribution of

15 Minimaxity - Conditions Definition: A distribution is least favorable if its Bayes risk, for any other distribution Theorem: If, then 1. is minimax 2. If is unique Bayes, it is unique minimax 3. is least favorable

16 Minimaxity Conclusion1: If a Bayes estimator has constant risk, it is minimax Conclusion2: If a Bayes estimator attains its highest risk with probability 1, it is minimax Conclusion3: Unique minimaxity, does not imply the uniqueness of the least favorable distribution

17 Example 1 – X~b(p,n) Flip an unfair coin n times, estimate p Is the UMVU, X/n, minimax with square loss? The risk is

18 Example 1 – X~b(p,n) The minimax estimator is actually given by And the appropriate constant risk function is

19 Example 1 – X~b(p,n) Is the minimax estimator any good in this case?

20 Example 1 – X~b(p,n)  Note that for a slightly different risk function  So we have a Bayes estimator (for the uniform prior) whose risk is constant, so X/n is actually minimax

21 Example 2 – Random estimator Suppose the loss function is none convex A deterministic estimators of p, has at most n+1 distinct values, so the maximum risk is 1! Consider an estimator, d, choosing p at random, Its risk would be

22 Example 2 – Random estimator Lemma: If the loss function is convex, positive and minimal at, then the class of all the deterministic estimators is complete So if the loss is convex, there is no need to explore randomized estimators

23 What happens when no least favorable distribution exist? Definition: A sequence of prior distributions is least favorable if for any, Definition: An estimator which is the limit of Bayes estimators is called limit Bayes, Theorem: If then is a least favorable sequence, and is minimax Comment: unlike the previous theorem, no uniqueness is guaranteed here

24 Example 3 – What happens when no least favorable distribution exist?. Estimating. Is, the UMVU, minimax? Consider the sequence is with As we have seen before (with n=1), the Bayes estimator is,and the risk is The limit is the UMVU We have an estimator with, so it is minimax and the sequence is least favorable!

25 Minimaxity - Summary If an estimator is Bayes (or limit Bayes) with constant risk it is minimax A minimax estimator doesn’t have to have constant risk In the limit case, uniqueness of the Bayes limit, doesn’t imply the uniqueness of the minimax

26 Agenda Minimaxity Admissibility Completeness

27 Admissibility What kind of an estimator is admissible? An estimators is admissible if no other estimator is uniformly better Is the UMVU admissible? No, biased estimators may dominate the UMVE

28 Admissibility What kind of an estimator is admissible? Theorem: If, The loss function is zero at, and increasing as d moves away from and, then the estimator is inadmissible Proof: It is dominated by

29 Example 1 – Exponential X Given Assume it is known The unbiased estimator is Its improvement is the maximum likelihood estimator

30 Example 1 – Exponential X Here we illustrate the MSE of the unbiased and truncated estimators, when

31 Admissibility What kind of an estimator is admissible? Theorem: If is a unique Bayes estimator, it is admissible Example: Any feasible constants estimator is admissible Does not hold in the case of limit Bayes

32 Example 2-Admissibility of linear estimators. Estimating. Prior as before the resulting estimator is The MSE is For 0<a<1 we have a unique Bayes (admissible) What if a=0,a=1,a>1,a<0?

33 Example 2-Admissibility of linear estimators Case1,a>1: So dominates Case2,a<0: So is dominated by the constant Case3,a=1: Best choice of bias is 0, and the estimator is admissible Case4,a=0: In this case we have a constant estimator

34 Example 2-Admissibility of linear estimators Here we illustrate the MSE for different choices of a,b

35 Example 3 – bound estimation Continuing the previous example. Assume, is the estimator still admissible, minimax? Admissible? No, it is dominated by the maximum likelihood estimator: But is the MLE minimax? Answer: No…

36 Example 3 – bound estimation The actual minimax when (without proof) is given by There is no closed solution for general If we only observe the linear estimators, it is easy to show that the linear minimax is

37 Example 3 – bound estimation In the case where, here is a plot illustrating the MSE for the minmax, linear minmax, MLE and Unbiased estimators

38 Admissibility-Karlin’s Theorem Theorem: If where then the linear estimator of the mean is admissible with squared error loss if and only if for all both of the following integrals diverge and

39 Example 4 – Binomial case Here Choose and Estimate using a linear estimator of the form So the convergence integral is

40 Example 4 – Binomial case The integral does not converge at both ends whenever Plotting the MSE for the unbiased, minimax and two other admissible linear estimators

41 Admissibility- Summary Corollary: If then is an admissible estimator, under square error loss Proof: Just insert in both integral of Karlin’s theorem Lemma: A constant risk and admissible estimator is minimax Lemma: A unique minimax estimator is admissible Lemma: A unique Bayes estimator is admissible

42 Admissibility- Summary Limit Bayes does not have to by admissible An admissible estimator with no prior information, may be inadmissible when it is present The same applies to minimaxity

43 Agenda Minimaxity Admissibility Completeness

44 Completeness Theorem: If and the loss function is continuous, strictly convex and then for any admissible estimator,exists a sequence of prior distributions so that Corollary: Under the assumptions of the theorem, the class of all limit Bayes estimators is complete

45 Completeness – Linear estimators When estimating the weighted sum, from the observation then under square loss the linear estimator is admissible if and only if For estimating from the linear estimator where is a symmetric matrix is admissible if and only if all eigenvalues of A are between 0 and 1, and 2 at most are equal 1!

46 Thank you for your attention