Founded 1348Charles University. Johann Kepler University of Linz FSV UK STAKAN III Institute of Economic Studies Faculty of Social Sciences Charles University.

Slides:



Advertisements
Similar presentations
Advanced topics in Financial Econometrics Bas Werker Tilburg University, SAMSI fellow.
Advertisements

CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
Chap 8-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 8 Estimation: Single Population Statistics for Business and Economics.
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY
IAOS 2014 Conference – Meeting the Demands of a Changing World Da Nang, Vietnam, 8-10 October 2014 ROBUST REGRESSION IMPUTATION: CONSIDERATION ON THE INFLUENCE.
Visual Recognition Tutorial
Maximum likelihood (ML) and likelihood ratio (LR) test
Point estimation, interval estimation
Maximum likelihood (ML)
Regression III: Robust regressions
Parametric Inference.
2. Point and interval estimation Introduction Properties of estimators Finite sample size Asymptotic properties Construction methods Method of moments.
Visual Recognition Tutorial
Chapter 8 Estimation: Single Population
Visual Recognition Tutorial
July 3, Department of Computer and Information Science (IDA) Linköpings universitet, Sweden Minimal sufficient statistic.
Maximum likelihood (ML)
Regression Eric Feigelson. Classical regression model ``The expectation (mean) of the dependent (response) variable Y for a given value of the independent.
EC220 - Introduction to econometrics (review chapter)
1 UNBIASEDNESS AND EFFICIENCY Much of the analysis in this course will be concerned with three properties of estimators: unbiasedness, efficiency, and.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Founded 1348Charles University. Johann Kepler University of Linz FSV UK STAKAN III Institute of Economic Studies Faculty of Social Sciences Charles University.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Charles University FSV UK STAKAN III Institute of Economic Studies Faculty of Social Sciences Institute of Economic Studies Faculty of Social Sciences.
Measures of Variability In addition to knowing where the center of the distribution is, it is often helpful to know the degree to which individual values.
Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania
Charles University FSV UK STAKAN III Institute of Economic Studies Faculty of Social Sciences Institute of Economic Studies Faculty of Social Sciences.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
1 Information Geometry of Self-organizing maximum likelihood Shinto Eguchi ISM, GUAS This talk is based on joint research with Dr Yutaka Kano, Osaka Univ.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Charles University FSV UK STAKAN III Institute of Economic Studies Faculty of Social Sciences Institute of Economic Studies Faculty of Social Sciences.
Robust Estimators.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Founded 1348Charles University 1. FSV UK STAKAN III Institute of Economic Studies Faculty of Social Sciences Institute of Economic Studies Faculty of.
Founded 1348Charles University
Estimators and estimates: An estimator is a mathematical formula. An estimate is a number obtained by applying this formula to a set of sample data. 1.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Charles University FSV UK STAKAN III Institute of Economic Studies Faculty of Social Sciences Institute of Economic Studies Faculty of Social Sciences.
1 Estimation of Population Mean Dr. T. T. Kachwala.
CLASSICAL NORMAL LINEAR REGRESSION MODEL (CNLRM )
Charles University FSV UK STAKAN III Institute of Economic Studies Faculty of Social Sciences Institute of Economic Studies Faculty of Social Sciences.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
STAT03 - Descriptive statistics (cont.) - variability 1 Descriptive statistics (cont.) - variability Lecturer: Smilen Dimitrov Applied statistics for testing.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Charles University FSV UK STAKAN III Institute of Economic Studies Faculty of Social Sciences Institute of Economic Studies Faculty of Social Sciences.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Chapter 4. The Normality Assumption: CLassical Normal Linear Regression Model (CNLRM)
Estimating standard error using bootstrap
Charles University Charles University STAKAN III
Charles University Charles University STAKAN III
12. Principles of Parameter Estimation
Some General Concepts of Point Estimation
Charles University Charles University STAKAN III
Model Comparison: some basic concepts
Summarizing Data by Statistics
Charles University Charles University STAKAN III
Charles University Charles University STAKAN III
Charles University Charles University STAKAN III
Lecture # 2 MATHEMATICAL STATISTICS
Learning From Observed Data
12. Principles of Parameter Estimation
Presentation transcript:

Founded 1348Charles University

Johann Kepler University of Linz FSV UK STAKAN III Institute of Economic Studies Faculty of Social Sciences Charles University Prague Institute of Economic Studies Faculty of Social Sciences Charles University Prague Jan Ámos Víšek - BASIC IDEAS ROBUST STATISTICS - Austria, Linz 16. – BASIC IDEAS

Schedule of today talk A motivation for robust studies Huber’s versus Hampel’s approach Prohorov distance - qualitative robustness Influence function - quantitative robustness gross-error sensitivity local shift sensitivity rejection point Breakdown point Recalling linear regression model Scale and regression equivariance

Introducing robust estimators continued Schedule of today talk Maximum likelihood(-like) estimators - M-estimators Other types of estimators - L-estimators -R-estimators - minimum distance - minimum volume Advanced requirement on the point estimators

AN EXAMPLE FROM READING THE MATH Having explained what is the limit, an example was presented: To be sure that the students they were asked to solve the exercise : The answer was as follows: really understand what is in question,

Why the robust methods should be also used? Fisher, R. A. (1922): On the mathematical foundations of theoretical statistics. Philos. Trans. Roy. Soc. London Ser. A 222, pp

Continued Why the robust methods should be also used? ! is asymptotically infinitely larger than

Standard normal density Student density with 5 degree of freedom Is it easy to distinguish between normal and student density?

Continued Why the robust methods should be also used? New York: J.Wiley & Sons Huber, P.J.(1981): Robust Statistics.

Continued Why the robust methods should be also used? So, only 5% of contamination makes two times better than. Is 5% of contamination much or few? E.g. Switzerland has 6% of errors in mortality tables, see Hampel et al.. Hampel, F.R., E.M. Ronchetti, P. J. Rousseeuw, W. A. Stahel (1986): Robust Statistics - The Approach Based on Influence Functions. New York: J.Wiley & Sons.

Conclusion: We have developed efficient monoposts which however work only on special F1 circuits. A proposal: Let us use both. If both work, bless the God. We are on F1 circuit. If not, let us try to learn why. What about to utilize, if necessary, a comfortable sedan. It can “survive” even the usual roads.

Huber’s approach One of possible frameworks of statistical problems is to consider a parameterized family of distribution functions. Let us consider the same structure of parameter space but instead of each distribution function let us consider a whole neighborhood of d.f.. Huber’s proposal: Finally, let us employ usual statistical technique for solving the problem in question.

continued - an example Huber’s approach Let us look for an (unbiased, consistent, etc.) esti- mator of location with minimal (asymptotic) variance for family., i.e. consider instead of single d.f. the family. Let us look for an (unbiased, consistent, etc.) estimator of location with minimal (asymptotic) variance for family of families. Finally, solve the same problem as at the beginning of the task. For each let us define

Hampel’s approach The information in data is the same as information in empirical d.f.. An estimate of a parameter of d.f. can be then considered as a functional. has frequently a (theoretical) counterpart. An example:

continued Hampel’s approach Expanding the functional at in direction to, we obtain: where is e.g. Fréchet derivative - details below. Message: Hampel’s approach is an infinitesimal one, employing “differential calculus” for functionals. Local properties of can be studied through the properties of.

Qualitative robustness Let us consider a sequence of “green” d.f. which coincide with the red one, up to the distance from the Y-axis. Does the “green” sequence converge to the red d.f. ?

Let us consider Kolmogorov-Smirnov distance, i.e. continued Qualitative robustness K-S distance of any “green” d.f. from the red one is equal to the length of yellow segment. The “green” sequence does not converge in K-S metric to the red d.f. ! CONCLUSION: Independently on n, unfortunately.

continued Qualitative robustness Prokhorov distance Now, the sequence of the green d.f. converges to the red one. We look for a minimal length, we have to move the green d.f. - to the left and up - to be above the red one. In words: CONCLUSION:

Conclusion : For practical purposes we need something “stronger” than qualitative robustness. DEFINITION E.g., the arithmetic mean is qualitatively robust at normal d.f. !?! In words: Qualitative robustness is the continuity with respect to Prohorov distance. i.i.d. Qualitative robustness

Quantitative robustness The influence function is defined where the limit exists. Influence function

continued Quantitative robustness Characteristics derived from influence function Gross-error sensitivity Local shift sensitivity Rejection point

Breakdown point (The definition is here only to show that the description of breakdown which is below, has good mathematical basis. ) Definition – please, don’t read it in the sense that the estimate tends (in absolute value ) to infinity or to zero. is the smallest (asymptotic) ratio which can destroy the estimate In words obsession (especially in regression – discussion below)

An introduction - motivation Robust estimators of parameters Let us have a family and data. Of course, we want to estimate. Maximum likelihood estimators : What can cause a problem?

Robust estimators of parameters Consider normal family with unit variance: An example (notice that does not depend on ). So we solve the extremal problem

A proposal of a new estimator Robust estimators of parameters Maximum likelihood-like estimators : Once again: What caused the problem in the previous example? So what about

Robust estimators of parameters quadratic part linear part

The most popular estimators Robust estimators of parameters maximum likelihood-like estimators M-estimators based on order statistics L-estimators based on rank statistics R-estimators

Robust estimators of parameters The less popular estimators but still well known. Robust estimators of parameters based on minimazing distance between empirical d.f. and theoretical one. Minimal distance estimators based on minimazing volume containing given part of data and applying “classical” (robust) method. Minimal volume estimators

Robust estimators of parameters The classical estimator, e.g. ML-estimator, has typically a formula to be employed for evaluating it. Algorithms for evaluating robust estimators Extremal problems (by which robust estimators are defined) have not (typically) a solution in the form of closed formula. To find an algorithm how to evaluate an approximation to the precise solution. Firstly To find a trick how to verify that the appro- ximation is tight to the precise solution. Secondly

High breakdown point obsession (especially in regression – discussion below) Hereafter let us have in mind that we speak implicitly about

Recalling the model Put ( if intercept ),. and where. Linear regression model

So we look for a model “reasonably” explaining data. Linear regression model Recalling the model graphically

This is a leverage point and this is an outlier. Linear regression model Recalling the model graphically

Formally it means: If for data the estimate is, than for data the estimate is Equivariance in scale If for data the estimate is, than for data the estimate is Equivariance in regression Scale equivariant Affine equivariant We arrive probably easy to an agreement that the estimates of parameters of model should not depend on the system of coordinates. Equivariance of regression estimators

Unbiasedness Consistency Asymptotic normality Gross-error sensitivity Reasonably high efficiency Low local shift sensitivity Finite rejection point Controllable breakdown point Scale- and regression-equivariance Algorithm with acceptable complexity and reliability of evaluation Heuristics, the estimator is based on, is to really work Advanced (modern?) requirement on the point estimator Still not exhaustive

THANKS for ATTENTION