Parametric Families of Distributions and Their Interaction with the Workshop Title Chris Jones The Open University, U.K.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Descriptive Statistics-II
Estimation of Means and Proportions
One-way nonparametric ANOVA with trigonometric scores by Kravchuk, O.Y. School of Land and Food Sciences, University of Queensland.
Brief introduction on Logistic Regression
Transformations Getting normal or using the linear model.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Chap 10: Summarizing Data 10.1: INTRO: Univariate/multivariate data (random samples or batches) can be described using procedures to reveal their structures.
FAMILIES OF UNIMODAL DISTRIBUTIONS ON THE CIRCLE Chris Jones THE OPEN UNIVERSITY.
ALTERNATIVE SKEW-SYMMETRIC DISTRIBUTIONS Chris Jones THE OPEN UNIVERSITY, U.K.
Maximum likelihood (ML) and likelihood ratio (LR) test
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Maximum likelihood (ML)
Generalised linear models
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
Maximum-Likelihood estimation Consider as usual a random sample x = x 1, …, x n from a distribution with p.d.f. f (x;  ) (and c.d.f. F(x;  ) ) The maximum.
Log-linear and logistic models
Linear and generalised linear models
July 3, Department of Computer and Information Science (IDA) Linköpings universitet, Sweden Minimal sufficient statistic.
Market Risk VaR: Historical Simulation Approach
1 An Introduction to Nonparametric Regression Ning Li March 15 th, 2004 Biostatistics 277.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Maximum likelihood (ML)
© 2002 Thomson / South-Western Slide 6-1 Chapter 6 Continuous Probability Distributions.
Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Traffic Modeling.
A statistical model Μ is a set of distributions (or regression functions), e.g., all uni-modal, smooth distributions. Μ is called a parametric model if.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 6-1 Business Statistics, 4e by Ken Black Chapter 6 Continuous Distributions.
Education 793 Class Notes Normal Distribution 24 September 2003.
Chapter 4: Introduction to Predictive Modeling: Regressions
Roghayeh parsaee  These approaches assume that the study sample arises from a homogeneous population  focus is on relationships among variables 
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Sampling and estimation Petter Mostad
1 Chapter 4: Introduction to Predictive Modeling: Regressions 4.1 Introduction 4.2 Selecting Regression Inputs 4.3 Optimizing Regression Complexity 4.4.
Statistical Data Analysis 2011/2012 M. de Gunst Lecture 2.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
Analysis of financial data Anders Lundquist Spring 2010.
Selecting Input Probability Distributions. 2 Introduction Part of modeling—what input probability distributions to use as input to simulation for: –Interarrival.
CS Statistical Machine learning Lecture 7 Yuan (Alan) Qi Purdue CS Sept Acknowledgement: Sargur Srihari’s slides.
Normal Distributions Overview. 2 Introduction So far we two types of tools for describing distributions…graphical and numerical. We also have a strategy.
Ondrej Ploc Part 2 The main methods of mathematical statistics, Probability distribution.
Estimating standard error using bootstrap
Distributions.
Introduction to Normal Distributions
Normal distributions x x
Standard Errors Beside reporting a value of a point estimate we should consider some indication of its precision. For this we usually quote standard error.
Probability and the Normal Curve
Stat 223 Introduction to the Theory of Statistics
A Forgotten Distribution
Handout on Statistics Summary for Financial Analysis: Random Variables, Probability and Probability Distributions, Measures of Central Tendency, Dispersion,
Business Statistics, 4e by Ken Black
Stochastic Hydrology Hydrological Frequency Analysis (II) LMRD-based GOF tests Prof. Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.
Alafia river: Autocorrelation Autocorrelation of standardized flow.
The normal distribution
Undergraduated Econometrics
Warmup To check the accuracy of a scale, a weight is weighed repeatedly. The scale readings are normally distributed with a standard deviation of
Modeling Continuous Variables
CHAPTER 2 Modeling Distributions of Data
Stat 223 Introduction to the Theory of Statistics
Honors Statistics The Standard Deviation as a Ruler and the Normal Model Chapter 6 Part 3.
Regression Assumptions
CHAPTER 2 Modeling Distributions of Data
STATISTICAL INFERENCE PART I POINT ESTIMATION
Business Statistics, 3e by Ken Black
CHAPTER 2 Modeling Distributions of Data
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Regression Assumptions
Applied Statistics and Probability for Engineers
Presentation transcript:

Parametric Families of Distributions and Their Interaction with the Workshop Title Chris Jones The Open University, U.K.

How the talk will pan out … it will start as a talk in distribution theory –concentrating on generating one family of distributions then will continue as a talk in distribution theory –concentrating on generating a different family of distributions but in this second part, the talk will metamorphose through links with kernels and quantiles … … and finally get on to a more serious application to smooth (nonparametric) QR the parts of the talk involving QR are joint with Keming Yu

Set Starting point: simple symmetric g How might we introduce (at most two) shape parameters a and b which will account for skewness and/or kurtosis/tailweight (while retaining unimodality)? Modelling data with such families of distributions will, inter alia, afford robust estimation of location (and maybe scale).

FAMILY 1 g

Actual density of order statistic: Generalised density of order statistic: (i,n integer) (a,b>0 real)

Roles of a and b a=b=1: f = g a=b: family of symmetric distributions ab: skew distributions a controls left-hand tail weight, b controls right the smaller a or b, the heavier the corresponding tail

Properties of (Generalised) Order Statistic Distributions Distribution function: Tail behaviour. For large x>0: –power tails: –exponential tails: Limiting distributions: –a and b large: normal distribution –one of a or b large, appropriate extreme value distribution Other properties such as moments and modality need to be examined on a case-by-case basis For more, see Jones (2004, Test)

Tractable Example 1 Jones & Faddys (2003, JRSSB) skew t density When a=b, Student t density on 2a d.f.

Some skew t densities

… and with a and b swopped

f = skew t density arises from ??? g

Yes, the t distribution on 2 d.f.!

Tractable Example 2 Q: The (order statistics of the) logistic distribution generate the ??? A: Log F distribution –This has exponential tails

These examples, seen before, are therefore log F distributions

The log F distribution

The simple exponential tail property is shared by: the log F distribution the asymmetric Laplace distribution the hyperbolic distribution Is there a general form for such distributions?

FAMILY 2: distributions with simple exponential tails Starting point: simple symmetric g with distribution function G and General form for density is:

Special Cases G is point mass at zero, G^[2]=xI(x>0) f is asymmetric Laplace G is logistic, G^[2]=log(1+exp(x)) f is log F G is t_2, G^[2]=½(x+(1+x^2)) f is hyperbolic G is normal, G^[2]= xΦ(x)+φ(x) G uniform, G^[2]=½(1+x)I(-1 1)

solid line: log F dashed line: hyperbolic dotted line: normal-based

Practical Point 1 the asymmetric Laplace is a three parameter distribution; other members of family have four; fourth parameter is redundant in practice: (asymptotic) correlations between ML estimates of σ and either of a or b are very near 1; reason: σ, a and b are all scale parameters, yet you only need two such parameters to describe main scale-related aspects of distribution [either (i) a left-scale and a right-scale or (ii) an overall scale and a left-right comparer]

Practical Point 2 Parametrise by μ, σ, a=1-p, b=p. Then, score equation for μ reads: This is kernel quantile estimation, with kernel G and bandwidth σ

Includes bandwidth selection by choosing σ to solve the second score equation: But its simulation performance is variable:

And so to Quantile Regression: The usual (regression) log-likelihood, is kernel localised to point x by

this (version of) DOUBLE KERNEL LOCAL LINEAR QUANTILE REGRESSION satisfies Writing and Contrast this with Yu & Jones (1998, JASA) version of DKLLQR : where

The vertical bandwidth σ=σ(x) can also be estimated by ML: solve Compare 3 versions of DKLLQR: Yu & Jones (1998) including r-o-t σ and h; new version including r-o-t σ and h; new version including above σ and r-o-t h.

Based on this limited evidence: Clear recommendation: –replace Yu & Jones (1998) DKLLQR method by (gently but consistently improved) new version Unclear non-recommendation: –use new bandwidth selection?

References