Presentation on theme: "Parametric Families of Distributions and Their Interaction with the Workshop Title Chris Jones The Open University, U.K."— Presentation transcript:
Parametric Families of Distributions and Their Interaction with the Workshop Title Chris Jones The Open University, U.K.
How the talk will pan out … it will start as a talk in distribution theory –concentrating on generating one family of distributions then will continue as a talk in distribution theory –concentrating on generating a different family of distributions but in this second part, the talk will metamorphose through links with kernels and quantiles … … and finally get on to a more serious application to smooth (nonparametric) QR the parts of the talk involving QR are joint with Keming Yu
Set Starting point: simple symmetric g How might we introduce (at most two) shape parameters a and b which will account for skewness and/or kurtosis/tailweight (while retaining unimodality)? Modelling data with such families of distributions will, inter alia, afford robust estimation of location (and maybe scale).
Actual density of order statistic: Generalised density of order statistic: (i,n integer) (a,b>0 real)
Roles of a and b a=b=1: f = g a=b: family of symmetric distributions ab: skew distributions a controls left-hand tail weight, b controls right the smaller a or b, the heavier the corresponding tail
Properties of (Generalised) Order Statistic Distributions Distribution function: Tail behaviour. For large x>0: –power tails: –exponential tails: Limiting distributions: –a and b large: normal distribution –one of a or b large, appropriate extreme value distribution Other properties such as moments and modality need to be examined on a case-by-case basis For more, see Jones (2004, Test)
Tractable Example 1 Jones & Faddys (2003, JRSSB) skew t density When a=b, Student t density on 2a d.f.
The simple exponential tail property is shared by: the log F distribution the asymmetric Laplace distribution the hyperbolic distribution Is there a general form for such distributions?
FAMILY 2: distributions with simple exponential tails Starting point: simple symmetric g with distribution function G and General form for density is:
Special Cases G is point mass at zero, G^=xI(x>0) f is asymmetric Laplace G is logistic, G^=log(1+exp(x)) f is log F G is t_2, G^=½(x+(1+x^2)) f is hyperbolic G is normal, G^= xΦ(x)+φ(x) G uniform, G^=½(1+x)I(-1 1)
Practical Point 1 the asymmetric Laplace is a three parameter distribution; other members of family have four; fourth parameter is redundant in practice: (asymptotic) correlations between ML estimates of σ and either of a or b are very near 1; reason: σ, a and b are all scale parameters, yet you only need two such parameters to describe main scale-related aspects of distribution [either (i) a left-scale and a right-scale or (ii) an overall scale and a left-right comparer]
Practical Point 2 Parametrise by μ, σ, a=1-p, b=p. Then, score equation for μ reads: This is kernel quantile estimation, with kernel G and bandwidth σ
Includes bandwidth selection by choosing σ to solve the second score equation: But its simulation performance is variable:
And so to Quantile Regression: The usual (regression) log-likelihood, is kernel localised to point x by
this (version of) DOUBLE KERNEL LOCAL LINEAR QUANTILE REGRESSION satisfies Writing and Contrast this with Yu & Jones (1998, JASA) version of DKLLQR : where
The vertical bandwidth σ=σ(x) can also be estimated by ML: solve Compare 3 versions of DKLLQR: Yu & Jones (1998) including r-o-t σ and h; new version including r-o-t σ and h; new version including above σ and r-o-t h.
Based on this limited evidence: Clear recommendation: –replace Yu & Jones (1998) DKLLQR method by (gently but consistently improved) new version Unclear non-recommendation: –use new bandwidth selection?