Presentation is loading. Please wait.

Presentation is loading. Please wait.

Institute for Paper, Pulp and Fiber Technology 1 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed.

Similar presentations


Presentation on theme: "Institute for Paper, Pulp and Fiber Technology 1 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed."— Presentation transcript:

1 Institute for Paper, Pulp and Fiber Technology 1 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies Feirer V., Hirn U., Friedl H., Bauer W. Institute for Paper, Pulp and Fiber Technology & Institute for Statistics Graz University of Technology

2 Institute for Paper, Pulp and Fiber Technology 2 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Agenda Motivation Generalized Linear Models Multiplicative Binomial Distribution Double Binomial Distribution Application of the Two Distributions Summary

3 Institute for Paper, Pulp and Fiber Technology 3 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Motivation consider the problem of successful ink transfer on paper explain occurrence of unprinted regions …part of a larger, industry-funded project at the IPZ. (No. of datapoints in sample: roughly 9  10 6 sample size: 3  6 mm²)

4 Institute for Paper, Pulp and Fiber Technology 4 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Predictor Variables TopographyFormation…the way fibres are arranged

5 Institute for Paper, Pulp and Fiber Technology 5 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Response true colour image

6 Institute for Paper, Pulp and Fiber Technology 6 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 GENERALIZED LINEAR MODELS Basics

7 Institute for Paper, Pulp and Fiber Technology 7 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Distribution of the Response response model for here …part of the Exponential Family withthe probability for successful ink transmission

8 Institute for Paper, Pulp and Fiber Technology 8 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 the Generalized Linear Model* model for is linked to the mean by * Nelder & Wedderburn (1972). Generalized Linear Models. Journal of the Royal Statistical Society, 135, linear predictor advances over a linear model: distribution of the relative frequencies … member of the Exponential Family mean lies between 0 and 1

9 Institute for Paper, Pulp and Fiber Technology 9 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Model Deviance Deviance = -2 × ( maximized log-likelihood of considered model – maximized log-likelihood of saturated model ) under certain regularity conditions, …a test for goodness-of-fit ifUnderdispersion Variance of data smaller than assumed by the model ifOverdispersion Variance of data larger than assumed by the model

10 Institute for Paper, Pulp and Fiber Technology 10 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Deviances of the Printability Datasets distinct deviations from a binomial variance! few many unprinted areas …values from 11 different data sets

11 Institute for Paper, Pulp and Fiber Technology 11 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 MULTIPLICATIVE BINOMIAL DISTRIBUTION A Generalization of the Binomial Distribution

12 Institute for Paper, Pulp and Fiber Technology 12 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Definition *Altham (1978). Two Generalizations of the Binomial Distribution. Journal of the Royal Statistical Society, 27, considers litters of rabbits animals within one litter are treated with the same dosis of a certain drug n… litter size y… number of surviving animals outcomes from animals from within one litter are not mutually independent Altham introduces an interaction parameter ω introduced by Altham* as „multiplicative generalization of the binomial distribution“

13 Institute for Paper, Pulp and Fiber Technology 13 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Properties Member of the 2-parameter Exponential Family For ω=1, it corresponds to the Binomial Distribution For n=1, it reduces to the Bernoulli distribution

14 Institute for Paper, Pulp and Fiber Technology 14 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison With Classic Binomial pdf n = 36  = 0.8 ω=1 gives the classic binomial distribution

15 Institute for Paper, Pulp and Fiber Technology 15 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Variances n = 36 ω=1 gives the classic binomial distribution

16 Institute for Paper, Pulp and Fiber Technology 16 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Integration into GLM Context log-likelihood function of distribution logit-link  0 <  < 1  ω > 0 log-linear link

17 Institute for Paper, Pulp and Fiber Technology 17 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 DOUBLE BINOMIAL DISTRIBUTION A Second Generalization of the Binomial Distribution

18 Institute for Paper, Pulp and Fiber Technology 18 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Definition *Efron (1986). Double Exponential Families and their Use in Generalized Linear Regression. Journal of the American Statistical Association, 81, introduced by Efron* as part of the Double Exponential Family second parameter  allows variation of variance: variance is smaller than binomial if 0<  <1 and larger than binomial if  >1  =1 gives the classic binomial distribution

19 Institute for Paper, Pulp and Fiber Technology 19 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison With Classic Binomial pdf n = 36  = 0.8  =1 gives the classic binomial distribution

20 Institute for Paper, Pulp and Fiber Technology 20 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Variances n = 36  =1 gives the classic binomial distribution

21 Institute for Paper, Pulp and Fiber Technology 21 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Integration into GLM Context member of the 2-parameter exponential family log-likelihood function of distribution  0 <  < 1   > 0 logit-link log-linear link

22 Institute for Paper, Pulp and Fiber Technology 22 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 AN APPLICATION The Printability Dataset

23 Institute for Paper, Pulp and Fiber Technology 23 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Response and Explanatory Variables occurrrence of unprinted areas… ~ explained by… topography + formation

24 Institute for Paper, Pulp and Fiber Technology 24 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of Three Models Distributionclassic binomial multiplicative binomial double binomial DoF DoF AIC

25 Institute for Paper, Pulp and Fiber Technology 25 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Means

26 Institute for Paper, Pulp and Fiber Technology 26 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Means

27 Institute for Paper, Pulp and Fiber Technology 27 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Means The second parameter influences the mean, too.

28 Institute for Paper, Pulp and Fiber Technology 28 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Standard Deviations

29 Institute for Paper, Pulp and Fiber Technology 29 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Standard Deviations

30 Institute for Paper, Pulp and Fiber Technology 30 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Variances binomial Std. Dev. at n=36: cannot be larger than 3 empirical Std. Deviations: up to 11 Multiplicative and Double Binomial Standard Deviations fit much better to empirical results

31 Institute for Paper, Pulp and Fiber Technology 31 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Summary Two generalizations of the binomial distribution might compensate over- or underdispersion in the case of classic binomial distribution. Multiplicative Binomial Distribution (Altham, 1978) second parameter ω in GLM context:model  with the logistic link and ω with the log-linear link function

32 Institute for Paper, Pulp and Fiber Technology 32 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Summary 2 Double Binomial Distribution (Efron, 1986) second parameter  in GLM context:model  with the logistic link and  with the log-linear link function

33 Institute for Paper, Pulp and Fiber Technology 33 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Thank You for Your Attention


Download ppt "Institute for Paper, Pulp and Fiber Technology 1 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed."

Similar presentations


Ads by Google