Download presentation
Presentation is loading. Please wait.
Published byDanna Muster Modified over 9 years ago
1
Institute for Paper, Pulp and Fiber Technology 1 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies Feirer V., Hirn U., Friedl H., Bauer W. Institute for Paper, Pulp and Fiber Technology & Institute for Statistics Graz University of Technology
2
Institute for Paper, Pulp and Fiber Technology 2 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Agenda Motivation Generalized Linear Models Multiplicative Binomial Distribution Double Binomial Distribution Application of the Two Distributions Summary
3
Institute for Paper, Pulp and Fiber Technology 3 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Motivation consider the problem of successful ink transfer on paper explain occurrence of unprinted regions …part of a larger, industry-funded project at the IPZ. (No. of datapoints in sample: roughly 9 10 6 sample size: 3 6 mm²)
4
Institute for Paper, Pulp and Fiber Technology 4 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Predictor Variables TopographyFormation…the way fibres are arranged
5
Institute for Paper, Pulp and Fiber Technology 5 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Response true colour image
6
Institute for Paper, Pulp and Fiber Technology 6 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 GENERALIZED LINEAR MODELS Basics
7
Institute for Paper, Pulp and Fiber Technology 7 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Distribution of the Response response model for here …part of the Exponential Family withthe probability for successful ink transmission
8
Institute for Paper, Pulp and Fiber Technology 8 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 the Generalized Linear Model* model for is linked to the mean by * Nelder & Wedderburn (1972). Generalized Linear Models. Journal of the Royal Statistical Society, 135, 370-384 linear predictor advances over a linear model: distribution of the relative frequencies … member of the Exponential Family mean lies between 0 and 1
9
Institute for Paper, Pulp and Fiber Technology 9 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Model Deviance Deviance = -2 × ( maximized log-likelihood of considered model – maximized log-likelihood of saturated model ) under certain regularity conditions, …a test for goodness-of-fit ifUnderdispersion Variance of data smaller than assumed by the model ifOverdispersion Variance of data larger than assumed by the model
10
Institute for Paper, Pulp and Fiber Technology 10 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Deviances of the Printability Datasets distinct deviations from a binomial variance! few many unprinted areas …values from 11 different data sets
11
Institute for Paper, Pulp and Fiber Technology 11 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 MULTIPLICATIVE BINOMIAL DISTRIBUTION A Generalization of the Binomial Distribution
12
Institute for Paper, Pulp and Fiber Technology 12 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Definition *Altham (1978). Two Generalizations of the Binomial Distribution. Journal of the Royal Statistical Society, 27, 162-197 considers litters of rabbits animals within one litter are treated with the same dosis of a certain drug n… litter size y… number of surviving animals outcomes from animals from within one litter are not mutually independent Altham introduces an interaction parameter ω introduced by Altham* as „multiplicative generalization of the binomial distribution“
13
Institute for Paper, Pulp and Fiber Technology 13 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Properties Member of the 2-parameter Exponential Family For ω=1, it corresponds to the Binomial Distribution For n=1, it reduces to the Bernoulli distribution
14
Institute for Paper, Pulp and Fiber Technology 14 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison With Classic Binomial pdf n = 36 = 0.8 ω=1 gives the classic binomial distribution
15
Institute for Paper, Pulp and Fiber Technology 15 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Variances n = 36 ω=1 gives the classic binomial distribution
16
Institute for Paper, Pulp and Fiber Technology 16 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Integration into GLM Context log-likelihood function of distribution logit-link 0 < < 1 ω > 0 log-linear link
17
Institute for Paper, Pulp and Fiber Technology 17 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 DOUBLE BINOMIAL DISTRIBUTION A Second Generalization of the Binomial Distribution
18
Institute for Paper, Pulp and Fiber Technology 18 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Definition *Efron (1986). Double Exponential Families and their Use in Generalized Linear Regression. Journal of the American Statistical Association, 81, 709-721 introduced by Efron* as part of the Double Exponential Family second parameter allows variation of variance: variance is smaller than binomial if 0< <1 and larger than binomial if >1 =1 gives the classic binomial distribution
19
Institute for Paper, Pulp and Fiber Technology 19 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison With Classic Binomial pdf n = 36 = 0.8 =1 gives the classic binomial distribution
20
Institute for Paper, Pulp and Fiber Technology 20 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Variances n = 36 =1 gives the classic binomial distribution
21
Institute for Paper, Pulp and Fiber Technology 21 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Integration into GLM Context member of the 2-parameter exponential family log-likelihood function of distribution 0 < < 1 > 0 logit-link log-linear link
22
Institute for Paper, Pulp and Fiber Technology 22 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 AN APPLICATION The Printability Dataset
23
Institute for Paper, Pulp and Fiber Technology 23 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Response and Explanatory Variables occurrrence of unprinted areas… ~ explained by… topography + formation
24
Institute for Paper, Pulp and Fiber Technology 24 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of Three Models Distributionclassic binomial multiplicative binomial double binomial 17071845211632 DoF24832482 661458364117 DoF24812480 AIC662058454125
25
Institute for Paper, Pulp and Fiber Technology 25 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Means
26
Institute for Paper, Pulp and Fiber Technology 26 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Means
27
Institute for Paper, Pulp and Fiber Technology 27 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Means The second parameter influences the mean, too.
28
Institute for Paper, Pulp and Fiber Technology 28 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Standard Deviations
29
Institute for Paper, Pulp and Fiber Technology 29 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Standard Deviations
30
Institute for Paper, Pulp and Fiber Technology 30 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Comparison of the Variances binomial Std. Dev. at n=36: cannot be larger than 3 empirical Std. Deviations: up to 11 Multiplicative and Double Binomial Standard Deviations fit much better to empirical results
31
Institute for Paper, Pulp and Fiber Technology 31 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Summary Two generalizations of the binomial distribution might compensate over- or underdispersion in the case of classic binomial distribution. Multiplicative Binomial Distribution (Altham, 1978) second parameter ω in GLM context:model with the logistic link and ω with the log-linear link function
32
Institute for Paper, Pulp and Fiber Technology 32 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Summary 2 Double Binomial Distribution (Efron, 1986) second parameter in GLM context:model with the logistic link and with the log-linear link function
33
Institute for Paper, Pulp and Fiber Technology 33 Verena Feirer Österreichische Statistiktage Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies September 7 th, 2011 Thank You for Your Attention
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.