# SADC Course in Statistics Modelling ideas in general – an appreciation (Session 20)

## Presentation on theme: "SADC Course in Statistics Modelling ideas in general – an appreciation (Session 20)"— Presentation transcript:

SADC Course in Statistics Modelling ideas in general – an appreciation (Session 20)

To put your footer here go to View > Header and Footer 2 Objectives The aim of this session is to provide you with an appreciation of approaches available to deal with modelling variables that are not in the form of quantitative measurements

To put your footer here go to View > Header and Footer 3 Contents A brief overview of modelling ideas in general Emphasis is on different analysis approaches to cater to the different types of response being modelled and an appreciation of the standard form of a model in terms of data=pattern+residual Presentation of the case study work will then follow…

To put your footer here go to View > Header and Footer 4 Steps in Modelling Exploratory stage Comparing competing models Fitting the chosen model Checking model assumptions Interpreting model Presenting the results. Always want as simple a model as possible, but one that describes all the pattern.

To put your footer here go to View > Header and Footer 5 data = pattern + residual e.g. paddy survey data yield Linear relationship with amount of fertiliser (continuous variable) Yield differs from variety to variety (grouping variable) Known or possible explanatory variables Statistical Models

To put your footer here go to View > Header and Footer 6 data = pattern + residual Known or possible explanatory variables Statistical Models Use this component to check model assumptions, e.g. plots of residuals, or a histogram for quantitative data (e.g. paddy yield) When data are quantitative, assume residuals have a normal distribution with a constant variance.

To put your footer here go to View > Header and Footer 7 data = pattern + residual Known or possible explanatory variables Statistical Models Need to consider this when the data have a hierarchical structure e.g. plants within pots, & leaves within plants e.g. households and individuals within hhs Different levels of variation require moving to more advanced procedures such as Multilevel Modelling with more than one residual

To put your footer here go to View > Header and Footer 8 data = pattern + residual Generalised Linear Models Not all data are quantitative measurements; e.g. often interested in proportions (or %s) e.g. or the response may be in the form of counts. Moving to modern methods, i.e. generalised linear models. In these models, the residuals have non-normal distributions.

To put your footer here go to View > Header and Footer 9 data = pattern + residual Logistic/Log-linear Models Logistic modelling is used to model data that are binary, i.e. only 2 categories The response being modelled is the log odds of getting response=yes. Log-linear modelling is suitable for use when dealing with categorical data having more than two categories.

To put your footer here go to View > Header and Footer 10 In summary… The appropriate model depends on the data type for your key response measurement. With quantitative measurements – use standard regression/anova type models –normal distribution assumed –If skewed data, consider taking a transformation For binary data, use logistic regression models For categorical variates (more than two categories) use poisson regression or log- linear models.

To put your footer here go to View > Header and Footer 11 Case Study Presentation will follow…