Presentation on theme: "Mark Tranmer Cathie Marsh Centre for Census and Survey Research Multilevel models for combining macro and micro data Unit 5."— Presentation transcript:
Mark Tranmer Cathie Marsh Centre for Census and Survey Research Multilevel models for combining macro and micro data Unit 5
Introduction We will see how the multilevel model provides a framework for combining individual level survey data with aggregate group level data. We illustrate this with an example where individual level data from the European Social Survey are combined with country level data from the Eurostat New Cronos data. The dependent variable in our example is voter turnout in the most recent election in their country of residence.
Learning objectives (1) To introduce the idea of multilevel modelling To explain why multilevel modelling is useful when linking macro and micro data. To present the kinds of substantive research questions that can be answered using this approach To outline software that permits multilevel models to be fitted.
Learning objectives (2) To give an example of linking micro and macro data in the multilevel model framework by combining the ESS micro data with country-level macro data from Eurostat New Cronos To briefly outline the various multilevel models in this context To explain how interactions between aggregate (macro) and individual level (micro) measures work in these models and why they might answer important substantive research questions.
Levels of analysis and inference Traditional regression models are used to carry out an analysis at a single level. Such as the individual (person level) with individual level data Or at the group (country level) with aggregate data. If we do an individual level analysis we can make individual level inferences but, without group level information such inferences may be made out of the context in which the processes occur. Sometimes this is referred to the atomistic fallacy Ideally we want to do the analysis in context
Levels of analysis and inference We could also do a group (country level) analysis. For example relating the % voting in each European country with the unemployment rate in that country. This would tell us whether countries with higher unemployment tended to have higher (or lower) levels of voter turnout. But it wouldnt tell us whether unemployed people were more (or less) likely to vote than employed people. To make such an inference about individuals from a group level analysis would be an example of the ecological fallacy. In general the results of analyses carried out at the group level do not apply at the individual level.
Multilevel models Multilevel models allow us to consider the individual level and the group level in the same analysis, rather than having to choose one or the other. For example we can consider the individual and the country level in the same analysis An alternative is to include dummy variables for each of the groups (i.e. countries in the analysis). A so called fixed effects approach. However multilevel models have several advantages over this approach:
Multilevel models 1. They provide an ideal framework for combining data from several sources, such as individual level survey data (micro data) and country level aggregate data (macro data). 2. They allow sophisticated hypotheses to be tested without the need to add a lot of extra variables and interactions to the model. E.g. it is relatively straight forward to consider a research question such as this: is the association of age with voter turnout stronger in some countries than others?
Multilevel modelling framework The current example involves individual level micro data from the European Social Survey And country level aggregate macro data from the Eurostat New Cronos. There are basically three ways of fitting multilevel models for voter turnout with these data:
Multilevel modelling framework 1.Models that involve the micro data only 2.Models that combine micro data and macro data and assess the additional impact of the variables from the macro dataset to explain variations in voter turnout 3.Models that interact variables on the micro data and macro data, such as whether or not someone is unemployed (micro data) with the % long term unemployed in the country (macro data).
Multilevel modelling software We will use software called MLwiN. Although to some extent SPSS can be used for multilevel modelling, MLwiN is more flexible and has better graphics and so on. More details of MLwiN at MLwiN is being made free to academics
Part 1: multilevel models and ESS micro data
Modelling approaches: theory Model 1: Single level model – e.g. predicting chance of voting with age
Modelling approaches: theory Model 1: Single level model
Modelling approaches: theory Model 2: null model (multilevel) – getting a sense of where the variation in voter turnout is: between people or between countries
Modelling approaches: theory Model 3: Multilevel Model with varying intercepts. Relating age to voting and allowing overall turnout to be higher/lower in each European country.
Modelling approaches: theory Model 3: Multilevel Model with varying intercepts
Modelling approaches: theory Model 4: Multilevel Model with varying intercepts and slopes – relationship of age with voting can be stronger/weaker in each country
Model 4: graphical representations
Using MLwiN to read in the data and set up the binomial model We will set up a binomial model in MLwiN and estimate some multilevel models (models 2-4) using the ESS micro data only We will use an MLwiN worksheet called Lmmd6.ws
Using Mlwin to read in the data and set up the binomial model Open MLwiN by locating it in the programmes listed in the windows start menu or by clicking on the MLwiN icon on your desktop. The default worksheet size for this exercise is 5000 cells which is too small to permit the analysis. However, it is easy to increase the worksheet size. To do this go to options and make the worksheet cells (change from 5000). NB: Do not save worksheet when prompted. Now choose data manipulation > names
Setting up the model in MLwiN
Null model (model 2) is now set up
Model 2 results
Model 3: results – add cent_age to model by clicking on add term
Model 4: set up
Model 4: results
Part 2: combining macro and micro data in multilevel models
Combining data in mulitlevel models: model 5 – Main effects
Combining data in mulitlevel models: model 6 – interactions
Model 5 main effects: results
Model 6 Interactions: results
Summary: what you have learnt in this session 1.The multilevel model is an extremely useful framework for combining macro and micro data 2.Multilevel logistic regression models can be used for an outcome with two categories such as voter turnout 3.We can then fit a series of models to extent the nature and extent of individual and country level variations in voter turnout. We can use software such as MLwiN to do this.
Summary: what you have learnt in this session 4. We can then estimate multilevel models with ESS micro data only 5.We can then combine micro and macro data by adding variables from Eurostat New Cronos to model 6.Finally we can also interact individual level ESS variables with country level variables from new Cronos data