Mark Tranmer Cathie Marsh Centre for Census and Survey Research Multilevel models for combining macro and micro data Unit 5.

Slides:



Advertisements
Similar presentations
Questions From Yesterday
Advertisements

Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Contextual effects In the previous sections we found that when regressing pupil attainment on pupil prior ability schools vary in both intercept and slope.
On-line Learning Environment for Multilevel Modelling Fiona Steele and Sacha Brostoff Centre for Multilevel Modelling University of Bristol.
Multilevel modelling short course
What is multilevel modelling?
National and Regional Variations in Electoral Participation in Europe: Evidence from The European Social Survey Ed Fieldhouse and Mark Tranmer Cathie Marsh.
SADC Course in Statistics Risks and return periods Module I3 Sessions 8 and 9.
Multilevel Event History Modelling of Birth Intervals
By Zach Andersen Jon Durrant Jayson Talakai
Doing an Econometric Project Or Q4 on the Exam. Learning Objectives 1.Outline how you go about doing your own econometric project 2.How to answer Q4 on.
Multilevel Modeling in Health Research April 11, 2008.
Methods of Economic Investigation Lecture 2
Linear Regression.  The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu:  The model won’t be perfect, regardless.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
1 Scottish Social Survey Network: Master Class 1 Data Analysis with Stata Dr Vernon Gayle and Dr Paul Lambert 23 rd January 2008, University of Stirling.
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
The World’s Fastest Crash Course in Statistics Or, What You Need to Know to Answer Your Research Question 13 November 2006.
School of Veterinary Medicine and Science Multilevel modelling Chris Hudson.
Modeling Wim Buysse RUFORUM 1 December 2006 Research Methods Group.
Lecture 6: Multiple Regression
Topic 3: Regression.
DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014.
An Introduction to Logistic Regression
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
SW318 Social Work Statistics Slide 1 Using SPSS for Graphic Presentation  Various Graphics in SPSS  Pie chart  Bar chart  Histogram  Area chart 
Week 9: Chapter 15, 17 (and 16) Association Between Variables Measured at the Interval-Ratio Level The Procedure in Steps.
Turnout in multi-level systems André Blais. Why is there a turnout gap between regional, national and European election? Usual assumption : first order.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Simple Linear Regression
Chapter 8: Bivariate Regression and Correlation
3nd meeting: Multilevel modeling: introducing level 1 (individual) and level 2 (contextual) variables + interactions Subjects for today:  Intra Class.
Categorical Data Prof. Andy Field.
Wednesday PM  Presentation of AM results  Multiple linear regression Simultaneous Simultaneous Stepwise Stepwise Hierarchical Hierarchical  Logistic.
Multilevel Modeling Using HLM and MLwiN Xiao Chen UCLA Academic Technology Services.
Correlation and regression 1: Correlation Coefficient
Dissertation Workshop How to design (and carry out) a quantitative analysis for a dissertation A practical workshop Mark Brown (Social Statistics)
Quantitative Research in Education Sohee Kang Ph.D., lecturer Math and Statistics Learning Centre.
Faculty of Social Sciences Induction Block: Maths & Statistics Lecture 3 Precise & Approximate Relationships Between Variables Dr Gwilym Pryce.
Workshop 1 Specify a multilevel structure for EITHER a response variable of your choice OR for a model to explain house prices OR voting behaviour Template.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Statistics and Quantitative Analysis U4320 Segment 8 Prof. Sharyn O’Halloran.
Regression Analysis. Scatter plots Regression analysis requires interval and ratio-level data. To see if your data fits the models of regression, it is.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Investigating Faking Using a Multilevel Logistic Regression Approach to Measuring Person Fit.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Causality and confounding variables Scientists aspire to measure cause and effect Correlation does not imply causality. Hume: contiguity + order (cause.
2-Day Introduction to Agent-Based Modelling Day 2: Session 6 Mutual adaption.
CORRELATION: Correlation analysis Correlation analysis is used to measure the strength of association (linear relationship) between two quantitative variables.
Chapter 16 Data Analysis: Testing for Associations.
Overview of Regression Analysis. Conditional Mean We all know what a mean or average is. E.g. The mean annual earnings for year old working males.
Chapter 9 Minitab Recipe Cards. Contingency tests Enter the data from Example 9.1 in C1, C2 and C3.
Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.
Copyright © 2009 Pearson Education, Inc. Chapter 8 Linear Regression.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 3 Multivariate analysis.
Financial Econometrics Lecture Notes 5
Chapter 13 Simple Linear Regression
Regression Analysis.
BINARY LOGISTIC REGRESSION
Introduction to Regression Analysis
Analysis of Covariance (ANCOVA)
Political Research & Analysis (PO657) Session V- Normal Distribution, Central Limit Theorem & Confidence Intervals.
CHAPTER 29: Multiple Regression*
DSS-ESTIMATING COSTS Cost estimation is the process of estimating the relationship between costs and cost driver activities. We estimate costs for three.
Amos Introduction In this tutorial, you will be briefly introduced to the student version of the SEM software known as Amos. You should download the current.
Regression Analysis.
Research Design Research Methodology and Methods of Social Inquiry
Regression Part II.
Presentation transcript:

Mark Tranmer Cathie Marsh Centre for Census and Survey Research Multilevel models for combining macro and micro data Unit 5

Introduction We will see how the multilevel model provides a framework for combining individual level survey data with aggregate group level data. We illustrate this with an example where individual level data from the European Social Survey are combined with country level data from the Eurostat New Cronos data. The dependent variable in our example is voter turnout in the most recent election in their country of residence.

Learning objectives (1) To introduce the idea of multilevel modelling To explain why multilevel modelling is useful when linking macro and micro data. To present the kinds of substantive research questions that can be answered using this approach To outline software that permits multilevel models to be fitted.

Learning objectives (2) To give an example of linking micro and macro data in the multilevel model framework by combining the ESS micro data with country-level macro data from Eurostat New Cronos To briefly outline the various multilevel models in this context To explain how interactions between aggregate (macro) and individual level (micro) measures work in these models and why they might answer important substantive research questions.

Levels of analysis and inference Traditional regression models are used to carry out an analysis at a single level. Such as the individual (person level) with individual level data Or at the group (country level) with aggregate data. If we do an individual level analysis we can make individual level inferences but, without group level information such inferences may be made out of the context in which the processes occur. Sometimes this is referred to the atomistic fallacy Ideally we want to do the analysis in context

Levels of analysis and inference We could also do a group (country level) analysis. For example relating the % voting in each European country with the unemployment rate in that country. This would tell us whether countries with higher unemployment tended to have higher (or lower) levels of voter turnout. But it wouldnt tell us whether unemployed people were more (or less) likely to vote than employed people. To make such an inference about individuals from a group level analysis would be an example of the ecological fallacy. In general the results of analyses carried out at the group level do not apply at the individual level.

Multilevel models Multilevel models allow us to consider the individual level and the group level in the same analysis, rather than having to choose one or the other. For example we can consider the individual and the country level in the same analysis An alternative is to include dummy variables for each of the groups (i.e. countries in the analysis). A so called fixed effects approach. However multilevel models have several advantages over this approach:

Multilevel models 1. They provide an ideal framework for combining data from several sources, such as individual level survey data (micro data) and country level aggregate data (macro data). 2. They allow sophisticated hypotheses to be tested without the need to add a lot of extra variables and interactions to the model. E.g. it is relatively straight forward to consider a research question such as this: is the association of age with voter turnout stronger in some countries than others?

Multilevel modelling framework The current example involves individual level micro data from the European Social Survey And country level aggregate macro data from the Eurostat New Cronos. There are basically three ways of fitting multilevel models for voter turnout with these data:

Multilevel modelling framework 1.Models that involve the micro data only 2.Models that combine micro data and macro data and assess the additional impact of the variables from the macro dataset to explain variations in voter turnout 3.Models that interact variables on the micro data and macro data, such as whether or not someone is unemployed (micro data) with the % long term unemployed in the country (macro data).

Multilevel modelling software We will use software called MLwiN. Although to some extent SPSS can be used for multilevel modelling, MLwiN is more flexible and has better graphics and so on. More details of MLwiN at MLwiN is being made free to academics

Part 1: multilevel models and ESS micro data

Modelling approaches: theory Model 1: Single level model – e.g. predicting chance of voting with age

Modelling approaches: theory Model 1: Single level model

Modelling approaches: theory Model 2: null model (multilevel) – getting a sense of where the variation in voter turnout is: between people or between countries

Modelling approaches: theory Model 3: Multilevel Model with varying intercepts. Relating age to voting and allowing overall turnout to be higher/lower in each European country.

Modelling approaches: theory Model 3: Multilevel Model with varying intercepts

Modelling approaches: theory Model 4: Multilevel Model with varying intercepts and slopes – relationship of age with voting can be stronger/weaker in each country

Model 4: graphical representations

Using MLwiN to read in the data and set up the binomial model We will set up a binomial model in MLwiN and estimate some multilevel models (models 2-4) using the ESS micro data only We will use an MLwiN worksheet called Lmmd6.ws

Using Mlwin to read in the data and set up the binomial model Open MLwiN by locating it in the programmes listed in the windows start menu or by clicking on the MLwiN icon on your desktop. The default worksheet size for this exercise is 5000 cells which is too small to permit the analysis. However, it is easy to increase the worksheet size. To do this go to options and make the worksheet cells (change from 5000). NB: Do not save worksheet when prompted. Now choose data manipulation > names

Setting up the model in MLwiN

Null model (model 2) is now set up

Estimation type

Model 2 results

Model 3: results – add cent_age to model by clicking on add term

Model 4: set up

Model 4: results

Part 2: combining macro and micro data in multilevel models

Combining data in mulitlevel models: model 5 – Main effects

Combining data in mulitlevel models: model 6 – interactions

Model 5 main effects: results

Model 6 Interactions: results

Summary: what you have learnt in this session 1.The multilevel model is an extremely useful framework for combining macro and micro data 2.Multilevel logistic regression models can be used for an outcome with two categories such as voter turnout 3.We can then fit a series of models to extent the nature and extent of individual and country level variations in voter turnout. We can use software such as MLwiN to do this.

Summary: what you have learnt in this session 4. We can then estimate multilevel models with ESS micro data only 5.We can then combine micro and macro data by adding variables from Eurostat New Cronos to model 6.Finally we can also interact individual level ESS variables with country level variables from new Cronos data