Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dealing with data on ethnicity: Principles and practice Paul Lambert, University of Stirling Talk presented to the DAMES Node workshop on Data on ethnicity.

Similar presentations

Presentation on theme: "Dealing with data on ethnicity: Principles and practice Paul Lambert, University of Stirling Talk presented to the DAMES Node workshop on Data on ethnicity."— Presentation transcript:

1 Dealing with data on ethnicity: Principles and practice Paul Lambert, University of Stirling Talk presented to the DAMES Node workshop on Data on ethnicity in social survey reseach Stirling, 28 th Jan 2010. DAMES ( is an ESRC funded research Node working on Data Management through e-Social

2 ..dealing with data on ethnicity 1)Handling/enhancing categorical data (data management) 2)Handling/enhancing data on ethnicity 2

3 3 Categorical data is important.. Principal social survey datum oBasis of most social research reports/analyses/comparisons Its rich and complex oWere often interested in very fine levels of detail / difference oWe usually recode categories in some way for analysis …how categorical data is managed is of great consequence to the results of analysis… Choices about recoding, boundaries, contrasts made [e.g. RAE analysis: Lambert & Gayle 2009]

4 4 EFFNATIS sample (1999): Subjective ethnic identity

5 5 UK EFFNATIS survey (1999) [Heckmann et al 2001]

6 6

7 7 Data management and categorical data In DAMES, we identify three important categorical variables (occupations, educational qualifications, ethnicity), and collect information about them in order to improve data management and hence exploitation of such data Key social science variables Existing resources (and metadata & support on those resources) UK and beyond

8 8 Occupational Information Resources Small databases (square electronic files) linking lists of occupational positions (occupational unit groups) with information about those positions Many existing resources already used in academic research (> 1000)

9 9 Educational information resources Small databases (often on paper) linking lists of educational qualifications with information about them Many existing resources (>500), but less communication between them [Part of UK scheme from ONS (2008)]

10 10 Ethnic Minority/Migration Information Resources Data which links measures of ethnicity / migration status with other information In high demand, but few existing resources (? < 500)

11 11 Standardizing categorical data Standardization refers to treating variables for the purposes of analysis, in order to aid comparison between variables o{In the terminology of survey research analysts} 1. Arithmetic standardization to re-scale metric values [z i = (x i – x) / sd] 2. Ex-ante harmonisation (during data production) [ensuring measures of the same concept, collected from different contexts, are recorded in coordinated taxonomies] 3. Ex-post harmonisation [adapting measures of the same concept, collected from different contexts, using a coordinated re-coding procedure]

12 12 The big issue: standardization for comparisons Comparisons are the essence [Treiman, 2009: 382] to make statements about differences [in measures] over contexts Categorical data is highly problematic.. Cant immediately conduct arithmetic standardization Struggle to enforce harmonised data collection..which may not in any case be suitable.. Struggle to achieve ex-post harmonisation Non-linear relations between categories Shifting underlying distributions

13 13 Two conventional ways to make comparisons [e.g. van Deth 2003] Measurement equivalence = ex ante harmonisation (or ex post harmonisation) Meaning equivalence = Arithmetic standardisation (or ex ante or ex post harmonisation) Much comparative research flounders on an insufficient recognition of strategies for equivalence (One size doesnt fit all, so we cant go on)

14 14 Measurement equivalence Measurement equivalence by assertion

15 15 Measurement equivalence can go wrong Show tabplot here

16 16 Meaning equivalence For categorical data, equivalence for comparisons is often best approached in terms of meaning equivalence (because of non-linear relations between categories and shifting underlying distributions) (even if measurement equivalence seems possible) Arithmetic standardisation offers a convenient form of meaning equivalence by indicating relative position with the structure defined by the current context For categorical data, this can be achieved by scaling categories in one or more dimension of difference

17 17 Effect proportional scaling using parents occupational advantage

18 18 What we do and what we ought to do (when standardizing categories) Research applications tend to select a favoured categorisation of a concept and stick with it Due to coordinated instructions [e.g. Blossfeld et al. 2006] Due to perceived lack of available alternatives Due to perceived convenience To make statistical analyses more robust we should… Operationalise and deploy various scalings and arithmetic measures Try out various of categorisations and explore their distributional properties … and keep a replicable trail of all these activities..

19 19 2) Handling data on ethnicity & standardizing categorical data GESDE projects are concerned with allowing social science researchers to navigate, and exploit, heterogeneous information resources Occupational Information Resources (GEODE) Educational Information Resources (GEEDE) Ethnic minority/Migration Information Resources (GEMDE)

20 20 Plenty of interest, and data, on ethnic minority groups, immigration, immigrants Data includes: Generic & specialist studies collecting ethnic referents ethnic identity; nationality, parents nationality; country of birth; language spoken; religion; race National research and data management: Most countries have evolving standard definitions of ethnic groups International research and data management Seen as highly problematic in many fields except immigration data Lambert, P.S. (2005). Ethnicity and the Comparative Analysis of Contemporary Survey Data. In J. H. P. Hoffmeyer-Zlotnick & J. Harkness (Eds.), Methodological Aspects in Cross-National Research (pp. 259-277). Manheim: ZUMA-Nachrichten Spezial 11.

21 …but working with ethnicity data in surveys is hard…! - Its sparse - Its collinear (e.g. to age) - Its dynamic (cf. comparative research) 21

22 22

23 23

24 24 UK: ONS & ESDS data guides Input harmonisation within decades Output harmonisation between decades oBosveld, K., Connolly, H., & Rendall, M. S. (2006). A guide to comparing 1991 and 2001 Census ethnic group data. London: Office for National Statistics. Academic strategies – ad hoc black group, etc Addition of extra categories over time Mixed ethnicities, marriages… UK Focus on ethnic identity, lack of attention to alternative referents

25 25 Comparative research solutions? Measurement equivalence might be achieved by: oSurvey data collection oConnecting related groups oLongitudinal linkage Functional equivalence for categories: oSimplified categorical distinctions oImmigrant cohorts oScaling ethnic categories

26 …Principles and practice… 3 themes in DAMES ought, in our perspective, to help here 1)Replicability / transparency 2)Plurality of approaches 3)Ease access (to off-putting operations) 26

27 Replicability / transparency Document your own recodes Access somebody elses recodes Identify commonly used recodes (& use them..!) 27

28 Plurality of approaches Diminishing excuses for not trying out multiple operationalisations… 28

29 Making complex things easier Organising complex categorical data Labelling, recoding, etc Effect proportional scaling Standardisation Interaction terms 29

30 30 Data used Department for Education and Employment. (1997). Family and Working Lives Survey, 1994-1995 [computer file]. Colchester, Essex: UK Data Archive [distributor], SN: 3704. Heckmann, F., Penn, R. D., & Schnapper, D. (Eds.). (2001). Effectiveness of National Integration Strategies Towards Second Generation Migrant Youth in a Comparative Perspective - EFFNATIS. Bamberg: European Forum for Migration Studies, University of Bamberg. Inglehart, R. (2000). World Values Surveys and European Values Surveys 1981-4, 1990-3, 1995-7 [Computer file] (Vol. 2000). Ann Arbor, MI: Institute for Social Research [Producer]; Inter-university Consortium for Political and Social Research [Distributor]. Li, Y., & Heath, A. F. (2008). Socio-Economic Position and Political Support of Black and Ethnic Minority Groups in the United Kingdom, 1972-2005 [computer file]. 2nd Edition. Colchester, Essex: UK Data Archive [distributor], SN: 5666. University of Essex, & Institute for Social and Economic Research. (2009). British Household Panel Survey: Waves 1-17, 1991-2008 [computer file], 5th Edition. Colchester, Essex: UK Data Archive [distributor], March 2009, SN 5151.

31 31 References Agresti, A. (2002). Categorical Data Analysis, 2nd Edition. New York: Wiley. Lambert, P. S., & Gayle, V. (2009). Data management and standardisation: A methodological comment on using results from the UK Research Assessment Exercise 2008. Stirling: University of Stirling, Technical paper 2008-3 of the Data Management through e-Social Science research Node ( Long, J. S. (2009). The Workflow of Data Analysis Using Stata. Boca Raton: CRC Press. Simpson, L., & Akinwale, B. (2006). Quantifying Stablity and Change in Ethnic Group. Manchester: University of Manchester, CCSR Working Paper 2006-05. Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677- 680. Treiman, D. J. (2009). Quantitative Data Analysis: Doing Social Research to Test Ideas. New York: Jossey Bass. van Deth, J. W. (2003). Using Published Survey Data. In J. A. Harkness, F. J. R. van de Vijver & P. P. Mohler (Eds.), Cross-Cultural Survey Methods (pp. 329-346). New York: Wiley.

Download ppt "Dealing with data on ethnicity: Principles and practice Paul Lambert, University of Stirling Talk presented to the DAMES Node workshop on Data on ethnicity."

Similar presentations

Ads by Google