Presentation on theme: "Quantifying culture: a discussion and reflection on current methodology Michael Regier BAR, BSc, MSc Department of Statistics, UBC."— Presentation transcript:
Quantifying culture: a discussion and reflection on current methodology Michael Regier BAR, BSc, MSc Department of Statistics, UBC
2 Outline NET definition of culture Operational problem Three current solutions Conclusions Open Discussion
3 NET definition of culture The cultural NET has defined culture as –“a complex interplay of meanings that represent and shape the individual and collective lives of people”. This definition is Postmodern in spirit. –Individuals and communities are shaped by meanings (e.g. language, country of birth, religion). –identity is socially (contextually) constructed.
4 Operational problem 1: measurement Although the cultural NET has a definition of culture, the definition provides little guidance on how to make it operational. –No available meta-narrative. e.g. All persons of who have ancestral links, within four generations, to the Indian subcontinent will be identified as South Asian. We need a function that will map (measure) the Postmodern definition to a value (e.g. binary indicator, population count)
5 Operational problem 2: cultural data In the absence of individual level cultural data, census data is used. Neighbourhoods are considered equivalent to the dissemination area (DA) –DA data is the smallest geographic area at which census data is freely released –In general, the DA represents the basic building block for other census based geographic areas –A DA is a compact (e.g. square) area with visible boundaries (e.g. road, river) with an average of 550 people. Census data is collected on predetermined measures (e.g. mother tongue, country of birth, ethnic origins) –Mismatch definitions used to abstract patients from registry database –Proxy to individual measure of culture. –Proxy to “true” definition of culture.
6 Three solutions Cut-off measure Compression and clustering (CC) measure Bayesian measure
7 Cut-off measure Methodology inherited from the Nova Scotia research team Uses a single cultural attribute (e.g. South Asian Ethnicity) to define a DA. A cut-off is selected for a specific census variable by exploring the impact of different cut-offs (e.g. 10%, 25%, 33%, 50%) on the numbers of people reporting the attribute of interest
8 Advantages and disadvantages of the cut-off measure Advantages –Uses freely available information –Easy to implement –Easy to interpret –Quick identification of DA’s of interest Disadvantages –Interesting sub-populations are predetermined by the researcher –Arbitrary data reduction Multinomial distributions are reduced to binomial distributions based on a selective search and non-algorithmic decisions –Unclear classification of a DA when multiple indicators are used 40% South Asian, 35% Canadian, and 25% East Asian –Demands the use of complex interactions Potential for over-fitting and spurious relationships Difficult to interpret these models
9 Compression and clustering measure (CC) Uses Principal Component Analysis with a clustering algorithm to capture the complex interplay of meanings as defined by a single census variable. –Multinomial distributions are compressed (data reduction) over the p- dimensional simplex space into an r dimensional space where r<p. –The principal components in the r-dimensional space are clustered based on the family of Mahalanobis distances. –Clusters are chosen by minimizing the ratio of the intra- and inter- cluster variation. The clusters represent a collection of DAs that have data driven similarities. Clusters are mutually exclusive –No cross-classification as with cut-off method Clusters represent cultural context
10 Advantages and disadvantages of the compression and clustering measure Advantages –Data driven researcher interprets the clusters and does not pre-determine the clusters prevents predetermined results driven analysis –Clear analytic methodology based on widely accepted statistical techniques for clustering data –The CC method represents context –Single cultural constructs are not isolated from their context Important attributes arise naturally from the data Inherently contains the complex interplay of meanings Can find naturally occurring ethnic enclaves –No need for interactions in the model –Need a statistician –Requires inter-disciplinary (or trans-disciplinary) collaboration for the interpretation of the indicator Disadvantages –Interpretation can be difficult –Not easy to implement Technical issues still remain for the development of the indicator
11 Bayesian measure Any operational definition of culture will fail to fully quantify the NET definition of culture. All operational definitions of culture will have some uncertainty with respect to how well they captured the NET definition of culture A Bayesian approach allows us to –model uncertainty of our definition –analytically incorporate researcher knowledge –naturally incorporate covariates and geographic hierarchies
12 Implementation of the Bayesian measure A Bayesian approach has only recently been considered. The approach is being considered in the context of survival analysis. Bayesian methods will require a statistician.
13 Which measure to use? The “correct” measure depends on context and research question –Cut-off approach Quick look at the data –Descriptive statistics, preliminary models, preliminary investigation into spatial-temporal trends –Compression and clustering Descriptive statistics, model construction, trend analysis (spatial- temporal), identify predictors, inference –Bayesian Incidence and survival, model construction, trend analysis (spatial-temporal), identify predictors, inference
14 Conclusions Making the conceptual definition of culture operational will result in a variety of functional definitions. Functional definitions should be used with caution as they may be reasonable for only a certain type of investigation. Functional definitions are not “off-the-shelf” definitions.