Download presentation
Presentation is loading. Please wait.
1
GIS in Spatial Epidemiology: small area studies of exposure- outcome relationships Robert Haining Department of Geography University of Cambridge
2
1.Spatial epidemiology: Some definitions Geographical correlation studies Framework for analysis Problems with small area analysis Reasons for conducting small area analysis Good practice Regression models 2.Reference to a case study: Data issues Statistical modelling
3
Spatial epidemiology is concerned with describing and understanding spatial variation in disease risk. Individual level data; Counts for small areas. Recent developments owe much to: Geo-referenced health and population data; Computing advances; Development of GIS; Statistical methodology.
4
Geographical correlation studies These studies typically involve examining geographical variations in exposure to environmental variables (air; water; soil etc) and their association with health outcomes whilst controlling for other relevant factors using regression.
5
Framework for analysis: Population is unevenly distributed geographically; People move around (day-to-day movements; longer term movements including migration); People possess relevant individual characteristics (age, sex, genetic make-up, lifestyle, etc) Live in communities
6
Problems with small area analyses Frequency and quality of population data (e.g. Census every 10 years); Spatial compatibility of different data sets; Availability of data on population movements; Measuring population exposure to the environmental variable; Environmental impacts are often likely to be quite small (relative to, for example, lifestyle effects) and there may be serious confounding effects; Cannot estimate strength of an association; Ecological (or aggregation) bias.
7
Reasons for conducting small area analysis Provides a qualitative answer about the existence of an association (e.g. between environmental variable and health outcome); May provide evidence that can be followed up in other ways.
8
Good practice (Richardson 1992) Allow for heterogeneity of exposure; Use “well defined” population groups; Use survey data to help obtain good exposure data; Allow for latency times; Allow for population movement effects
9
Regression model specification: O i denotes the number of cases for area i. i = 1,…,n. If the outcome is rare, typically, it is assumed that O i is Poisson distributed with parameter i. The expected value of O i is written: E[O i ] = i = E i r i i = 1, …, n, where r i is the unknown area-specific relative risk in area i, and E i defines the expected number of cases for i given the size of the population and its age and sex composition.
10
ln[ i ] = ln[E i ] + ln[r i ]. This defines a Poisson regression model where is the intercept parameter, and 1, 2,…, k and are regression parameters. ln[E i ] is an offset. The area-specific relative risk at i is associated with attributes of the population X 1,…,X k and the environmental exposure Z at i. Adjustment for overdispersion is necessary because of population heterogeneity at the scale of the individual small areas (see, for example, Manton and Stallard 1981). Allowance for data uncertainty arising from the use of sample data ln[ i ] = ln[E i ] + + 1 X 1,i + 2 X 2,i +..... + k X k,i + Z i
11
A short case study: I Data Issues and GIS Demographic and social and economic data: Pre-2001 Census: »Enumeration Districts (EDs); »Wards. 2001 Census: »Output Areas (OAs) »Super Output Areas (SOAs) Health data (Heart disease & stroke: mortality & admissions): Individual records geo-referenced to ED Postcoded counts Environmental data (NO x ; PM 10 ; CO) : Grided
12
Problem: obtain a measure of air pollution exposure at the ED level.
13
Step 1: Measuring NO x exposure. The Indic-Airviro model:
14
Average annual mean pollution levels 1994-9 (exc 1998): a) NO x (ug/m 3 ) ; b) PM10 (ug/m 3 )
15
Comparing modelled and monitored values for NO x.
16
Step 2: Transferring the gridded data to the ED framework. Areal Interpolation: i Area weighting
17
Areal Interpolation (from grid to EDs): ii point in polygon – ED centroid
18
Areal Interpolation (from grid to EDs): iii point in polygon – weighted PostPoint
21
Weighted PostPoint and ED centroid exposure measures are very similar; areal weighting different
22
Weighted PostPoint differs from both ED centroid and areal weighting
23
Where all three methods will give the same or similar results
24
Step 3: Making allowance for population movements 1.Long term population movement: Sheffield Health and Illness Prevalence study: –12,239 representative individuals 18-94 tracked from 1994-2002; –1491 died; 1572 left Sheffield. –Of the 9176 remaining: »70% did not move; »23% made 1 move »5% made 2 moves; »Just over 1% made 3 moves »Under 1% made 4 or more moves. => significant risk of misclassification of exposure level.
25
2. Short term population movement
26
Spatially smoothed CO average of the annual mean pollution levels (1994-1999, excluding 1998) for Sheffield enumeration districts (ug/m 3 ) (i) 1km; (ii) 2km; (iii) 4km
27
Comparing indoor and outdoor air pollution exposure People spend between 75% and 90% of their time indoors. Indoor pollution levels depend not only on outdoor emissions but on housing conditions (cooking, heating, ventilation etc). Evidence on relationship between indoor and outdoor pollution levels
28
PollutantSourceIndoorOutdoorIndoor/outdoor COValerio et al (1997)7.89.550.82 NODrakou et al (1998)56.5370.660.80 NO 2 Drakou et al (1998)67.1488.040.76 NO x Drakou et al (1998)126.98187.740.68 O3O3 Drakou et al (1998)17.5437.140.47 PM 2.5 Lee et al (1997)25.326.30.96 PM 2.5 Fischer et al (2000)19.5230.85 PM 10 Lee and Chang (2000)1201340.90 PM 10 Fischer et al (2000)29.539.50.75 mean ( g/m 3 ) ratio SO 2 Lee et al (1997)6.1811.10.56
29
Statistical modelling issues ln[ i ] = ln[E i ] + + 1 X 1,i + 2 X 2,i +... + k X k,i + Z i 1.Overdispersion linked to spatially correlated missing covariates. 2.Sampling errors where data are based on surveys (e.g lifestyle data). Fitted spatially structured random effects models in WinBUGS (MCMC estimation) to handle overdispersion; Used posterior densities for some of the lifestyle covariates (e.g. smoking prevalence); WinBUGS output sent to GIS to map model output (e.g. area specific risks).
30
Map of excess relative risks of coronary heart disease. An area (i) is considered to have excess relative risk when 97.5% of the simulated values of relative risk of area i (r i ) are greater than 1.
31
References: P.Brindley, R.Maheswaran, T.Pearson, S.Wise and R.Haining (2004) “ Using modelled outdoor air pollution data for health surveillance. ” In R.Maheswaran and M.Craglia (eds) GIS in Public Health Practice. Taylor and Francis, London, p.125- 149. P.Brindley, S.Wise, R.Maheswaran, and R.Haining. (2005) “ The effect of alternative representations of population location on the areal interpolation of air pollution exposure. ” Computers, Environment and Urban Systems, Vol 29, 455-469. R.Maheswaran, R.Haining, P.Brindley, J.Law, T.Pearson, N.Best (2006) “ Outdoor NO x and stroke mortality – adjusting for small area level smoking prevalence using a Bayesian approach. ” Statistical Methods in Medical Research, 2006, 15, 499-516.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.