Presentation on theme: "Dr. Karyn Morrissey, Department of Geography and Planning, University of Liverpool."— Presentation transcript:
Dr. Karyn Morrissey, Department of Geography and Planning, University of Liverpool
Data Many forms; Level 1: Micro and Macro Public policy – requires micro–level data Model impact of policy at the individual level, Incorporate demographic, economic, social, environmental factors, plus… Account for behavioural change (longitudinal models) Host of methods to analysis this data Descriptive Statistics, Probability Equations, Deterministic Equations
Current Methods Host of methods to analysis this data Descriptive Statistics, Probability Equations, Deterministic Equations Numerous methodologies to augment and refine this microdata; Add spatial referring: Spatial microsimulation Add behavioural aspects: Dynamic simulation, Agent Based Modelling We are quite good at this now (in theory)!
Big Data Fuelled by continuing technological advances Generation of enormous real-time datasets ‘Big data’ Particularly in cities – ‘Smart Cities’ Potential – provide entirely new information on the functioning of society At various scales and across time Understanding such data, remains a major challenge.
Data Generation Data is doubling every two years and last year (2011) 1.8 ‘zettabytes’ was collected Catone (2011) suggests that it is equivalent to the storage in 57.5 billion 32 GB iPads but, Whatever the analogy - number is too big to comprehend. The volume of data released each day exceeds anything that could be collected in the typical academic lifetime of a generation ago Most of this information will be lost
Data Generators/Holders Private companies collect for the first time more personal information than central government Also Volunteered data Typically collected by individuals with no formal training There is an increasing amount of such data that could be, and in some cases is being, incorporated into formal scientific analyses. Unlike company based data Much historical volunteered information is held by public organisations and agencies who have an obligation to make their data holdings publicly available
Data Integrity Change in who collects data is important; Particularly with regard to its representativeness, And therefore reliability and robustness for analysis More data - does not mean more quality data Statistical integrity will or is often not be a priority Especially in comparison with national census agencies. Most of these data collection processes differ from that of a prescribed scientific experiment This has to be taken into consideration when analysing the data, calibrating models or testing hypotheses
Data Integrity This is a major issue given the ONS plans to phase out the National Census The golden standard for population data Data collected by infrastructure based agencies/companies – transport, telecommunication’s, energy Tend to be of a high standard – high-tech metering, etc However, population data sets – Health records, social welfare, etc, Prone to error and importantly - deemed highly confidential
Big Data – Big Usage? Also, quantitative revolution: began to use ‘large’ datasets to garner insights into the societal processes Most simulation models still find it difficult to provide datasets for the population of the UK Still lack computational power And many of our methods are yet to be harnessed for the latest datasets due to the rapidity and frequency of data releases Particularly real time data – temporal modelling is difficult – time as a continuous rather than a traditionally modelled discrete entity…
Admin Data The National Psychiatric In-patient Reporting System (NPIRS) is maintained by the Health Research Board (HRB) in Ireland Includes data including addresses on all admissions to and discharges from, psychiatric in-patient facilities in Ireland annually. Patients addresses may be geo- coded for spatial analysis Observe link between patient’s home ward and distance to hospital Individuals that live beside hospitals – higher level of admissions? Does proximity influence admissions? Important policy question
New data – new steps Examine the relationship further - need to use data from two different micro-level data sources SMILE (Simulation Model of the Irish Local Economy): spatial profile of individuals with depression & access scores to APH (Morrissey et al., 2010) Admin data -NPIRS: spatial profile of individuals admitted to APH No way to infer rates of psychiatric admission from SMILE Cannot extrapolate results from NPIRS to general population – determinants are specific to the in-patient population Need a linking methodology to join both datasets together
New data – new steps Using propensity score matching (PSM) Found that closer proximity to an acute psychiatric hospital increases an individuals with self-reported depression probability of being admitted to an APH. Brunsdon and Comber, 2012 – found climate change effects on flowering, by using random co-efficient modelling However their first model, a simple OLS, indicated that flowering dates were becoming later!!!
New data – new steps Over 32,000 respondents !Great! However - potential survey bias I’m interested in extrapolating my findings on depression from this survey to the general population Non-random sample – people with a higher propensity/already have a CMD may specifically listen/respond to the show Cannot use OLS/Binary Models Need Sample Selection Models University of Liverpool (Kinderman and Pontin) and BBC Use BBC website to develop a wellbeing scale Launched - All in the Mind a BBC Radio 4 programme
Conclusion In ‘bite-size’ pieces, this new data, be it real-time, volunteered, administrative, company-based, can be highly useful… New methods, new ways of looking at data Need to be aware of it’s limitations Definite potential to link this new and big data to older quantitative methods to help inform policy Particularly as we as researchers become aware of it’s capacity and
Final Word Need to question – Do we need all this data? An answer to this may only be provided by turning to our ‘qualitative colleagues and the ‘data’ they collect