Presentation on theme: "AHM 2008, 11 th September 2008 Supporting Security-Oriented Interdisciplinary Research: Crossing the Social, Clinical and Geospatial Domains Prof. Richard."— Presentation transcript:
AHM 2008, 11 th September 2008 Supporting Security-Oriented Interdisciplinary Research: Crossing the Social, Clinical and Geospatial Domains Prof. Richard O. Sinnott National e-Science Centre University of Glasgow, Scotland email@example.com
AHM 2008, 11 th September 2008 The Context Many Grids EGEE, NGS, D-Grid, Naregi, OSG,… Many definitions and standards OGSA, OGSI, WSRF, WS-I, WS-ACRONYM-GOES-HERE… Many solutions Semantic web/Grid,... Web 2.0, wikis, mash-ups, collaboratories, clouds (fluffy/ill defined!)… Unicore, Globus, gLite, WS-…, OGSA-DAI, … Tis all a bit (a lot!!!) of a mess… couple that with the data and knowledge explosion in many (all?) domains, and we have a recipe for chaos “Grid” to me is solution that supports (simple) seamless access to a heterogeneous variety of compute and data resources Often domain specific – especially data! (simple) single sign-on support researchers and research, especially inter-, trans- disciplinary research often at the risk of being non-sexy!
AHM 2008, 11 th September 2008 Example of Inter-disciplinary Research Typical Query What is the correlation between living adult males over 50 years of age in Scotland who have had type-2 diabetes for 5 years or more and those employed in manual versus office jobs, i.e. does having the type-2 diabetes condition imply that those afflicted are more likely to be employed in manual or office jobs? Where in Scotland is this most prevalent? Why? Health inequalities, impacts of policies,… For example… Male life expectancy for the whole of Glasgow averages 70.7 years In East Glasgow, it goes right down to 53.9 years in the Calton ward UK National Average 77 years, Mongolia 65, Ghana 59, Gambia 54
AHM 2008, 11 th September 2008 Data, Data, Data No magic bullet for data management on the Grid You can use data if you a/ know where it is, b/ are allowed to access it, c/ know its format, d/ trust it is authentic, e/ are sure of its quality, f/ have the right local widget to talk to the right remote widget z/… a/ there are MANY, MANY, MANY resources out there Tis scary just how big the internet is! (see later) b/ there are MANY, MANY, MANY ways to define and enforce access policies Grid sexy security stuff vs real world of NHS, Range of data stakeholders, Ethics, Information governance, Sys-admin/user perspectives…
AHM 2008, 11 th September 2008 Data Grids There is no single solution Why? Things change Science revolution Grid technology revolution General principles/patterns are what we need How do I set up a Virtual Organisation to do research into X? Connecting users/software/resources across sites Seamless access, End-end security, … How do I connect multiple Virtual Organisations to do research into X, Y and Z? Clinical VOs vs other VOs
AHM 2008, 11 th September 2008 AAAAAAAA Grid Security A A Users like usernames/passwords Provide them (once!) Users don’t like/understand X.509 based PKI Forget training, education for most users! $> openssl pkcs12 -in cert.p12 -clcerts -nokeys -out usercert.pem! The vast majority most certainly won’t jump through hoops to get on the Grid “me-Science” culture Should all be transparent to end users and aligned with the way that want to work/access resources Access Management Federation (Shibboleth) + authZ technologies
AHM 2008, 11 th September 2008 Shibboleth Decentralised Approach Service provider Shib Frontend 5. Pass authentication info and attributes to authZ function Grid Portal 6. Make final AuthZ decision Grid Application Identity Provider Home Institution W.A.Y.F. Federation User 1. User points browser at Grid resource/portal 2. Shibboleth redirects user to W.A.Y.F. service 3.User selects their home institution 4. Home site authenticates user and pushes attributes to the service provider AuthN LDAP AuthZ uid Log-in once and roam User AuthN LDAP AuthZ Identity Provider Home Institution 4. Home site authenticates user and pushes attributes to the service provider User points browser at Grid resource/ portal ? ? ? ?
AHM 2008, 11 th September 2008 Inter-disciplinary Data Data, data everywhere… Or better yet, services, services everywhere… Clinical Data VOTES project –Primary care data, –Secondary care data, –Disease registries, … Social Science Data DAMES project –Occupational data –Social classification –Census data (educational, housing, family,...) –Survey data sets –Ethnicity … Geospatial data SeeGEO project –EDINA UK Borders –DigiMap
AHM 2008, 11 th September 2008 Security-oriented Socio-, Geo-, Clinical Data Infrastructures Licenses, privileges Others
AHM 2008, 11 th September 2008 VOTES Virtual Organisations for Trials and Epidemiological Studies 3 year (£2.8M) MRC funded project started October 2005 Plans to develop framework for producing Grid infrastructures to address key components of clinical trial/observational study Recruitment of potentially eligible participants Data collection during the study Study administration and coordination –Involves Glasgow, Oxford, Leicester/Nottingham, Manchester, Imperial »Strong links with UK Biobank
AHM 2008, 11 th September 2008 VOTES Scottish Experiences Scottish Data Space… up to now Scottish Care Information (SCI) Store Hospital batch system rolled out across Scotland (lab data, patient records…) Scottish Morbidity Records (SMR) Aggregated clinical records from last 40 years across Scotland We have been given pseudo-anonymised –SMR01A General acute inpatient and day case discharges (3,719,206 records) –SMR04A Psychiatric and mental handicap hospitals and units: admissions, residents and discharges (241,599 records) –SMR06A Scottish cancer registrations (171,167 records) –SMR99A Deaths (173,615 records) General Practitioners Administration System for Scotland (GPASS) Used by 85% of GPs across Scotland Consent Opt-in/opt-out trial, study, disease area, … Applied in range of areas/projects: UK Biobank, Congenital anomaly, Brain trauma, Diabetes, Knee pain/obesity, Prostate cance.… Community Health Index (CHI) number key to this!
AHM 2008, 11 th September 2008 DAMES NCeSS Data Management through e-Social Science node Lead by Stirling NeSC Glasgow involvement started August 2008 Occupational data Social classification Census data (educational, housing, family,...) Survey data sets Ethnicity …
AHM 2008, 11 th September 2008 Conclusions Systems driven by Information Governance/Ethics MREC, LREC, PAC, PIAG, Caldicott Guardians, Joe Public Once defined have tools/techniques to rapidly roll-out e- Infrastructures to support researchers Diabetes? Cancer? Obesity? Smoking? Health/Wealth? Genetics and Healthcare? Nature / Nurture? Focus not on single VO but supporting many VOs that have their own access/usage policies Understanding data models are ESSENTIAL to make any of this work!!!