SAIL: Documenting data content and quality, letting the computer take the strain Caroline Brooks Senior Research Analyst, College of Medicine, Swansea.

Slides:



Advertisements
Similar presentations
OMV Ontology Metadata Vocabulary April 10, 2008 Peter Haase.
Advertisements

Biomedical Informatics Reference Ontologies in Biomedicine Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College.
Burkina Faso Five-Year Evaluation of the Global Fund (GF5YE): Study Area 3 – Health Impact Sharing experiences in linking M&E to research linking M&E to.
National Institute of Statistics, Geography and Informatics (INEGI) Implementation of SDMX in Mexico.
MICS4 Survey Design Workshop Multiple Indicator Cluster Surveys Survey Design Workshop Data Archiving.
Metadata for Digital Content at the Library of Congress Jane Mandelbaum Information Technology Services Library of Congress May 2009.
Rclis in vision and reality Thomas Krichel
Obesity e-Lab Enabling obesity research using the Health Surveys for England: The Obesity e-Lab project Dexter Canoy The University of Manchester
The Process of Data Ingestion in ÆKOS Andrew Graham and Matt Schneider TERN Ecoinformatics Data Analysts Logos used with consent. Content of this presentation.
Towards the Single Patient Record: Clinical Documents Ann Wrightson.
Beijing, October 2006 Metadata development & deployment: What software business practices apply? William L. Anderson Praxis101 20th International CODATA.
Data Standards The use of data structures and OpenEHR Richard Kavanagh, Head of Data Standards, HSCIC.
Clever Recordkeeping Metadata Project Automating Recordkeeping Metadata Capture and Re-use: Translations from Theory to Practice Joanne.
Developing and enhancing the practice and management of scholarly activity and research within Further Education Colleges, Higher Education Academy [HEA],
Using language services to enrich the LOs' descriptions Dr. Vassilis Protonotarios University of Alcala, Spain 10 th Strategic Seminar / Conference 6-7.
Faculty of Computer Science © 2006 CMPUT 605February 11, 2008 A Data Warehouse Architecture for Clinical Data Warehousing Tony R. Sahama and Peter R. Croll.
OntoBlog: Linking Ontology and Blogs Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of Informatics, Japan 2 Asian.
1 Enriching UK PubMed Central SPIDER launch meeting, Wolfson College, Oxford Paul Davey, UK PubMed Central Engagement Manager.
The MetaDater Model and the formation of a GRID for the support of social research John Kallas Greek Social Data Bank National Center for Social Research.
LEVERAGING THE ENTERPRISE INFORMATION ENVIRONMENT Louise Edmonds Senior Manager Information Management ACT Health.
National Public Health Performance Standards Local Assessment Instrument Essential Service:10 Research for New Insights and Innovative Solutions to Health.
Intute and Organic.Edunet Jackie Wickham ALLCU, Oxford, July 2008.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Idea During the last decade, teachers/trainers of science and technology subjects have been faced with an extensive use of computer based approach in.
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
Progress in Open-World, Integrative, Web-based Collaborative Research Platforms Peter Fox and the DCO-DS* Team Tetherless World Constellation.
Bruno Oluka Tel: Technical Director, Ubunifu Systems Microsoft Access Database Lecture 1 – Introduction To Microsoft.
Data Management David Nathan & Peter Austin & Robert Munro.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
SERPent Project Secure Epidemiology Research Platform January – October 2010 Virtual Research Environment Rapid Innovation Project Funded.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
USGS Metadata in the Broader Picture 1994 Executive Order – Metadata must be created for all Federally-funded research – Federal Geographic Data.
Hampshire Hub Data Platform Progress update 1 October Bill Roberts Swirrl.
The Practical Challenges of Implementing a Terminology on a National Scale Professor Martin Severs.
Information Infrastructure Evolution ARIIC is working towards – a distributed electronic research environment that allows researchers to share, annotate,
Electronic labnotes Mari Wigham COMMIT/. Information WUR  Organising, sharing, finding and reusing data  Expertise in: ● Modelling data.
INFuture2015 Zagreb, November 2015 Long-term Preservation of Longitudinal Statistical Surveys in Psycholinguistic Research Hrvoje Stančić Faculty.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Plenary session: Knowledge sharing and learning Building Block session: Producing learning resources, training materials.
Trenches to Triples and Linked Data Lianne Smith Archives Services Manager and Project Lead King’s College London Archives.
Why RDA? A domain repository perspective George Alter ICPSR University of Michigan.
E-SI Theme: Exploiting Diverse Sources of Scientific Data Re-use or Re-invention - a Roadmap for Data Integration 27 th -28th November 2006 Prof. Jessie.
Scottish Improvement Science Collaborating Centre Strengthening the evidence base for improvement science: lessons learned Dr Nicola Gray, Senior Lecturer,
Housing and Health – data sources and information for Wales John Morris Pennaeth Ystadegau Iechyd, Llywodraeth Cymru Head of Health Statistics, Welsh Government.
An Experience Report from the Use of Digital Repositories in Building a New Module Simon McGinnes Trinity College Dublin.
The Gastroenterology Project at the RCP Jonathan Brown EWG chair – Gastroenterology & Hepatology BSG Informatics Lead.
The opportunities and challenges of sharing genomics data with the pharmaceutical industry Shahid Hanif, Head of Health Data & Outcomes, ABPI DNA digest.
Presented for discussion with Implementation SIG Heather Grain.
TDM in the Life Sciences Application to Drug Repositioning *
Metadata models to support the statistical cycle: IMDB
eHealth Standards and Profiles in Action for Europe and Beyond
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Dave Iberson-Hurst CDISC VP Technical Strategy
Making “Open Data” Work: Challenges for Data Integration in Genomics Research
Education SIG Implementors Curriculum
SMART GROUND platform overview
What’s New in Colectica 5.3 Part 1
Data Management: Documentation & Metadata
Applications of IFLA Namespaces
"IT principles" Context, roadmap
SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION
Greg Riccardi PCORI CTSA School of Information
A platform for Linked Data publishing
PH Systems Require More than IT
An ecosystem of contributions
NSDL Data Repository (NDR)
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Constructing the Next Generation of Integrated Data Systems
ROLE OF «electronic virtual enhanced research-engaged student teams» WEB PORTAL IN SOLUTION OF PROBLEM OF COLLABORATION INTERNATIONAL TEAMS INSIDE ONE.
A Research Data Catalogue supporting Blue Growth: the BlueBRIDGE case
Presentation transcript:

SAIL: Documenting data content and quality, letting the computer take the strain Caroline Brooks Senior Research Analyst, College of Medicine, Swansea University Ann Wrightson Lead Technical Design Architect, NHS Wales Informatics Service Hon. Research Associate, College of Medicine, Swansea University

Swansea Health Informatics Research & NWIS Partners in establishing and sustaining SAIL Wider collaboration in usability testing and innovation >Sharing skills & thinking around secondary uses of data

Ideas and facts General approaches in data research: People have ideas and test them using the available facts Ideas come from the available facts But – facts are not so easy to see in the data! Researchers need help... Which data resources contain the facts I need? What do I need to know about this data to use it well?

What’s in this repository, anyway? Dataset level – catalogue What/from where/from whom/how collected/rights to use Record level – dataset entry description Data model (entity-relationship model) Item level - field/attribute description Data types/ranges/controlled terms

How good is this data? What can it do for me? Item Population of this field/attribute - Why present? Why absent? Significance of this field/attribute – What does it mean for me? Record Evidential value of presence &/or absence of particular record Dataset What work has already been done with this data?

Work already done – SAIL databank website includes human readable dataset catalogue Description, source, related publications, data model Data Quality report (developed by SAIL team in 2013) Standardized informative documentation for each dataset Produced by automated analysis of data, published as PDF Working with Canadian colleagues (MCHP and Pop Data BC) Technology refresh of SAIL platform (CIPHER project – )

Work in progress Machine-readable format for catalogue and data quality information Data Documentation Initiative (DDI) format Initial target: publish on website as download link in catalogue Making outcomes of in-depth data quality work available for reuse Algorithms that instantiate clinical & social research concepts Evaluation of data coverage across populations of individuals Knowledge sharing with NWIS data warehouse team

Future directions Further work on characterizing concepts in data – reproducible, reusable How to make good use of SNOMED CT in source data New knowledge & skills needed, also issues with old/new data NWIS also working on this, another good area for collaboration More general use of knowledge models alongside data Comprehensive & integrated metadata reference architecture Data annotation, e.g. using biomedical science ontologies

Thank you for your attention