Presentation is loading. Please wait.

Presentation is loading. Please wait.

Synchronising Diversely Implemented Databases to Support Administration of Clinical Research Stuart Anderson Mark Hartswood Conrad Hughes CRISP (Clinical.

Similar presentations


Presentation on theme: "Synchronising Diversely Implemented Databases to Support Administration of Clinical Research Stuart Anderson Mark Hartswood Conrad Hughes CRISP (Clinical."— Presentation transcript:

1 Synchronising Diversely Implemented Databases to Support Administration of Clinical Research Stuart Anderson Mark Hartswood Conrad Hughes CRISP (Clinical Research Information Systems Project) School of Informatics University of Edinburgh

2 Many administrative databases, much the same data Project R#30 Title “A very important study” … Project R#30 Title “A very important study” …

3 Share the data automatically! Project R#30 Title “A very important study” … Project R#30 Title “A very important study” …

4 Research organisations Research Organisations (ROs) in NHS Lothian administering clinical research projects: –NHS Research & Development Office –Welcome Trust Clinical Research Facility (WTCRF) –Scottish Cancer Research Network (SCRN) –Experimental Cancer Medicine Centre (ECMC) NHS R&D involved in all projects, at least in terms of handling approvals

5 Project meta-information Project title Project start and end dates Project ethics status and research approval Project sponsors, funders and finance data Project personnel Sponsor and personnel contact details Patient lists and activity records … Supposedly the same data, but in different databases

6 A CRISPy Opportunity We could: –Reduce data entry costs –Improve data quality –Improve awareness of activity...if we find ways to share common data between databases Suits government “bureaucracy busting”

7 Options Looked at commercial solutions: –Some didn’t understand the complexity and risks (e.g. rsync in two directions) –Competent-sounding ones were prohibitively expensive (e.g. £170k per site) Our solution: DIY approach using free software

8 Harmony Document synchronisation framework –By Benjamin Pierce et al.: http://www.seas.upenn.edu/~harmony Reconciles changes made to multiple disconnected structured documents containing the same data (or subsets thereof - the “view update” problem), e.g. –Internet browser bookmarks files –Calendar applications Strong theoretical approach with emphasis on provable safety: changes only propagated under well-defined circumstances

9 Overview of Harmony RO1’s Document X RO1’s Document X RO2’s Document X RO2’s Document X Harmony Log of changes and conflicts New Archive Archive (~Old X) Archive (~Old X) Updated Document X (RO1) Updated Document X (RO1) Updated Document X (RO2) Updated Document X (RO2)

10 Harmony operation: Equality ArchiveRO1’s Document XRO2’s Document X R#30 A very important study yes R#30 A very important study yes R#30 A very important study yes After running Harmony: New ArchiveNew Document X (RO1)New Document X (RO2)Log R#30 A very important study yes R#30 A very important study yes R#30 A very important study yes “Documents are equal”

11 Harmony operation: Changes ArchiveRO1’s Document XRO2’s Document X R#30 A very important study no R#30 A very important study yes R#30 A very very important study no After Running Harmony: New ArchiveNew Document X (RO1)New Document X (RO2)Log R#30 A very very important study yes R#30 A very very important study yes R#30 A very very important study yes “Project R#30: title change propagated from RO2 to RO1; ethics change propagated from RO1 to RO2”

12 Harmony operation: Conflict ArchiveRO1’s Document XRO2’s Document X R#30 A very important study yes R#30 A very unimportant study yes R#30 A very very important study yes After Running Harmony: New ArchiveNew Document X (RO1)New Document X (RO2)Log R#30 A very important study yes R#30 A very unimportant study yes R#30 A very very important study yes “Project R#30 conflict over title fields – R#30 title changes not propagated”

13

14 Conflicts project IdFieldNHS R&DWTCRF E06377Project title Global Registry of Acute Coronary Events (GRACE) GRACE: Global Registry of Acute Coronary Events E01058Project title PROCARDIS (Precocious Coronary Artery Disease) study PROCARDIS - Precocious coronary artery disease study - a study to identify inherited causes of heart disease E01033Project title VITATOPS: a randomised controlled trial of vitamins to prevent stroke. Vitatops: A randomised controlled trial of vitamins to prevent sroke. 600 pre-roll-out conflicts to resolve; these examples are fairly trivial

15 Provenance issues? Trust Alignment Form and meaning Authority Control

16 Trust Organisations are allowing other participants to write to their databases Do you trust them? –Alignment of goals –Need to establish confidence in each other’s procedures and practices –Established through regular meetings –Others might know more than you do

17 Alignment: record identity Need to identify which records in different databases refer to the same project, funding body or person Use R&D Number, assigned by NHS R&D, for projects –Creation complicated because projects may initially be entered (without R&D#) by ROs –Deletion complicated because some projects may leave scope but no projects should really be deleted Funding bodies and persons are handled more loosely –Identity and duplication less critical here

18 Establishing identity Syncing two database tables SK Database 1Database 2 7 3 7 9 9 3 Unique Shared Keys identify records across databases Synchronising tables across two databases depends on having a unique shared key. This value has to be guaranteed to be unique within each table, and to identify corresponding records uniquely across databases.

19 Do they have the same meaning? Start/end date –Approval? Funding? Recruitment? Analysis? –Often driven by reporting requirements Some fields too contentious, not useful to share, so not included in sync –Option to synchronise separate meanings as separate fields Get parties to agree on common meanings –Valuable communications exercise among participants

20 Shared meaning = shared form? Field types/sizes Field values –N/A na None No Pending –Funder classification varies from DB to DB Personnel roles –One column per role or one row per role? Some adjustment and convergence possible to participants’ databases Transform data to “standard” on export/import

21 Authority Harmony is symmetric: no peer to a sync gets priority Some information should only be sourced by R&D (responsible for approvals) Some information is best sourced by ROs (personnel, funding) But: –Databases involved don’t record sources of information –Strict rules impair usability and make for an unpopular (and unused) system Solution: –Emphasise audit over control –But provide limited inter-site control at data import stage

22 Control Each database contains organisation- specific (and private) information Some content is just irrelevant to others Some patient data! Solution: import/export script run locally by each organisation only exports a chosen subset of tables, rows and columns

23 Benefits Data only entered once for all Everyone takes responsibility for data they’re most expert in Disagreement (“conflict”) is permitted, and may be resolved through human-human communication Limited (inter-site) audit operating so expect/hope for responsible behaviour

24 Conclusion Real data synchronisation application has been far from the theoretical ideal –Issues of alignment, scope, identity, policy, trust, data quality, form and meaning Solutions to problems encountered aren’t just technical: organisational engagement and trust have been essential in keeping the task tractable Rolling out now, so reality yet to be seen –Depends on fair balance of effort and reward among participants

25 Thank you! conrad.hughes@ed.ac.uk School of Informatics University of Edinburgh


Download ppt "Synchronising Diversely Implemented Databases to Support Administration of Clinical Research Stuart Anderson Mark Hartswood Conrad Hughes CRISP (Clinical."

Similar presentations


Ads by Google