Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Curation or Digital Data? The impact of Services and Federation Phil Lord Newcastle University.

Similar presentations


Presentation on theme: "Digital Curation or Digital Data? The impact of Services and Federation Phil Lord Newcastle University."— Presentation transcript:

1 Digital Curation or Digital Data? The impact of Services and Federation Phil Lord Newcastle University

2 Take Home Messages Curation is important for the CARMEN project and neuroinformatics To enable repeatability and rerunability, curation of both services and data are of equal importance To enable federation and autonomy, data release, license and other policies need to be operated over computationally.

3 Research Challenge Understanding the brain may be the greatest informatics challenge of the 21 st century Worldwide >100,000 neuroscientists (~ 5,000 in UK) are generating vast amounts of data Principal experimental data formats: molecular (genomic/proteomic) neurophysiological (time-series electrical measures of activity) anatomical (spatial) behavioural Neuroinformatics concerns how these data are handled and integrated, including the application of computational modelling

4 Need for Cooperation Understanding the brain may be the greatest informatics challenge of the 21 st century OECD Neuroinformatics Working Group identified the need to work cooperatively in order to achieve major advances Cooperation will permit: development of common processes best value from data, including long term curation ‘mega-analysis’ of large data sets integration of data sets across different scales and different approaches interdisciplinary research

5 CARMEN – Focus on Neural Activity resolving the ‘neural code’ from the timing of action potential activity Understanding the brain may be the greatest informatics challenge of the 21 st century neurone 1 neurone 2 neurone 3 raw voltage signal data collected by patch-clamp and single & multi- electrode array recording novel optical recording, particularly the activity dynamics of large networks

6 CARMEN is a new e-Science Pilot Project, (UK research council funded) in Neuroinformatics. To create a grid-enabled, real time ‘virtual laboratory’ environment for neurophysiological data To develop an extensible ‘toolkit’ for data extraction, analysis and modelling To provide a repository for archiving, sharing, integration and discovery of data To achieve wide community and commercial engagement in developing and using CARMEN –CARMEN is a 4 year project: if it is to last longer, it must become financially self-sufficient. See http://www.carmen.org.uk

7 CARMEN Active Information Repository Node

8 Dynamic Service Deployment - Dynasoar R CWSP req res 1 Compute Machines node1 s2,s5 … 2 node n s2 Web Server 3 2:service fetch& deploy SR Service Repository Client CAIRN

9 Distribution and Federation Initially, we plan to have two CAIRNS

10 Distribution and Federation

11 What about digital curation? Courtesy of Wikipedia

12 CARMEN’s perspective We wish to store data, store it’s provenance, store it’s usage. We need release policies, we need retention policies, we need to understand ownership

13 What do we get from this? Replicability: one scientist should be able to repeat another’s experiment, under equivalent conditions, at a different time. Rerunability: a scientist should be able to apply an equivalent technique under new circumstances. The addition of services into this mix complicate the issue. New DataOld Data Replicability Rerunability

14 New Data Old Data Old Services New Services Replicability Rerunability Is the specification of what happened actually right? Has the state of the world advanced since previously? Has the world changed, in a comparable way? Has the service changed in a comparable way? Error-Prone Neuroscientist Eager Neuroscientist Neurosciensist comparing to existing work Tool Builder

15 So, what is problem? I would like to rerun this experiment and release the results. Can I? Is the new data available? Is the new data public? Does the license allow derived results? Who owns the derived results? –data license –software license

16 So, whats the problem? Can I compare how new data would have changed the results? –Is that data available? (New and Old) –Is that data public? (New and Old) etc… Is it embargoed – will it become public later? –Do the licenses allow derived results? –Who owns the derived results? The licenses may conflict

17 CARMEN Active Information Repository Node

18 Whose release policy?

19 Policy Issues One of the main purposes of the CAIRN is to hide the distribution. What if the CAIRNs have different release policies? What if they have different licenses? We cannot inflict these differences on the user. Therefore, we must be able to compute over policies We must be able to represent justifications back to the users

20 An Example: Licensing Computationally amenable licenses are available Take, for example, Creative Commons

21

22 Take Home Messages Curation is important for the CARMEN project and neuroinformatics To enable repeatability and rerunability, curation of services and data are of equal importance To enable federation and autonomy, data release, license and other policies need to be operated over computationally.

23 Acknowledgements Professor Colin Ingram, Professor Jim Austin, Professor Leslie Smith, Professor Paul Watson Dr. Stuart Baker,Professor Roman Borisyuk, Dr. Stephen Eglen, Professor Jianfeng Feng, Dr. Kevin Gurney, Dr. Tom Jackson Dr. Marcus Kaiser, Dr. Phillip Lord, Dr. Paul Overton, Dr. Stefano Panzeri, Dr. Rodrigio Quian Quiroga, Dr. Simon Schultz, Dr. Evelyne Sernagor, Dr. V. Anne Smith, Dr. Tom Smulders Professor Miles Whittington, Christoph Echtermeyer, Martyn Fletcher, Frank Gibson, Mark Jessop Dr. Bojian Liang, Juan Martinez-Gomez, Dr. Chris Mountford, Agah Ogungboye, Georgios Pitsilis, Dr. Daniel Swan University of St Andrews The University Of Sheffield


Download ppt "Digital Curation or Digital Data? The impact of Services and Federation Phil Lord Newcastle University."

Similar presentations


Ads by Google