Presentation is loading. Please wait.

Presentation is loading. Please wait.

© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.

Similar presentations


Presentation on theme: "© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University."— Presentation transcript:

1 © S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk

2 © S.J. Coles 2006 Funding Body Viewpoint

3 © S.J. Coles 2006 Supporting Small Laboratory Working Practice Data from experiments conducted as recently as six months ago might be suddenly deemed important, but those researchers may never find those numbers – or if they did might not know what those numbers meant Lost in some research assistants computer, the data are often irretrievable or an undecipherable string of digits To vet experiments, correct errors, or find new breakthroughs, scientists desperately need better ways to store and retrieve research data Data from Big Science is … easier to handle, understand and archive. Small Science is horribly heterogeneous and far more vast. In time Small Science will generate 2-3 times more data than Big Science. Lost in a Sea of Science Data S.Carlson, The Chronicle of Higher Education (23/06/2006)

4 © S.J. Coles 2006 The Information Environment Institutional Data Sources

5 © S.J. Coles 2006 A Data-Rich Subject – the Crystallography Problem 30,000,000 1.5,000,000 450,000

6 © S.J. Coles 2006 Data and Information Loss

7 © S.J. Coles 2006 Open Access as the Answer?

8 © S.J. Coles 2006 Separating Data from Interpretations Underlying data Intellect & Interpretation

9 © S.J. Coles 2006 Research & e-Science workflows Aggregator services: national, commercial Repositories : institutional, e-prints, subject, data, learning objects Data curation: databases & databanks Validation Harvesting metadata Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Deposit / self- archiving Peer-reviewed publications: journals, conference proceedings Publication Validation Data analysis, transformation, mining, modelling Searching, harvesting, embedding Presentation services: subject, media-specific, data, commercial portals Resource discovery, linking, embedding Linking The scholarly knowledge cycle. Liz Lyon, eBankUK article. Ariadne, July 2003.

10 © S.J. Coles 2006 eBank-UK and the eCrystals Repository

11 © S.J. Coles 2006 Workflow Capture and Analysis RAW DATADERIVED DATARESULTS DATA

12 © S.J. Coles 2006 The eCrystals Data Archive http://ecrystals.chem.soton.ac.uk

13 © S.J. Coles 2006 Access to the underlying data

14 © S.J. Coles 2006 Metadata Publication Using simple Dublin Core Crystal structure Title (Systematic IUPAC Name) Authors Affiliation Creation Date Additional chemical information through Qualified Dublin Core Empirical formula International Chemical Identifier (InChI) Compound Class & Keywords Specifies which datasets are present in an entry DOI http://dx.doi.org/10.1594/ecrystals.chem.soton.ac.uk/145 Rights & Citation http://ecrystals.chem.soton.ac.uk/rights.html Application Profile http://www.ukoln.ac.uk/projects/ebank-uk/schemas/

15 © S.J. Coles 2006 Metadata and Data Quality Control Data manipulation toolbox Associated Metadata Value added Format conversion

16 © S.J. Coles 2006 Harvesting & Aggregating: Google Coles, S.J., Day, N.E., Murray-Rust, P., Rzepa, H.S., Zhang, Y., Org. Biomol. Chem., 2005, (10),1832-1834. DOI: 10.1039/b502828k10.1039/b502828k

17 © S.J. Coles 2006 Harvesting: OAIster

18 © S.J. Coles 2006 Linking and aggregating

19 © S.J. Coles 2006 Embedded in a science portal

20 © S.J. Coles 2006 The Repository for the Laboratory – R4L

21 © S.J. Coles 2006 Repositories Supporting Laboratory Working Practice eBank-UK / eCrystals concentrates on the dissemination of data compiled once a study is complete – ideal for complex studies Still a need to capture data from single shot experiments on small laboratory instruments To fully assure quality and accuracy of metadata it is essential to capture and describe data at the point when it is generated Solution: A repository with the potential to store data and metadata as they are generated in the laboratory Added Bonus: A repository can manage data and provide automated report generation and data analysis tools

22 © S.J. Coles 2006 Laboratory Repositories and Information Management

23 © S.J. Coles 2006 Workflow Analysis Researcher, Compound, Experiment type, Timestamp Sample preparation Data acquisition Deposit current dataset Analyse: Refine experiment? Complete experiment deposit

24 © S.J. Coles 2006 The R4L Repository Deposit Search / Browse Create new compoundAdd experiment data and metadata

25 © S.J. Coles 2006 e-Research workflows Aggregator services Institutional data repositories Data curation & preservation: databases & databanks Validation Harvest Data creation & capture in Smart lab Deposit Publishers: peer- review journals, conference proceedings Publication Validation Data analysis, transformation, mining, modelling Search, harvest Presentation services: portals Data discovery, linking, citation Linking, citation Laboratory repository Deposit The eCrystals Federation Model


Download ppt "© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University."

Similar presentations


Ads by Google