Presentation on theme: "Research Data Access and Preservation Summit Panel 2 - Promoting Re-Use of Scientific Collections Some responses to the questions posed... John Harrison."— Presentation transcript:
Research Data Access and Preservation Summit Panel 2 - Promoting Re-Use of Scientific Collections Some responses to the questions posed... John Harrison SHAMAN Project University of Liverpool
How do you handle organization of collections today? We created a highly structured hierarchy of directories within our storage system (currently iRODS) Allows logical separation, but association of: Collection data Supporting documentation (context, provenance) System Policies Software code Configurations, Workflows Discovery mechanisms (indexes)
What are the biggest issues with building collections for new communities? Scalability; quantity of data is increasing rapidly More important to select, and prioritize data with most potential to be useful to future generations. Mechanisms for identifying useful items in large reference collections become more important.
When new communities access existing data collections, what new access capabilities are required? It's difficult to generalize; depends a great deal on expectations of the community in question. Viewing the data will be essential for all communities One important aspect of our approach has been to develop a display technology, independent of the originating application Emulation, but with a layer of abstraction from the operating system (Java Virtual Machine) Provides a platform for development of new and unforeseen capabilities for interaction with legacy (potentially obselete) file formats.
What level of description is required to meet the expectations of new communities? Impossible to say for certain. Expectations evolve as technology develops. Best we can do: Rigidly adhere to most stringent and well documented standards of today. Preserve the means for future generations to interpret these descriptions by preserving documentation on the standard Tag libraries + Schemas for XML Ontologies
Is long-term sustainability enabled through re-purposing of collections? Theoretically, yes; only time will tell for sure Best change of achieving sustainability by using open standards to describe: Digital objects, their structure and associations Metadata (digital objects and the archive as a whole) Data management policies and processes
Are there other driving purposes behind promoting re-use of collections? Data may provide insights into unforeseen areas. e.g. results of drug trials might inform future drug development in the pharmaceutical industry In such a highly regulated industry, the ability to get back to raw data to ensure authenticity is very important!
Which institutions can be approached for sustaining re-purposed collections? So far, it seems to be mainly memory institutions that are looking at issues of digital preservation (Libraries, Archives, Museums) Anyone with significant data should be thinking about issues surrounding preservation of their knowledge/information assets. In the future, funding bids should consider the costs of preserving the results of their research. I think inevitably many organizations will end up out- sourcing digital preservation.