Presentation on theme: "Criteria for the trustworthiness of data centres Jens Klump Helmholtz Centre Potsdam German Research Centre for Geosciences (GFZ) DataCite Summer Meeting."— Presentation transcript:
Criteria for the trustworthiness of data centres Jens Klump Helmholtz Centre Potsdam German Research Centre for Geosciences (GFZ) DataCite Summer Meeting 2010 Session 3: Trustworthiness of Data Centres – A technological, a structural and a legal discussion
Outline Objective Criteria for archives and criteria catalogues nestor Criteria Catalogue Translation of general criteria into practice Standards and certification
Objective Digital information has become part of our cultural heritage. Scientific findings are increasingly produced, documented, and presented in electronic form – often exclusively so. At the same time the underlying technology is changing rapidly. Many data are unique and cannot be produced again. Whom do we entrust our digital heritage?
Core Requirements for Digital Archives Published jointly in January 2007 by: The Digital Curation Center (DCC UK) DigitalPreservationEurope (DPE) nestor (Germany) Center for Research Libraries (USA) For repositories of all types and sizes preservation activities must be scaled to the needs and means of the defined community.
nestor catalogue of criteria for trusted digital repositories The nestor working group produced a catalogue of criteria for trusted digital repositories. The catalogue of criteria is in the process of becoming a German DIN standard in preparation for ISO standardisation. The criteria follow the philosophy of ISO 9000.
Principles of the criteria catalogue The catalogue follows the reference model for open archival information systems (OAIS, ISO 14721: 2003) The criteria of the catalogue have been kept abstract to allow transfer to any use case in digital archiving.
Structure of the Catalogue Documentation Transparency Adequacy Measurability Documentation Transparency Adequacy Measurability Documentation Transparency Adequacy Measurability Organisational Framework Dealing with Objects Infrastructure and Security
Organisational Framework The Digital Repository has defined its goals. The DR grants its designated community adequate access to the information represented by the digital objects. Legal and contractual rules are observed. The organisational form is appropriate for the DR. – Finance, staff, organisational structure, long-term planning, continuity beyond the existence of the repository The digital repository undertakes appropriate quality management.
Object Management I The digital repository ensures the integrity of the digital objects during all processing stages. – Ingest, Archival Storage, Access The digital repository ensures the authenticity of the digital objects during all stages of processing. The digital repository has a strategic plan for its technical preservation measures (preservation planning). The digital repository accepts digital objects from the producers based on defined criteria. (NB! Cost)
Object Management.Ingest Process Ingest process is the most difficult and expensive part of long-term archiving. To minimise risks: – The digital repository specifies its submission information packages (SIPs). – The digital repository identifies which characteristics of the digital objects are significant for information preservation. – The digital repository has technical control of the digital objects in order to carry out long-term preservation measures.
Object Management II Archival storage of the digital objects is undertaken to defined specifications. The digital repository permits usage of the digital objects based on defined criteria. – Dissimination packages and transformation AIP -> DIP. The data management system is capable of providing the necessary digital repository functions. – Persitent Identifiers and semantic relations, adequate metadata (format, content, identification, structure), provenance, technical description, usage rights, preservation of package structure.
Infrastructure and Security The IT infrastructure is appropriate. – Implementation of requirements, IT security The infrastructure protects the digital repository and its digital objects. – Protect against natural threats (e.g. fire, water, seismic activity) – Protect against risks caused by humans. The objects can be harmed directly by employees or through harmful programmes smuggled into the system (e.g. viruses). Protecting the data also involves preventing the unintentional forwarding of information by programmes (trojans) or people (espionage).
Addressing the Community The criteria of the catalogue have been kept abstract to allow transfer to all use cases in digital archiving. Implementation requires a translation into requirements and means of the community. What is required? (adequacy) What can be implemented now? Later? (compliance)
Levels of Compliance Example: ESA LTDP Common Guidelines, Issue 1, 11/2009
Standards and Certification nestor Criteria Catalogue was due for voting at the German Bureau of Standards (DIN) on and was unanimously accepted. TRAC is being discussed by ISO. Certification? – Government archives are not interested in certification. – Data centres could be interested in certification to address requirements of science funding organisations and editorial boards. – Currently avalable certificates are not yet suitable for formal certification.
Summary We need trusted digital repositories to preserve our scientific heritage. The criteria catalogues help to specify the requirements towards digital archives. Before implementation the criteria need to be translated into the requirements of the designated user communities. Defined compliance levels may ease the transition. The federated nature of scientific research is best addressed by certification of trusted digital repositories according to compliance levels.