Presentation on theme: "PV2013 Summary Results Data Stewardship Interest Group WGISS-37 Meeting Cocoa Beach (Florida-US) - April 14-18, 2014."— Presentation transcript:
PV2013 Summary Results Data Stewardship Interest Group WGISS-37 Meeting Cocoa Beach (Florida-US) - April 14-18, 2014
Slide 2 PV 2013 Event The PV2013 conference was held at ESA-ESRIN in Frascati on 4-6 November 2013. 100 Registered Participants (94 attending the conference) 40 Oral Presentations 22 Posters
Slide 3 PV 2013 Objectives Address prospects in the domain of scientific and technical data preservation together with value adding to these data. Provide a forum for organizations dealing with preservation of own data and value adding to present the status of their activities, plans and expectations. Share knowledge, experiences and lessons learned, foster cooperation.
Slide 5 PV 2013 Topics Adding value to data and facilitation of data use Ensuring long-term data and knowledge preservation Data preservation: lessons learnt & Future prospects and cooperation
Slide 6 Adding value to data and facilitation of data use (1) DATA CONSOLIDATION, ARCHIVING AND PRESERVATION Consolidation of raw (or Level 0) data records eventually rescuing them from old media; preservation of all raw data (you might not know what is relevant in future); completion of time series of data; recovery of analogue data from pre-digital era. Preservation of all digital assets in a trustworthy state; certification of repositories; duplication of data to prevent loss (catastrophic events). Historical archives to be made attractive for users and exploited; data records pre-processing and (re)processing into higher level products; use of persistent identifiers and data exposition as linked data. Science value increases with the number of connected services and data products (e.g. combination of in-situ data and satellite data).
Slide 7 Adding value to data and facilitation of data use (2) METADATA AND STANDARDS Use standard and widely accepted data and metadata formats (do not invent new ones) to facilitate use of standard discovery, access methods and tools; better/easier exploitation of data even in different domains. Often available metadata is not sufficient to describe data to interested users, e.g. in the context of climate activities. Additional, commentary metadata – stemming from product providers and users – needs to be made available with the data to satisfy these needs. Standardised metadata is essential for Interoperability be it Collection-, Product- or Services Metadata; Semantic Interoperability is key in this context and needs to be improved.
Slide 8 Adding value to data and facilitation of data use (3) VALUE ADDING AND DATA EXPLOITATION User interested in: long time series of data to monitor changes; contributing to algorithms development (need L0); accessing higher level products for research activities; accessing external processing facilities; having freedom to define query parameters on the archive. Scalable Architectures are needed to support storage and processing (NRT & reprocessing) of huge amounts of data; integration of “Cloud” services might help to improve performances. Need to move towards: data Intensive Science Platforms addressing both expert and non-expert users; Virtual Observatories as interactive exploitation platforms; user tailored Research Platforms and Information Dashboards; Image Information Mining; Semantic Web (Linked Data). User feedback on data
Slide 9 Adding value to data and facilitation of data use (4) COOPERATION ON DATA PRESERVATION Ideally a common set of models - conceptual models, metadata schemas, ontologies, vocabularies - for describing data, and a common strategy for harmonizing them should be available. Raising awareness on the preservation issue and training is fundamental; provision of open data for education can be a mean for ensuring data preservation. Appraisal is needed as resources are limited; collaboration is key in this area; recognized need and benefits of sharing information and lessons learned.
Slide 10 Ensuring long-term data and knowledge preservation (1) CHALLENGES TO DATA PRESERVATION Preservation is fundamental but has a cost; funding necessary to guarantee long term availability of data can be a problem; need to define “preservation models and use cases” and to prepare plans to support it. Preservation of data carried out by providers according to own policies and best practices; overarching approach needed for organizations dealing with different types of data (e.g. Earth Observation, In-situ). Early involvement of archive/curation experts in projects is important to get a clear definition of Submission Information Packages (SIP) and avoid major efforts at the end of the project.
Slide 11 Ensuring long-term data and knowledge preservation (2) CHALLENGES TO DATA PRESERVATION Data preservation is not just about keeping the data safe, but also about keeping the data usable. Data preservation includes software and documentation needed to understand and use the data; data records and non-data records have to be distinguished and the handling of non-data records has to be improved. Align metadata for SW and documentation, including link to data. Extensive planning and prototypes are essential, especially with large projects; flexibility is key.
Slide 12 Ensuring long-term data and knowledge preservation (3) INTEROPERABILITY AND HARMONIZATION Urgency to recover and clean-up datasets and associated knowledge; need procedures and harmonization within and across communities; harmonised data formats, metadata, portals; Inspire compliance. Make data, knowledge and tools accessible in an interoperable way across data centres / across communities; federation of registries (e.g. metadata, data, knowledge); unified discovery. Careful documentation and metadata are crucial because higher level data products are more and more used (e.g. need provenance information) and because data products can be used across communities (less familiar with the data). Preservation and accessibility activities driven by user needs and type of data: no need or ability to regenerate less advanced preservation effort.
Slide 13 Data preservation: lessons learnt & Future prospects and cooperation (1) Modular and flexible architectures for data preservation & access systems gives capability to change/adapt to different & evolving requirements; Data archive management tools can facilitate the work of the data centres from various disciplines throughout a long period of time. Diversity of instruments / data content & formats / metadata, makes data preservation and knowledge representation very challenging (e.g. ISS); definition of “common core data models” and use of standards is key; special care should be considered for any data format to enable long term preservation and interpretation of data sets. Common vision on data preservation among different organizations: data policy, common services, functionalities and processes. Definition and application of a common data policy (e.g. for historical data in Earth Observation, for ISS data) would be highly beneficial for data exploitation (and preservation).
Slide 14 Data preservation: lessons learnt & Future prospects and cooperation (2) Grid infrastructure & Cloud Computing can facilitate data preservation and exploitation but have limitations and can impose constraints; availability of shared infrastructure (e.g. at national level) and generic web services supporting basic functionalities common to most research data management systems can reduce overhead and costs for single organizations. Big Data should not to be considered as a challenge but as an opportunity; Data Volume, Value, Variety, Velocity (4Vs). Fundamental to share knowledge, experiences and lessons learned; cooperation, communication, awareness raising, interaction with users. From LTDP to LTDP4V
Slide 15 PV 2015 Conference Hosted by EUMETSAT in November 2015 in Darmstadt (Germany) 2015