Presentation is loading. Please wait.

Presentation is loading. Please wait.

VI-SEEM Data Repository

Similar presentations


Presentation on theme: "VI-SEEM Data Repository"— Presentation transcript:

1 VI-SEEM Data Repository
Vladimir Dimitov IICT-BAS acknwloedgements to Vladimir Slavnić IPB The VI-SEEM project initiative is co-funded by the European Commission under the H2020 Research Infrastructures contract no

2 Agenda VI-SEEM Data Repository Underlying Software Technology
Hardware Implementation Benefits of VI-SEEM Repo Features Types of data Information Model The DSPACE (VI-SEEM Repo) Architecture Repository Organization Examples Total number of slides: 25 VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

3 VI-SEEM Repository The VI-SEEM Repository provides long term data preservation, suitable for data set sharing Use cases To store curated data sets for long term preservation To share those datasets with selected collaborators or open them up to whole communities, via web interface To make such data sets searchable by means of associating meta data and then harvesting them Enables scientific communities to capture and describe digital works using a custom submission workflow module VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

4 Underlying Software Technology
Based on DSpace ( DSpace is a platform that allows you to capture items in any format – in text, video, audio, and data. It distributes it over the web. It indexes your work, so users can search and retrieve your items. It preserves your digital work over the long term. Developed by the MIT Libraries with support from the HP-MIT Alliance A platform to build an Institutional Repository VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

5 Hardware Implementation
The VI-SEEM Repo is installed on a virtual machine with 8 GB RAM and 4 virtual cores. The physical hosting server is an IBM 3650 M4, with 2 eight core CPUs and 128 GB RAM. The storage array is formated with GPFS and it is connected over infiniband (56 Gbit/s), using IBM GSS and ESS storage servers. Failover issues are handled automatically by GPFS. The storage capacity dedicated to the Repo is 50 TB. Currently around 16 TB are occupied with useful data. Hosted and maintained by GRNET. VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

6 Benefits of VI-SEEM Repo
Some example benefits: Getting your research results out quickly, to a worldwide audience Reaching a worldwide audience through exposure to search engines such as Google Storing reusable teaching materials that you can use with course management systems Archiving and distributing material you would currently put on your personal website Storing examples of students’ projects (with the students’ permission) Showcasing students’ theses (again with permission) Keeping track of your own publications/bibliography Having a persistent network identifier for your work, that never changes or breaks No more page charges for images. You can point to your images’ persistent identifiers in your published articles. VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

7 Features User Interface
Web based, for submission, end-user and System Administrators Search and retrieval of items by browsing or searching the metadata Workflow Enables differing submission workflows for communities Models "e-people" who have "roles" in the workflow of a particular Community in the context of a given collection Persistent Identifiers (Handles) Implements CNRI handles as the persistent identifier associated with each item Soon to be integrated with the VI-SEEM PID service Access Control Allows contributors to limit access to items in the repository, at both the collection and the individual item level Integrated with the VI-SEEM Login Service Metadata Schema Utilises Qualified Dublin Core VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

8 Types of data in the VI-SEEM Repository
Articles Preprints, e-prints Technical Reports Working Papers Conference Papers E-theses Audio/Video Lecture notes, Visualizations, simulations Datasets in various formats Experimental Simulation Input Output Images Visual, scientific Teaching material Digitized library collections VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

9 Information Model Communities Collections (in communities)
Departments, Labs, Research Centers, Schools… Collections (in communities) Distinct groupings of like items Items (in collections) Logical content objects Receive persistent identifier Bitstreams (in items) Individual files Receive preservation treatment Versioning- Item “versions” can be All instances of a work in different formats E.g. the XML, PDF, and PostScript versions All editions of a work over time Metadata lists all available versions of items VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

10 The DSpace (VI-SEEM Repo) Architecture
VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

11 Repository Organization
Each Dspace service is comprised of Communities – the highest level of the Dspace content hierarchy Communities may be: Departments Labs Research Centres Schools Each community contains descriptive metadata about itself and the collections contained within it VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

12 Collections Each community in turn have collections which contain items or files Collections can belong to a single community or multiple communities (collaboration between communities may result in a shared collection) As with communities, each collection contains descriptive metadata about itself and the items contained within it VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

13 Example Structures Structures may be based around organizational units: Structures are hierarchical: Community Collection Items Life Sciences MD Data Climate Future Climate Modelling Digital Cultural Heritage Social Sciences Library Community Sub Community Collection Item Life Sciences Molecular Dynamics Cancer Related Data .... VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

14 Example: Home screen VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

15 Example: Climate Sciences community
VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

16 Example: Browsing by title
VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

17 Example: Submissions and Workflow tasks
VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

18 Example: Item submission, first step
VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

19 Example: Item submission, description
VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

20 Example: Item submission, third step
VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

21 Example: File upload VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

22 Example: Item review VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

23 Example: Add Creative Commons (CC) license
VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

24 Example: Distribution license
VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct

25 Thank you for your attention. Questions?
Conclusion The VI-SEEM Repository is the main place for long term data preservation, suitable for dataset sharing. The implementation is based ot DSpace popular open source technology for building large data repositories. The VI-SEEM Repository is hosted on a high-performance infrastructure. Types of data may include: Measurements, Visualizations, Simulations, Audio/Video, Images, Digitized library collections, Articles, Technical Reports, training materials, raw data etc. The dataset items are described with detailed metadata records, which must be carefully and patiently filled in by the senders. A carefully selected license must be assigned to each dataset item. Thank you for your attention. Questions? VI-SEEM Regional Climate Training event - Belgrade, Serbia, Oct


Download ppt "VI-SEEM Data Repository"

Similar presentations


Ads by Google