Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Data Management in EGI

Similar presentations


Presentation on theme: "Introduction to Data Management in EGI"— Presentation transcript:

1 Introduction to Data Management in EGI
Vincenzo Spinoso EGI.eu/INFN

2 Outline Categorisation of data services in EGI Status and future plans

3 Components Data management is performed by interoperable components
Different components address different needs Storage management at site level Transfer between sites Security Catalogue, metadata

4 How data are managed at site level?
Storage endpoints How data are managed at site level?

5 Storage endpoints A unique namespace is provided to the client
Authentication and encryption guarantee confidentiality and integrity Several protocols are supported for file access and transfer Distribute data across several disk servers guarantees scalability at site level If tapes are provided, access to tape is transparent

6 Storage endpoints DPM Lustre or GPFS StoRM

7 What about interoperability, access, transfers?

8 Access, transfers DPM StoRM Abstraction layer SRM GridFTP WebDAV NFS/pNFS «Storage element» Applications and users can interact with the endpoints using different protocols SRM offers storage management disk/tape transparent management interface between different transfer protocols standard interface GridFTP offers advanced data transfer Parallel streams Fault tolerance Security (authorization, encryption) Optimization

9 Access, transfers DPM StoRM Abstraction layer SRM GridFTP WebDAV NFS/pNFS «Storage element» Applications and users can interact with the endpoints using different protocols WebDAV offers a «web-based network file system» Widely supported by many OSes Standard (IETF) NFS4.1 provides «local access» (fast, POSIX)

10 Access, transfers DPM Abstraction layer SRM GridFTP WebDAV NFS/pNFS

11 Data transfer scheduling
Can transfers be scheduled?

12 Data transfer scheduling
schedule continuous sustained data transfer across multiple endpoints prioritize inter-VO and intra-VO file transfers Many different clients available towards several protocols (SRM, GridFTP, webdav… ) Useful in the VO management context to control data transfers

13 Catalogue Where are my files? lfn:grid/ /store/data/run1312

14 Catalogue LFC hierarchical view of files to users, with a UNIX-like client interface Logical File Name (LFN) to Storage URL (SURL) mappings authorization on namespace

15 EGI «whole picture» Really complex infrastructure based on elementary «bricks» each VO chooses its «recipe» of components mature and stable integration in a unified release controls stability of the «off-line» machinery operations control stability of the «on-line» machinery

16 Globus Online provides robust and easy to use file transfer capabilities Web interface Transfer management Performance monitoring Retries after failures, autorecover when possible It’s a service, hosted at (US) But the files that the service moves among EGI sites DO NOT LEAVE Europe GridFTP «3rd party transfer» is used Files copied directly between the EGI endpoints

17 iRODS Provides high level abstraction layer on top of storage resources Users focus on their data, not on where they are on the data grid Provides native metadata catalogue Multiple authentication plugins (password, PAM, GSI… ) Multiple access protocols (POSIX, S3, RADOS… ) Rule-oriented approach: «policies» can be easily implemented as data management tasks Ongoing integration in the EGI infrastructure

18 FedCloud IaaS Capabilities
Computing VM Management VM Marketplace Storage Block Storage Object Storage

19 Block Storage Persistent block level storage to use with VMs
Use as any other block device from VMs Snapshotable Simple usage Consistent and low-latency performance SSDs (in some sites) High Performance From GB to TB Create and attach to VMs on demand Scale to your needs

20 Object Storage API Access Scalable Sharing
Data storage infrastructure for storing and retrieving data from anywhere at any time Simple REST APIs for managing and accessing data API Access Store as much data as needed. Get accounted only for the space used. Scalable Define ACLs on each object, share publicly your data Sharing

21 Block Storage vs Object Storage
Access only from within a VM only at the same site the VM is located from any device connected to the internet. Sharing not possible possible (data can be kept private or public) Accounting for the entire volume, regardless how much of it is actually used only for the data stored Integration easy with any application capable to write/read file from a local disk requires a client to be integrated within the application

22 Use Cases Block Storage Object Storage Application hosting
Data Processing Database Large Data File Storage & Backup Static Content Media Serving & Sharing Big Data

23 in order to integrated a product in UMD please follow instructions onhttps://wiki.egi.eu/wiki/EGI_Software_Component_Delivery Questions?


Download ppt "Introduction to Data Management in EGI"

Similar presentations


Ads by Google