DataNet – Flexible Metadata Overlay over File Resources Daniel Harężlak 1, Marek Kasztelnik 1, Maciej Pawlik 1, Bartosz Wilk 1, Marian Bubak 1,2 1 ACC.

Slides:



Advertisements
Similar presentations
Polska Infrastruktura Informatycznego Wspomagania Nauki w Europejskiej Przestrzeni Badawczej Institute of Computer Science AGH ACC Cyfronet AGH The PL-Grid.
Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
UrbanFlood Towards a framework for creation, deployment and reliable operation of distributed, time-critical applications Marian Bubak and Marek Kasztelnik.
Scientific Workflow Support in the PL-Grid Infrastructure with HyperFlow Bartosz Baliś, Tomasz Bartyński, Kamil Figiela, Maciej Malawski, Piotr Nowakowski,
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Polish Infrastructure for Supporting Computational Science in the European Research Space GridSpace Based Virtual Laboratory for PL-Grid Users Maciej Malawski,
ARCS Data Analysis Software An overview of the ARCS software management plan Michael Aivazis California Institute of Technology ARCS Baseline Review March.
May Archiving PAWN: A Policy-Driven Software Environment for Implementing Producer- Archive Interactions in Support of Long Term Digital.
FutureGrid Image Repository: A Generic Catalog and Storage System for Heterogeneous Virtual Machine Images Javier Diaz, Gregor von Laszewski, Fugang Wang,
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space User Oriented Provisioning of Secure Virtualized.
Next Generation Domain-Services in PL-Grid Infrastructure for Polish Science. Numerical Simulations of Metal Forming Production Processes and Cycles by.
SaaS, PaaS & TaaS By: Raza Usmani
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
Towards auto-scaling in Atmosphere cloud platform Tomasz Bartyński 1, Marek Kasztelnik 1, Bartosz Wilk 1, Marian Bubak 1,2 AGH University of Science and.
Polish Infrastructure for Supporting Computational Science in the European Research Space Policy Driven Data Management in PL-Grid Virtual Organizations.
By Mihir Joshi Nikhil Dixit Limaye Pallavi Bhide Payal Godse.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
C Copyright © 2009, Oracle. All rights reserved. Appendix C: Service-Oriented Architectures.
CGW 2003 Institute of Computer Science AGH Proposal of Adaptation of Legacy C/C++ Software to Grid Services Bartosz Baliś, Marian Bubak, Michał Węgiel,
Polish Infrastructure for Supporting Computational Science in the European Research Space QoS provisioning for data-oriented applications in PL-Grid D.
Introduction to Cloud Technology StratusLab Tutorial (Orsay, France) 28 November 2012.
DISTRIBUTED COMPUTING
Using SRB and iRODS with the Cheshire3 Information Framework Building Data Grids with iRODS May, 2008 National e-Science Centre Edinburgh Dr Robert.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
CubicWeb – The Semantic Web is a construction game! Student: Uglješa Milić University of Belgrade School of Electrical.
Integrated Collaborative Information Systems Ahmet E. Topcu Advisor: Prof Dr. Geoffrey Fox 1.
Cracow Grid Workshop, October 27 – 29, 2003 Institute of Computer Science AGH Design of Distributed Grid Workflow Composition System Marian Bubak, Tomasz.
21/05/2010 AU DEPARTMENT OF COMPUTER SCIENCE FACULTY OF SCIENCE AARHUS UNIVERSITY TATIONpRESEN The homeport system Jeppe Brønsted, Post Doc, Phd Aarhus.
Plan  Introduction  What is Cloud Computing?  Why is it called ‘’Cloud Computing’’?  Characteristics of Cloud Computing  Advantages of Cloud Computing.
EC-project number: Universal Grid Client: Grid Operation Invoker Tomasz Bartyński 1, Marian Bubak 1,2 Tomasz Gubała 1,3, Maciej Malawski 1,2 1 Academic.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,
Federating PL-Grid Computational Resources with the Atmosphere Cloud Platform Piotr Nowakowski, Marek Kasztelnik, Tomasz Bartyński, Tomasz Gubała, Daniel.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
GraDS MacroGrid Carl Kesselman USC/Information Sciences Institute.
EGI Technical Forum Amsterdam, 16 September 2010 Sylvain Reynaud.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
The National Grid Service User Accounting System Katie Weeks Science and Technology Facilities Council.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
The Virtual Observatory and Ecological Informatics System (VOEIS): Using RESTful architecture and an extensible data model to provide a unique data management.
WP5 – Infrastructure Operations Test and Production Infrastructures StratusLab kick-off meeting June 2010, Orsay, France GRNET.
SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI Accessing cloud resources through the WS-PGRADE/gUSE and CloudBroker integrated.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
REST API to develop application for mobile devices Mario Torrisi Dipartimento di Fisica e Astronomia – Università degli Studi.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Technologies Stuart N. Wrigley 1, Raúl García-Castro 2 and Cassia Trojahn 3 1.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Service Oriented Architecture (SOA) Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
PLG-Data and rimrock Services as Building
Piotr Bała, Marcin Radecki, Krzysztof Benedyczak
Department of Computer Science AGH
By: Raza Usmani SaaS, PaaS & TaaS By: Raza Usmani
From VPH-Share to PL-Grid: Atmosphere as an Advanced Frontend
Model Execution Environment for Investigation of Heart Valve Diseases
VI-SEEM Data Discovery Service
PROCESS - H2020 Project Work Package WP6 JRA3
Module 01 ETICS Overview ETICS Online Tutorials
Mariusz Sterzel1 , Lukasz Dutka1, Tomasz Szepieniec1
Final Review 27th March Final Review 27th March 2019.
H2020 EU PROJECT | Topic SC1-DTH | GA:
Presentation transcript:

DataNet – Flexible Metadata Overlay over File Resources Daniel Harężlak 1, Marek Kasztelnik 1, Maciej Pawlik 1, Bartosz Wilk 1, Marian Bubak 1,2 1 ACC Cyfronet AGH, 2 AGH University of Science and Technology, Institute of Computer Science AGH Workshop on Cloud Services for File Synchronisation and Sharing November 17-18, 2014, CERN 1

2 Presentation Plan PL-Grid Computing Infrastructure Motivation behind DataNet Metadata Management Requirements Architecture Description Deployment Conclusions

3 DataNet – PLGrid Landscape PL-Grid Programme PL-Grid Computing power: ca. 230 Tflops Storage: ca Tbytes Basic infracructure services PL-Grid PLUS Additional 500 Tflops and 4,4 Pbytes Support for domain Grids Users Domain Experts Hardware/ Middleware Services

4 DataNet – Rationale and Objectives Rationale Data management as a common requirement in computational sciences Workflow and scripting engines provide only a little support Each application is different and requires a dedicated metadata/data model Objectives Provide means for ad-hoc metadata model creation and deployment of corresponding storage facilities Create a research space for metadata model exchange and discovery with associated data repositories with access restrictions in place Support different types of storage sites and data transfer protocols Support the exploratory paradigm by making the models evolve together with data

5 DataNet – PLGrid Requirements PLGrid infrastructure – supporting different e-Science domains Various applications coming from different scientific communities Common computational resources Deployment of model data as repositories Robust enablement of a dedicated interface Access control capabilities Exploitation of available storage infrastructure Universal availability of the repository Platform independent Facilitated by existing standards

6 DataNet – Architecture Web Interface is used by users to create, extend and discover metadata models Model repositories are deployed in the PaaS Cloud layer for scalable and reliable access from computing nodes through REST interfaces Data items from Storage Sites are linked from the model repositories

7 DataNet – Data Model Set of entities with fields Simple types Array types File type Relations

8 DataNet – Repository Repositories are accessed through REST Data view through a web application Configurable Access control Public Private (within a group of users)

9 DataNet – Repository Access Data sent over with JSON or FORM REST methods POST – submit new data PUT – modify data DELETE – remove data GET- retrieve data Queries with URL

10 DataNet – PLGrid Deployment PLGrid Users Access with regular PLGrid account REST interface Web Application for simple use cases Domain applications User proxy delegation retrieved from PLGrid OpenID provider REST interface DataNet as a Service already in place Deployment procedure followed

11 DataNet – Service summary Status Production Number of users5 – 10 (30 – 40 planned) Default and Maximum Quota5GB (Home), 100GB (Storage) Linux/Mac/Win user ratio3/0,5/2 Desktop/Mobile/Web access ratioWeb only TechnologyCloud Foundry, Java, Ruby Target communitiesResearchers in various fields Integration in your current environmentUsing the same storage elements Risk factorsNetwork reliability (upload interruptions) Most important functionalityREST access Missing functionality (if any)None so far

12 DataNet – User Feedback Easy integration in any programming language Possibility to view data via a web application Eesy extension to other metadata engines Not and end-user software

13 DataNet – DONEs and TODOs DONEs Custom CloudFoundry environment was setup as a PaaS platform to ensure quick deployments of required application and storage services Schema for metadata model creation was elaborated and was evaluated for NoSQL storage service MongoDB Storage site access libraries were implemented and tested Deployment of a web-based tool to create, discover and manage metadata models TODOs Support various types of metadata storage services to fulfil different application requirements (if required) Prototype a utility for data migration between model versions

14 Thank You Acknowledgements This research has been partially supported by the European Regional Development Fund program no. POIG /10 as part of the PL-Grid PLUS project Contact us and help make DataNet better Visit for more informationhttp://dice.cyfronet.pl See DataNet in action at (PLGrid account required)