San Diego Supercomputer Center www.irods.org iRODS DGMS Towards Data Grid Standard Implementations Arun Jagatheesan San Diego Supercomputer Center Open.

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

© 2006 Open Grid Forum OGF19 Federated Identity Rule-based data management Wed 11:00 AM Mountain Laurel Thurs 11:00 AM Bellflower.
Service Oriented Architecture for Mobile Applications Swarupsingh Baran University of North Carolina Charlotte.
Data Grid: Storage Resource Broker Mike Smorul. SRB Overview Developed at San Diego Supercomputing Center. Provides the abstraction mechanisms needed.
San Diego Supercomputer Center Self-organizing Smart Namespaces : Next Generation Data Grid Systems Arun Jagatheesan iRODS.org.
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Environmental Terminology System and Services (ETSS) June 2007.
Workshop on Cyber Infrastructure in Combustion Science April 19-20, 2006 Subrata Bhattacharjee and Christopher Paolini Mechanical.
Architecture of Grid File System (GFS) - Based on the outline draft - Arun swaran Jagatheesan San Diego Supercomputer Center Global Grid Forum 11 Honolulu,
Project Proposal: Academic Job Market and Application Tracker Website Project designed by: Cengiz Gunay Client: Cengiz Gunay Audience: PhD candidates and.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
US NITRD LSN-MAGIC Coordinating Team – Organization and Goals Richard Carlson NGNS Program Manager, Research Division, Office of Advanced Scientific Computing.
Identity Management Report By Jean Carreon and Marlon Gonzales.
I Copyright © 2004, Oracle. All rights reserved. Introduction.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
Jan Storage Resource Broker Managing Distributed Data in a Grid A discussion of a paper published by a group of researchers at the San Diego Supercomputer.
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida Dataflows in SRB using SDSC Matrix Arun Jagatheesan Architect & Team.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
San Diego Supercomputer Center SDSC Storage Resource Broker Data Grid Automation Arun Jagatheesan et al., San Diego Supercomputer Center University of.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
The Network Performance Advisor J. W. Ferguson NLANR/DAST & NCSA.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Event-Based Hybrid Consistency Framework (EBHCF) for Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
© 2007 Open Grid Forum Data Grid Management Systems: Standard API - community development Arun Jagatheesan, San Diego Supercomputer Center & iRODS.org.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure SRB + Web Services = Datagrid Management System (DGMS) Arcot.
Designing the Architecture for Grid File System (GFS) Arun swaran Jagatheesan San Diego Supercomputer Center Global Grid Forum 12 Brussels, Belgium.
0 SharePoint Search 2013 Rafael de la Cruz SharePoint Developer Seneca Resources twitter.com/delacruz_rafael
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
GGF9 GFS WG BOF10/07/2003, Chicago Grid File System Group Proposal BOF Osamu Tatebe (AIST) Jane Xu (IBM) Arun Jagatheesan (SDSC)
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida DGL: The Assembly Language for Grid Computing Arun swaran Jagatheesan.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Grid File System WG – GGF 17 Arun Jagatheesan San Diego Supercomputer Center GGF 17 May 11, 2006 Tokyo, Japan.
Towards a WBEM-based Implementation of the OGF GLUE Information Model Sergio Andreozzi, INFN-CNAF, Bologna (Italy) Third EGEE User Forum 13 Feb 2008, Clermont-Ferrand,
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Introduction to The Storage Resource.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
Easy Access to Grid infrastructures Dr. Harald Kornmayer (NEC Laboratories Europe) Dr. Mathias Stuempert (KIT-SCC, Karlsruhe) EGEE User Forum 2008 Clermont-Ferrand,
Managing Enterprise GIS Geodatabases
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Adopting the Practice of Enterprise Analysis in a Mid-Sized Company Mary Burns Furr Adaptis, Inc Seattle, Washington USA.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Java Web Services Orca Knowledge Center – Web Service key concepts.
SOA (Service Oriented Architecture)
Designing the Architecture for Grid File System (GFS)
Ahmet Fatih Mustacoglu
Distributed Systems Bina Ramamurthy 11/30/2018 B.Ramamurthy.
San Diego Supercomputer Center University of California, San Diego
Unit# 5: Internet and Worldwide Web
WGISS Connected Data Assets Oct 24, 2018 Yonsook Enloe
Distributed Systems Bina Ramamurthy 4/7/2019 B.Ramamurthy.
Distributed Systems Bina Ramamurthy 4/22/2019 B.Ramamurthy.
Automating Profitable Growth™
Presentation transcript:

San Diego Supercomputer Center iRODS DGMS Towards Data Grid Standard Implementations Arun Jagatheesan San Diego Supercomputer Center Open Grid Forum 19 Jan 31, 2007 – session II

San Diego Supercomputer Center IROS DGMS 2 Outline Community Introduction : OGF-GFS User perspective Developer/Vendor Perspective Need for standard community implementation Community implementation process GFS-WG community architecture sketch Follow-up actions

San Diego Supercomputer Center IROS DGMS 3 Motivation Global namespace for unstructured data storage Collaboration amongst multiple partners / teams Long-term management of unstructured data Files, collection-based digital entities

San Diego Supercomputer Center IROS DGMS 4 NIH BIRN Data Grid

San Diego Supercomputer Center IROS DGMS 5 World Wide Datagrid

San Diego Supercomputer Center IROS DGMS 6 Used or Required by Large scale academic projects Federal agencies (NARA, LoC, …) Fortune 500, Forbes Global 2000, ….

San Diego Supercomputer Center IROS DGMS 7 DGMS Concept-wise Large-scale logical file system +File System +Database System +Grid Computing = Data Grid Management System (DGMS) Core Concepts Logical shared collections Logical shared resources Collaborative communities

San Diego Supercomputer Center IROS DGMS 8 Problem solved / Requirements –1 Collaborative logical namespace Global collaborations of multiple teams Collaborations of multiple organizations Avoid multiple mount points as they restrict scalability of the collaboration Coordinated data sharing at any granular level (data, metadata, annotations,…)

San Diego Supercomputer Center IROS DGMS 9 Problem solved / Requirements –2 Data Distribution Multi-site replicas reduce access times Replicas have the same logical name everywhere in the enterprise (big plus for users) Concept of replica, copy, cache Replicas controlled by user, admin, system-enabled (automated or policy based) Reduce WAN latency (chattiness)

San Diego Supercomputer Center IROS DGMS 10 Problem solved / Requirements –3 Data Classification and Discovery Major advantage for Global 2000 companies Tag data with any arbitrary metadata schema Each team can organize its data based on user-defined attributes Multiple teams can have different metadata attributes on the same data Query, discover and access data without knowing path or protocol to be used

San Diego Supercomputer Center IROS DGMS 11 User Perspective Designed for Off the shelf don’t want to assemble (or DIY) But able to customize the solution One point of contact or responsibility If it does not work I have one mailing list or number to call

San Diego Supercomputer Center IROS DGMS 12 Vendor/developer perspective “OGF-GFS compatible” OGF-GFS Data Grid Applications OGF-GFS Data Grid Appliance Ease of standard evolution Avoid unnecessary dependencies on multiple interfaces for operations that are the same granular level Ability to collaborate, learn and compete An end-to-end solution with common interface Additional capabilities that add value to the solution

San Diego Supercomputer Center IROS DGMS 13 Lessons Learnt Software v/s Specification Software implementation to engage and collaborate as we define standards (unless every wants to invest on software development from the start) Make both the user and vendor/developer happy Have users happy to be confident to share requirements and demand for the standards from vendors/developers Vendors/developers know it’s a real thing that can be implemented around their existing products or software

San Diego Supercomputer Center IROS DGMS 14 The scope (from GFS Architecture) A single interface Protocols A hybrid of XML and byte-level protocol XML – command channel of operations Byte-level – data movement Possible Functionalities File namespace and file operations (read, write, … Meta-data operations (user-defined metadata, search) Data Grid Language for policy, rules etc.,

San Diego Supercomputer Center IROS DGMS 15 What could be the right high level picture? DGMS XML-command protocol Byte-level data protocol Object-transfer Facilitate SOA

San Diego Supercomputer Center IROS DGMS 16 What could be the right high level picture? DGMS server XML-command protocol Byte-level data protocol DGMS server DGMS server

San Diego Supercomputer Center IROS DGMS 17 User perspective Logical Resources Multiple Replicas Users from different organizations User defined meta data for data discovery Secret Recipe

San Diego Supercomputer Center IROS DGMS 18 So what will we be doing (products?) Definition Concept ( data grid namespace, resource-namespace…) Initial functionalities (DGMS operations to be targeted) Namespace (Files, Metadata, Resource, Policy rules) XML protocol XML-handshake and message transfer between DGMS- client and DGMS-server Most importantly… Software as a common framework for the evolution, adoption and growth of the standard and DGMS concepts

San Diego Supercomputer Center IROS DGMS 19 So how will we do it? (process) Community-based open design (OPEN FORUM) Design discussions as a community Code through multiple parties to make sure we keep the vendor/developer community and user community engaged Community-based open standard (OPEN STDS) Specs written using wiki and other mechanisms Community based spec for OGF Interoperability workshops and Workshops along with other relevant agencies like SNIA or DMTF

San Diego Supercomputer Center IROS DGMS 20 How can you get started? Initial requirements Can you delete ? (sign up for our mailing list) Got Bandwidth and browser? (Visit our group page) Can you scream or shout or smile ( join our WG sessions) Are you a user or consumer or researcher? Tell us what is needed? What should be there for you to put this open source software/standard in production Are you a vendor/developer? Have your engineer or developer talk to us (we will convert him to a DGMS developer or DGMS Guru) We are developing a open standard – take advantage of it and develop a value added solution around it

San Diego Supercomputer Center IROS DGMS 21 When do we get started? Right now (Hmmm.. We did long time back) Conference calls every other week Mostly Wednesdays Attend through phone call, Skype or Polycom Video conference (any thing you like) Discussions influencing, design requirements Face to face meeting Once every quarter (planned), OGF sessions

San Diego Supercomputer Center IROS DGMS 22 Suggestions, comments, critics TO DO Standard operations based on policies/rules Take advantage of OGF standards as possible Other commercial or magic tools could be used below the standard NOT TO DO

San Diego Supercomputer Center IROS DGMS 23 Conclusions Data Grids Data Grid Management systems (DGMS) Very good user need in academic and non-academics Need for standards framed by Grid File System WG Software-included Spec Strategy