San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

OGF-23 iRODS Metadata Grid File System Reagan Moore San Diego Supercomputer Center.
Data Grid: Storage Resource Broker Mike Smorul. SRB Overview Developed at San Diego Supercomputing Center. Provides the abstraction mechanisms needed.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SAN DIEGO SUPERCOMPUTER CENTER Particle Physics Data Grid PPDG Data Handling System Reagan.
San Diego Supercomputer Center NARA Research Prototype Persistent Archive Building Preservation Environments with Data Grid Technology (NARA Research Prototype.
San Diego Supercomputer CenterNational Partnership for Advanced Computational Infrastructure1 Grid Based Solutions for Distributed Data Management Reagan.
A Very Brief Introduction to iRODS
Applying Data Grids to Support Distributed Data Management Storage Resource Broker Reagan W. Moore Ian Fisk Bing Zhu University of California, San Diego.
1 Introduction to Database Management Systems Lila Rao Graham.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Understanding Active Directory
Architecture of Grid File System (GFS) - Based on the outline draft - Arun swaran Jagatheesan San Diego Supercomputer Center Global Grid Forum 11 Honolulu,
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
Presenter: Dipesh Gautam.  Introduction  Why Data Grid?  High Level View  Design Considerations  Data Grid Services  Topology  Grids and Cloud.
Jan Storage Resource Broker Managing Distributed Data in a Grid A discussion of a paper published by a group of researchers at the San Diego Supercomputer.
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida Dataflows in SRB using SDSC Matrix Arun Jagatheesan Architect & Team.
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
San Diego Supercomputer Center SDSC Storage Resource Broker Data Grid Automation Arun Jagatheesan et al., San Diego Supercomputer Center University of.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
Rule-Based Programming for VORBs Bertram Ludaescher Arcot Rajasekar Data and Knowledge Systems San Diego Supercomputer Center U.C. San Diego.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure SRB + Web Services = Datagrid Management System (DGMS) Arcot.
Designing the Architecture for Grid File System (GFS) Arun swaran Jagatheesan San Diego Supercomputer Center Global Grid Forum 12 Brussels, Belgium.
Rule-Based Preservation Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar Richard Marciano {moore, schroede, mwan, sekar,
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Archive for the NSDL Reagan W. Moore Charlie Cowart.
GGF9 GFS WG BOF10/07/2003, Chicago Grid File System Group Proposal BOF Osamu Tatebe (AIST) Jane Xu (IBM) Arun Jagatheesan (SDSC)
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida DGL: The Assembly Language for Grid Computing Arun swaran Jagatheesan.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
San Diego Supercomputer Center iRODS DGMS Towards Data Grid Standard Implementations Arun Jagatheesan San Diego Supercomputer Center Open.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
Introduction to The Storage Resource.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
National Archives and Records Administration1 Integrated Rules Ordered Data System (“IRODS”) Technology Research: Digital Preservation Technology in a.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
Data Breakout. OGSA Architecture – databases Eldas, OGSA-DAI and GridMiner implement a slightly old version of OGSA / DAIS –Architecture doc describes.
Grid File System Working Group SAGA and GFS-WG Grid File System Working Group (GFS-WG) Global Grid Forum (GGF)
1 Introduction to Active Directory Directory Services Uniquely identify users and resources on a network Provide a single point of network management.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
St. Petersburg, 2016 Openstack Disk Storage vs Amazon Disk Storage Computing Clusters, Grids and Cloud Erasmus Mundus Master Program in PERCCOM Author:
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Grid File System WG GGF11, Honolulu June 8-9, 2004.
Introduction to Database Management Systems
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Introduction to Data Management in EGI
GSAF Grid Storage Access Framework
GSAF Grid Storage Access Framework
Designing the Architecture for Grid File System (GFS)
Arcot Rajasekar Michael Wan Reagan Moore (sekar, mwan,
San Diego Supercomputer Center University of California, San Diego
Technical Issues in Sustainability
Introduction to Active Directory Directory Services
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Architecture of Grid File System (GFS) - Based on the outline draft -
Presentation transcript:

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida POSIX-like OGSA/SOAP Services Arun Jagatheesan Architect & Team Lead, SDSC Matrix San Diego Supercomputer Center GFS, Global Grid Forum-9 October 7, 2003, Chicago

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 2 Talk Outline Grid File System The small big picture Need for Schema Need for Operation definitions Data Transport

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 3 Grid File System Data Sources Grid File System Service (POSIX-like Interface) Data Services Virtual Directory Service (Management of virtualization) Coordinated with other groups Hierarchical Logical Name space, ACL, metadata Applications (Astronomy, Physics, Life Science, business apps,...) NFS/CIFS …

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 4 The small big picture Data Sources Grid File System Service (POSIX-like Interface) Data Services Virtual Directory Service (Management of virtualization) NFS/CIFS … XML Schema for Collections, Data Sets OGSA/SOAP based interfaces for file operations NFS or other standard interface over the virtualized schema

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 5 Grid Collection Schema XML Schema based Description for Collections or Virtual Directories Data Sets File System Meta-data (file size, date created, …) Application Specific Meta-data Access Permissions … Logical Name space Extensible Scalable (more federations) Dynamic Composition of the name space Import and Export

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 6 Operations on Logical Namespace OGSA/SOAP based interfaces Grid File System operations Similar to traditional file systems operations / POSIX Open (= Get a GSR?), Read, Seek’n’Read, Seek’n’Write, … Simple Control (Context) Operations Management of Logical Namespace SOAP based bindings Bulk (Content) Operations Only SOAP bindings for data transport ??? (NOPE) Alternative mechanisms needed in standard

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 7 How do we form the logical namespace?

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 8 Logical Layers (bits,data,information,..) Storage Resource Transparency Storage Location Transparency E:\srbVault\image.jpg /users/srbVault/image.jpg Select … from srb.mdas.td where... Data Identifier Transparency image_0.jpg…image_100.jpg Data Replica Transparency image.sqlimage.cgiimage.wsdl Virtual Data Transparency Collections or Virtual Directories patientRecordsCollectionmyActiveNeuroCollection

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 9 Storage Resource Transparency (1) Storage repository abstraction Archival systems, file systems, databases, FTP sites, … Logical resources Combine physical resources into a logical set of resources Hide the type and protocol of physical storage system Load balancing – based on access patterns Unlike DBMS, user is aware of logical resources Flexibility to changes in mass storage technology

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 10 Storage Resource Transparency (2) Standard operations at storage repositories POSIX like operations on all resources Storage specific operations Databases - bulk metadata access Object ring buffers - object based access Hierarchical resource managers - status and staging requests

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 11 Storage Location Transparency Support replication of data for performance Transparent access to physical location and physical resource Virtualization of distributed data resources Data naming managed by the data grid Redundancy for preservation Resource redundancy – “m of n” resources in list Location redundancy – replicate at multiple locations

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 12 Data Identifier Transparency Four Types of Data Identifiers: 1. Unique name OID or handle 2. Descriptive name Descriptive attributes – meta data Semantic access to data 3. Collective name Logical name space of a collection of data sets Location independent 4. Physical name Physical location of resource and physical path of data

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 13 Data Replica Transparency Replication Improve access time Improve reliability Provide disaster backup and preservation Physically or Semantically equivalent replicas Replica consistency Synchronization across replicas on writes Updates might use “m of n” or any other policy Distributed locking across multiple sites Versions of files Time-annotated snapshots of data

San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure University of Florida 14 Conclusion Lot of possibilities Need for a Standard Grid File Schema and Global Logical Namespace for virtualization Need for Standard description of Operations or Grid File System Service Call for Users, Projects Developers, Vendors It’s a stone’s throw away – together, we will do it.