San Diego Supercomputer Center, University of California at San Diego Grid Physics Network (GriPhyN) University of Florida A Data Storage Language for.

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

GFS OGF-22 Global Resource Naming Developers: Reagan Moore Arcot Mike.
The Storage Resource Broker and.
The Storage Resource Broker and.
Overview of the SDSC Storage Resource Broker Wayne Schroeder (and other SRB team members) May, 2004 San Diego Supercomputer Center, University of California.
Peter Berrisford RAL – Data Management Group SRB Services.
SACNAS, Sept 29-Oct 1, 2005, Denver, CO What is Cyberinfrastructure? The Computer Science Perspective Dr. Chaitan Baru Project Director, The Geosciences.
Data Grid: Storage Resource Broker Mike Smorul. SRB Overview Developed at San Diego Supercomputing Center. Provides the abstraction mechanisms needed.
San Diego Supercomputer Center Self-organizing Smart Namespaces : Next Generation Data Grid Systems Arun Jagatheesan iRODS.org.
San Diego Supercomputer CenterNational Partnership for Advanced Computational Infrastructure1 Grid Based Solutions for Distributed Data Management Reagan.
Background Chronopolis Goals Data Grid supporting a Long-term Preservation Service Data Migration Data Migration to next generation technologies Trust.
Looking ahead for GFS … Arun Jagatheesan San Diego Supercomputer Center Remote Talk at GGF-16 Athens, Greece.
Wayne Schroeder, Paul Tooby Data Intensive Cyber Environments Team (DICE) DICE Center, University of North Carolina at Chapel Hill; Institute for Neural.
A Very Brief Introduction to iRODS
Chronopolis: Preserving Our Digital Heritage David Minor UC San Diego San Diego Supercomputer Center.
Jean-Yves Nief, CC-IN2P3 Wilko Kroeger, SCCS/SLAC Adil Hasan, CCLRC/RAL HEPiX, SLAC October 11th – 13th, 2005 BaBar data distribution using the Storage.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
EInfrastructures (Internet and Grids) - 15 April 2004 Sharing ICT Resources – “Think Globally, Act Locally” A point-of-view from the United States Mary.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Supercomputing Center Jysoo Lee KISTI Supercomputing Center National e-Science Project.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
By: Roman Olschanowsky An Introduction to the.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida Programming Gridflows using Matrix Arun Jagatheesan Architect, SDSC.
Jan Storage Resource Broker Managing Distributed Data in a Grid A discussion of a paper published by a group of researchers at the San Diego Supercomputer.
San Diego Supercomputer Center SDSC Storage Resource Broker SRB as data grid solution (Chinese version) Arun Jagatheesan San Diego Supercomputer.
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida Dataflows in SRB using SDSC Matrix Arun Jagatheesan Architect & Team.
Data R&D Issues for GTL Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego Bertram Ludäscher
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
San Diego Supercomputer Center SDSC Storage Resource Broker Data Grid Automation Arun Jagatheesan et al., San Diego Supercomputer Center University of.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
San Diego Supercomputer Center SDSC Storage Resource Broker A Data Storage Language for the Requirements of Rebels and Misfits Arun Jagatheesan San Diego.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
1-1.1 Sample Grid Computing Projects. NSF Network for Earthquake Engineering Simulation (NEES) 2004 – to date‏ Transform our ability to carry out research.
Telescience for Advanced Tomography Applications 1 Telescience featuring IPv6 enabled Telemicroscopy Steven Peltier National Center for Microscopy and.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure SRB + Web Services = Datagrid Management System (DGMS) Arcot.
Rule-Based Preservation Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar Richard Marciano {moore, schroede, mwan, sekar,
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Archive for the NSDL Reagan W. Moore Charlie Cowart.
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida DGL: The Assembly Language for Grid Computing Arun swaran Jagatheesan.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
SAN DIEGO SUPERCOMPUTER CENTER By: Roman Olschanowsky An Introduction to the.
6 February 2009 ©2009 Cesare Pautasso | 1 JOpera and XtremWeb-CH in the Virtual EZ-Grid Cesare Pautasso Faculty of Informatics University.
Michael Doherty RAL UK e-Science AHM 2-4 September 2003 SRB in Action.
From SRB to IRODS: Policy Virtualization using Rule-Based Data Grids Reagan W. Moore Wayne Schroeder Arcot Rajasekar Mike Wan San Diego Supercomputer Center.
Introduction to The Storage Resource.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for.
San Diego Supercomputer Center, University of California at San Diego Grid Physics Network (GriPhyN) University of Florida Data Grid and Gridflow Management.
Biomedical Informatics Research Network The Storage Resource Broker & Integration with NMI Middleware Arcot Rajasekar, BIRN-CC SDSC October 9th 2002 BIRN.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
National Archives and Records Administration1 Integrated Rules Ordered Data System (“IRODS”) Technology Research: Digital Preservation Technology in a.
The Storage Resource Broker and.
SAN DIEGO SUPERCOMPUTER CENTER An Intelligent Rule-Oriented Data Management System DataGrid Wayne Schroeder San Diego Supercomputer Center, University.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Introduction: AstroGrid increases scientific research possibilities by enabling access to distributed astronomical data and information resources. AstroGrid.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
San Diego Supercomputer Center University of California, San Diego
OGCE Portal Applications for Grid Computing
Presentation transcript:

San Diego Supercomputer Center, University of California at San Diego Grid Physics Network (GriPhyN) University of Florida A Data Storage Language for the Rebels and Misfits Arun Jagatheesan San Diego Supercomputer Center (SDSC) University of California, San Diego

San Diego Supercomputer Center University of California at San Diego University of Florida 2 Talk Outline Problem Situation Problem Solution Concepts If or Why different? Status Boring – Isn’t this supposed to be a fun session? Hint: count the times he will use the word “Actually”

San Diego Supercomputer Center University of California at San Diego University of Florida 3 Problem Situation SDSC: Petabytes of storage Millions of files Distributed in a data grid We believe our Data Storage Software already has: Infrastructure Independence (more than location independence) Data virtualization (more than storage virtualization)

San Diego Supercomputer Center University of California at San Diego University of Florida 4 Countries actively using SDSC SRB

San Diego Supercomputer Center University of California at San Diego University of Florida TB 358 TB 682 TB Total data brokered by SDSC SRB

San Diego Supercomputer Center University of California at San Diego University of Florida 6 SDSC SRB User Community (Major US) BaBar, Stanford Linear Accelerator Center (SLAC) California Digital Library (CDL) Center for Integrated Space Weather Modeling (CISM) CVC, Visualization Portal LDC Data Storage NIH Bio Informatics Research Network (BIRN) NSF Southern California Earthquake Center (SCEC) National Archives and Records Administration (NARA) National Aeronautics and Space Administration Centers (NASA) National Virtual Observatory (NVO) Npackage, NSF Middleware Initiative (NMI) National Science Digital Library (NSDL) National Optical Astronomy Observatory (NOAO) ROADNet Purdue University SCCOOS, USA Scientific Rich Media Archive Salk Institute Strand Map Service, USA UC Berkeley Library UCSD Library University of Houston Persistent Archives Test bed University of Wisconsin, Madison WebBase, Stanford University Yale University Library

San Diego Supercomputer Center University of California at San Diego University of Florida 7 SDSC SRB User Community Academia Sinica, Taiwan Australian National University Bio-Lab, University of Genoa, Italy Council for the Central Laboratory of the Research Councils (CCLRC), UK CC-IN2P3, France Distributed Framework, Singapore Distributed Aircraft Maintenance Environment (DAME), UK eMinerals Project, UK eScience, Belfast Center Fraunhofer ITWM, Germany High Energy Accelerator Organization, KEK, Japan K* Grid Computing, Korea KEK Computing Center, Japan Lyon, France NorGrid, Norway Nanyang Data Grid, Singapore Queensland University of Technology (QUT), Australia Rutherford Appleton Laboratory (RAL), UK T-Systems, Germany UK eScience Project, UK UniGrid, Poland UMK, Poland Virtual Laboratory for eScience, Netherlands

San Diego Supercomputer Center University of California at San Diego University of Florida 8 Problem In short, we got rebels and misfits (in a nice way) Any file system event (like insert) on the Data Grid should start a long-run process on the file system Data Storage Workflow Automation Language Need a way to describe the workflow (currently scripts) Need more flexible, dynamic ways, to describe, manage and query data storage workflow

San Diego Supercomputer Center University of California at San Diego University of Florida 9 Solution

San Diego Supercomputer Center University of California at San Diego University of Florida 10 Data Grid Language (DGL) XML based gridflow description Describes data storage execution flow logic Implemented as a web service ECA-based rule description for execution ECA = Event, Condition, Action Querying of Status of Gridflow XQuery / Simple query of a Gridflow Execution Scoped variables and gridflow patterns For, For-each, parallel, sequential, switch-cases

San Diego Supercomputer Center University of California at San Diego University of Florida 11 Flow Scoped Variables that can control the flow Logic used by the sub-members Sub-members that are the real execution statements

San Diego Supercomputer Center University of California at San Diego University of Florida 12 Data Grid Request Annotations about the Data Grid Request Can be either a Flow or a Status Query

San Diego Supercomputer Center University of California at San Diego University of Florida 13 Concepts ECA : Event-Condition-ActionS Infrastructure Independent Process Description on Infrastructure independent logical storage Data Storage related operations and variables in workflow language Dynamic workflow description update Data Grid Workflow patterns

San Diego Supercomputer Center University of California at San Diego University of Florida 14 Status Just graduated from our “time well wasted” lab… From prototypes now to production software, next steps… Open source collaborative development project UCSD/SDSC, UCSB, UK CCLRC, Australia University, commercial company (-ies) Its very easy to write a driver and take advantage of our infrastructure

San Diego Supercomputer Center University of California at San Diego University of Florida 15 Acknowledgment SDSC Data Grid Group (SDSC SRB Software) SDSC Storage (Infrastructure and Production) SDSC Matrix Team