Presentation is loading. Please wait.

Presentation is loading. Please wait.

San Diego Supercomputer Center www.irods.org iRODS DGMS Towards Data Grid Standard Implementations Arun Jagatheesan San Diego Supercomputer Center Open.

Similar presentations


Presentation on theme: "San Diego Supercomputer Center www.irods.org iRODS DGMS Towards Data Grid Standard Implementations Arun Jagatheesan San Diego Supercomputer Center Open."— Presentation transcript:

1 San Diego Supercomputer Center www.irods.org iRODS DGMS Towards Data Grid Standard Implementations Arun Jagatheesan San Diego Supercomputer Center Open Grid Forum 19 Jan 31, 2007 – session II

2 San Diego Supercomputer Center www.irods.org IROS DGMS 2 Outline Community Introduction : OGF-GFS User perspective Developer/Vendor Perspective Need for standard community implementation Community implementation process GFS-WG community architecture sketch Follow-up actions

3 San Diego Supercomputer Center www.irods.org IROS DGMS 3 Motivation Global namespace for unstructured data storage Collaboration amongst multiple partners / teams Long-term management of unstructured data Files, collection-based digital entities

4 San Diego Supercomputer Center www.irods.org IROS DGMS 4 NIH BIRN Data Grid

5 San Diego Supercomputer Center www.irods.org IROS DGMS 5 World Wide Datagrid

6 San Diego Supercomputer Center www.irods.org IROS DGMS 6 Used or Required by Large scale academic projects Federal agencies (NARA, LoC, …) Fortune 500, Forbes Global 2000, ….

7 San Diego Supercomputer Center www.irods.org IROS DGMS 7 DGMS Concept-wise Large-scale logical file system +File System +Database System +Grid Computing = Data Grid Management System (DGMS) Core Concepts Logical shared collections Logical shared resources Collaborative communities

8 San Diego Supercomputer Center www.irods.org IROS DGMS 8 Problem solved / Requirements –1 Collaborative logical namespace Global collaborations of multiple teams Collaborations of multiple organizations Avoid multiple mount points as they restrict scalability of the collaboration Coordinated data sharing at any granular level (data, metadata, annotations,…)

9 San Diego Supercomputer Center www.irods.org IROS DGMS 9 Problem solved / Requirements –2 Data Distribution Multi-site replicas reduce access times Replicas have the same logical name everywhere in the enterprise (big plus for users) Concept of replica, copy, cache Replicas controlled by user, admin, system-enabled (automated or policy based) Reduce WAN latency (chattiness)

10 San Diego Supercomputer Center www.irods.org IROS DGMS 10 Problem solved / Requirements –3 Data Classification and Discovery Major advantage for Global 2000 companies Tag data with any arbitrary metadata schema Each team can organize its data based on user-defined attributes Multiple teams can have different metadata attributes on the same data Query, discover and access data without knowing path or protocol to be used

11 San Diego Supercomputer Center www.irods.org IROS DGMS 11 User Perspective Designed for Off the shelf don’t want to assemble (or DIY) But able to customize the solution One point of contact or responsibility If it does not work I have one mailing list or number to call

12 San Diego Supercomputer Center www.irods.org IROS DGMS 12 Vendor/developer perspective “OGF-GFS compatible” OGF-GFS Data Grid Applications OGF-GFS Data Grid Appliance Ease of standard evolution Avoid unnecessary dependencies on multiple interfaces for operations that are the same granular level Ability to collaborate, learn and compete An end-to-end solution with common interface Additional capabilities that add value to the solution

13 San Diego Supercomputer Center www.irods.org IROS DGMS 13 Lessons Learnt Software v/s Specification Software implementation to engage and collaborate as we define standards (unless every wants to invest on software development from the start) Make both the user and vendor/developer happy Have users happy to be confident to share requirements and demand for the standards from vendors/developers Vendors/developers know it’s a real thing that can be implemented around their existing products or software

14 San Diego Supercomputer Center www.irods.org IROS DGMS 14 The scope (from GFS Architecture) A single interface Protocols A hybrid of XML and byte-level protocol XML – command channel of operations Byte-level – data movement Possible Functionalities File namespace and file operations (read, write, … Meta-data operations (user-defined metadata, search) Data Grid Language for policy, rules etc.,

15 San Diego Supercomputer Center www.irods.org IROS DGMS 15 What could be the right high level picture? DGMS XML-command protocol Byte-level data protocol Object-transfer Facilitate SOA

16 San Diego Supercomputer Center www.irods.org IROS DGMS 16 What could be the right high level picture? DGMS server XML-command protocol Byte-level data protocol DGMS server DGMS server

17 San Diego Supercomputer Center www.irods.org IROS DGMS 17 User perspective Logical Resources Multiple Replicas Users from different organizations User defined meta data for data discovery Secret Recipe

18 San Diego Supercomputer Center www.irods.org IROS DGMS 18 So what will we be doing (products?) Definition Concept ( data grid namespace, resource-namespace…) Initial functionalities (DGMS operations to be targeted) Namespace (Files, Metadata, Resource, Policy rules) XML protocol XML-handshake and message transfer between DGMS- client and DGMS-server Most importantly… Software as a common framework for the evolution, adoption and growth of the standard and DGMS concepts

19 San Diego Supercomputer Center www.irods.org IROS DGMS 19 So how will we do it? (process) Community-based open design (OPEN FORUM) Design discussions as a community Code through multiple parties to make sure we keep the vendor/developer community and user community engaged Community-based open standard (OPEN STDS) Specs written using wiki and other mechanisms Community based spec for OGF Interoperability workshops and Workshops along with other relevant agencies like SNIA or DMTF

20 San Diego Supercomputer Center www.irods.org IROS DGMS 20 How can you get started? Initial requirements Can you delete email? (sign up for our mailing list) Got Bandwidth and browser? (Visit our group page) Can you scream or shout or smile ( join our WG sessions) Are you a user or consumer or researcher? Tell us what is needed? What should be there for you to put this open source software/standard in production Are you a vendor/developer? Have your engineer or developer talk to us (we will convert him to a DGMS developer or DGMS Guru) We are developing a open standard – take advantage of it and develop a value added solution around it

21 San Diego Supercomputer Center www.irods.org IROS DGMS 21 When do we get started? Right now (Hmmm.. We did long time back) Conference calls every other week Mostly Wednesdays Attend through phone call, Skype or Polycom Video conference (any thing you like) Discussions influencing, design requirements Face to face meeting Once every quarter (planned), OGF sessions

22 San Diego Supercomputer Center www.irods.org IROS DGMS 22 Suggestions, comments, critics TO DO Standard operations based on policies/rules Take advantage of OGF standards as possible Other commercial or magic tools could be used below the standard NOT TO DO

23 San Diego Supercomputer Center www.irods.org IROS DGMS 23 Conclusions Data Grids Data Grid Management systems (DGMS) Very good user need in academic and non-academics Need for standards framed by Grid File System WG Software-included Spec Strategy


Download ppt "San Diego Supercomputer Center www.irods.org iRODS DGMS Towards Data Grid Standard Implementations Arun Jagatheesan San Diego Supercomputer Center Open."

Similar presentations


Ads by Google