Presentation is loading. Please wait.

Presentation is loading. Please wait.

San Diego Supercomputer Center www.iRODS.org Self-organizing Smart Namespaces : Next Generation Data Grid Systems Arun Jagatheesan iRODS.org.

Similar presentations


Presentation on theme: "San Diego Supercomputer Center www.iRODS.org Self-organizing Smart Namespaces : Next Generation Data Grid Systems Arun Jagatheesan iRODS.org."— Presentation transcript:

1 San Diego Supercomputer Center Self-organizing Smart Namespaces : Next Generation Data Grid Systems Arun Jagatheesan iRODS.org

2 San Diego Supercomputer Center 2 Content Outline State of the art Where we stand Concepts What is next, new, hot and exciting? Yesterday’s research - now Today’s research - future? What could be done from OGF, SNIA, IETF?? Standard for distributed data management Risks, rewards

3 San Diego Supercomputer Center 3 State of the art - where we are now (Shameless self promotion or fact!) Estimated 2 petabytes of data brokerage Multiple agencies- DoD, NARA, NSF, NIH, … Multiple countries - US, UK, Japan, France…, Antartica Span off a private company … We don’t live in the past anyways…

4 San Diego Supercomputer Center 4 Concepts and Lessons (Current understanding - looking back) Don’t hide distributed computing Allows users to “enjoy” distributed namespace rather than cheat them with “location opaque” namespace (unlike traditional file systems) Human readable or enjoy-able (No urls, uuids etc) Logical mappings to physical heterogeneities Data (files), storage resource, metadata, user groups, policies, and even file systems become logical entities in data grids Hide every thing including with logical human-friendly names Keep it simple and scalable (It’s the data model & design) Not layer on top of another layer. Finished product not lego blocks. Hybrid approach - Neither too much P2P nor too much centralization. Just the right level of distributed computing with some TLC for users

5 San Diego Supercomputer Center 5 Content Outline State of the art Where we stand Concepts What is next, new, hot and exciting? An use case - LSST Yesterday’s research - now Today’s research - future? What could be done from OGF, SNIA, IETF?? Standard for distributed data management Risks, rewards

6 San Diego Supercomputer Center 6 Motivational Use Case LSST = Large Synoptic Survey Telescope 150+ Petabytes Multiple countries, multiple data centers Multiple heterogeneous file systems (high performance, high distribution, interoperability, P2P, …) Multiple heterogeneous hardware

7 San Diego Supercomputer Center 7 Yesterday’s research Data Grid Workflows and policies Some concepts prototyped in SRB Matrix Event, Condition, Action (ECA) based “data grid flows” If, for, for-each, if-else, switch-case Server-side workflows on data grids Use a separate language to capture the recipe of workflow and execute it as action - Data Grid Language Let the flow be with you (Flow data type was introduced)

8 San Diego Supercomputer Center 8 Today’s research = future Now = Lessons learnt + yesterday’s research Allow logical namespace to reflect local namespace (local file system logically mounted on global namespace) Allow users to define their own policies and workflows (  Services, rules) iRODS.org - Open source platform - world’s first open source Data Grid Management System (DGMS).

9 San Diego Supercomputer Center 9 iRODS.org Its all about the namespace and how user’s or applications interact with it What if we made this namespace “smart” ECA Rules + Machine Learning or bootstrapped learning Event: (any thing, as simple as a file upload) Condition: based on system or user metadata Action: Any system-defined or user-defined service

10 San Diego Supercomputer Center 10 iRODS Namespace #1 (data) Human readable data names to data (or virtual data) Namespace #2 (resource) Human readable resource names to storage resource (allows distributed computing) Namespace #3 (policies) Human readable policy namespace of how data needs to be managed Again every thing can be accessed and controlled by end-users (not just SYSTEM adminis)

11 San Diego Supercomputer Center 11 Content Outline State of the art Where we stand Concepts What is next, new, hot and exciting? An use case - LSST Yesterday’s research - now Today’s research - future? What could be done from OGF, SNIA, IETF?? Standard for distributed data management Risks, rewards

12 San Diego Supercomputer Center 12 OGF, SNIA and iRODS.org Collaborative data management FAN / Data grid??? - but still Distributed data management But still needs a standard simple API as a standard Data grid namespace on XAM resources Standardize a simple API (java, C/C++) to provide data grid concepts on top of existing SNIA XAM or products Open source data grid software Involve engineers from different participating member organizations Multi-institutional participation Multiple countries, mulitple companies, academic and commercial participants

13 San Diego Supercomputer Center 13 Enthusiasm is contagious


Download ppt "San Diego Supercomputer Center www.iRODS.org Self-organizing Smart Namespaces : Next Generation Data Grid Systems Arun Jagatheesan iRODS.org."

Similar presentations


Ads by Google