Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

OGF-23 iRODS Metadata Grid File System Reagan Moore San Diego Supercomputer Center.
© 2006 Open Grid Forum OGF19 Federated Identity Rule-based data management Wed 11:00 AM Mountain Laurel Thurs 11:00 AM Bellflower.
Data Grid: Storage Resource Broker Mike Smorul. SRB Overview Developed at San Diego Supercomputing Center. Provides the abstraction mechanisms needed.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SAN DIEGO SUPERCOMPUTER CENTER Particle Physics Data Grid PPDG Data Handling System Reagan.
San Diego Supercomputer Center NARA Research Prototype Persistent Archive Building Preservation Environments with Data Grid Technology (NARA Research Prototype.
San Diego Supercomputer CenterNational Partnership for Advanced Computational Infrastructure1 Grid Based Solutions for Distributed Data Management Reagan.
Lecture 23 Internet Authentication Applications
A Very Brief Introduction to iRODS
PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008.
Security Requirements for Shared Collections Storage Resource Broker Reagan W. Moore
GGF-17 Astro Workshop Preservation Environment Working Group Officers: Bruce Barkstrom (NASA Langley) Reagan Moore (SDSC) Goals  Demonstrate.
Introduction to PKI Seminar What is PKI? Robert Brentrup July 13, 2004.
Applying Data Grids to Support Distributed Data Management Storage Resource Broker Reagan W. Moore Ian Fisk Bing Zhu University of California, San Diego.
Brief Overview of Major Enhancements to PAWN. Producer – Archive Workflow Network (PAWN) Distributed and secure ingestion of digital objects into the.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
UMIACS PAWN, LPE, and GRASP data grids Mike Smorul.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
On Developing Data Grid Workflows using Storage Resource Broker (SRB) and Kepler Tim H. Wong - UC Davis Efrat Frank - SDSC Dr. Bertram Ludäscher - UC Davis.
Data Grid Interactions with Firewalls Michael Wan Reagan Moore SDSC/UCSD/NPACI.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
DCC Conference, Glasgow November, Digital Archive Policies and Trusted Digital Repositories MacKenzie Smith, MIT Libraries Reagan Moore, San Diego.
SYSTEM ADMINISTRATION Chapter 13 Security Protocols.
National Partnership for Advanced Computational Infrastructure Digital Library Architecture Reagan Moore Chaitan Baru Amarnath Gupta George Kremenek Bertram.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
Information Management and Distributed Data Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar Richard Marciano {moore, schroede, mwan, sekar,
Working Group: Practical Policy Rainer Stotzka, Reagan Moore.
Core SRB Technology for 2005 NCOIC Workshop By Michael Wan And Wayne Schroeder SDSC SDSC/UCSD/NPACI.
Jan Storage Resource Broker Managing Distributed Data in a Grid A discussion of a paper published by a group of researchers at the San Diego Supercomputer.
Data Grids and Data Management Storage Resource Broker Reagan W. Moore
Managing Simulation Output Storage Resource Broker Reagan W. Moore
Lecture 23 Internet Authentication Applications modified from slides of Lawrie Brown.
Rule-Based Data Management Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar {moore, schroede, mwan, {moore, schroede, mwan,
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Chapter 9 Section 2 : Storage Networking Technologies and Virtualization.
Computer Security: Principles and Practice First Edition by William Stallings and Lawrie Brown Lecture slides by Lawrie Brown Chapter 22 – Internet Authentication.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
1 Vigil : Enforcing Security in Ubiquitous Environments Authors : Lalana Kagal, Jeffrey Undercoffer, Anupam Joshi, Tim Finin Presented by : Amit Choudhri.
Working Group Practical Policy based on slides and latest documents from the PP WG chaired by Reagan Moore, Rainer Stotzka presented by Johannes Reetz.
Rule-Based Preservation Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar Richard Marciano {moore, schroede, mwan, sekar,
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Archive for the NSDL Reagan W. Moore Charlie Cowart.
Module 6: Managing Client Access. Overview Implementing Client Access Servers Implementing Client Access Features Implementing Outlook Web Access Introduction.
Policy Based Data Management Data-Intensive Computing Distributed Collections Grid-Enabled Storage iRODS Reagan W. Moore 1.
San Diego Supercomputer CenterNational Partnership for Advanced Computational Infrastructure1 Data Grids, Digital Libraries, and Persistent Archives Reagan.
From SRB to IRODS: Policy Virtualization using Rule-Based Data Grids Reagan W. Moore Wayne Schroeder Arcot Rajasekar Mike Wan San Diego Supercomputer Center.
GGF-17 Preservation Environments Research Group Preservation Environment Working Group Officers: Bruce Barkstrom (NASA Langley) Reagan.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
National Science Foundation Cooperative Agreement: OCI Reagan Moore, PI Mary Whitton, Project Manager.
1 Active Directory Service in Windows 2000 Li Yang SID: November 2000.
National Archives and Records Administration1 Integrated Rules Ordered Data System (“IRODS”) Technology Research: Digital Preservation Technology in a.
4.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 12: Implementing Security.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Active Directory Domain Services (AD DS). Identity and Access (IDA) – An IDA infrastructure should: Store information about users, groups, computers and.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
Building Preservation Environments Reagan W. Moore San Diego Supercomputer Center Storage Resource Broker.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
Working Group: Data Foundations and Terminology (Practical Policy Considerations) Reagan Moore.
Building Preservation Environments from Federated Data Grids Reagan W. Moore San Diego Supercomputer Center Storage.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Configuring File Services
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Policy-Based Data Management integrated Rule Oriented Data System
Introduction to Data Management in EGI
Arcot Rajasekar Michael Wan Reagan Moore (sekar, mwan,
Technical Issues in Sustainability
Presentation transcript:

Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore

Abstract National and international collaborations create shared collections that span multiple administrative domains. Shared collection are built using: Data virtualization, the management of collection properties independently of the storage system. Trust virtualization, the ability to assert authenticity and authorization independently of the underlying storage resources. What rights management are implied in trust virtualization? How are data integrity challenges met? How do you implement a “Deep Archive”?

State of the Art Technology Grid - workflow virtualization Manage execution of jobs (processes) independently of compute servers Data grid - data virtualization Manage properties of a shared collection independently of storage systems Semantic grid - information virtualization Reason across inferred attributes derived from multiple collections.

Shared Collections Purpose of SRB data grid is to enable the creation of a shared collection between two or more institutions Register digital entity into the shared collection Assign owner, access controls Assign descriptive, provenance metadata Manage audit trails, versions, replicas, backups Manage location mapping, containers, checksums Manage verification, synchronization Manage federation with other shared collections Manage interactions with storage systems Manage interactions with APIs

Trust Virtualization Collection owns the data that is registered into the data grid Data grid is a set of servers, installed at each storage repository Servers are installed under a SRB ID created for the shared collection All accesses to the data stored under the SRB ID are through the data grid Authentication and authorization are controlled by the data grid, independently of the remote storage system

Rights Management Shared collection approach Manage authentication of all users who will have privileges beyond that of public “read” Map users to groups Provide access controls on users and groups Virtual Organization Management approach (VOM) Authenticate users, and provide certificate asserting membership in a group Manage access controls on groups Provide rules for access based on membership in a group

Authentication / Authorization Collection owned data At each remote storage system, an account ID is created under which the data grid stores files User authenticates to the data grid Data grid checks access controls Data grid server authenticates to a remote data grid server Remote data grid server authenticates to the remote storage repository SRB servers return the data to the client

Authentication Mechanisms Grid Security Infrastructure PKI certificates Challenge-response mechanism No passwords sent over network Ticket Valid for specified time period or number of accesses Generic Security Service API Authentication of server to remote storage

Trust Implementation For authentication to work across multiple administrative domains Require collection-managed names for users For authorization to work across multiple administrative domains Require collection-managed names for files Result is that access controls remain invariant. They do not change as data is moved to different storage systems under shared collection control

Network Devices Access to data must handle security requirements of network devices Firewalls - typically requires access be initiated from within a firewall Network routers - location of data may change based on load leveling Private virtual network - location of data not known explicitly known Handled by creating network transport protocols to meet requirements of each network device

SRB server1 SRB agent SRB server2 Parallel mode Data Transfer – Server Initiated MCAT Sput -m SRB agent srbObjPut + socket addr, port and cookie 1.Logical-to-Physical mapping 2. Identification of Replicas 3.Access & Audit Control Peer-to-peer Request Connect to client Data transfer R Server behind firewall

SRB server1 SRB agent SRB server2 Parallel mode Data Transfer – Client Initiated MCAT Sput -M SRB agent srbObjPut 1.Logical-to-Physical mapping 2. Identification of Replicas 3.Access & Audit Control Return socket addr., port and cookie Connect to server Data transfer R 5 5 Client behind firewall

HIPAA Patient Confidentiality Access controls on data Access controls on metadata Access controls on storage systems Audit trails End-to-end encryption (manage keys) Localization of data to specific storage Access controls do not change when data is moved to another storage system under data grid control

Logical Name Spaces Storage Repository Storage location User name File name File context (creation date,…) Access constraints Data Access Methods (C library, Unix, Web Browser) Data access directly between application and storage repository using names required by the local repository

Logical Name Spaces Storage Repository Storage location User name File name File context (creation date,…) Access constraints Data Grid Logical resource name space Logical user name space Logical file name space Logical context (metadata) Control/consistency constraints Data Collection Data Access Methods (C library, Unix, Web Browser) Data is organized as a shared collection

Logical Resource Names Logical resource name represents multiple physical resources Writing to a logical resource name can result in: Replication - write completes when each physical resource has a copy Load leveling - write completes when the next physical resource in the list has a copy Fault tolerance - write completes when “k” of “n” resources have a copy Single copy - write is done to first disk at same IP address, then disk anywhere, then tape

Federation Between Data Grids Data Grid Logical resource name space Logical user name space Logical file name space Logical context (metadata) Control/consistency constraints Data Collection B Data Access Methods (Web Browser, DSpace, OAI-PMH) Data Grid Logical resource name space Logical user name space Logical file name space Logical context (metadata) Control/consistency constraints Data Collection A Access controls and consistency constraints on cross registration of digital entities

Authentication across Zones Follow Shibboleth model A user is assigned a “home” zone User identity is Home-zone:Domain:User-ID All authentications are done by the “home” zone

User Authentication Z1:D1:U1 Z2 Z1 Connect Challenge/Response Validation of Challenge Response

User Authentication Z1:D1:U1 Z2 Z1 Connect Challenge/Response Validation of Challenge Response Z3 Forward

Deep Archive Z2Z1 Z3 Z2:D2:U2 Register Z3:D3:U3 Register Pull Pull Firewall Server initiated I/O DeepArchive StagingZone Remote Zone No access by Remote zones PVN

NARA Persistent Archive NARAU MdSDSC MCAT Original data at NARA, data replicated to U Md Replicated copy at U Md for improved access, load balancing and disaster recovery Deep archive at SDSC, no user access, data pulled from NARA Demonstrate preservation environment Authenticity Integrity Management of technology evolution Mitigation of risk of data loss Replication of data Federation of catalogs Management of preservation metadata Scalability Types of data collections Size of data collections Federation of Three Independent Data Grids

For more Information on Storage Resource Broker Data Grid Reagan W. Moore