Tactical Storage: Simple, Secure, and Semantic Access to Remote Data Prof. Douglas Thain University of Notre Dame

Slides:



Advertisements
Similar presentations
Wei Lu 1, Kate Keahey 2, Tim Freeman 2, Frank Siebenlist 2 1 Indiana University, 2 Argonne National Lab
Advertisements

PHANI VAMSI KRISHNA.MADDALI. BASIC CONCEPTS.. FILE SYSTEMS: It is a method for storing and organizing computer files and the data they contain to make.
BXGrid: A Data Repository and Computing Grid for Biometrics Research Hoang Bui University of Notre Dame 1.
Operating System Support for Space Allocation in Grid Storage Systems Douglas Thain University of Notre Dame IEEE Grid Computing, Sep 2006.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.
University of Notre Dame
Separating Abstractions from Resources in a Tactical Storage System Douglas Thain, Sander Klous, Justin Wozniak, Paul Brenner, Aaron Striegel, and Jesus.
The Consequences of Decentralized Security in a Cooperative Storage System Douglas Thain, Chris Moretti, Paul Madrid, Phil Snowberger, and Jeff Hemmes.
Research Issues in Cooperative Computing Douglas Thain
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Positioning Dynamic Storage Caches for Transient Data Sudharshan VazhkudaiOak Ridge National Lab Douglas ThainUniversity of Notre Dame Xiaosong Ma North.
Deconstructing Clusters for High End Biometric Applications NSF CCF June Douglas Thain and Patrick Flynn University of Notre Dame 5 August.
Separating Abstractions from Resources in a Tactical Storage System Douglas Thain University of Notre Dame
Research Issues in Cooperative Computing Douglas Thain
An Introduction to Grid Computing Research at Notre Dame Prof. Douglas Thain University of Notre Dame
Enabling Data-Intensive Science with Tactical Storage Systems Douglas Thain
Enabling Data-Intensive Science with Tactical Storage Systems Prof. Douglas Thain University of Notre Dame
Separating Abstractions from Resources in a Tactical Storage System Douglas Thain University of Notre Dame
The Condor Data Access Framework GridFTP / NeST Day 31 July 2001 Douglas Thain.
ArcGIS for Server Reference Implementations An ArcGIS Server’s architecture tour.
Efficient Access to Many Small Files in a Grid Filesystem Douglas Thain and Christopher Moretti University of Notre Dame.
The Difficulties of Distributed Data Douglas Thain Condor Project University of Wisconsin
1 DNS,NFS & RPC Rizwan Rehman, CCS, DU. Netprog: DNS and name lookups 2 Hostnames IP Addresses are great for computers –IP address includes information.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
CVMFS: Software Access Anywhere Dan Bradley Any data, Any time, Anywhere Project.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
Hands-On Microsoft Windows Server 2008 Chapter 5 Configuring, Managing, and Troubleshooting Resource Access.
Chapter 10 Networking and the Internet ITSC 1458.
16 th May 2006Alessandra Forti Storage Alessandra Forti Group seminar 16th May 2006.
PCGRID ‘08 Workshop, Miami, FL April 18, 2008 Preston Smith Implementing an Industrial-Strength Academic Cyberinfrastructure at Purdue University.
BaBar MC production BaBar MC production software VU (Amsterdam University) A lot of computers EDG testbed (NIKHEF) Jobs Results The simple question:
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
File and Object Replication in Data Grids Chin-Yi Tsai.
Interactive Job Monitor: CafMon kill CafMon tail CafMon dir CafMon log CafMon top CafMon ps LcgCAF: CDF submission portal to LCG resources Francesco Delli.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
Workshop on the Future of Scientific Workflows Break Out #2: Workflow System Design Moderators Chris Carothers (RPI), Doug Thain (ND)
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Enabling Data Intensive Science with Tactical Storage Systems Prof. Douglas Thain University of Notre Dame
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Web interface for Protein Sequence.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Andrew McNab - Security issues - 4 Mar 2002 Security issues for TB1+ (some personal observations from a WP6 and sysadmin perspective) Andrew McNab, University.
INFSO-RI Enabling Grids for E-sciencE EGEE-2 NA4 Biomed Bioinformatics in CNRS Christophe Blanchet Institute of Biology and Chemistry.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
EGEE is a project funded by the European Union under contract IST Enabling bioinformatics applications to.
NeST: Network Storage John Bent, Venkateshwaran V Miron Livny, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
2 nd EGEE/OSG Workshop Data Management in Production Grids 2 nd of series of EGEE/OSG workshops – 1 st on security at HPDC 2006 (Paris) Goal: open discussion.
1 Grid2003 Monitoring, Metrics, and Grid Cataloging System Leigh GRUNDHOEFER, Robert QUICK, John HICKS (Indiana University) Robert GARDNER, Marco MAMBELLI,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Service to Encrypt Biological Data on Grid.
Towards a Taxonomy of Security for Distributed Computing
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Haiyan Meng and Douglas Thain
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Outline Announcements Lab2 Distributed File Systems 1/17/2019 COP5611.
CSE 451: Operating Systems Spring Module 21 Distributed File Systems
CSE 451: Operating Systems Winter Module 22 Distributed File Systems
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
INFNGRID Workshop – Bari, Italy, October 2004
CSE 451: Operating Systems Winter 2004 Module 19 Distributed Systems
Presentation transcript:

Tactical Storage: Simple, Secure, and Semantic Access to Remote Data Prof. Douglas Thain University of Notre Dame

As of 25 April Condor Worldwide: –56,682 CPUs / ??? TB / 1758 sites Teragrid –15,328 CPUs / 220 TB / 6 sites Open Science Grid –21,156 CPUs / 83 TB / 61 sites EGEE Grid –Lots??? Plentiful Computing Power

Complex Ecology of Storage Shared Filesystem shared disk shared disk private disk private disk private disk private disk HTTP, FTP, RFIO, gLite, SRB, SCP, RSYNC, HTTP... Independent Cluster Disks

Problems Accessing Data Large Burden on the User –User may not be able/willing to state files in advance. –Different services/protocols available at different sites. –Programs not modified to take advantage of services. Different access modes for different purposes. –File transfer: preparing system for intended use. –File system: access to data for running jobs. Resources go unused. –Disks on each node of a cluster. –Unorganized resources in a department/lab. –Would like to combine disks into larger structures. A global file system can’t satisfy everyone! –(Global means different things to different people.) –Both a technical and social problem.

What’s the Problem? We often assume that the site administrator is responsible for making the site comfortable for the user. (Not possible on the grid!) Rather, the user should be able to bring along a mechanism to access multiple independent (remote?) data sources. Of course, we have to make it easy!

Tactical Storage Systems (TSS) A TSS allows any node to serve as a file server or as a file system client. All components can be deployed without special privileges – but with security. Users can build up complex structures. –Filesystems, databases, caches,... –Admins need not know/care about larger structures. Two Independent Concepts: –Resources – The raw storage to be used. –Abstractions – The organization of storage.

file transfer file system file system file system file system file system file system file system Simple Filesystem App Distributed Database Abstraction Parrot App Distributed Filesystem Abstraction Parrot App Cluster administrator controls policy on all storage in cluster UNIX Workstations owners control policy on each machine. file server file server file server file server file server file server file server UNIX ??? Parrot 3PT

Key Properties Tactical Storage is Simple: –Appears as an ordinary filesystem. –Applies to unmodified applications and data w/out code changes, relinking, kernel modules, etc... Tactical Storage is Secure: –Authentication with standard GSI or Kerberos. –Rich distributed access control system. Tactical Storage is Semantic: –Name data by meaning, not by location. –Supports external name resolution mechanisms.

Access Control in File Servers Unix Security is not Sufficient –No global user database possible/desirable. –Mapping external credentials to Unix gets messy. Instead, Make External Names First-Class –Perform access control on remote, not local, names. –Types: Globus, Kerberos, Unix, Hostname, Address Each directory has an ACL: globus:/O=NotreDame/CN=DThain RWLA RWL hostname:*.cs.nd.edu RL address: * RWLA

file system file system file system file system file system file system file system UNIX file server file server file server file server file server file server file server Physics Group List Chemistry Group List Lab 5 Group List App data ACL: Lab 5 RW Chemistry R App data ACL: Physics RW Lab 5 R Distributed Group ACLs

Semantic Data Access Appl Parrot /usr/local = /chirp/host5.nd.edu/software /tmp = /chirp/host9.nd.edu/scratch /data = /gsiftp/ftp.nd.edu/mydata /db = resolver:find_db host5host9FTP /usr/local /tmp /data find_db Where is /db/dir/523? It’s at /ftp/ftp.infn.it/db/xz

Remote Database Access script Parrot file server file system DB data libdb.so sim.exe WAN Simple FS HEP Simulation Needs Direct DB Access –App linked against Objectivity DB. –Objectivity accesses filesystem directly. –How to distribute application securely? Solution: Remote Root Mount via Parrot: parrot –M /=/chirp/fileserver/rootdir parrot –M /=/chirp/fileserver/rootdir DB code can read/write/lock files directly. DB code can read/write/lock files directly. GSI Auth GSI Credit: Sander NIKHEF

Remote Application Loading appl Parrot HTTP server file system liba.so libb.so libc.so Credit: Igor Fermi National Lab HTTP Modular Simulation Needs Many Libraries –Devel. on workstations, then ported to grid. –Selection of library depends on analysis tech. –Constraint: Must use HTTP for file access. Solution: Dynamic Link with TSS+HTTP: –/home/cdfsoft -> /http/dcaf.fnal.gov/cdfsoft select several MB from 60 GB of libraries proxy

Technical Problem HTTP is not a filesystem! (No directories) –Advantages: Firewalls, caches, admins. Appl Parrot HTTP Module HTTP Server root etchomebin alicecmsbabar opendir(/home) GET /home HTTP/1.0

Technical Problem Solution: Turn the directories into files. –Can be cached in ordinary proxies! –Hierarchical SHA1 integrity check. Appl Parrot HTTP Module HTTP Server root etchomebin alicecmsbabar opendir(/home) GET /home/.dir HTTP/1.0.dir make httpfs alice babar cms

Logical Access to Bio Data Many databases of biological data in different formats around the world: –Archives: Swiss-Prot, TreMBL, NCBI, etc... –Replicas: Public, Shared, Private, ??? Users and applications want to refer to data objects by logical name, not location! –Access the nearest copy of the non-redundant protein database, don’t care where it is. Solution: EGEE data management system maps logical names (LFNs) to physical names (SFNs). Credit: Christophe Blanchet, Bioinformatics Center of Lyon, CNRS IBCP, France

Logical Access to Bio Data BLAST Parrot RFIOgLiteHTTPFTP Chirp Server FTP Server gLite Server EGEE File Location Service Run BLAST on LFN://ncbi.gov/nr.data open(LFN://ncbi.gov/nr.data) Where is LFN://ncbi.gov/nr.data? Find it at: FTP://ibcp.fr/nr.data nr.data RETR nr.data open(FTP://ibcp.fr/nr.data)

Performance of Bio Apps on EGEE

Expandable Filesystem for Experimental Data Credit: John Notre Dame Astrophysics Dept. buffer disk 2 GB/day today could be lots more! daily tape daily tape daily tape daily tape daily tape 30-year archive analysis code Can only analyze the most recent data. Project GRAND

Expandable Filesystem for Experimental Data Credit: John Notre Dame Astrophysics Dept. buffer disk 2 GB/day today could be lots more! daily tape daily tape daily tape daily tape daily tape 30-year archive Project GRAND file server file server file server file server Distributed Shared Filesystem Adapter analysis code Can analyze all data over large time scales.

Current Work Credit: Jesus Izaguirre and Aaron Notre Dame Now that we can easily use any storage... –Much easier to arrange data/jobs arbitrarily. –Idea: combine cluster storage / cluster comp! –Goal: keep jobs close to data that they need. –PINS: Processing in STorage Example: GEMS Distributed Databank –Facility for creating, storing, and analyzing molecular dynamics data in a cluster. –Goal: Be able to easily scale both CPU and storage capacity by adding commodity nodes.

file system file system file system file system file system file system file system UNIX file server file server file server file server file server file server file server meta-data database J1J2J3J4 D1D2D3D4D1 D3D4 F F(D1) Fetch D1 Compute F(D1) Query (Mol==“CH4”) && (T>300K) Distributed Filesystem Abstraction Adapter App D2D3D4 D2D3 D4 D1

More Open Problems Resource Management –How to prevent overcommitment -> badput? Security –How to easily express complex policies for sharing and controlling combined cpu/disk? Reliability –How to deal with disconnection, erasure, rejection, unexpected performance, etc... Garbage Collection –What’s to prevent me from filling every disk everywhere with computations that I might need? Debugging –How do we dig out of numerous, noisy, distributed logs that state relevant to a complex workflow?

Conclusion Tactical storage allows end users to build large structures out of simple building blocks without getting stuck on the ugly details.

Acknowledgments Science Collaborators: –Christophe Blanchet –Patrick Flynn –Sander Klous –Peter Kunzst –Erwin Laure –John Poirier –Igor Sfiligoi CS Collaborators: –Jesus Izaguirre –Aaron Striegel CS Students: –Paul Brenner –James Fitzgerald –Jeff Hemmes –Paul Madrid –Chris Moretti –Gerhard Niederwieser –Phil Snowberger –Justin Wozniak

For more information... Cooperative Computing Lab Cooperative Computing Lab Cooperative Computing Tools Cooperative Computing Tools Douglas Thain Douglas Thain –

Problem: Shared Namespace file server globus:/O=NotreDame/* RWLAX a.out test.ctest.dat cms.exe

Solution: Reservation (V) Right file server O=NotreDame/CN=* V(RWLA) /O=NotreDame/CN=Monk RWLA mkdir a.outtest.c /O=NotreDame/CN=Monk mkdir /O=NotreDame/CN=Ted RWLA a.outtest.c /O=NotreDame/CN=Ted mkdir only!