Hironori Ito Brookhaven National Laboratory

Slides:



Advertisements
Similar presentations
Creating HIPAA-Compliant Medical Data Applications with Amazon Web Services Presented by, Tulika Srivastava Purdue University.
Advertisements

ATLAS T1/T2 Name Space Issue with Federated Storage Hironori Ito Brookhaven National Laboratory.
Copy on Demand with Internal Xrootd Federation Wei Yang SLAC National Accelerator Laboratory Create Federated Data Stores for the LHC IN2P3-CC,
WINDOWS AZURE STORAGE 11 de Mayo, 2011 Gisela Torres – Windows Azure MVP Aventia-Renacimiento Twitter:
Status Report on Ceph Based Storage Systems at the RACF
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
SRM at Clemson Michael Fenn. What is a Storage Element? Provides grid-accessible storage space. Is accessible to applications running on OSG through either.
Data management at T3s Hironori Ito Brookhaven National Laboratory.
Introduction to Cloud Computing
Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
Introduction to HDFS Prasanth Kothuri, CERN 2 What’s HDFS HDFS is a distributed file system that is fault tolerant, scalable and extremely easy to expand.
Redirector xrootd proxy mgr Redirector xrootd proxy mgr Xrd proxy data server N2N Xrd proxy data server N2N Global Redirector Client Backend Xrootd storage.
Status & Plan of the Xrootd Federation Wei Yang 13/19/12 US ATLAS Computing Facility Meeting at 2012 OSG AHM, University of Nebraska, Lincoln.
Introduction to HDFS Prasanth Kothuri, CERN 2 What’s HDFS HDFS is a distributed file system that is fault tolerant, scalable and extremely easy to expand.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
Globus and ESGF Rachana Ananthakrishnan University of Chicago
Data Evolution: 101. Parallel Filesystem vs Object Stores Amazon S3 CIFS NFS.
Implementation of Simple Cloud-based Distributed File System Group ID: 4 Baolin Wu, Liushan Yang, Pengyu Ji.
Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.
Data Management at Tier-1 and Tier-2 Centers Hironori Ito Brookhaven National Laboratory US ATLAS Tier-2/Tier-3/OSG meeting March 2010.
New Features of Xrootd SE Wei Yang US ATLAS Tier 2/Tier 3 meeting, University of Texas, Arlington,
Data Distribution Performance Hironori Ito Brookhaven National Laboratory.
St. Petersburg, 2016 Openstack Disk Storage vs Amazon Disk Storage Computing Clusters, Grids and Cloud Erasmus Mundus Master Program in PERCCOM Author:
Fault – Tolerant Distributed Multimedia Streaming Web Application By Nirvan Sagar – Srishti Ganjoo – Syed Shahbaaz Safir
Network - definition A network is defined as a collection of computers and peripheral devices (such as printers) connected together. A local area network.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
PaaS services for Computing and Storage
Onedata Eventually Consistent Virtual Filesystem for Multi-Cloud Infrastructures Michał Orzechowski (CYFRONET AGH)
a brief summary for users
Course: Cluster, grid and cloud computing systems Course author: Prof
WLCG IPv6 deployment strategy
AWS Solution Architect Associate Exam associate-dumps.html Free AWS Solution Training Exam Question.
DPM at ATLAS sites and testbeds in Italy
Big Data is a Big Deal!.
Scalable sync-and-share service with dCache
Amazon AWS Solution Architect Associate Exam Questions PDF associate.html AWS Solution Training Exam.
Database Replication and Monitoring
AWS Integration in Distributed Computing
BNL Box Hironori Ito Brookhaven National Laboratory
ATLAS Grid Information System
Blueprint of Persistent Infrastructure as a Service
Section 6 Object Storage Gateway (RADOS-GW)
Object Stores for Event Service and Logs
BNL Tier1 Report Worker nodes Tier 1: added 88 Dell R430 nodes
Future of WAN Access in ATLAS
Nebula A cloud-based back end for
PanDA setup at ORNL Sergey Panitkin, Alexei Klimentov BNL
A full demonstration based on a “real” analysis scenario
Academia Sinica Grid Computing Centre
Bernd Panzer-Steindel, CERN/IT
ATLAS Sites Jamboree, CERN January, 2017
XROOTd for cloud storage
NET2.
Brookhaven National Laboratory Storage service Group Hironori Ito
WLCG Demonstrator R.Seuster (UVic) 09 November, 2016
AWS COURSE DEMO BY PROFESSIONAL-GURU. Amazon History Ladder & Offering.
Large Scale Test of a storage solution based on an Industry Standard
Acutelearn Amazon Web Services Training Classroom Training Instructor led trainings at Acutelearn premises Corporate Training Custom tailored trainings.
Amazon AWS Solution Architect Associate Exam Dumps For Full Exam Info Visit This Link:
Amazon AWS Solution Architect Associate Exam Questions PDF associate-dumps.html AWS Solution Training.
Australia Site Report Sean Crosby DPM Workshop – 13 December 2013.
Web Server Design Week 16 Old Dominion University
SQL Server on Amazon Web Services
SQL Server on Amazon Web Services
Presentation transcript:

Hironori Ito Brookhaven National Laboratory Use of S3 storage Hironori Ito Brookhaven National Laboratory

S3 Storage S3 is a storage API for various object storages. Amazon, Ceph, Riak, xRootd, etc... are supporting S3 interface. Within the S3 storage, one stores objects(files/data) in a bucket(directory/place-holder). One can store large number of objects in a bucket or create large number of buckets. There is generally no associated file system. Fuse is possible. To access S3 storage, one needs access_key_id and secret_access_key. access_key_id is associated with buckets and their permissions.

BNL Ceph BNL has currently RAW 1.8 PB of storage using the retired storage systems. With the replication factor of 3, the actual usable space is about 0.6PB. The Ceph cluster consists of 8 head nodes. Two of them are also used as the access gateway via S3 API as well as regular http. The detail is shown in the next slide by Alexander Zaytsev (also see his HEPIX presentation http://indico.cern.ch/event/320819/session/6/contri bution/39 ) BNL is currently in the process to increase the performace and capacity by using more retired storages. BNL, MWT2 and AGLT2 are also dicussing the possible test of Federated Ceph storage. It is one-master + many slaves style storage. http://ceph.com/docs/master/radosgw/federated- config/

Hepix by Alexander Zaytsev

BNL's Ceph Use PANDA developers have been using BNL Ceph storage for outputs of job logs at relatively small scale. Pilots developers (Paul and Wen) have added S3 features using boto python API. PANDA server developers (Tadashi) has added the supports for storage access_key_id and secret_access_key PANDA developers have also asked/suggested BNL to create the simple http interface to access the logs. Ceph S3 supports the simple http natively. However, it can not be used due to our network configurations. Simple RAILs Restful webapp was created as proxy service to the backend storage. It is simply access via http://host/bucket/object The storage has been stable without any major issues.

Amazon S3 BNL has been ingaging with Amazon to use their cloud services as a possible ATLAS computing and storage resources. Amazon has three storages; S3 (simple storage), EBS (elastic block store) and Glacier (backup) http://aws.amazon.com/s3/ BNL has been testing S3 as a possible, regular ATLAS storage within Amazon. Read/writes by pilots Read/writes by DDM

Access to Amazon S3 Pilots use S3 APIs to read/write S3 (just like Ceph) Needs access_key_id and secret_access_key, which are stored in the PANDA server DDM DDM currently needs the SRM. By Carlos Gamboa Install BestMan SRM BestMan SRM needs file system Mount the S3 storage as Fuse. End points BNL- AWSEAST_DATADISK/PRODDISK/USERDI SK were added to AGIS. DDM transfers has been tested for functionality and performance ~ a few 100 MB/s writing to S3 due to using one stream. Multiple streams breaks checksum validation ~ several 100 MB/s reading from S3. Multiple streams are fine. The performance can be increased. The most recent FTS3 supports direct S3 access without the use of SRM. Needs testing DDM must also supports S3 to utilize FTS3's S3 capabilities.