Service Challenge 3 CERN

Slides:



Advertisements
Similar presentations
Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC.
Advertisements

LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Service Challenge Meeting “Service Challenge 2 update” James Casey, IT-GD, CERN IN2P3, Lyon, 15 March 2005.
June 23rd, 2009Inflectra Proprietary InformationPage: 1 SpiraTest/Plan/Team Deployment Considerations How to deploy for high-availability and strategies.
CASTOR Project Status CASTOR Project Status CERNIT-PDP/DM February 2000.
Scalable Server Load Balancing Inside Data Centers Dana Butnariu Princeton University Computer Science Department July – September 2010 Joint work with.
CERN DNS Load Balancing Vladimír Bahyl IT-FIO. 26 November 2007WLCG Service Reliability Workshop2 Outline  Problem description and possible solutions.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
Data management for ATLAS, ALICE and VOCE in the Czech Republic L.Fiala, J. Chudoba, J. Kosina, J. Krasova, M. Lokajicek, J. Svec, J. Kmunicek, D. Kouril,
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Usage of virtualization in gLite certification Andreas Unterkircher.
Status SC3 SARA/Nikhef 20 juli Status & results SC3 throughput phase SARA/Nikhef Mark van de Sanden.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
Tier-1 Andrew Sansum Deployment Board 12 July 2007.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.
DMLite GridFTP frontend Andrey Kiryanov IT/SDC 13/12/2013.
CernVM-FS Infrastructure for EGI VOs Catalin Condurache - STFC RAL Tier1 EGI Webinar, 5 September 2013.
ALICE experiences with CASTOR2 Latchezar Betev ALICE.
ASCC Site Report Eric Yen & Simon C. Lin Academia Sinica 20 July 2005.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Storage & Database Team Activity Report INFN CNAF,
Dissemination and User Feedback Castor deployment team Castor Readiness Review – June 2006.
GDB Meeting 12. January Bernd Panzer-Steindel, CERN/IT 1 Mass Storage at CERN GDB meeting, 12. January 2005.
HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.
Andrew Lahiff HEP SYSMAN June 2016 Hiding infrastructure problems from users: load balancers at the RAL Tier-1 1.
LHC collisions rate: Hz New PHYSICS rate: Hz Event selection: 1 in 10,000,000,000,000 Signal/Noise: Raw Data volumes produced.
15.June 2004Bernd Panzer-Steindel, CERN/IT1 CERN Mass Storage Issues.
Servizi core INFN Grid presso il CNAF: setup attuale
Road map SC3 preparation
CASTOR: possible evolution into the LHC era
Dynamic Extension of the INFN Tier-1 on external resources
WLCG IPv6 deployment strategy
ALICE Computing Data Challenge VI
Introduction of load balancers at the RAL Tier-1
IT Services Katarzyna Dziedziniewicz-Wojcik IT-DB.
“A Data Movement Service for the LHC”
LCG Service Challenge: Planning and Milestones
OpenLab Enterasys Meeting
High Availability Linux (HA Linux)
NL Service Challenge Plans
BNL Tier1 Report Worker nodes Tier 1: added 88 Dell R430 nodes
IT-DB Physics Services Planning for LHC start-up
James Casey, IT-GD, CERN CERN, 5th September 2005
Tape Operations Vladimír Bahyl on behalf of IT-DSS-TAB
LHC experiments Requirements and Concepts ALICE
ETICS Pool for IPv6 tests
StoRM Architecture and Daemons
Luca dell’Agnello INFN-CNAF
Castor services at the Tier-0
Olof Bärring LCG-LHCC Review, 22nd September 2008
3.1 Types of Servers.
Christof Hanke, HEPIX Spring Meeting 2008, CERN
Conditions Data access using FroNTier Squid cache Server
Brookhaven National Laboratory Storage service Group Hironori Ito
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
Grid Canada Testbed using HEP applications
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
LHC Data Analysis using a worldwide computing grid
SpiraTest/Plan/Team Deployment Considerations
Presentation transcript:

Service Challenge 3 Plans @ CERN Vladimír Bahyl IT/FIO Vladimir.Bahyl@cern.ch

Outline Important dates Network connectivity Hardware Software 4 different configurations Conclusion 17 May 2005 Plans @ CERN for SC3

Important dates Setup phase Service phase Starts 1st of July 2005 But preparations already started Includes throughput test 150 MB/s disk (CERN)  disk (Tier-1) 60 MB/s disk (CERN)  tape (Tier-1) = traffic up to 1 GB/sec from CERN Service phase Starts in September  end of 2005 Includes real experiments’ data CMS and ALICE in the beginning ATLAS and LHCb join later (October/November) 17 May 2005 Plans @ CERN for SC3

General Purpose Network backbone Network connectivity T-1s 10Gb link 1Gb link 2x1Gb links up to 4Gbps cernh7 128.142.224.7/24 Tapes 128.142.224.1/24 up to 8Gbps Disks N nodes Nx1 Gb 41x1 Gb Databases General Purpose Network backbone 41 nodes 128.142.224.0/24 17 May 2005 Plans @ CERN for SC3

Hardware – list 20 IA64 Itaniums 21 IA32 disk servers oplaproXX 21 IA32 disk servers lxshareXXXd 10 other IA32 servers available on standby lxfsXXXX / lxbXXXX All nodes will use single Gb network connection only Special routing to Tier-1s Local host firewall enabled iptables 17 May 2005 Plans @ CERN for SC3

Hardware – layout 17 May 2005 Plans @ CERN for SC3

Software – July 2005 Setup phase with throughput test Data from local disks SRM + GridFTP for transfers 17 May 2005 Plans @ CERN for SC3

Software – September 2005 Service phase – experiments’ real data will be transferred Currently considering 4 different configuration scenarios http://cern.ch/service-radiant/cern/plans-for-SC3.html 17 May 2005 Plans @ CERN for SC3

Software – scenario 1 GridFTP/SRM servers Connection process: 1. Client will connect to any node running SRM 2. SRM server will contact one of the experiments’ stagers to enquire about the location of the data and send it back to the client Client will connect to another GridFTP server from the same DNS load balancing alias and ask for the data GridFTP server will pull data from the disk server and serve it back to the client Pros: GridFTP/SRM separated from the rest of CASTOR services Easy to expand by adding more GridFTP/SRM nodes Cons: Current SRM limitation – bulk transfers served by the same GridFTP node Limitation on number of files with the current CASTOR stager Inefficient data transfers (twice via the same node) CASTOR stagers client disk servers 17 May 2005 Plans @ CERN for SC3

Software – scenario 2 Pros: Data travel to client directly from a disk server (ideal case) Setup will easily support 1 GB/s or more Cons: Currently, GridFTP can not deal with locality of the data (2006 ?) – until then data would have to be transferred between disk servers GridFTP is a heavy process to run on already loaded disk servers All disk servers would need to have their network settings changed (major operation) Limitation on number of files with the current CASTOR stager Connection process: Client at Tier-1 will connect via SRM to a disk server at CERN SRM will ask stager about the location of the data and send this information back to the client 3. Client will connect to a disk server running GridFTP daemon (using DNS load balancing alias) 4. In the ideal case, when data is available locally on the very same disk server, it would be send back to the client via GridFTP 5. If the data is not local, GridFTP daemon running on the disk server would then pull it from the other disk server that has it and transfer it to the client CASTOR stagers client disk servers running GridFTP + SRM 17 May 2005 Plans @ CERN for SC3

Software – scenario 3 disk servers running GridFTP Connection process: Client at Tier-1 will contact one of the front-end SRM servers SRM will ask local stager about the location of the data If the data is not available on any of the attached disk servers, stager would stage it in from tapes SRM will send the information about the location of the data back to the client Client will connect to the given disk server and fetch the data over GridFTP Pros: Elegant separation of services = low interference with the rest of CASTOR Easily expandable by adding more nodes Should support rates of 1 GB/s or more Cons: Data would have to be migrated to tapes first = impossible to read data generated on the fly (Pre-)staging of existing data required SRM would have to be modified to return TURL with disk server name Limitation on number of files with the current CASTOR stager Thrown away solution in general SRM servers client CASTOR 17 May 2005 Plans @ CERN for SC3

Software – scenario 4 GridFTP/SRM disk pool Pros: No limitation on the number of files in the CASTOR catalog Easily expandable by adding more nodes to the GridFTP disk pool Should support rates of 1 GB/s or more Few nodes exposed externally Separation of WAN services from CERN internal CASTOR operations Closest to the real LHC scenario Cons: Requires lot of new software be in place within < 2 months In the worst case, every file would have to be replicated between 2 disk servers = load on 2 nodes Connection process: Client at Tier-1 will request a file from one of the externally visible front-end disk servers running SRM and GridFTP If the data is present in the disk pool, it’s location is forwarded to the client which will then fetch the data via GridFTP from the given disk server If the data has to be fetched from another disk server (inside CERN), new CASTOR stager will replicate it to the GridFTP disk pool of externally visible disk servers Client will be informed about the location of the data and fetch it from appropriate disk server via GridFTP New CASTOR stager client disk servers 17 May 2005 Plans @ CERN for SC3

Conclusion Completed: In progress: To do: Network setup is ready Hardware configuration of machines Nodes installed with Scientific Linux CERN 3.0.4 IA32 fully Quattorized In progress: Quattorize IA64 nodes Finalize GridFTP/SRM/CASTOR stager configuration Deployment of the new CASTOR stager SRM development To do: Decide which option we take and implement it 17 May 2005 Plans @ CERN for SC3

Thank you E-mail: Vladimir.Bahyl@cern.ch 17 May 2005 Plans @ CERN for SC3