USATLAS SC4. 2 ?! 130.199.48.0…… 130.199.185.0 130.199.48.0 The same host name for dual NIC dCache door is resolved to different IP addresses depending.

USATLAS SC4

2 ?! 130.199.48.0…… 130.199.185.0 130.199.48.0 The same host name for dual NIC dCache door is resolved to different IP addresses depending on which DNS is inquired.

4 20 Gb/s NSF RAID (20 TB) HPSS Mass Storage System Gridftp (2 nodes / 0.8 TB local) HRM SRM (1 node) dCache SRM (1 node) Gridftp door (4 nodes) WAN 2x10 Gb/s LHC OPN VLAN 2 x 1 Gb/s 1 Gb/s Write Pool (10 nodes / 2.1 RAID5 TB) Read Pool (314 nodes / 145 TB) 5 x 1 Gb/s Tier 1 VLANS 20 Gb/s 4 x 1 Gb/s dCache.... N x 1 Gb/s.... 20 Gb/s Logical Connections BNL Tier 1 WAN Storage Interfaces and Logic View

5 SC4 Throughput Phase

6 SC4 Throughput Phase Summary  All data transfer were bypassing BNL firewall for high performance.  BNL had achieved/exceeded BNL USATLAS Tier 2 Mou target for data transfer. One of best WLCG Tier 1 site.  We gained experience of serving USATLAS production and Service challenge on the same dCache system simultaneously. BNL is the only Tier 1 site doing this.  Identified several performance bottlenecks among the stack of USATLAS data manager and data transfer (Panda DQ2 FTS, dCache, network) which can impact both SC4 and USATLAS production.  Fixed the dCache bottleneck by separating core services into multiple high performance hosts, creating dedicated resources for multiple ATLAS data transfer activities, and tuning memory, file system and database.  Evaluated the new dCache release.

7 SC4 Service Phase (All ATLAS Tier 1 site)

8 SC4 Service Phase (DQ2 monitoring)

9 SC4 Service Phase Summary  DQ2 coordinated data transfer.  BNL provided tape storage for RAW data export and disk only storage for ESD and AOD.  dCache was significantly improved compared with throughput phase. (Thanks for the lessons learned from April). It can easily handle the data transfer requirement from Panda OSG production and SC4 ATLAS Tier 0 export.  The data transfer channels from other Tier 1 site (except CNAF) to BNL were verified by Hiro Ito (BNL DDM operation). (CNAF was upgrading their SRM).  The ATLAS DDM coordinated data transfer between USATLAS Tier 2 and Tier 1 are well ahead in schedule. (Thanks to OSG Panda production).  We verified the integrated data flow of ATLAS Tier 0 AOD export from Tier 0 to BNL, then to US Midwest Tier 2 site.  The data flow between BNL and CERN are using LHCOPN which provides 10 Gbps Layer 2 network connection between NYC and CERN. No STARLight is involved.

11 Meeting Notes  Use Dual-home dCache doors  The external interface of doors are in 192.12.15.0  The internal interface of the doors are in 130.199.185.0.  The data flow (in/out) will always go through doors.  Use External/Internal DNS to resolve the same host name of doors to the external IP address/internal IP address, determined by which DNS is used.  Bring the routing for 130.199.185.0 and 130.199.48.0/23 back to USATLAS SW7.  Request ACL for VLAN 315(?) which 192.12.15.0 reside.  One end: LHC OPN address blocks or 3+2 Tier 2s.  The other end will be 192.12.15.0.  What about other T3 sites to contact with the external interface of dCache doors?  Need to go through firewall or not?  Two types of storage (Durable and Permanent)  When we received ESD2, the ESD1 will be discarded. Therefore, we do not need to save ESD to HPSS. We need them, we can get from other Tier 0 and Tier 1 sites.  RAW, our fraction of ESD, AOD, Tier 2 simulation results => Permanent which has tape backend.  Other ESD, AOD will go to durable storage which is not necessarily backed up by tape system.

12 BNL SC4 Plans  VLAN 315 can send network traffic?  FTS and LFC will be setup.  LCG 2.7.0  VObox: We also installed ATLAS DQ2 installed on top of it (done)  BDII provide static and dynamic monitoring information (STATIC Setup?)  R-GMA provide traffic monitoring from Tier 1 to Tier 2. (Plan to make it available before SC4 Service Phase)  CE is based on BNL condor system (Plan to be ready before SC4 service phase June)  Lcg-utils (done) dCache Preparation (Durable, Permanent, Information Publish).  Permanent  System manages cache, tape copy, Access sometimes slow  Durable  User (VO) manages cache, WITHOUT tape copy, Access fast

13 Publish Information for BNL dCache  List of transfer protocols per SE available from information system  SRM knows what it supports, can inform client  FTS Channel Information.  LFC Information. dn: GlueSALocalID=dteam-durable,GlueSEUniqueID=dcache.my_domain,... [...] GlueSARoot: dteam:/pnfs/my_domain/durable-path/dteam GlueSAPath: /pnfs/my_domain/durable-path/dteam GlueSAType: durable [...] GlueChunkKey: GlueSEUniqueID=dcache.my_domain [...] dn: GlueSALocalID=dteam-permanent,GlueSEUniqueID=dcache.my_domain,... [...] GlueSARoot: dteam:/pnfs/my_domain/permanent-path/dteam GlueSAPath: /pnfs/my_domain/permanent-path/dteam GlueSAType: permanent [...] GlueChunkKey: GlueSEUniqueID=dcache.my_domain

14 SC4 Pre-Production System  Pre-production service will be used as soon as it is available and its usage won't go away when SC4 starts. There may be periods where the pre-production service is not extensively used, but the goal is from now on to always develop against the pre-production service.

15 SC4 April Throughput  Need dCache!!!  April 3rd (Monday) - April 13th (Thursday before Easter) - sustain an average daily rate to each Tier1 at or above the full nominal rate (200MB/Second).  We should continue to run at the same rates unattended over Easter weekend (14 - 16 April).  Tuesday April 18th - Monday April 24th we should perform the tape tests at the rates in the table below (75 MB/second).  From after the con-call on Monday April 24th until the end of the month experiment-driven transfers can be scheduled. (LFC will be needed by then for DQ2).

16 SC4 Tier 1 to Tier 1 Data Transfer (May)   Within Each VO, the details of the T1 T1 transfers still need to be finalized. A "dTeam" phase should be foreseen, to ensure that the basic infrastructure is setup. Similarly for T1->T2. A possible scenario follows:   We have to focus on our two sister Tier 1 site: IN2P3 and FZK first.   All Tier1s need to setup an FTS service and configure channels to enable transfers to/from all other Tier1s.   dTeam transfers at 5MB/s (10MB/s?) need to be demonstrated between each T1 and all other T1s   These tests would take place during May, after the April throughput tests and before the SC4 service begins in June.

17 ATLAS Specific Plan  Plans (ATLAS)  Tier 2 Plans  Tier 2 Workshop  Background Information (Darios Slides)

18 Summary of requests from ATLAS  March-April (pre-SC4): 3-4 weeks in for internal Tier-0 tests (Phase 0)  April-May (pre-SC4): tests of distributed operations on a “small” testbed (PPS)  Last 3 weeks of June: Tier-0 test (Phase 1) with data distribution to Tier-1s ( Send AODs to (at least) a few Tier-2s  Last 3 weeks of June: Tier-0 test (Phase 1) with data distribution to Tier-1s (720MB/s + full ESD to BNL), and Send AODs to (at least) a few Tier-2s  3 weeks in July: distributed processing tests (Part 1)  2 weeks in July-August: distributed analysis tests (Part 1)  3-4 weeks in September-October: Tier-0 test (Phase 2 of Part 1) with data to Tier-2s  3 weeks in October: distributed processing tests (Part 2)  3-4 weeks in November: distributed analysis tests (Part 2)

19 Tier 2 Plans  Details of involving Tier 2 are in planning too.  Tier 2 dCache: dCache needs to be stabilize and operational in one or all sites at Midwest, southwest and Northwest ( first week of June) for receiving AODs to (at least) a few Tier-2s.  All Tier 2 dCache should be up and in production in September  Extend data distribution to all (most) Tier-2s  Use 3D tools to distribute calibration data  Base line client tools should be deployed at Tier 2 centers.  No any other services required for Tier 2 except SRM and DQ2.

20 WLCG Tier 2 Workshop  https://twiki.cern.ch/twiki/bin/view/LCG/WorkshopAndTutorials https://twiki.cern.ch/twiki/bin/view/LCG/WorkshopAndTutorials  http://indico.cern.ch/conferenceDisplay.py?confId=1148&view=egee_m eeting&showDate=all&showSession=all&detailLevel=contribution http://indico.cern.ch/conferenceDisplay.py?confId=1148&view=egee_m eeting&showDate=all&showSession=all&detailLevel=contribution http://indico.cern.ch/conferenceDisplay.py?confId=1148&view=egee_m eeting&showDate=all&showSession=all&detailLevel=contribution  from Monday 12 June 2006 (11:00) to Wednesday 14 June 2006 (18:00) at CERN ( Council Chamber )Council Chamber  Four Experiment Activities Introduction.  MC Simulation User Cases  An Overview of Calibration & Alignment  Analysis Use Cases  Services Required at / for Tier2s (Grid, Application).  Support and Operation Issues.  Happen in the middle of June.

ATLAS plans for 2006: Computing System Commissioning and Service Challenge 4 Dario Barberis CERN & Genoa University

22 Computing System Commissioning Goals  Main aim of Computing System Commissioning will be to test the software and computing infrastructure that we will need at the beginning of 2007:  Calibration and alignment procedures and conditions DB  Full trigger chain  Event reconstruction and data distribution  Distributed access to the data for analysis  At the end (autumn-winter 2006) we will have a working and operational system, ready to take data with cosmic rays at increasing rates

23 ATLAS Computing Model  Tier-0:  Copy RAW data to Castor tape for archival  Copy RAW data to Tier-1s for storage and reprocessing  Run first-pass calibration/alignment (within 24 hrs)  Run first-pass reconstruction (within 48 hrs)  Distribute reconstruction output (ESDs, AODs & TAGS) to Tier-1s  Tier-1s:  Store and take care of a fraction of RAW data  Run “slow” calibration/alignment procedures  Rerun reconstruction with better calib/align and/or algorithms  Distribute reconstruction output to Tier-2s  Keep current versions of ESDs and AODs on disk for analysis  Tier-2s:  Run simulation  Keep current versions of AODs on disk for analysis

24 ATLAS Tier-0 Data Flow EF CPU farm T1 T1s Castor buffer RAW 1.6 GB/file 0.2 Hz 17K f/day 320 MB/s 27 TB/day ESD 0.5 GB/file 0.2 Hz 17K f/day 100 MB/s 8 TB/day AOD 10 MB/file 2 Hz 170K f/day 20 MB/s 1.6 TB/day AODm 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day RAW AOD RAW ESD (2x) AODm (10x) RAW ESD AODm 0.44 Hz 37K f/day 440 MB/s 1 Hz 85K f/day 720 MB/s 0.4 Hz 190K f/day 340 MB/s 2.24 Hz 170K f/day (temp) 20K f/day (perm) 140 MB/s Tape

25 Recent Update for Tier 0 Tier 1 Data Transfer

26 BNL Data Flow (2008 Based on 20%) Tier-0 CPU farm T1 Other Tier-1s BNL disk buffer RAW 1.6 GB/file 0.04 Hz 3.4K f/day 64 MB/s 5.4 TB/day ESD2 0.5 GB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day AOD2 10 MB/file 0.4 Hz 34K f/day 4 MB/s 0.32 TB/day AODm2 500 MB/file 0.004 Hz 0.34K f/day 4 MB/s 0.32 TB/day RAW ESD2 AODm2 0.088 Hz 7.48K f/day 88 MB/s 7.32 TB/day T1 Other Tier-1s T1 Tier-2s BNL Tape RAW 1.6 GB/file 0.04 Hz 3.4K f/day 64 MB/s 5.4 TB/day disk storage AODm2 500 MB/file 0.008 Hz 0.68K f/day 4 MB/s 0.32 TB/day ESD2 0.5 GB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day AOD2 10 MB/file 0.4 Hz 34K f/day 4 MB/s 0.32 TB/day ESD2 0.5 GB/file 0.02 Hz 1.7K f/day 80 MB/s 0.8 TB/day AODm2 500 MB/file 0.03 Hz 3.0K f/day 16 MB/s 1.44 TB/day ESD2 0.5 GB/file 0.02 Hz 1.7K f/day 20 MB/s 1.6 TB/day AODm2 500 MB/file 0.036 Hz 3.1K f/day 4*9 MB/s 1.44 TB/day ESD1 0.5 GB/file 0.2 Hz 17K f/day 100 MB/s 8 TB/day AODm1 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day AODm1 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s*3 1.6 TB/day AODm2 500 MB/file 0.008 Hz 0.70 f/day 4 MB/s*3 0.32 TB/day Plus simulation & analysis data flow Real data storage, reprocessing and distribution 234MB*n analysis

27 BNL to 3+2 Tier2 (Estimation!)  See https://uimon.cern.ch/twiki/bin/view/Atlas/Tier1DataFlow https://uimon.cern.ch/twiki/bin/view/Atlas/Tier1DataFlow  Tier 1 to Tier 2 likely to be very bursty and driven by analysis demands Network to Tier 2 are expected to be a fraction of 10Gbps (UC 30% of 10 Gbps is allocated, opportunistic usage may bump up to 10Gbps.).  Desire to reach 100MBs for each of 3+2 Tier 2 clusters.  300MB/second ~ 500MB/second in total to BNL.  Tier 2 to Tier 1 transfer are almost entirely continuous simulation transfers  The aggregate input rate to Tier 1 center is comparable to 20%~25% of the rate from tier 0.

28 Tier-0 Tier-1 BNL Write buffer T1 Tier-2s BNL Tape Read storage AODm2 500 MB/file 0.008 Hz 0.68K f/day 4 MB/s 0.32 TB/day ESD2 0.5 GB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day ESD1 100MB AODM1 20MB RAW 64MB ESD2 80MB (80% EST from T1s) AODM2 16MB ESD2 20MB AODM2 36MB CPU farm BNL Data Flow (2008) 88MB (RAW, ESD, AOD) 350MB (including raw data) (Analysis AOD)500MB 200MB(?) (304MB*20%)~60MB, Simu 60MB (Tier 2)

29 ATLAS SC4 Tests  Complete Tier-0 test  Internal data transfer from “Event Filter” farm to Castor disk pool, Castor tape, CPU farm  Calibration loop and handling of conditions data  Including distribution of conditions data to Tier-1s (and Tier-2s)  Transfer of RAW, ESD, AOD and TAG data to Tier-1s  Transfer of AOD and TAG data to Tier-2s via Tier 1  Data and dataset registration in DB (add meta-data information to meta-data DB)  Distributed production  Full simulation chain run at Tier-2s (and Tier-1s)  Data distribution to Tier-1s, other Tier-2s and CAF  Reprocessing raw data at Tier-1s  Data distribution to other Tier-1s, Tier-2s and CAF  Distributed analysis  “Random” job submission accessing data at Tier-1s (some) and Tier-2s (mostly)  Tests of performance of job submission, distribution and output retrieval

30 ATLAS SC4 Plans (1)  Tier-0 data flow tests:  Phase 0: 3-4 weeks in March-April for internal Tier-0 tests  Phase 1: last 3 weeks of June with data distribution to Tier-1s  Run integrated data flow tests using the SC4 infrastructure for data distribution  Send AODs to (at least) a few Tier-2s  Automatic operation for O(1 week)  First version of shifter’s interface tools  Treatment of error conditions  Phase 2: 3-4 weeks in September-October  Extend data distribution to all (most) Tier-2s  Use 3D tools to distribute calibration data

31 ATLAS SC4 Plans (2)  ATLAS includes continuous distributed simulation productions (Kaushik)  SC4: distributed reprocessing tests:  Test of the computing model using the SC4 data management infrastructure  Needs file transfer capabilities between Tier-1s and back to CERN CAF  Also distribution of conditions data to Tier-1s (3D)  Storage management is also an issue  Could use 3 weeks in July and 3 weeks in October  SC4: distributed simulation intensive tests:  Once reprocessing tests are OK, we can use the same infrastructure to implement our computing model for simulation productions  As they would use the same setup both from our ProdSys and the SC4 side  First separately, then concurrently.

32 Overview of requirements for SC4  SRM (“baseline version”) on all storages  VO Box per Tier-1 and in Tier-0  LFC server per Tier-1 and in Tier-0  FTS server per Tier-1 and in Tier-0  Permanent Storage and Durable Storage.  separate SRM entry points for permanent and durable storages.  Disk space is managed by DQ2.  Counts as online (“disk”) data in the ATLAS Computing Model  Ability to install FTS ATLAS VO agents on Tier-1 and Tier-0 VO Box  Ability to deploy DQ2 services on VO Box as during SC3  No new requirements on the Tier-2s besides SRM SE

33 Overview of FTS and VO Box  Hence, an ATLAS VO Box will contain:  FTS ATLAS agents  And remaining DQ2 persistent services (less s/w than for SC3 as some functionality merged into FTS in the form of FTS VO agents)  DQ2 site services will have associated SFTs for testing

34 ATLAS SC4 Requirement (PPS)  Small testbed with (part of) CERN, a few Tier-1s and a few Tier-2s to test our distributed systems (ProdSys, DDM, DA) prior to deployment  It would allow testing new m/w features without disturbing other operations  We could also tune properly the operations on our side  The aim is to get to the agreed scheduled time slots with an already tested system and really use the available time for relevant scaling tests  This setup would not interfere with concurrent large-scale tests or data transfers run by other experiments

USATLAS SC4. 2 ?! 130.199.48.0…… 130.199.185.0 130.199.48.0 The same host name for dual NIC dCache door is resolved to different IP addresses depending.

Similar presentations

Presentation on theme: "USATLAS SC4. 2 ?! 130.199.48.0…… 130.199.185.0 130.199.48.0 The same host name for dual NIC dCache door is resolved to different IP addresses depending."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

USATLAS SC4. 2 ?! 130.199.48.0…… 130.199.185.0 130.199.48.0 The same host name for dual NIC dCache door is resolved to different IP addresses depending.

Similar presentations

Presentation on theme: "USATLAS SC4. 2 ?! 130.199.48.0…… 130.199.185.0 130.199.48.0 The same host name for dual NIC dCache door is resolved to different IP addresses depending."— Presentation transcript:

Similar presentations

About project

Feedback