Presentation is loading. Please wait.

Presentation is loading. Please wait.

FAX PERFORMANCE TIM, Tokyo May 2013. PERFORMANCE TIM, TOKYO, MAY 2013ILIJA VUKOTIC 2  Metrics  Data Coverage  Number of users.

Similar presentations


Presentation on theme: "FAX PERFORMANCE TIM, Tokyo May 2013. PERFORMANCE TIM, TOKYO, MAY 2013ILIJA VUKOTIC 2  Metrics  Data Coverage  Number of users."— Presentation transcript:

1 FAX PERFORMANCE TIM, Tokyo May 2013

2 PERFORMANCE TIM, TOKYO, MAY 2013ILIJA VUKOTIC IVUKOTIC@UCHICAGO.EDU 2  Metrics  Data Coverage  Number of users  Percentage of successful jobs  Total amount of data delivered  Bandwidth usage  Source  Ganglia plots  MonaLisa  FAX Dashboard  HC tests  CostMatrix tests  Special tests using dedicated resources better than 97%, more than 2 replicas mostly UofC, Prague users Latest HC tests >99% ~ 2PB/week

3 COST MATRIX TIM, TOKYO, MAY 2013ILIJA VUKOTIC IVUKOTIC@UCHICAGO.EDU 3 destination Rate MB/s BNL-ATLASCERN-PRODDESY-HHINFN-ROMA1LRZ-LMUMWT2RAL-LCG2SWT2_CPBUKI-LT2-QMULUKI-SCOTGRID-GLASGOW source AGLT21.102.690.650.556.680.901.241.23 BNL-ATLAS57.500.69 CERN-PROD63.4125.965.743.696.100.940.7610.336.98 DESY-HH3.7346.521.473.910.514.565.02 IllinoisHEP0.752.450.580.6636.501.054.731.34 INFN-FRASCATI1.440.724.000.770.620.950.79 INFN-NAPOLI-ATLAS4.379.8612.711.452.620.405.13 INFN-ROMA14.085.309.291.581.940.424.664.68 LRZ-LMU3.9611.952.1339.978.310.430.698.805.82 MPPMU4.1610.442.2339.908.300.430.714.905.87 MWT21.022.590.710.7413.091.302.081.35 OU_OCHEP_SWT20.57 0.550.391.872.030.710.91 praguelcg22.582.961.482.142.490.440.533.951.76 RAL-LCG23.081.561.372.0034.1410.554.07 RU-Protvino-IHEP0.941.290.850.951.030.510.451.691.54 SWT2_CPB0.813.730.670.7427.2843.295.861.50 UKI-LT2-QMUL3.022.441.261.461.382.190.421.603.46 UKI-SCOTGRID-ECDF1.943.531.281.124.480.690.392.084.22 UKI-SCOTGRID-GLASGOW7.634.555.121.962.012.861.090.537.7710.34 UKI-SOUTHGRID-OX-HEP3.193.801.523.824.325.463.29 WT215.680.763.220.610.6010.510.823.531.34 A place to get idea on rate a single job can expect to see. Are our pipes really this full? Let’s see other sources of information.

4 COST MATRIX VS. PERFSONAR TIM, TOKYO, MAY 2013ILIJA VUKOTIC IVUKOTIC@UCHICAGO.EDU 4 Comparison of just one link in one direction: source AGLT destination MWT2 Perfsonar info at 4 h intervals. Can it be worker nodes links are saturating?

5 MWT2 SLAC AGLT2 BNL CERN CLOGGING THE PIPES  Using HC submitted jobs submitted to 4 ANALY queues  AGLT2, BNL, MWT2, SLAC  Each site runs 300 jobs of two types – 50 in parallel  xrdcp 3 files randomly chosen from SMWZ datasets prepared for FDR from others  Reads 10% of events from 3 file randomly chosen from FDR SMWZ from others  Uploads time to finish, events/s, MB/s for each job, pandaid so jobs can be investigated  All jobs submitted through FDR web interface http://ivukotic.web.cern.ch/ivukotic/FDR/index.asp http://ivukotic.web.cern.ch/ivukotic/FDR/index.asp  All in parallel to other HC stress tests TIM, TOKYO, MAY 2013ILIJA VUKOTIC IVUKOTIC@UCHICAGO.EDU 5

6 TESTS 0.17% failure rate ! TIM, TOKYO, MAY 2013ILIJA VUKOTIC IVUKOTIC@UCHICAGO.EDU 6

7 COPY  Clearly not limited by WN links  Assuming just 30 simultaneous jobs worst case delivery rates are:  BNL to CERN: 75 MB/s  CERN to AGLT2: 170 MB/s  MWT2 to AGLT2: 100 MB/s  AGLT to CERN: 90 MB/s  SLAC to BNL: 300 MB/s  Average WAN access ~ 300 MB/s TIM, TOKYO, MAY 2013ILIJA VUKOTIC IVUKOTIC@UCHICAGO.EDU 7 MB/s BNL-ATLASCERN-PRODMWT2AGLT2SLAC source BNL-ATLAS86.09 2.5113.41 8.97 CERN-PROD 11.5376.3810.56 5.76 MWT2 11.11.9127.083.32 AGLT222.41 2.923.08 65.49 SLAC 9.872.068.766.3882.71

8 READ  Jobs were reading 10% of events using TTC 30MB  100% data are transferred and decompressed.  ROOT can decompress our D3PD at ~20 MB/s  Rates are the same as for xrdcp except when local access.  Over WAN one should expect at least 50% of CPU efficiency of local access.  Less than 100 simultaneous standard analysis jobs will saturate 10 Gb WAN link.  FAX needs to be used judiciously, can easily overwhelm weaker links  Rates are the same as for xrdcp except when local access.  Over WAN one should expect at least 50% of CPU efficiency of local access.  Less than 100 simultaneous standard analysis jobs will saturate 10 Gb WAN link.  FAX needs to be used judiciously, can easily overwhelm weaker links TIM, TOKYO, MAY 2013ILIJA VUKOTIC IVUKOTIC@UCHICAGO.EDU 8 READ destination events/s BNL-ATLASCERN-PRODMWT2AGLT2SLAC source BNL-ATLAS163.7 19.6148.24 91.99 CERN-PROD 62.07224.2346.6363.35 MWT2 116.1511.69141.0734.94 AGLT275.4117.42179.38 SLAC 48.678.8 34.92 43.0892.46

9 MONA LISA TIM, TOKYO, MAY 2013ILIJA VUKOTIC IVUKOTIC@UCHICAGO.EDU 9

10 WAYS AHEAD TIM, TOKYO, MAY 2013ILIJA VUKOTIC IVUKOTIC@UCHICAGO.EDU 10  Increase coverage, add redundancy, increase total bandwidth  Enlargement  Increases performance, reduces bandwidth needs  Caching  Cost matrix – smart FAX  Smart network - Bandwidth requests, QOS assurance  Improve adoption rate  Presenting, teaching, preaching  New services  Improve satisfaction  FAX tuning  Application tuning  New services


Download ppt "FAX PERFORMANCE TIM, Tokyo May 2013. PERFORMANCE TIM, TOKYO, MAY 2013ILIJA VUKOTIC 2  Metrics  Data Coverage  Number of users."

Similar presentations


Ads by Google