Presentation is loading. Please wait.

Presentation is loading. Please wait.

CERN LCG Middleware Certification and Support Maarten Litmaath CERN IT/GD GridPP Workshop 2-4 June 2004.

Similar presentations


Presentation on theme: "CERN LCG Middleware Certification and Support Maarten Litmaath CERN IT/GD GridPP Workshop 2-4 June 2004."— Presentation transcript:

1 CERN LCG Middleware Certification and Support Maarten Litmaath CERN IT/GD GridPP Workshop 2-4 June 2004

2 CERN Maarten Litmaath, GridPP meeting, 2004/06/03 LCG Middleware LCG middleware has various origins –VDT: Globus, Condor(-G), MyProxy, … –EDG: WP1, WP2, … –DataTAG: GLUE, GridICE, … –LCG: GFAL, lcg-BDII, exp. SW tools, better Globus job managers, improved Replica Manager tools, … –DESY/FNAL: dCache –… Maintenance and development –GD group cannot maintain lines of EDG code! –Ideally just do certification and testing of external middleware –Support agreements with WP1, VDT –GD maintains/develops critical components that currently: Are missing (e.g. GFAL) Have performance issues (e.g. Replica Manager, BDII) –GD provides fixes or workarounds for serious bugs only EGEE/ARDA may completely replace many components

3 CERN Maarten Litmaath, GridPP meeting, 2004/06/03 Where is the code? CERN central CVS system autobuild (see next page) –EDG/LCG code –LCG configuration: Everything else is external –Simplifies build –Complicates debugging Need at least the sources All RPMs under /afs/cern.ch/project/gd/RpmDir LCG code guidelines adapted from EDG –http://grid-deployment.web.cern.ch/grid-deployment/cgi-bin/index.cgi –Documentation menu

4 CERN Maarten Litmaath, GridPP meeting, 2004/06/03 The Builds EDG autobuild system has been ported to LCG –http://lxshare0297.cern.ch/LCG/autobuild/ –Allows nightly build of latest compliant CVS tag per package –Build-on-demand tag triggers immediate build Currently only RH 7.3 supported –Porting to RH Enterprise Linux underway in GD group WN being tested –Collaboration with CERN OpenLab to port code + build recipes to IA-64 CE + WN already included in EIS testbed –Other platforms being considered: Fedora RH 9 RH 6.2 Solaris IRIX …

5 CERN Maarten Litmaath, GridPP meeting, 2004/06/03 The Certification Resulting middleware must be integrated and then certified on all supported platforms –Also verify interoperability of all platforms Complicates certification exponentially –Goal is production quality: Stability, robustness, performance, scalability Easy configuration, operation, maintenance –A lot of effort has been going into debugging Get feedback from production system (e.g. rollout mailing list) Send feedback to developers, but apply in-house patches in the meantime See next talk by David Smith Current big certification testbed shown on next page –Only RH 7.3 for now –Remote sites to be added (again) Madison (VDT), Taipei, Budapest, … –Simulates multiple realistic configurations Can test multiple platforms at the same time

6 CERN Maarten Litmaath, GridPP meeting, 2004/06/03 h234 RB_a h243 BDII_a h235 CE_a h236 SE_a h240 RB_b h281 BDII_b h241 CE_b h277 CE_2_a h278 SE_2_a h290 CE_3_a h291 SE_3_a h286 CE_4 h287 SE_4 h238 WN_a1 h244 WN_b1 h280 WN_2_a1 h300 WN_3_a1 h288 WN_3_a2 h296 WN_4_a1 rlscert02 RLS_Oracle Cluster_1Cluster_2Cluster_3Cluster_4 h275 UI_1 h285 UI_4 h237 CE_5 Condor h270 WN_5_1 Cluster_5 lxs5243 CE_6 LSF h246 MyProxy h271 WN_a2 h272 WN_2_a2 h248 pool dcache h279 WN_a3 h273 WN_2_a3 h274 WN_2_a4 h230 WN_3_a3 h294 WN_4_a2 lxs5238 No home sharing share local /home h247 SE_2_b dcache h206 WN_5_2 lxs5239 lxs5240 lxs5241 lxs5242 h282 SE_c dcache No home sharing h303 SE_d Castor h229 SE_3_b Castor Certification & Testing Testbed h289 WN_b3 h245 WN_b2 h239 RB_3 h276 UI_3 h284 BDII_3

7 CERN Maarten Litmaath, GridPP meeting, 2004/06/03 The Tests Feature testing –Workload Management, Data Management, Information System, … Job distribution with and w/o data constraints, resource saturation, proxy renewal Data access, replica services Different architectures/configurations –Try to simulate the production system to some extent Stress tests –Performance should degrade gracefully, no crashes Explicit error injection –Study system reaction Security –One should not be able to bypass it Experiments integration testing done by GD/EIS on their testbed

8 CERN Maarten Litmaath, GridPP meeting, 2004/06/03 Certification, Testing & Release Cycle CERTIFICATION TESTING DEPLOYMENT LCG C&T section add features fix problems transmit problems EGEE fix problems new releases VDT fix problems new releases Integrate Basic Functionality Tests Run Special Tests Run Certification Matrix Release candidate tagged RELEASE PRE-DEPLOYMENT GENERAL RELEASE EXPERIMENTS INTEGRATION Experiments software installation Testing experiments specific features Certified release tag

9 CERN Maarten Litmaath, GridPP meeting, 2004/06/03 Errors reflect ongoing development Details available through links An LCG release candidate must not have any serious errors reported by the test suites Typical Certification Matrix

10 CERN Maarten Litmaath, GridPP meeting, 2004/06/03 The Tasks Web page to open bugs and tasks: –https://savannah.cern.ch/projects/lcgoperation/ Main task: stabilize LCG-2 –Allow serious work to get done efficiently –Minor remaining inconveniences should be tolerable To be addressed by EGEE/ARDA Main ingredients –dCache –Porting to RH 7.3 successors –Redo Replica Manager core –Flexible info providers corresponding changes in WP1/WP2 code –Shield CE against overload risk –…

11 CERN Maarten Litmaath, GridPP meeting, 2004/06/03 More Tasks Try and follow Globus releases (via VDT) Use the VDT more: –Helps EU-US interoperability –Try more functionality already provided by VDT Condor as default batch system? PacMan? –Try and put more into the VDT Try R-GMA for monitoring –Combine with GridICE Get rid of MDS completely LCFGng Quattor …


Download ppt "CERN LCG Middleware Certification and Support Maarten Litmaath CERN IT/GD GridPP Workshop 2-4 June 2004."

Similar presentations


Ads by Google