Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Integration and Testing, SA3 Markus Schulz CERN IT JRA1 All-Hands Meeting 22 nd - 24 nd March.

Similar presentations


Presentation on theme: "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Integration and Testing, SA3 Markus Schulz CERN IT JRA1 All-Hands Meeting 22 nd - 24 nd March."— Presentation transcript:

1 INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Integration and Testing, SA3 Markus Schulz CERN IT JRA1 All-Hands Meeting 22 nd - 24 nd March 2006

2 Enabling Grids for E-sciencE INFSO-RI-508833 Integration and Testing, SA3 2 Building a Release The Plan 1.TCG provides prioritized list of functionality for the next release 2.SA3 “shops” for components 3.SA3 builds list of component candidates 4.TCG blesses the list 5.SA3 Integration (functional freeze) 6.SA3 Certification 7.SA1 Preproduction 1.with users, integrates new components into operations 8.Loop quickly through 5-7 to add patches, remove broken parts 9.SA3 packages release (with contributions from developers & friends) 1.Documentation (user, config, and admin) 2.RPM repositories 3.Configuration tools 10.SA1 & SA3 push out new release 11.SA1, SA3, and software developer support the release

3 Enabling Grids for E-sciencE INFSO-RI-508833 Integration and Testing, SA3 3 Integration VDT/OSG OMII- Europe JRA1 SA3 … Testing & Certification Support, analysis, debugging Production service SA1 Pre-production service Middleware providers SA3 Certification activities SA3+SA1 Process to deployment

4 Enabling Grids for E-sciencE INFSO-RI-508833 Integration and Testing, SA3 4 Release Sequence The Plan Part II 1.Every second release ends into production 2.X.1 releases contain all potential packages 3.Process (Integrate, cert) weeds out those that are not ready yet 4.X.2 release candidate is moved on the pre-production service 5.The x.0 and x.1 have to be overlapping activities

5 Enabling Grids for E-sciencE INFSO-RI-508833 Integration and Testing, SA3 5 Planned Schedule for gLite 3.0.0 Plan for gLite 3.1.0: –March 31 st : code freeze for development release gLite 3.1.0 –April 30 th : end of integration –May 31 st : end of certification. Deployment on PPS –July 31 st : release of production version gLite 3.2.0. Start deployment at sites –September : gLite 3.2.0 installed at sites and usable. PRODUCTION!! Deploy in prod n PPS Certification Tuesday 28/2/06 gLite 3.0.0β exits certification and enters pre-production FebruaryAprilMayJuneMarch Friday 28/4/06 PPS phase ends. gLite 3.0.0 passes from PPS to Production. Wednesday 15/3/06 gLite 3.0.0β available to users in the PPS Deployment of gLite 3.0.0β in PPS Continual bug fixing and patches passed to PPS Thursday 1/6/06 LCG Service Challenge 4 (SC4) starts!!

6 Enabling Grids for E-sciencE INFSO-RI-508833 Integration and Testing, SA3 6 Time to rollout Time to upgrade ~constant (~2.5 sites/day) Takes a long time to upgrade entire infrastructure LCG-2.6.0

7 Enabling Grids for E-sciencE INFSO-RI-508833 Integration and Testing, SA3 7 Problems To get into a steady state –No extra time for the merging of the two release prep. Systems –No time for establishing a new process Integration and testing of gLite-3.0 is special –2 stacks (build systems) –Multiple tests components –2 sets of installation and configuration tools –Many changes of the the way integration is done –Merging teams and procedures SC4 requirements –Core components still need to integrate core functionality –Non negotiable release date + non negotiable functionality Requirements and Prioritization for next releases – become clear only during (pre.) production usage –BUT: Freeze a few days after startup of PPS…..

8 Enabling Grids for E-sciencE INFSO-RI-508833 Integration and Testing, SA3 8 Problems gLite-3.1 currently not well defined –Partially due to lack of time Time for getting new services production ready is hard to predict –This makes pushing 3.1 through the process a frightening task –A tick list is needed to check components at certain process state transitions  Something like the next slide, but tailored for Enter Integration Integration -> Certification Certification -> PPS PPS -> Production Still need to move a lot of code into the ETICS build system Still need to define the process from code to release –Working on gLite-3.0 is a good to understand what is needed Maybe the current concept of releases is not adequate? –Component based with infrequent checkpoint releases + upgrades?

9 Enabling Grids for E-sciencE INFSO-RI-508833 Integration and Testing, SA3 9 Checklist for a new service User support procedures (GGUS) –Troubleshooting guides + FAQs –User guides Operations Team Training –Site admins –CIC personnel –GGUS personnel Monitoring –Service status reporting –Performance data Accounting –Usage data Service Parameters –Scope - Global/Local/Regional –SLAs –Impact of service outage –Security implications Contact Info –Developers –Support Contact –Escalation procedure to developers Interoperation –Documented issues First level support procedures –How to start/stop/restart service –How to check it’s up –Which logs are useful to send to CIC/Developers  and where they are SFT Tests –Client validation –Server validation –Procedure to analyse these  error messages and likely causes Tools for CIC to spot problems –GIIS monitor validation rules (e.g. only one “global” component) –Definition of normal behaviour  Metrics CIC Dashboard –Alarms Deployment Info –RPM list –Configuration details –Security audit

10 Enabling Grids for E-sciencE INFSO-RI-508833 Integration and Testing, SA3 10 Current State of Integration ETICS and lcg build systems –Move to ETICS started –Progress is slow because priority is SC4 functionality –ETICS team handles building of meta RPMs  Define release candidates –Repositories for certification, preproduction, and soon production –Build for SC4 32bit RPM only –Lots of informal communication between:  Integrators, Certificators, and SoftwareProviders Next steps (after we are ready to roll for SC4) –One build system –Define and describe integration process  Including synchronization with external dependencies –Prepare, with SA3 partners, for building releases for:  Multiple linux distributions and package formats  32/64 Intel & AMD

11 Enabling Grids for E-sciencE INFSO-RI-508833 Integration and Testing, SA3 11 Testing We have now an inventory of existing tests (Zdenek and friends) –Not 100% complete, but an excellent start We have started to identify gaps –And have sign ups for some of them –Here the external partners in SA3 will contribute On the certification testbed we use –Gilbert’s test suite for LCG components –Gilbert’s test suite for gLite components –Some manually run tests –SFT “External” tests for SRM interoperations, FTS,.. An integration of tests is urgently needed –Common reporting and archiving –Should be linked to ETICS activities

12 Enabling Grids for E-sciencE INFSO-RI-508833 Integration and Testing, SA3 12 Testing Complexity of the certification testbeds has and will increase –Different WLMs (Ces, RBs) –More services –Hopefully soon external partner sites  Different platforms (OS, distributions, architecture)  Interoperability (different grids, different OS versions) Main problem in the testing area is the lack of resources –All hands are on deck getting gLite-3.0 out of the door Next steps: –Gap analysis –Plan for an integrated test environment  Best within ETICS –Explore ways to handle complexity and diversity more efficiently  Virtual machines ?

13 Enabling Grids for E-sciencE INFSO-RI-508833 Integration and Testing, SA3 13 Summary With gLite-3.0 we have not yet reached a fully integrated release –Work to be done on: –Integrating the build systems –Integrating all tests –Defining a release process  Workflow  Acceptance criteria All activities are focused to meet the SC4 deadlines –This helps to prioritize –This slows down the lcg gLite integration process Consolidation of test and certification activities will take some time We have to rethink how we can evolve the production system –How to introduce change? –What is a release? –The current approach 3.0 ->( 3.1) -> 3.2 is not very realistic


Download ppt "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Integration and Testing, SA3 Markus Schulz CERN IT JRA1 All-Hands Meeting 22 nd - 24 nd March."

Similar presentations


Ads by Google