Presentation is loading. Please wait.

Presentation is loading. Please wait.

EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Oliver Keeble SA3 Activity Leader CERN EGEE-III.

Similar presentations


Presentation on theme: "EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Oliver Keeble SA3 Activity Leader CERN EGEE-III."— Presentation transcript:

1 EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Oliver Keeble SA3 Activity Leader CERN EGEE-III First Review, 24-25 June, 2009 SA3 Status Report

2 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Activity Overview SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 2 Country Total PM planned at M24 (1) Total FTE CERN39616.5 Cyprus120.5 Czech Republic241.0 Finland120.5 Greece301.3 Ireland361.5 Italy964.0 Netherlands241.0 Poland241.0 Russia301.3 Spain321.3 UK361.5 Total PM planned at M24752 Total FTE 31.3

3 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 SA3 Objectives SA3’s main objectives are to : –Produce well-tested and documented gLite releases together with associated configuration tools –Improve the multi-platform support of gLite –Increase interoperability of different Grid infrastructures by working towards best practices and established standards and provide input to standardisation bodies In between JRA1 & SA1 in the software process SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 3

4 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 4 Tasks TSA3.1: Integration, configuration and packaging (186PM)‏ TSA3.2: Testing and certification (319PM)‏ TSA3.3: Support, analysis, debugging, problem resolution (100PM)‏ TSA3.4: Interoperability & Platform support (141PM)‏ TSA3.5: Activity Management (46PM)‏ Distribution of tasks in SA3Software change management SA3/JRA1

5 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Middleware releases Functional highlights introduced during year 1 –gLite’s interface to computing resources: CREAM –a keystore for Medical Data Management: HYDRA –a metadata catalogue: AMGA –a local authorisation service: glexec/SCAS –enabling grid access to local resources: Batch sys integration –making parallel computing easier on the grid: MPI Updates to gLite 3.1 / SL4 / 32 & 64 bit –Deployed across the infrastructure –22 updates made –Each an aggregation of numerous changes 1556 change requests were opened and 1742 were closed –Includes both bugs and enhancement requests SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 5

6 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Release history SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 6 Average during year 1 of EGEE-III is 12 patches per month Each update represents numerous different changes Changes released together were independent until the time of release

7 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Fixing bugs SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 7

8 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Open & closed change requests SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 8 Total number of bugs integrated over time EGEE-III inherited a lot of ‘bugs’ Many are in fact fixed, invalid, obsolete, duplicate… The discontinuities represent efforts to clean up different classes of ‘bug’

9 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Certification & Release Release –Documented an updated release process  Acceptance criteria –Post-mortem & Rollback  Fixes are not always available in time –RPM signing Certification –Full documentation for devolution of certification  Ready for product teams –Regression tests –CREAM stress and comparative analysis SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 9

10 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Nagios Monitoring of the testbed SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 10 The SA1 monitoring framework and tests are available in certification

11 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Patch certification SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 11

12 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Patch certification SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 12

13 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Batch system integration Target Torque/PBS/Maui, Condor, Sun Grid Engine (SGE) and LSF –Required for the lcg-CE and CREAM Support for Torque and LSF is in place Sun Grid Engine –During the first year or EGEE III, SGE was fully certified as a batch system for the LCG Computing Element –CREAM support for SGE is still ongoing Condor –integration with the LCG-CE working with known issues –Condor integration with CREAM is ongoing New and updated TWiki pages on batch system integration and batch system support SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 13

14 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 GLUE 2.0 & InfoSys Information system schema –Will allow a better expression of what is on the grid and therefore more efficient use of resources Work carried out within JRA1 & SA3 Ratified as an Open Grid Forum (OGF) standard LDAP rendering is nearly done –Will be packaged and pushed out in a few weeks 3 stage rollout process –Deployment of ‘empty’ schema in parallel with 1.3 –Update of information providers to populate 2.0 –Implementation of support in clients SAGA Service discovery API –Candidate OGF specification ‘Scalability & infosys related problems’ –Lead to BDII v5 SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 14

15 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Activity & Project Coordination Task tracking –Task tracking system implemented to follow effort –Weekly meetings EMT (Engineering Management Team) –Cross-activity, short-term coordination, chaired by SA3 All-hands meetings –Established the principle of joint sessions with JRA1 –CERN, Prague and Nicosia –Expect one more before the end of the project SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 15

16 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Population of task tracker SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 16 Number of tasks tracked, integrated over time

17 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Interoperability & Standards Short term and medium interoperability goals have been achieved with key infrastructures: Open Science Grid (OGF), Nordic DataGrid Facility (NDGF) –glite-WMS submission to NDGF resources now possible  And is actively used by CMS! Work has moved beyond short term fixes –Now working towards long term sustainability via standards  Pursued in OGF  BES/JSDL workshop  “Production Grid Infrastructures” “investigate adoption of OGF recommendations in production grid infrastructures” Initially focusing on BES, GLUE & JSDL –Ongoing talks with ARC and UNICORE  Harmonisation of European middleware stacks for EGI era Maintaining relationships with other infrastructures NAREGI, Teragrid, DEISA, PRAGMA SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 17

18 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Security Audits SA3 has undertaken source-level security reviews of sensitive components in the middleware –HYDRA –SCAS –DPM The only EGEE activity dedicated to proactively finding security issues Reports have been created and passed to the relevant development teams SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 18

19 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Platform Support PlatformStrategy/purposeStatus SL5/x86_64Next reference platform Full gLite release Worker Node and User Interface available Build Complete Debian4/x86_64To grid-enable existing compute resources Client release Worker Node available MacOSXRequired for User Interface Client release Build not complete SL4/i386 & SL4/x86_64 To be maintained as requiredFully maintained SL3/i386To be retiredOfficially obsolete TarballCan be adapted to other LinuxesAvailable Other PlatformsTo be made available, where possible, with limited support Builds set up in ETICS SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 19 Builds are performed with ETICS

20 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 The Debian Release SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 20

21 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Platform Support – the issues Issues are typically not technical –No fundamental incompatibilities Finding effort to deal with multiple platform builds while moving to gLite3.2/SL5 Slow turnaround times on reported problems –Build complexity –Prioritisation Gradual introduction of platform support in ETICS –Convergence on working Debian x86_64 builds Poor availability of MacOSX to developers A new platform requires additional expertise and resources all along the software lifecycle –Impact of one source code change is amplified SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 21

22 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 General Issues Change iteration time –Access to developers –Release reactivity Build –Managing the complexity in a sustainable way –Provision of ETICS client addressing performance issues Effort –CERN SA3 understaffed  High turnover: lost 1 person every 2 months  Currently at 10FTE (cf 16.5), 3 hires pending –Distribution of effort  Average outside CERN is 9PM (over 2yrs) per person Certification expertise is highly specialised Incompatibility of project objectives with local hierarchies SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 22

23 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Year 2 Continue to deliver gLite updates while implementing necessary changes for EGI –Complete gLite 3.2 / SL5 release  Worker Node on other target platforms  Understand implications for certification Implement ‘Product Teams’ –Overlay this structure on existing teams –Define requirements and constraints –Implement/adopt any new technology required Describe new release process –And all other docs (eg developer’s guide) gLite SDK and gLite 4 planning –Source rpms, promoting community contributions Fully document certification process SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 23

24 Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 Summary Optimisation of infrastructure –Improved test coverage in certification  In regression tests –Documentation ready for devolution of tasks to product teams –Signed rpms –Functional improvements  CREAM, MPI –Continual flow of gLite updates Extension of infrastructure –Batch system integration –SL5 releases –Debian WN in certification –NDGF Interoperability SA3 - Oliver Keeble - EGEE-III First Review 24-25 June 2009 24


Download ppt "EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Oliver Keeble SA3 Activity Leader CERN EGEE-III."

Similar presentations


Ads by Google