Presentation is loading. Please wait.

Presentation is loading. Please wait.

Software Overview and LCG Project Status & Plans Torre Wenaus BNL/CERN DOE/NSF Review of US LHC Software and Computing NSF, Arlington June 20, 2002.

Similar presentations


Presentation on theme: "Software Overview and LCG Project Status & Plans Torre Wenaus BNL/CERN DOE/NSF Review of US LHC Software and Computing NSF, Arlington June 20, 2002."— Presentation transcript:

1 Software Overview and LCG Project Status & Plans Torre Wenaus BNL/CERN DOE/NSF Review of US LHC Software and Computing NSF, Arlington June 20, 2002

2 Torre Wenaus, BNL/CERN DOE/NSF Review 2 Outline  Overview, organization and planning  Comments on activity areas  Personnel  Conclusions  LCG Project Status and Plans

3 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 3 U.S. ATLAS Software Project Overview  Control framework and architecture  Chief Architect, principal development role. Software agreement in place  Databases and data management  Database Leader, primary ATLAS expertise on ROOT/relational baseline  Software support for development and analysis  Software librarian, quality control, software development tools, training…  Automated build/testing system adopted by and (partly) transferred to Int’l ATLAS  Subsystem software roles complementing hardware responsibilities  Muon system software coordinator  Scope commensurate with U.S. in ATLAS: ~20% of overall effort  Commensurate representation on steering group  Strong role and participation in LCG common effort

4 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 4 U.S. ATLAS Software Organization

5 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 5 U.S. ATLAS - ATLAS Coordination US roles in Int’l ATLAS software: D. Quarrie (LBNL), Chief Architect D. Malon (ANL), Database Coordinator P. Nevski (BNL), Geant3 Simulation Coordinator H. Ma (BNL), Raw Data Coordinator C. Tull (LBNL), Eurogrid WP8 Liaison T. Wenaus (BNL), Planning Officer USInternational

6 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 6 ATLAS Subsystem/Task Matrix Offline Coordinator ReconstructionSimulationDatabaseChair N. McCubbin D. Rousseau A. Dell’Acqua D. Malon Inner Detector D. Barberis D. Rousseau F. Luehring S. Bentvelsen / D. Calvet Liquid Argon J. Collot S. Rajagopalan M. Leltchouk H. Ma Tile Calorimeter A. Solodkov F. Merritt V.Tsulaya T. LeCompte MuonJ.Shank J.F. Laporte A. Rimoldi S. Goldfarb LVL 2 Trigger/ Trigger DAQ S. George S. Tapprogge M. Weilers A. Amorim / F. Touchard Event Filter V. Vercesi F. Touchard Computing Steering Group members/attendees: 4 of 19 from US (Malon, Quarrie, Shank, Wenaus)

7 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 7 Project Planning Status  U.S./Int’l ATLAS WBS/PBS and schedule fully unified  ‘Projected’ out of common sources (XProject); mostly the same  US/Int’l software planning covered by the same person  Synergies outweigh the added burden of the ATLAS Planning Officer role  No ‘coordination layer’ between US and Int’l ATLAS planning: direct ‘official’ interaction with Int’l ATLAS computing managers. Much more efficent  No more ‘out of the loop’ problems on planning (CSG attendance)  True because of how the ATLAS Planning Officer role is currently scoped  As pointed out by an informal ATLAS computing review in March, ATLAS would benefit from a full FTE devoted to the Planning Officer function  I have a standing offer to the Computing Coordinator: to willingly step aside if/when a capable person with more time is found  Until then, I scope the job to what I have time for and what is highest priority  ATLAS management sought to impose a different planning regime on computing (PPT) which would have destroyed US/Int’l planning commonality; we reached an accommodation which will make my time more rather than less effective, so I remained in the job

8 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 8 ATLAS Computing Planning  US led a comprehensive review and update of ATLAS computing schedule in Jan-Mar  Milestone count increased by 50% to 600; many others updated  Milestones and planning coordinated around DC schedule  Reasonably comprehensive and detailed through 2002  Things are better, but still not great  Schedule still lacks detail beyond end 2002  Data Challenge schedules and objectives unstable  Weak decision making (still a major problem) translates to weak planning  Strong recommendation of the March review to fix this; no observable change  Use of the new reporting tool PPT (standard in ATLAS construction project) may help improve overall planning  Systematic, regular reporting coerced by automated nagging  Being introduced so as to integrate with and complement XProject-based planning materials. XProject adapted; waiting on PPT adaptations.

9 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 9 Short ATLAS planning horizon As of 3/02

10 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 10 Summary Software Milestones Green: Done Gray: Original date Blue: Current date

11 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 11 Data Challenge 1  DC1 phase 1 (simu production for HLT TDR) is ready to start  Software ready and tested, much developed in the US  Baseline core software, VDC, Magda, production scripts  2M events generated and available for filtering and simulation  US is providing the first 50K of filtered, fully simulated events for QA  Results will be reviewed by QA group before the green light is given for full scale production in about a week  During the summer we expect to process 500k fully simulated events at the BNL Tier 1

12 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 12 Brief Comments on Activity Areas  Control Framework and Architecture  Database  Software Support and Quality Control  Grid Software

13 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 13 Control Framework and Architecture  US leadership and principal development roles  David Quarrie recently offered and accepted a further 2 year term as Chief Architect  Athena role in ATLAS appears well consolidated  Basis of post-simulation Data Challenge processing  Actively used by end users, with feedback commensurate with Athena’s stage of development  LHCb collaboration working well  FADS/Goofy (simulation framework) issue resolved satisfactorily  Regular, well attended tutorials  Other areas still have to prove themselves  ATLAS data definition language  Being deployed now in a form capable of describing the ATLAS event model  Interactive scripting in Athena  Strongly impacted by funding cutbacks  New scripting service emerging now  Tremendous expansion in ATLAS attention to event model  About time! A very positive development  Broad, (US) coordinated effort across the subsystems to develop a coherent ATLAS event model  Built around the US-developed StoreGate infrastructure  Core infrastructure effort receiving useful feedback from the expanded activity

14 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 14 Database  US leadership and key technical expertise  The ROOT/relational hybrid store in which the US has unique expertise in ATLAS is now the baseline, and is in active development  The early US effort in ROOT and relational approaches (in the face of ‘dilution of effort’ criticisms) was a good investment for the long term as well as the short term  Event data storage and management now fully aligned with the LCG effort  ~1 FTE each at ANL and BNL identified to participate and now becoming active  Work packages and US roles now being laid out  ATLAS and US ATLAS have to be proactive and assertive in the common project for the interests of ATLAS (I don’t have my LCG hat on here!), and I am pushing this hard  Delivering a data management infrastructure that meets the needs of (US) ATLAS and effectively uses our expertise demand it

15 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 15 Software Support, Quality Control  New releases are available in the US ~1 day after CERN (with some exceptions when problems arise!)  Provided in AFS for use throughout the US  Librarian receives help requests and queries from ~25 people in the US  US-developed nightly build facility used throughout ATLAS  Central tool in the day to day work of developers and the release process  Recently expanded as framework for progressively integrating more quality control and testing  Testing at component, package and application level  Code checking to be integrated  CERN support functions being transferred to new ATLAS librarian  Plan to resume BNL-based nightlies  Much more stable build environment than CERN at the moment  Hope to use timely, robust nightlies to attract more usage to the Tier 1  pacman (Boston U) for remote software installation  Adopted by grid projects for VDT, and a central tool in US grid testbed work

16 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 16 Grid Software  Software development within the ATLAS complements of the grid projects is being managed as an integral part of the software effort  Objective is to integrate grid software activities tightly into ongoing core software program, for maximal relevance and return  Grid project programs consistent with this have been developed  And has been successful  e.g. Distributed data manager tool (Magda) we developed was adopted ATLAS-wide for data management in the DCs  Grid goals, schedules integrated with ATLAS (particularly DC) program  However we do suffer some program distortion  e.g. we have to limit effort on providing ATLAS with event storage capability in order to do work on longer-range, higher-level distributed data management services

17 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 17 Effort Level Changes  ANL/Chicago – loss of.5 FTE in DB  Ed Frank departure; no resources to replace  BNL – cancelled 1 FTE new hire in data management  Insufficient funding in the project and the base program to sustain the bare- bones plan  Results in transfer of DB effort to grid (PPDG) effort – because the latter pays the bills, even if it distorts our program towards lesser priorities  As funding looks now, >50% of the FY03 BNL sw development effort will be on grid!!  LBNL – stable FTE count in architecture/framework  One expensive/experienced person replaced by very good postdoc  It is the DB effort that is most hard-hit, but ameliorated by common project  Because the work is now in the context of a broad common project, US can still sustain our major role in ATLAS DB  This is a real, material example of common effort translating into savings (even if we wouldn’t have chosen to structure the savings this way!)

18 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 18 Personnel Priorities for FY02, FY03  Priorities are the same as presented last time, and this is how we are doing…  Sustain LBNL (4.5FTE) and ANL (3FTE) support  This we are doing so far.  Add FY02, FY03 1FTE increments at BNL to reach 3FTEs  Failed. FY02 hire cancelled.  Restore the.5FTE lost at UC to ANL  No resources  Establish sustained presence at CERN.  No resources  As stated last time… we rely on labs to continue base program and other lab support to sustain existing complement of developers  And they are either failing or predicting failure soon. Lab base programs are being hammered as well.

19 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 19 SW Funding Profile Comparisons 2000 agency guideline January 2000 PMP 11/2001 guideline Compromise profile requested in 2000 Current bare bones

20 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 20 Conclusions  US has consolidated the leading roles in our targeted core software areas  Architecture/framework effort level being sustained so far  And is delivering the baseline core software of ATLAS  Database effort reduced but so far preserving our key technical expertise  Leveraging that expertise for a strong role in common project  Any further reduction will cut into our expertise base and seriously weaken the US ATLAS role and influence in LHC database work  US has made major contributions to an effective software development and release infrastructure in ATLAS  Plan to give renewed emphasis to leveraging and expanding this work to make the US development and production environment as effective as possible  Weakening support from the project and base programs while the emphasis on grids grows is beginning to distort our program in a troubling way

21 LCG Project Status & Plans (with an emphasis on applications software) Torre Wenaus BNL/CERN DOE/NSF Review of US LHC Software and Computing NSF, Arlington June 20, 2002

22 Torre Wenaus, BNL/CERN DOE/NSF Review 22 The LHC Computing Grid (LCG) Project  Developed in light of the LHC Computing Review conclusions  Approved (3 years) by CERN Council, September 2001  Injecting substantial new facilities and personnel resources  Activity areas:  Common software for physics applications  Tools, frameworks, analysis environment  Computing for the LHC  Computing facilities (fabrics)  Grid middleware  Grid deployment  Global analysis environment  Foster collaboration, coherence of LHC computing centers Goal – Prepare and deploy the LHC computing environment

23 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 23 Other Labs The LHC Computing Grid Project LHCC Project Overview Board Technical Study Groups Reports Reviews Resources Board Resource Issues Computing Grid Projects Project execution teams HEP Grid Projects Project Manager Project Execution Board Requirements, Monitoring Software and Computing Committee LCG Project Structure

24 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 24 Current Status  High level workplan just written (linked from this review’s web page)  Two main threads to the work:  Testbed development (Fabrics, Grid Technology and Grid Deployment areas)  A combination of primarily in-house CERN facilities work and working with external centers and the grid projects  Developing a first distributed testbed for data challenges by mid 2003  Applications software (Applications area)  The most active and advanced part of the project  Currently three active projects in applications:  Software process and infrastructure  Mathematical libraries  Persistency  Pressuring the SC2 to open additional project areas ASAP – not enough current scope to put available people to work effectively (new LCG and existing IT people)

25 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 25 LHC Manpower needs for Core Software2000 Have (miss) 20012002200320042005ALICE 12(5) 17.516.51717.516.5 ATLAS 23(8) 3635302829 CMS 15(10) 2731333333 LHCb 14(5) 2524232221 Total 64(28) 105.5106.5103100.599.5 Only computing professionals counted From LHC Computing Review (FTEs) LCG common project activity in applications software: Expected number of new LCG-funded people in applications is 23 Number hired or identified to date: 9 experienced, 3 very junior Number working today: 8 LCG (3 in the last 2 weeks), plus ~3 existing IT, plus expts

26 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 26 Applications Area Scope T Application Software Infrastructure q Scientific libraries, foundation libraries, software development tools and infrastructure, distribution infrastructure T Physics Data Management q Storing and managing physics data: events, calibrations, analysis objects T Common Frameworks q Common frameworks and toolkits in simulation, reconstruction and analysis (e.g. ROOT, Geant4) T Support for Physics Applications q Grid ‘ portals ’ and interfaces to provide distributed functionality to physics applications q Integration of physics applications with common software

27 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 27 Typical LHC Experiment Software Architecture … a ‘Grid-Enabled’ View Standard Libraries High level triggers One main framework, e.g. ROOT Various specialized frameworks: persistency (I/O), visualization, interactivity, simulation, etc. Grid integration Widely used utility libraries (STL, CLHEP); distributed services Applications built on top of frameworks Analysis Simulation Reconstruction Frameworks Toolkits Frameworks Toolkits = Common solutions being pursued or foreseen Grid Services Grid Interfaces

28 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 28 Current major project: Persistency  First major common software project begun in April: Persistency Framework (POOL)  To manage locating, reading and writing physics data  Moving data around will be handled by the grid, as will the distributed cataloging  Will support either event data or non-event (e.g. conditions) data  Selected approach: a hybrid store  Data objects stored by writing them to ROOT files  The bulk of the data  Metadata describing the files and enabling lookup are stored in relational databases  Small in volume, but with stringent access time and search requirements, well suited to relational databases  Successful approach in current experiments, e.g. STAR (RHIC) and CDF (Tevatron)  LHC implementation needs to scale to much greater data volumes, provide distributed functionality, and serve the physics data object models of four different experiments  Early prototype is scheduled for September 02 (likely to be a bit late!)  Prototype to serve a scale of 50TB, O(100k) files, O(10) sites  Early milestone driven by CMS, but would have been invented anyway: we need to move development work from abstract discussions to iterating on written software  Commitments from all four experiments to development participation  ~3 FTEs each from ATLAS and CMS; in ATLAS, all the participation (~2 FTEs) so far is from the US (ANL and BNL); another ~1-2 FTE from LHCb+ALICE

29 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 29 Hybrid Data Store – Schematic View Experiment event model Object Dictionary Service Persistency Manager Experiment specific object model descriptions Storage: Pass object(s) Get ID(s) Retrieval: Pass ID(s) Get object(s) Locator Service Distributed Replica Manager (Grid) Locate Files File(s) Storage Manager ID – File DB fopen etc. Object Streaming Service File info File records Data objects Object info Object descriptions Dataset Locator Process dataset Dataset DB File(s) Name Data File Human Interaction

30 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 30 Coming Applications RTAGs  After a very long dry spell (since Jan), the SC2 has initiated the first stage of setting up additional projects – establishing requirements and technical advisory groups (RTAGs) with 2-3 month durations  Detector geometry and materials description  To address high degree of redundant work in this area (in the case of ATLAS, even within the same experiment)  Applications architectural blueprint  High level architecture for LCG software  Pending RTAGs in applications  Physics generators (launched yesterday)  A fourth attempt at a simulation RTAG in the works (politics!)  Analysis tools (will follow the blueprint RTAG)

31 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 31 Major LCG Milestones  June 2003 – LCG global grid service available (24x7 at 10 centers)  June 2003 – Hybrid event store release  Nov 2003 – Fully operational LCG-1 service and distributed production environment (capacity, throughput, availability sustained for 30 days)  May 2004 – Distributed end-user interactive analysis from Tier 3  Dec 2004 – Fully operational LCG-3 service (all essential functionality required for the initial LHC production service)  Mar 2005 – Full function release of persistency framework  Jun 2005 – Completion of computing service TDR

32 June 20, 2002 Torre Wenaus, BNL/CERN DOE/NSF Review 32 LCG Assessment  In the computing fabrics (facilities) area, LCG is now the context (and funding/effort source) for CERN Tier0/1 development  But countries have been slow to follow commitments with currency  In the grid middleware area, the project is still trying to sort out its role as ‘not just another grid project’; not yet clear how it will achieve the principal mission of ensuring the needed middleware is available  In the deployment area (integrating the above two), testbed/DC plans are taking shape well with an aggressive mid 03 production deployment  In the applications area, the persistency project seems on track, but politics etc. have delayed the initiation of new projects  The experiments do seem solidly committed to common projects  This will change rapidly if LCG hasn’t delivered in ~1 year  CMS is most proactive in integrating the LCG in their plans; ATLAS less so to date (this extends to the US program). I will continue to push (with my ‘ATLAS’ hat on!) to change this


Download ppt "Software Overview and LCG Project Status & Plans Torre Wenaus BNL/CERN DOE/NSF Review of US LHC Software and Computing NSF, Arlington June 20, 2002."

Similar presentations


Ads by Google