Presentation is loading. Please wait.

Presentation is loading. Please wait.

Applications Review RWL Jones GridPP12. ATLAS Risks: –Delayed middleware, raid changes in framework, competition Progress –New GANGA version (first deliverable)

Similar presentations


Presentation on theme: "Applications Review RWL Jones GridPP12. ATLAS Risks: –Delayed middleware, raid changes in framework, competition Progress –New GANGA version (first deliverable)"— Presentation transcript:

1 Applications Review RWL Jones GridPP12

2 ATLAS Risks: –Delayed middleware, raid changes in framework, competition Progress –New GANGA version (first deliverable) for ATLAS use with: plug-in to ATLAS production system through ADA plug in for ATLAS AMI (metadata system) –Developing ADA job editor and builder, AMI browser –Support of Grid operations – software kits and distribution tools (ADA package management) –support for OS variants –Frederic Brochu as production manager will be part of the Grid tool selection ream Issues –Interoperability issues between Grids –SE issues LCG Castor failures RLS corruption (transfer and m/w failure).

3 LHCb Risks: –Technology incompatibility –Changing metadata/prodsys requirements –Dependence on ARDA etc. Progress –No level-1s due until the end of 05Q1. –Production console prototype advanced because of changed DC schedule. –Production desktop release in DIRAC was a level-1 in GridPP1, now delivered and in use. –Bookkeeping interface developed in ARDA context. A Tomcat servelet is deployed The XML-RPS service development is suspended in favour of the GANGA/ARDA interface. Issues –Changing DC planning –Integration with/dependence on GANGA and ARDA.

4 GANGA Risks: –Contention between ATLAS and LHCb objectives –Competition –Sudden changes in frameworks –Middleware and metadata delays. Progress –Revision of design underway to optimise link to client systems. –Updates to job registry to allow reloading of pre-repackaging jobs. –Improvements in the threading and monitoring of the core modules. –Athena (Tan, for ATLAS) and Brunel (Soroko for LHCb) modules plugged-in using PyBus. Improvements in the DIRAC and DaVinci application handlers. –Python bindings for DIAL, access to dial classes using PyDial Allows access to the ATLAS production system etc. –A prototype job submission is implemented. –Jobs splitting is being developed and extensive testing underway. Issues –Slight delays in level-2s because priority given to experiment-specific features expected they will be recovered during Soroko’s visit to CERN. Similar short trip by Harrison and Tan to BNL (2 weeks) requested to move things forward.

5 CMS Risks: –Everything! –They list withdrawal of an experiment Progress –Significant contribution to PhEDEx – could be of more general use. Can sustain TB transfers Works across LCG and Grid3 –Work towards bookkeeping metadata system –Unification of Italian (Grape) and UK (Gross) job submission. BOSS will be re-engineered to include required functionality –R-GMA performance studies indicate it will scale, deployment on LCG-2 test bed Testing continues in CMS context Issues –They list withdrawal of an experiment as a risk! –Concerns about disparate effort met in large part by central planning

6 Data Management (PhEDEx) Manages bulk data transfer for CMS –Automated, scalable point-to-point and routed multi-hop file transfers –Reliable: state of transfers always known, full rollback and retry of individual transfers –Couples update of local file cats; Triggers local actions on receipt of files or file groups –Locates nearest replica for transfer –Resources on Grid3, LCG and non-Grid –Effectively forms a single datagrid –Wide variety of tools, storage resources Now approaching maturity –TB per day transfers –Multi-10k replicas per site every few days –Sustained service imminent –Talking to LCG re next phase of tool development

7 Grid Analysis (GrOSS) GrOSS –Simple interface to LCG for physicist user –Handles job splitting, prep, archiving, monitoring on the Grid –Currently uses BOSS for job submission, tracking –User options specified in extended classAds –No changes required to applications Latest developments –Rationalisation of GrOSS and similar tools within CMS –Joint development plan established –Timescale of few months –New tool will incorporate GrOSS functionality directly into BOSS with a scriptable python user interface layer –UK leading the integration process

8 Monitoring (R-GMA / BOSS) R-GMA development / performance testing –Promised functionality is now present –Initial tests with ~ 1 minute jobs suggest that it will scale to entire CMS data processing system –Tests with more realistic jobs required Latest developments –Rolled out in LCG-2.2.0 –Evaluating for transport layer beneath CMS BOSS tool –Links strongly to UK Grid Analysis work Status –Pushing to get R-GMA actually deployed on LCG testbed –Started testing with longer CMS simulation jobs –R-GMA is steadily becoming less fragile…

9 SAMGrid Risks: –Almost everything –They also list withdrawal of an experiment Progress –CDF Integrating SAM and Grid3 –Is this covered by JIM/RunJob deliverables? basic job submission demonstrated. JIM installation at Oxford and Glasgow. SAM v6 switch took longer than needed because of memory lean in db-server. –D0: Installing and testing SAMGrid at IC to allow P17 reprocessing. Adding reprocessing to D0RunJob. (Level-2s on track) –SAM interfaces underway to allow the reprocessing to be of general use. Issues –They list withdrawal of an experiment as a risk! –Managerial – need for more clearly defined Level-3s –CDF plans may face major changes after central review –When do the Regional Analysis Centres become live and start reporting?

10 BaBar Risks: –LCG and other middleware –Gird divergence/interoperability –Breaking dependence on Objectivity Progress –Requirement capture for distributed analysis complete. –Grid submission Manchester/RAL, but problems with conditions db access, environment variables –RLS read-write access to Italy demonstrated Issues –Reporting! –Affect of late hiring on delivery

11 QCDGrid Risks: –External middleware –Staff retention –Change of project scope Progress –ILDG look likely to adopt QCDgrid software C++ + Globus + EDG security –Risk analysis and stress testing requirement capture complete –Good progress on stress testing itself and software installation and distribution on UKQCD machines. –QCDOC is now installed and being shaken down in Edinburgh along with ‘Tier 1’ 50 Tbyte store. Issues –Contact/overlap with other GridPP activities

12 PhenoGrid Risks: –No risk assessment Progress –? Issues –Reporting! –Affect of late hiring on delivery

13 Portal Risks: –Beaten to the task by others Progress –Requirements complete but technical review document not on time. Needs discussion with Rob Allen –First portlet built allowing GridSphere application access to DN server can issue visual_grid_proxy_init using jnlp. Issues –Need swift progress –Links with other projects need to be established


Download ppt "Applications Review RWL Jones GridPP12. ATLAS Risks: –Delayed middleware, raid changes in framework, competition Progress –New GANGA version (first deliverable)"

Similar presentations


Ads by Google