Presentation on theme: "Distributed Analysis at the LCG Torre Wenaus, BNL/CERN LCG Applications Area Manager Caltech Grid Enabled Analysis."— Presentation transcript:
Distributed Analysis at the LCG Torre Wenaus, BNL/CERN LCG Applications Area Manager http://cern.ch/lcg/peb/applications Caltech Grid Enabled Analysis Workshop June 24, 2003
Torre Wenaus, BNL/CERN GridPP meeting, July 1, 2003 Slide 2 Distributed Analysis Related Activity at the LCG Middleware requirements and use cases arising from distributed analysis (GAG, HEPCAL) See Ruth’s talk Analysis modelling (Grid Technology Area) See Kathrin Paschen’s talk Distributed analysis application layer (Applications Area) ‘ARDA’ RTAG Hopes for this meeting
Torre Wenaus, BNL/CERN GridPP meeting, July 1, 2003 Slide 3 Applications Area Activity Products mentioned are examples; not a comprehensive list Blue: Common activity Grey: Experiment specific
Torre Wenaus, BNL/CERN GridPP meeting, July 1, 2003 Slide 4 Distributed Analysis in the Applications Area Anticipated activity: Grid interfaces to the experiments – interfaces to physicist end users, and grid-enabled services serving higher level applications and frameworks Integration/adaptation of physics applications software in the grid environment Prerequisite: A mandate coming from agreement among experiments on common work Via an ‘RTAG’: Requirements and Technical Assessment Group Distributed Analysis RTAG just established a week ago But even in the absence of a mandate, we have started limited, focused work because we have two people hired explicitly to work on distributed analysis Development of a remote launch service Task agreed upon a week ago, and now starting
Torre Wenaus, BNL/CERN GridPP meeting, July 1, 2003 Slide 5 Remote Launch Service A ‘grid service’ in the LCG architecture Remotely launch the clients and/or masters making up a distributed parallel interactive analysis task Using grid middleware Providing immediate launch and responsiveness A generic service usable in different analysis tool contexts The service will be integrated and used in both PROOF and Ganga ie. integrated with ROOT/CINT and as a Ganga Python module What middleware can/should we use? Looking first at Condor ‘Computing On Demand’ (COD) – appears to have the specs we need Very interesting talk by Derek Wright at http://www.cs.wisc.edu/condor/CondorWeek2003/presentations/ http://www.cs.wisc.edu/condor/CondorWeek2003/presentations/ Maarten Ballintijn may already have Condor/PROOF COD working? Looking forward to PROOF demo
Torre Wenaus, BNL/CERN GridPP meeting, July 1, 2003 Slide 6 Other Distributed Analysis Tasks Before ‘remote launch service’ was chosen as an initial distributed analysis task, others were proposed and considered An indication of (some of) what is seen to be missing for interactive analysis Proposed tasks were Grid-based control/communication service used between interactive masters/clients Development of an OGSA(-like?) service making use of GSI Is no middleware project going to provide us with this essential service? Interface to datasets/file catalogs including querying on tags, LFN, etc. – i.e., a dataset service Interface to resource broker to find the best location(s), based on the data set and interactive availability, where to run the query Do today’s resource brokers understand distributed interactive analysis? Will tomorrow’s? Comments on these and on how best to use 1-1.5 FTEs on distributed analysis are welcome
Torre Wenaus, BNL/CERN GridPP meeting, July 1, 2003 Slide 7 RTAG on An Architectural Roadmap towards Distributed Analysis (ARDA) 1). Observation: Different LHC experiments have developed packages (AliEn, Ganga, Dirac, Impala, Boss, Grappa, Magda…) that either sit on top, complement, expand or parallel the functionality of the Grid middleware (VDT, EDG…) At this time the LCG is coming to grips with the middleware development requirements There is an expectation that an OGSA Services Architecture will be the basis for future development. The Experiments need to specify in their TDR’s, baselines, fallback and development strategies Motivation: To agree on requirements as laid out in a first step by recent work within the GAG and identify commonalities within the current projects which might allow the LCG (both in the AA and GTA areas) to provide a focus of effort. To provide guidance to the LCG on future Middleware development directions and interfacing work to match the experiment requirements To build on the richness of the current technical solutions to avoid duplication of efforts To clearly identify the roles and responsibilities of the components/layers/ services in the experiment DA planning To give guidance to the community on the expected division of work between the experiments, the LCG and the external projects. 1) Arda was the name given by the Elves to their World and all it contained, see www.glyphweb.com/arda/
Torre Wenaus, BNL/CERN GridPP meeting, July 1, 2003 Slide 8 Mandate for the ARDA RTAG To review the current DA activities and to capture their architectures in a consistent way To confront these existing projects to the HEPCAL II use cases and the user's potential work environments in order to explore potential shortcomings. To consider the interfaces between Grid, LCG and experiment-specific services Review the functionality of experiment-specific packages, state of advancement and role in the experiment. Identify similar functionalities in the different packages Identify functionalities and components that could be integrated in the generic GRID middleware To confront the current projects with critical GRID areas To develop a roadmap specifying wherever possible the architecture, the components and potential sources of deliverables to guide the medium term (2 year) work of the LCG and the DA planning in the experiments.
Torre Wenaus, BNL/CERN GridPP meeting, July 1, 2003 Slide 9 Schedule and Makeup of ARDA RTAG The RTAG shall provide a draft report to the SC2 by September 03. It should contain initial guidance to the LCG and the experiments to inform the September LHCC manpower review, in particular on the expected responsibilities of The experiment projects The LCG (Development and interfacing work rather than coordination work) The external projects The final RTAG report is expected for October 03. The RTAG shall be composed of Two members from each experiment Representatives of the LCG GTA and AA If not included above, the RTAG shall co-opt or invite representatives from the major Distributed Analysis projects and non-LHC running experiments with DA experience.
Torre Wenaus, BNL/CERN GridPP meeting, July 1, 2003 Slide 10 This Meeting I hope this meeting can give a kick start to the RTAG… Informed by a survey of what exists (code, use cases) now, What are the components/layers/services required specifically for distributed analysis? What software is currently existing or in the works to cover these? Can an architecture that is realizable in the near term be blocked out? Can it be agreed on? On the principle that we have to start with realizable architectures and tools and build upwards incrementally over time With due consideration for the R&D nature of present work, can we work in a coherent and complementary way? Can we identify elements which should be pursued as common solutions? When we confront current middleware with our needs, what is missing? How will the holes be filled?