Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Management Overview David M. Malon Argonne U.S. LHC Computing Review Berkeley, CA 14-18 January 2003.

Similar presentations


Presentation on theme: "Data Management Overview David M. Malon Argonne U.S. LHC Computing Review Berkeley, CA 14-18 January 2003."— Presentation transcript:

1 Data Management Overview David M. Malon Argonne U.S. LHC Computing Review Berkeley, CA 14-18 January 2003

2 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 2 Outline  Technology transition  Architecture and design  Support for data challenges  LHC-wide common projects  Database support for detector description  Other collaborative efforts—conditions databases  Some challenges on the horizon  Summary

3 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 3 Technology transition  Objectivity/DB has been the ATLAS baseline, and the persistence technology for Data Challenge 0, but will be phased out this year  Currently retained as a reference implementation  ATLAS Software Release 6.0.0 (January 2003) and developer releases leading up to it will not depend upon Objectivity/DB (on schedule)  Migration for data in Objectivity/DB databases based upon earlier releases planned for 2003  Technology strategy is to adopt LHC-wide common persistence infrastructure (hybrid relational and ROOT-based streaming layer) as soon as this is feasible  A U.S.-developed ROOT-based conversion service provides the persistence technology for at least Phase I of Data Challenge I  This, too, will be phased out when common project software is sufficiently capable

4 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 4 Technology transition (2)  ATLAS architectural separation of transient and persistent representations has meant that the transition has been relatively painless for physicists and physics software developers  Not so painless for the database group, partly because of the need to support multiple technologies simultaneously with limited personpower  But AthenaROOT conversion services provide valuable prototyping for LHC common project work  Short-term problem in any case  Complicated by need to support data access inside and outside Athena  Remember that Geant3 simulations are still in FORTRAN  When LHC common project dictionary infrastructure (see David Quarrie’s talk) is more mature, such transitions may be substantially easier

5 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 5 Architecture and design  U.S.-led effort produced an event store architecture document last fall  Since last review, a U.S.-led effort produced a hybrid (relational/streaming) event store design document, using the architecture document as a starting point  Represents the most detailed thinking among any of the LHC experiments about how to build a hybrid store  Circulated to other LHC software architects, and the principal subject of an April database workshop in Orsay  CERN IT/DB and ROOT team experts attended as well  Not all of the ideas will survive an LHC-wide common project, but many will, and they provide a non-trivial starting point for LHC-wide discussions in any case

6 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 6 Support for data challenges  Data Challenge 0 was finally completed (Summer ’02), when Data Challenge 1 was well underway(!)  It’s a good thing, though, that ATLAS did not simply declare success without serious continuity tests—these should be the true legacy of DC0  Database group was therefore supporting two different persistence technologies (Objectivity/DB and AthenaROOT) for these data challenges  Also supporting event generation for both data challenges, to different extents  Seizing the opportunity to introduce grid project technologies into ATLAS data challenges  Magda from PPDG  Virtual data ideas from GriPhyN in event generation and simulation recipes—even in advance of the release of GriPhyN VDL toolkit (Chimera)

7 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 7 Support for data challenges (2)  U.S. database group has been trying to avoid losing too many developers to (worthwhile) day-to-day data challenge production responsibilities  Not being involved in DC production management, though, makes it harder to push for prototyping and adoption of tools developed by U.S. grid projects (e.g., virtual data infrastructure from GriPhyN)  Leads inevitably to some ad hoc solutions to data challenge problems  CERN-based database effort (Goossens, Smirnov), though, has been largely lost to the data challenges  ATLAS Data Challenge Coordinator (Poulard) is also the CERN group leader  Update: Smirnov is leaving CERN to join the Chicago iVDGL team  Note that there has been a significant increase in DC involvement by U.S. ATLAS grid testbed participants

8 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 8 LHC-wide common projects  First RTAG (Requirements Technical Assessment Group) commissioned by SC2 was to try to find sufficient common ground for an LHC-wide project to deliver a shared persistence infrastructure  RTAG membership: Brun (ROOT), Duellmann (IT/DB), Innocente (CMS), Malon (ATLAS; convenor), Mato (LHCb), Rademakers (ALICE)  Succeeded in producing a document and achieving consensus sufficient to launch a common project  Final report delivered 5 April 2002  Proposes ROOT-based streaming layer plus a relational database layer  Persistence project launch workshop held 5-6 June 2002 at CERN  Quarrie and Malon also represent ATLAS in the LCG Architects Forum (and there are other RTAGs)

9 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 9 LHC common persistence infrastructure (POOL)  Workshop produced agreement to attempt to meet a rather aggressive schedule—a September 2002 release with non-trivial functionality, and a Spring 2003 release sufficient to support serious data challenges  ATLAS database group is fully committed to contributing to this effort and to adopting this technology  To be clear: the common project infrastructure that POOL will provide IS our baseline event store technology  All event-store-related ATLAS database development is planned to be a contribution to or an extension of POOL, or an integration of POOL into the ATLAS/Athena environment  U.S. is contributing approximately 2 FTEs directly to common project development; this should increase as Objectivity responsibilities wane  Orsay plans to contribute ~1 FTE

10 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 10 U.S. contributions to POOL  We are attempting to avoid mission creep in the common project as well, by participation in selected clearly defined work packages  Common project event collections and collection management (ANL)  Contributions to persistence for non-ROOT objects (BNL)  Craig Tull (LBNL) is contributing to dictionary effort  We have also volunteered for management of external MySQL-related packages (ensures coherence of ATLAS and LHC-wide versions)  Both ANL and BNL (Malon, Adams) will continue to contribute to overall common project architecture and design  We have established “local” liaisons with Fermilab-based CMS contributors to the common project (Joshi, Tanenbaum)

11 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 11 Integration of POOL into ATLAS  First really testable release of POOL delivered in December  U.S. ATLAS plans to deliver an Athena conversion service based on POOL in Release 6.0.0 (January ’03) if POOL release is sufficiently capable  Likely to be a prototype, for use by database developers  First “production” release of POOL is due in Spring ’03  First ATLAS “production” conversion service based upon POOL will follow (Release 7.0.0, tentatively)

12 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 12 Database support for detector description  Database group was asked in April by detector description working group to provide access to “primary numbers”—numbers that parameterize ATLAS geometry—for Athena applications  U.S. database group in August delivered access to “primary numbers”—numbers that parameterize ATLAS geometry description— via a conversion service that respects the Gaudi/Athena architecture  Numbers are resident in a MySQL database  Approach strongly leverages NOVA work, funded at BNL as an LDRD project

13 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 13 Other collaborative efforts—conditions databases  Strategy has been to use IT-provided conditions database if possible, rather than writing such a service ourselves  IT implementations, though, are in Objectivity/DB and Oracle9i  Lisbon ATLAS group has delivered a MySQL implementation for TDAQ community; we have enlisted their help, and encouraged them to contribute this to LCG repository (done!)  Concurrently, TDAQ implementation has been integrated into ATLAS offline software releases  U.S. database group has refrained from work in this area in an effort to avoid overcommitment, but real work is needed soon to connect conditions infrastructure to Athena  Planned for Release 6.0.0  We have organized a condition database workshop for 4-5 February  Challenge: Muon test beam in April(?!)

14 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 14 Some challenges on the horizon  Development of an architecture based upon POOL  POOL project is currently delivering building blocks upon which an architecture might be based  Will be helped by Blueprint RTAG work  U.S. ATLAS is in a position to take a leadership role in articulation of a common project persistence architecture, thanks to prior work on ATLAS database architecture and hybrid event store design  Development of a coherent approach to relational database services  Currently many, many MySQL applications in ATLAS; ad hoc approach to access and services  Should be addressed on an LHC-wide level  ATLAS database group is currently formulating strawman requirements, near-term deployment plans, and proposed longer-term strategies as starting points for LHC-wide discussions  Presentation made to LCG audience November 2003

15 15 January 2003 David M. Malon, ANL U.S. LHC Computing Review 15 Summary  We are in the midst of a major technology transition while supporting data challenges, with no reserve personpower  No replacement for Ed Frank (Chicago), delayed rampup at BNL; U.S. database group is funded at a level that is less than “bare bones”  Have lost an additional 0.5 FTE at Argonne; expect restoration in ‘03  Committed to ensuring the success of the LCG persistence project, and to using the resulting infrastructure as the principal ATLAS persistence technology  Relying upon joint projects and leveraging other projects wherever possible (LCG (POOL), CERN IT and ATLAS/TDAQ (conditions), PPDG (Magda), GriPhyN (virtual data), LDRD (NOVA for primary numbers), HENP Grand Challenge (POOL event collections and iterators), …)


Download ppt "Data Management Overview David M. Malon Argonne U.S. LHC Computing Review Berkeley, CA 14-18 January 2003."

Similar presentations


Ads by Google