Presentation is loading. Please wait.

Presentation is loading. Please wait.

Working Group 4 Facilities and Technologies

Similar presentations


Presentation on theme: "Working Group 4 Facilities and Technologies"— Presentation transcript:

1 Working Group 4 Facilities and Technologies
Stephen Wolbers and Yves Kemp (co-chair) Fermilab and DESY Workshop on Data Preservation and Long Term Analysis in HEP SLAC, May 26-28, 2009

2 Outline Charge to WG4 and Overview Methodology and responses
Overall Results Plans for the workshop May 26-29, 2009 Stephen Wolbers

3 Charge and Overview The Working Subgroup (WG4): Facilities and Technologies was charged to provide: Survey and assessment of existing infrastructures in HEP and their adaptability to data preservation requirement Reflection on the impact of the new technologies on the data preservation methods. May 26-29, 2009 Stephen Wolbers

4 Working Group Methodology
There are two co-leaders of this working group. Yves Kemp (DESY) Stephen Wolbers (FNAL) We decided to survey as many institutions and experiments as possible in a short time to get a broad overview of what various institutions practice and what official policy is with respect to data management. 10 data centers and 9 experiments were contacted. 9 centers replied. 3 experiment replied. May 26-29, 2009 Stephen Wolbers

5 Questions in the survey
Questions were developed rather unscientifically but with the goal of learning how various concerned parties are thinking about data preservation and long term analysis. General questions about data management and storage, backup, and related policies (6) Questions related to databases, code and other types of data required for analysis (2) Question about VM technology (1) Although the questions are directed at centers, experiments were asked to comment on the same to gauge what they believe they will need from the centers and/or will develop with the centers. May 26-29, 2009 Stephen Wolbers

6 Results from the survey
The data centers are remarkably consistent in many ways: The size of the data storage. Practices for managing the data. Lack of formality in dealing with data. Access and authentication practices. Basic architecture for storage and access. Handling of “non-data” files such as databases, code, user files, calibrations and other constants, documentation, etc. Technology investigations. May 26-29, 2009 Stephen Wolbers

7 Results from the survey (2)
Differences seen from the survey: Robotic and tape technologies. HSM software. Disk cache and details of access methods. Expectations for long term preservation and access. Size of data managed: All are quite large by current standards. Some are just “larger”. Hard to tell whether the differences are essential or merely represent different initial conditions and working points. But it is most likely the latter. May 26-29, 2009 Stephen Wolbers

8 Results from the Survey (3)
Experiments. The experiments have less to say on these matters, at least that is what we have found so far. These are likely to be addressed in the other working groups. The requests will likely fall into the following categories: Need for long-term agreements for data storage and access. Formal agreements. Need for access to all of the myriad other information required for analysis. May 26-29, 2009 Stephen Wolbers

9 What’s missing from the survey
We put together this survey rather quickly. Neglected some possible technologies or techniques which may be important as we move forward: Grid technologies and techniques (but these are used in any case) Cloud technologies (still new and not well understood) Other emerging ideas such as Digital Libraries, INSPIRE, etc. Other new ideas. Resources (tapedrives, tapes, CPU, networks) Regulatory, security, budget, priority issues. Workflow Systems May 26-29, 2009 Stephen Wolbers

10 Experiments views In general the experiments rely on the data centers for facilities and services. Hope for the best. Normally get what is required. Experiments are in a strong position when: While the experiment is taking data Results are coming out Press releases are common and frequent Experiments are in a much weaker position when: The experiment is no longer taking data No upgrades are planned Fewer results, fewer publications, fewer collaborators are active, etc. May 26-29, 2009 Stephen Wolbers

11 Facilities View Computing Facilities will need a reason to invest resources in maintaining access to data. Part of the mission of these workshops is to develop the case. Data preservation is not normally: Mandated Funded High priority Part of an agreement Technologically challenging or leading-edge High demand Headline or PR enabling May 26-29, 2009 Stephen Wolbers

12 Dangers One would like to protect against: Change of mission Lab
Computing organization Other Change of leadership Budget troubles Loss of expertise May 26-29, 2009 Stephen Wolbers

13 Other Issues Many of the issues for the facilities are important. But there are differences: Size of the facility and its future. Larger facilities with large increases in data size find it easier to deal with migration and access to older data. Mission of the facility Single-purpose facility Multi-purpose facility Funding May 26-29, 2009 Stephen Wolbers

14 Cost Nothing is free People – they are also busy with other things. Expertise Equipment Software Facility managers will require some sort of funding mechanism and justification to maintain functionality and capability of systems. This is not normally a part of their mandate. Funding agencies and/or the management of the facility as a whole is not completely flexible. May 26-29, 2009 Stephen Wolbers

15 Plans for this workshop and for future work
We should examine the resources required to provide facilities and services for: Data Preservation and migration Access to the data DB, code, auxiliary file preservation and access Documentation Web pages Ideas for what is needed to negotiate agreements Plan for and understand “technology migration” Many other ideas that will come from the discussions. May 26-29, 2009 Stephen Wolbers

16 Some specific suggestions (thanks to Yves)
Virtualization projects. Cloud computing trials, a la Belle. Interface of HSM from job to data. SRM, dcap, rfio, etc. More will be added… May 26-29, 2009 Stephen Wolbers

17 Specific plan Well-defined goals based on input from the survey and from the representatives here. Assignment of work and timescales, including authorship responsibility. Some attempt at coordination with other groups to ensure that the scope is consistent with what other groups are doing. May 26-29, 2009 Stephen Wolbers


Download ppt "Working Group 4 Facilities and Technologies"

Similar presentations


Ads by Google