Data Preservation in HEP Use Cases, Business Cases, Costs & Cost Models Grid Deployment Board International Collaboration for Data.

Slides:



Advertisements
Similar presentations
1st Meeting of the Working Party on International Trade in Goods and Trade in Services Statistics - September 2008 Australia's experience (so far) in.
Advertisements

Pulling it all together… with thanks to Sheila Anderson.
How to commence the IT Modernization Process?
Data Management TEG Status Dirk Duellmann & Brian Bockelman WLCG GDB, 9. Nov 2011.
Sidney B. Westley East-West Center Let’s Talk about Reaching Wider Audiences with our Research Results.
Collaboration to Clarify the Costs of Curation The 4C Project – A Collaboration to Clarify the Costs of Curation APARSEN Webinar: 13 June 2013 Neil Grindley.
Relevance and effectiveness Regional Good Standardization Practice Course July, Bangkok, Thailand Good Standardization Practice 2012.
23 June Strategy Proposal Heinz Stockinger on behalf of the Executive Board SwiNG Assembly Meeting Berne, 23 June 2008.
New DFG Information Infrastructure Projects Dr. Stefan Winkler-Nees; Birmingham, 28. March 2011 New DFG Information Infrastructure Projects.
Data preservation & the Virtual Observatory Bob Mann Wide-Field Astronomy Unit Royal Observatory Edinburgh
© 2014 Public Health Institute PROPOSAL WRITING.
DATA PRESERVATION IN ALICE FEDERICO CARMINATI. MOTIVATION ALICE is a 150 M CHF investment by a large scientific community The ALICE data is unique and.
Digital preservation Hydra Europe, LSE 24 April 2015 Anders Conrad.
EGI-Engage Recent Experiences in Operational Security: Incident prevention and incident handling in the EGI and WLCG infrastructure.
Exa-Scale Data Preservation in HEP
S/W Project Management
Organization Mission Organizations That Use Evaluative Thinking Will Develop mission statements specific enough to provide a basis for goals and.
Long-Term Data Preservation in HEP Challenges, Opportunities and Solutions(?) Workshop on Best Practices for Data Management & Sharing.
Long-Term Data Preservation in HEP Challenges, Opportunities and Solutions(?) Joint Data Preservation RDA-3 International Collaboration.
2004 National Oral Health Conference Strategic Planning for Oral Health Programs B.J. Tatro, MSSW, PhD B.J. Tatro Consulting Scottsdale, Arizona.
Data Preservation at the Exa-Scale and Beyond Challenges of the Next Decade(s) APARSEN Webinar, November 2014.
Update on UDFR (Unified Digital Format Registry) NDIIPP Meeting June 25, 2009 Andrea Goethals.
ETICS2 All Hands Meeting VEGA GmbH INFSOM-RI Uwe Mueller-Wilm Palermo, Oct ETICS Service Management Framework Business Objectives and “Best.
Technology Strategy Board Driving Innovation Participation in Framework Programme 7 Octavio Pernas, UK NCP for Health (Industry) 11 th April 2012.
© 2013 OSLC Steering Committee1 Proposal to Create OSLC Affiliated Technical Committees OSLC Steering Committee Meeting: 1 PM EDT, 8 July 2013 Open Services.
Long-Term Data Preservation: Debriefing Following RDA-4 WLCG GDB, October 2014
Workshop summary Ian Bird, CERN WLCG Workshop; DESY, 13 th July 2011 Accelerating Science and Innovation Accelerating Science and Innovation.
Introduction 1. Purpose of the Chapter 2. Institutional arrangements Country Practices 3. Legal framework Country Practices 4. Preliminary conclusions.
Data Preservation in High Energy Physics Towards a Global Effort for Sustainable Long-Term Data Preservation in HEP
EGI_DS or “can WLCG operate after EGEE?” Jamie Shiers ~~~ WLCG GDB, May 14 th 2008.
The DPHEP Collaboration & Project(s) Services, Common Projects, Business Model(s) PH/SFT Group Meeting December 2013 International.
Ian Bird LCG Project Leader OB Summary GDB 10 th June 2009.
+ Chapter 9: Management of Business Intelligence © Sabherwal & Becerra-Fernandez.
4/5/20071 The LAW (Linux Applications on Windows) Project Sudhamsh Reddy University of Texas at Arlington.
EGI-Engage Recent Experiences in Operational Security: Incident prevention and incident handling in the EGI and WLCG infrastructure.
DPHEP7 / DASPOS Closing DPHEP7, March 2013 International Collaboration for Data Preservation and Long Term Analysis in High Energy.
CCRC’08 Monthly Update ~~~ WLCG Grid Deployment Board, 14 th May 2008 Are we having fun yet?
JISC/CNI Conference Edinburgh, 26th June 2002 Challenges of Digital Preservation – do we have a road map? Maggie Jones.
Collaboration to Clarify the Costs of Curation CERN Costs Workshop Activities and Approaches to Cost Modelling in the 4C Project 13 – 14 January 2014 Germán.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
TCRF Strategic Planning Process A Stakeholders’ Consultative Retreat- Morogoro 26 th -27 April 2013.
Data Preservation in HEP Use Cases, Business Cases, Costs & Cost Models Grid Deployment Board International Collaboration for Data.
Update on HEP SSC WLCG MB, 6 th July 2009 Jamie Shiers Grid Support Group IT Department, CERN.
Run II Review Closeout 15 Sept., 2004 FNAL. Thanks! …all the hard work from the reviewees –And all the speakers …hospitality of our hosts Good progress.
#DPHEP: Status and Outlook Sustainable Strategies for Long-Term DP at the Exa-scale LHCC Referees Meeting International Collaboration.
Summary of HEP SW workshop Ian Bird MB 15 th April 2014.
JSPG Update David Kelsey MWSG, Zurich 31 Mar 2009.
LHC Computing – the 3 rd Decade Jamie Shiers LHC OPN meeting October 2010.
Preservation e-Infrastructures, Certification & ADMP IGs DPHEP Status and Outlook RDA Plenary 6 Paris, September 2016 International.
International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics RECODE - Final Workshop - January.
DPHEP7 / DASPOS Introduction DPHEP7, March 2013 International Collaboration for Data Preservation and Long Term Analysis in High Energy.
DPHEP – International Perspectives
The DPHEP Collaboration & Project(s) Services, Common Projects, Business Model(s) EGI “towards H2020” Workshop December 2013 International.
Preparing Data Management Plans for WLCG and HNISciCloud IT International Collaboration for Data Preservation and Long Term.
A Shared Commitment to Digital Preservation and Access.
Long-Term Data Preservation WLCG Overview Board, March 2013 Twitter: #DPHEP International Collaboration for Data Preservation and.
Workshop summary Outline  Workshop’s aims  Highlights from the presentations (my selection!)  Costing Exercise – What we learnt  Summary - Roadmap.
School on Grid & Cloud Computing International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics.
Grid as a Service. Agenda Targets Overview and awareness of the obtained material which determines the needs for defining Grid as a service and suggest.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
Future Organisation of the PCWG
HEP LTDP Use Case & EOSC Pilot
David Kelsey CCLRC/RAL, UK
EOSCpilot WP4: Use Case 5 Material for
APARSEN Webinar, November 2014
Ian Bird GDB Meeting CERN 9 September 2003
2. ISO Certification Discussed already at 2015 PoW and several WLCG OB meetings Proposed approach: An Operational Circular that describes the organisation's.
Moving in the digital world – breaking down the barriers Monique Nielsen National Archives of Australia February 2018.
What does DPHEP do? DPHEP has become a Collaboration with signatures from the main HEP laboratories and some funding agencies worldwide. It has established.
What is a System? A system is a collection of interrelated components that work together to perform a specific task.
Presentation transcript:

Data Preservation in HEP Use Cases, Business Cases, Costs & Cost Models Grid Deployment Board International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics

DPHEP-fest Today: Monday Oct 14: – Update on progress since CHEP 2013 Wednesday Oct 16: DPHEP CHEP – DPHEP “Common Projects”; – Moving from a “Problem Statement” (Blueprint) to Services, Solutions and Projects 2

DPHEP Implementation Board Equivalent to GDB / MB for DPHEP – Indico: – Twitter: – Mail archives: IB/default.aspxhttps://groups.cern.ch/group/DPHEP- IB/default.aspx 3

DP in the Wider Context Many projects / disciplines active At least some “mature” in many aspects – We can profit a lot by collaboration (bi-directional) International / inter-disciplinary coordination: – Alliance for Permanent Access (APA) [ executive board candidate ] – RDA Preservation e-Infrastructure Interest Group [ vice-chair ] Several relevant conferences / workshops:conferences / workshops – APA – iDCC – iPRES – PV – (RDA) 4

High Level Strategy wrt Others Make “them” aware of us – “Them” = other projects, funding agencies, … Clarify what we can offer – e.g. “bit preservation” at 100PB -> 1EB scale  This seems to be working 5

The remainder of this talk will concentrate on: – Use Cases; – Associated Business Cases; – Costs & Cost Models. Why is this relevant for the GDB? – Because there are messages and implications for the funders – As they may well be service and other implications (“best practices”) – Because members of the GDB can provide input to the elaboration of the costs & cost models Once we have these we can prepare a “roadmap” for handling the key Use Cases An analysis of the costs is essential for future work… 6

DPHEP – 1 st Workshop “The problem is substantial and past experience shows that early preparation is needed and sufficient resources should be allocated.” “The “raison d’être” of data preservation should be clearly and convincingly formulated, including a viable economic model.” 7

Use Cases Three Use Cases have been identified, based on the “Problem Statement(s)” in the DPHEP Blueprint They are simple enough for discussions with non-experts They may be over-simplified but IMHO this does not dramatically alter the bottom line 8

1 – Long Tail of Papers 9

2 – New Theoretical Insights 10

3 – “Discovery” to “Precision” 11

4 – (whatever) There is a general feeling that “we” should preserve data “forever” “just in case” No clear business case An understanding of the costs can help clarify the strategy (e.g. “best effort” – bit preservation + ?) Preservation of data + software + knowledge beyond human lifetimes not obvious… (Cost benefit analysis) 12

Use Case Summary 1.Keep data usable for ~1 decade 2.Keep data usable for ~2 decades 3.Keep data usable for ~3 decades Re-visit after we have understood costs & cost models, plus potential “solutions” 13

COSTS AND COST MODELS 14

Costs – Introduction We do not know exactly what the costs will be in the future But, we can make estimates, based on our “knowledge” and experience In some areas these estimates will be relatively accurate In others, much less so “Acceptable” costs compared to what? – Cost of LHC? WLCG? A specific service, such as DB? 15

A DB Service Costs include: – Hardware; – Licenses & maintenance; – People. There is also value = business case  10 = EUR1M/year 16

Costs of Curation Workshop Within DPHEP, and in collaboration with external projects (e.g. 4C), we are planning a “no stone left un-turned” workshopworkshop Look at the many migrations we have performed in the (recent) past – plus those foreseen  Estimate / calculate costs Come up with scenarios for the future: – 10 year preservation = 3 media migrations + n build systems + p s/w repositories + q O/S versions + … – 20 year preservation: more disruptive changes – 30 year preservation: more still  Manpower almost certainly the dominant cost What can we do to optimize it? Coordinate validation activities -> service Streamline emulation activities -> tool-kit(s) Best practices & support for migration activities -> support activity Can we do things in a way that costs less in the future – and make our data more “preservational”? 17

Summary Your input and experience is needed to make the workshop successful – Jan 13/14 (or Jan 27/28) We will start to build agenda now – output will be a report with costs & cost models This should help guide our work – and IMHO is a pre-requisite for obtaining funding / resources 18

Conclusions Unless there are real surprises (IMHO not consistent with “experiment”), the real and necessary costs of curation are affordable Affordable means business case is valid / strong  Knowing the numbers can only help 19

EntityDescriptionInput and Positioning Output DPHEP Project Manager Project management, administrative, technical, funding Main operational coordinator, maintain contacts, organises meetings, lead proposals for funding Reports to the steering committee 20