Download presentation
Presentation is loading. Please wait.
Published byErika Hardy Modified over 6 years ago
1
Data Management Considerations for the International Polar Year
World Data Center for Glaciology, Boulder Facilitating the international exchange of snow and ice data Data Management Considerations for the International Polar Year “In the midst of the present IGY, with its vast ramifications, its flood of observations, messages, reports, etc. threatening to overwhelm the individual, the just proudness of this … progress is mingled with a sentimental nostalgia [for] IPY2 … IPY2 was like chamber-music compared to the symphony of the present IGY” Julious Bartels, Annals of the International Geophysical Year, Vol 1., p205 Mark A. Parsons, Ronald L. Weaver, Ruth Duerr, and Roger G. Barry American Geophysical Union San Francisco, California 14 December 2004
2
IPY1 IPY2 IGY (IPY3) IPY4 ? So if IGY was a symphony what is IPY4 with it’s overwhelming volume of data. Let’s hope it’s not cacophony.
3
What will IPY4 bring? IPY4 ? Will you be able to find all the data relevant to your research and see relationships between data sets. Will you be able to retrieve IPY4 data in 2050? Will you be able to merge and integrate different data sets across experiments and disciplines? Will you be able to subset, visualize, and transform your data? etc. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004
4
Organization of IPY Data Management
Data Policy & Management Subcommittee scientists data managers funding agencies IPY Joint Committee eGY Programme Office Data & Information Service Users From the IPY Framework document based on recomendations from JCADM, Clic, and the ICSU Priority Area Assessment on Sci Data and Info Note: Service not system--but still needs to be a portal Recommends and I’m generally assuming open and free access DIS is “conductor” that ensures all data components are coordinated and follow best practices Projects Data Centers, Virtual Observatories, etc. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004
5
Systems and Innovation
Succeeded “Challenged” Failed Careful of a monolithic data system or overreliance on technology--build on what exists as stated in framework A federated approach can encourage innovation with less risk All bound together with a simple DIS DIS is conducter but we all need to know the music Greater risk with size and complexity The Standish Group’s “CHAOS report”. An assessment of 40,000 IT application projects Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004
6
Organization of IPY Data Management
Data Policy & Management Subcommittee scientists data managers funding agencies IPY Joint Committee eGY Programme Office Data & Information Service Users Seeking a step improvement in DM Projects Data Centers, Virtual Observatories, etc. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004
7
The People Part “A striking proportion of project difficulties stem from people in both customer and supplier organisations failing to implement known best practice.” — Oxford University/Computer Weekly survey of public and private sector IT projects (emphasis added) However, people are much more able to adapt to change, uncertainty, and messy systems Systems don’t solve problems people do, but they need standards and best practices. Service counts. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004
8
The People Part: Science and Data Management
Many have stated the need to involve scientists in data management, but… It is also important to involve data managers in conducting science. Field Experiments: 20% increase in data quality (Parsons, et al. 2004) 70% of experiment cost is data collection (Longley, et al. 2001) Observing systems NRC repeatedly, US Climate Change Sci Prog., JCADM, PAA, IPY agree on scientific involvement (enhances usability) We saw a ~20% improvement in data quality by involving data managers in data collection also Increased completeness Improved data collection protocol when data managers were involved in data collection for a large field experiment. Especially important with all the experiments (~70% of an experiment cost is data collection Longley e.a. 2001) and new obs systems slated to be part of IPY Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004
9
Preservation and Access—Two Peas in a Pod
Scientific Data Stewardship: “preservation and responsive supply of reliable and comprehensive data, products, and information for use in building new knowledge to…” —USGCRP, 1998 “the long-term preservation of the scientific integrity, monitoring and improving the quality, and the extraction of further knowledge from the data” — H. Diamond et al., NOAA/NESDIS, 2003 Describe archive needs like OAIS: fixity, integrity, etc. vs. access needs such as catalog metadata (also needed for arch) Preservation and access have overlapping metadata requirements Overlapping integrity requirements Both driven by changing user needs Both disrupted by changing technology Therefore: Both must be considered during planning, collection, processing, archiving… “Scientific data and information management can no longer be viewed as a task for untrained amateurs or as part of routine ‘clean up’ at the completion of a research project.” --ICSU PAA However Access still needs work… Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004
10
Access. What is it? Preservation requirements are well defined in the Open Archive Information System (OAIS) Reference Model, but No similar model for access requirements — eGY could help Not even a common definition of “access” and what restricts it Unique access requirements for social science data and non-digital collections (physical samples, photographs, audio, etc.) Access needs--relate to portal, We have OAIS, now we need Access ref. Model--first should be open access definition, data integration. Access relates to data policy, capacity, association with publication, finding, data mining, data integration Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004
11
Documentation Use existing standards, e.g. Describe uncertainty
ISO19115 metadata standard OAIS Reference Model Describe uncertainty Challenge your assumptions “We must not … start from any and every accepted opinion, but only from those we have defined — those accepted by our judges or by those whose authority they recognize.” —Aristotle c. 350 BC No new standards! Uncertainty and errors (data quality) are different things e.g. uncertainty in an algorithm vs errors in measurement Some uncertainty is inherent and uncorrectable (Couclelis, 2003) Lakoff and Johnson (1980)argue that people need a conceptual basis to understand something and that scientists invoke key metaphorical concepts to work observations into a clear consistent structure. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004
12
The Data Itself Formats: Archives and users may have different needs
The Data Itself Formats: Archives and users may have different needs Consider four themes (Raymond, 2004) Transparency Interoperability Extensibility Storage or transaction economy Unique considerations with audio/video There will never be a single standard format but some good examples in the works, (text, OGC, others). eGY could help here. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004
13
Data Management Considerations or Themes
Manage technical innovation Systems need people Scientists and data managers working together Preservation and Access—Two peas in a pod The nature of the documentation The nature of the data In summary Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004
14
Data Management Principles (bumper stickers)
Preservation without access is pointless; access without preservation is impossible. It’s about DATA not systems Involve scientists in data management & data managers in science Think about long-term archiving NOW! Document uncertainty! A good bumper sticker is catchy and conveys much in few words. Keep things simple & flexible Consider the needs of current, future, and unknown users
15
What’s Next? The Data and Information Service should be created soon.
The Data Sub-Committee needs to consider these themes and principles when developing the IPY data policy. If we don’t think about data and their stewardship now and continuously over the next several years, the results of the international polar year will be meaningless. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.