Presentation is loading. Please wait.

Presentation is loading. Please wait.

Science Archives Workshop - April 25, 2007 - Page 1 Archive Policies and Implementation: A Personal View from a NASA Heliophysics Data Policy Perspective.

Similar presentations


Presentation on theme: "Science Archives Workshop - April 25, 2007 - Page 1 Archive Policies and Implementation: A Personal View from a NASA Heliophysics Data Policy Perspective."— Presentation transcript:

1 Science Archives Workshop - April 25, Page 1 Archive Policies and Implementation: A Personal View from a NASA Heliophysics Data Policy Perspective D. Aaron Roberts NASA GSFC 25 April 2007

2 Science Archives Workshop - April 25, Page 2 Define:Archive (some Google results)  A site containing a large number of files, possibly acquired over time, and often publicly accessible. (100 Best Web Hosting)  A function permitting users to copy one or more files to a long-term storage device. Archive copies can:  Accompany descriptive information;  Imply data compression software usage;  Be retrieved by archive date, file name, or description (Tivoli Storage Manager)  Archive is a London-based Trip-hop group. (Wikipedia)

3 Science Archives Workshop - April 25, Page 3 Science Data Archive Definition  Easily accessible, scientifically useable, well-documented, secure data = a good archive.  Requires:  Open data policy  Independently useable data  Science input (data preparation and serving)  Proper registration and backup

4 Science Archives Workshop - April 25, Page 4 Archiving Homilies  Archiving is a journey, not a destination  “Archive early, archive often” as a natural extension of serving data  “Central” archiving is more about knowledge than acquisition  Knowledge must be easily available: presentation matters  The customer is always right  Standards are only as good as the community that supports them, but they are essential: “It’s the metadata, stupid”  Consider the legacy

5 Science Archives Workshop - April 25, Page 5 Archiving is a journey Properly described, well-documented, accessible data should easily move from one archiving stage to the next:  NASA missions produce Active Archives (nothing is “ingested”)  Products, delivery, and initial long-term data plans in Project Data Management Plan  Virtual Observatories provide uniform descriptions and access to many such archives  The archive continues to develop in the extended mission  A Mission Archive Plan provides updates to the Senior Reviews on status, plans, and actions for post mission products and service  After the mission, a Resident Archive can continue to server data  Active upgrades of data products to be funded by other means  NSSDC manages the RAs  “Permanent” archiving may just be moving the data and documentation to a more generic Resident Archive (e.g., SDAC, SPDF) for continued access  At all stages, backups and registries maintain safety and knowledge of the data products

6 Science Archives Workshop - April 25, Page 6 “Central” archiving More about knowledge than acquisition:  What exists?  Where is it?  Is it well documented?  Is it safe?  New focus for NSSDC role (at least for HP): knowledge of data environment; management of RAs.  (Harvested) VO registries augmented as needed can provide a complete set of resources.  Information about the above should be available in ways that provide easy overviews as well as details.

7 Science Archives Workshop - April 25, Page 7 The customer is always right The community determines directions:  Peer review of VOs, RAs, Data Centers, Missions: What is working? What could be improved? What can go?  HP Data and Computing Working Group provides feedback on HQ directions  “Top down vision, bottom-up implementation”  “Market-driven” including what we want from archives

8 Science Archives Workshop - April 25, Page 8 It’s the metadata, stupid Standards that work:  Value of sharing data  SPASE data model provides a uniform description of data products  SPASE description + data = “SIP”, “AIP”, and “DIP”  Preserved data should be in common, open, supported formats (e.g, FITS, HDF, CDF, documented ASCII, …)  Communication and other standards TBD  Important to decide the level of description

9 Science Archives Workshop - April 25, Page 9 Consider the legacy Preserving and serving what matters for the long term:  What is most useful? (If “all” is not possible)  What works now, and what will last (and how)?  Calibrated, best-effort products should accompany level-zero plus software/algorithms

10 Science Archives Workshop - April 25, Page 10 A model Heliophysics never quite implemented Main problems: (1)“Planning” is a mission function (in collaboration with VOs and others) (2)“Ingest” is replaced by “production” and “transfer” (3)“Access” is a distributed function as are the archives in general

11 Science Archives Workshop - April 25, Page 11 The New Heliophysics Mission Data Lifecycle and Framework

12 Science Archives Workshop - April 25, Page 12 Summary Easily accessible, scientifically useable, well-documented, secure data = a good archive. Archiving is a journey, not a destination “Central” archiving is more about knowledge than acquisition Knowledge must be easily available: presentation matters The customer is always right Standards are only as good as the community that supports them, but they are essential: “It’s the metadata, stupid” Consider the legacy

13 Science Archives Workshop - April 25, Page 13 Backup Slides (HP Data Policy)

14 Science Archives Workshop - April 25, Page 14 The HP Data Environment l Data from the Heliophysics Great Observatory reside in a distributed environment and are served from multiple sources. l Multimission Data Centers n Solar Data Analysis Center n Space Physics Data Facility (CDAWeb, OMNIWeb, etc.) n National Space Science Data Center l Mission-level active archives: e.g. ACE, TIMED, TRACE, Cluster, etc. l Much of our data are served from individual instrument sites. l We are moving into a new data environment of n Virtual Observatories for convenient search and access of the distributed data, and n Resident Archives to retain the distributed data sources even after mission termination. l We have a Data and Computing Working Group to help us move ahead.

15 Science Archives Workshop - April 25, Page 15 Goals of the HP Science Data Management Policy l Improve management of and access to HP mission data. l Clarify the architecture and associated data lifecycle milestones of the data environment. l Provide guidelines for proposals, Project Data Management Plans, NRAs, peer reviews, and other activities related to the HP data environment.

16 Science Archives Workshop - April 25, Page 16 Basic Philosophy l Evolve the existing HP data environment: n take advantage of new computer and Internet technologies to n respond to our evolving mission set and community research needs (enable the HP Great Observatory) l Blend ‘bottoms-up’, ‘market-driven’ implementation approaches with a ‘top-down’ vision for an integrated data environment. l Assure that the HP science community participates in all levels of data management.

17 Science Archives Workshop - April 25, Page 17 Guiding Principles l All data produced by the HP missions will be open and made available as soon as is practical. n Gurman's "Right Amount of Glue” from the Fall 2002 AGU meeting sets the philosophy [see a key component of which is a standard of behavior - share one’s data with everyone. l Data will be independently scientifically usable. n adequate documentation including uniform SPASE descriptions n sustainable and open data formats n easy electronic access n provision of appropriate analysis tools.

18 Science Archives Workshop - April 25, Page 18 Architecture l The environment will be distributed n Many archives with different internal workings l Data integration capabilities provided by discipline- based virtual observatories (“VxO’s”; VSO first for x = “Solar” and now 5 others) n linked by a central dictionary (“SPASE Data model”) and machine- to-machine communication routines. n Easily permits the inclusion of essential data sets from non-NASA sources. n Provides a context for services and advanced analysis tools developed under, e.g. AISRP, LWS TR&T, and the VxOs.

19 Science Archives Workshop - April 25, Page 19 Policy Recommendations, Etc. l The Policy includes: n Roles of data environment components n “Rules of the Road” for data use, n Recommendations for Project Data Management Plans and Mission Archive Plans, n A timeline of the HP mission data lifecycle

20 Science Archives Workshop - April 25, Page 20 Implementation l Use peer-review processes to assist in managing the elements of the environment. n NRAs for: (a) VxOs, (b) Data quality and access improvement, (c) Resident Archives, and (d) Value-added services. n Mission and Data Center Senior Reviews RA reviews. l Success will be determined by community use and feedback. The process is “market-driven.”

21 Science Archives Workshop - April 25, Page 21 Current Activities l Finalizing the Data Policy with community input. n Our goal is to have this ready for the MIDEX AO l Implementing a second round of VxOs and processing the next round of proposals for VxOs and related services. l Coordinating these efforts through frequent interactions and work with the SPASE group. l Implementing Resident Archives and the processes to manage these archives. l Working with new missions to incorporate the Data Policy from the start, and “retrofitting” older missions through VxOs and other means. l Working on collaboration with other NASA science divisions, other US agencies, and international partners. l Maintaining a web site for latest news about our data environment:


Download ppt "Science Archives Workshop - April 25, 2007 - Page 1 Archive Policies and Implementation: A Personal View from a NASA Heliophysics Data Policy Perspective."

Similar presentations


Ads by Google