Presentation is loading. Please wait.

Presentation is loading. Please wait.

Archiving of solar data Luis Sanchez Solar and Heliospheric Archive Scientist Research and Scientific Support Department.

Similar presentations


Presentation on theme: "Archiving of solar data Luis Sanchez Solar and Heliospheric Archive Scientist Research and Scientific Support Department."— Presentation transcript:

1 Archiving of solar data Luis Sanchez Solar and Heliospheric Archive Scientist Research and Scientific Support Department

2 Contents A bit of history on ESA archives. The SOHO archive.
Evolution: New ESA approach to science operations and archiving. Archive infrastructure and data products. The virtual observatory layer. NASA’s Heliospheric Virtual Observatory.

3 ESA’s old approach to archiving science data
‘Traditional’ approach: No science archiving done at ESA (except Hipparcos). Funding agencies supporting PIs responsible for archiving data. Infrared Space Observatory changed that: ESA’s RSSD established an archive for this observatory type mission. Triggered involvement in archiving of astronomy data at VILSPA: Active participation in IVOA with the development of virtual observatory aware mission archives for astronomy missions (ISO, XMM, Integral…). Same path also followed for planetary data: Establishment of Planetary Science Archive at VILSPA with very close ties to PDS. Meanwhile, other mission specific archives were established elsewhere SOHO at GSFC, Ulysses and CAA at ESTEC. Limited interoperability with virtual observatories.

4 Archiving of SOHO data The SOHO archive was developed by ESA with the collaboration of NASA. Servers supplied by ESA. Software designed and developed by ESA in 1997. Storage provided by NASA as part of the Solar Data Analysis Center (SDAC) and shared with other missions. Network infrastructure contributed by NASA. Simple, modular design based on several components: Relational Database Managing System (RDBMS). Web based user interface (UI). Middleware for passing information between the UI and the RDBMS. Validation and ingestion of data products. Off line (batch) distribution of data products.

5 The SOHO archive place in SOHO operations

6 The ingestion is instrument based:
A software module is written to validate and extract metadata from all the data products provided by a given instrument. Addition of new data products or modification of existing ones do not affect to data products from other instruments.

7 The SOHO archive Pros: Easy software maintenance.
Designed to be used with any RDBMS supported by Perl::DBI. Designed to be used with a variety of user interfaces. Runs on any major operative system. Cons: Primitive interface with virtual observatories. Not easy to run applications on top of it (for example, basic data analysis. 10 year old technology. New archive being developed at ESAC reusing existing code base for science archives fitting with new approach to science ops and archiving.

8 ESA’s new approach to sciops/archiving
VILSPA is now ESAC (European Space Astronomy Centre): Focal point for science operations and archiving. ESA supports the establishment of long term science archives across all disciplines reusing the infrastructure already developed at ESAC for astronomy and planetary missions. ESA’s RSSD is discussing a renewed approach to science operations and archiving for Solar System missions: On-going process tied to RSSD reorganization. More resources to PI teams to get calibrated data. Improve consolidation across missions for operations and archiving. Development mission archives for all science disciplines which support existing virtual observatories.

9 Archive building blocks
Mission or discipline oriented long term archives. Archive infrastructure can be common for ‘active’ and long term archives. This is the ‘technical layer’. Hardware (servers, storage, and networks). Operative system and application/utility level software like RDBMS. Great scope for infrastructure consolidation (lower costs, more efficiency). But has to work properly with the archive holdings to be held. Archive holdings. This is where the science is. Data products in the traditional sense. Software. Science applications. Procedures. Logs. Documentation.

10 Archive infrastructure requirements
Some basic requirements for the archive infrastructure: Completeness: All data from the mission stored together with software and procedures. Different levels of data products also stored. Longevity: Hardware and software ought to be upgraded as easily as possible during the life of the archive. Integrity: Data products should not change (see also ‘security’). Availability: Data products should be accessible to PIs and other scientists without restrictions and in a timely manner. Accountability: Every operation with the archive is documented and traceable. Security: Against tampering and denial of service attacks. Status information: The status of the archive including data holdings but also operational status (users, queries executed, data distributed…).

11 Data products (including software, documentation, etc.)
Some aspects to have into account when defining data products: Intended usage (science analysis, housekeeping, public relations…) Audience intended (PI team, engineering team, wider scientific community…). Tools to be used when accessing and using it. Turnaround times for generation, expiration and access. Dependencies on other data products (for generation, expiration, access). Versions to be produced (perhaps for different calibrations or purposes). Metadata required to fully describe it. Relationship between metadata used and those used by the science community on similar or related data products. Format for data and metadata representation. Physical implementation for the chosen format. Documentation on procedure and software used to generate it. Documentation on what the data product represents. Quality of data information (very hard to do it properly a posteriori).

12 Virtual Observatory layer
Additional layer on top of existing archives and services (order below is roughly chronological from now onwards so some future time): Archive location irrelevant to the user (distributed access). Data and metadata may be held in different locations. Searches are independent from the archive holding the data. Searches use a common set of parameters using a data model. Data retrieval is done from one or many data repositories. Possibility to run science applications on data holdings. Eventually, data retrieval might be not even necessary. GRID computing (remote data, services, and computation). Virtual observatory ‘added value’: Working with science data is easier (less boring, non productive stuff to do). Opens up new science (that was too work intensive, or because with ‘data mining’ is possible to find new relationships between data products). Making new data products accessible to the science community is also easier.

13 NASA Virtual Observatories initiative
Heliophysics Virtual Observatory: Data from existing missions (SOHO, Trace, RHESSI, Wind, Cluster, ACE, Polar, Geotail, FAST, IMAGE, TIMED, SORCE, Ulysses, Voyager…) and upcoming ones (STEREO, Solar-B, SDO…). Heliophysics becomes separated from Earth Sciences. Distributed environment: ‘Small box’ approach with the Virtual Solar Observatory (VSO) as pathfinder. Resident archives (the existing ones) to retain data collections. Virtual observatories for convenient search with access to all data. Distributed funding and implementation. SPASE data model as ‘Rosetta stone’ for interoperability of heliophysics data. Magnetospheric data in PDS to be made compatible with SPASE so it becomes accessible to the space physics scientific community. The Heliophysics Virtual Observatory is the umbrella or the sum of all these virtual observatories.


Download ppt "Archiving of solar data Luis Sanchez Solar and Heliospheric Archive Scientist Research and Scientific Support Department."

Similar presentations


Ads by Google