Presentation on theme: "Science Archives in the 21st Century Best Practices in Ingestion and Data Access at the NASA/IPAC Infrared Science Archive"— Presentation transcript:
Science Archives in the 21st Century Best Practices in Ingestion and Data Access at the NASA/IPAC Infrared Science Archive http://irsa.ipac.caltech.edu/ G. Bruce Berriman (IPAC, Caltech)
Science Archives in the 21st Century The NASA/IPAC Infrared Science Archive The archive node for NASA’s infrared astronomy data sets Housed at the Infrared Processing Analysis Center (IPAC) in Pasadena Multi-wavelength archive curates data from IRAS, 2MASS, SWAS, MSX, IRTS, Spitzer Legacy projects 200 source catalogs, 10 million images, 30,000 spectra Interoperable with Spitzer archive, ISO, NED, VizieR
Science Archives in the 21st Century Best Practices in Data Standards & Ingestion Data standards designed to enable astronomers to use the data Source catalogs - attributes of all columns must be fully specified Preferably delivered in column delimited ASCII format Images must comply with the FITS standard and include WCS footprint Spectra must comply with FITS standard or be delivered as a table Include slit center coordinate and position angle Data should be self-describing & include provenance
Science Archives in the 21st Century IRSA Best Practices Become a resource for data providers Become a resource for data providers One Archive provides data management support for an active mission IRSA provided this function for 2MASS; will provide for WISE Data products already incorporated into archive infrastructure Work day to day with processing and science team - problems are inexpensive to solve Two Archive staff as members of science or data processing teams
Science Archives in the 21st Century IRSA Best Practices Three Schedule and budget pressure complicate delivery of well structured data sets Reprocessing is expensive E.g. Spitzer Legacy team re-deliveries cost IRSA $200K over past two years Encourage early delivery of sample products IRSA provides on-line and downloadable tools that are aids in QA (http://irsa.ipac.caltech.edu/irsa-dataQA.html) Tools developed in response to common problems. Examples Document attributes of a source catalog Validate structure and format of a source table Validate syntax, WCS information and astrometry of image
Science Archives in the 21st Century Image Validation Tool Provides simple check of positional accuracy - overlay positions of 2MASS sources FITS keywords comply with FITS syntax (fverify) WCS information is complete Edit FITS headers Control image display
Science Archives in the 21st Century Best Practice in Data Access Best practice: use a common software architecture All IRSA services are integrated into the Infrared Science Information System Component based architecture Modules are stand-alone, portable ANSI C tools that are plugged together Supports extensive software re-use Controls maintenance costs Anatomy of User Application Application is usually a CGI program Components plugged together & controlled by an executive library Executive starts components as child services & parses return values
Science Archives in the 21st Century Archive Software Infrastructure - Benefits to Data Providers Efficient deployment of new end-user services IRSA has used this infrastructure to build archives for customers Michelson Science Center W. M. Keck Observatory Archive Transit Data Set archives Interferometry archives (KI, PTI) Cosmic Evolution Survey (COSMOS) NASA Stellar and Exoplanet Database (NStED) Estimated savings in MSC, COSMOS and NStED introduced by re-use is $3M.
Your consent to our cookies if you continue to use this website.