APOGEE-2 Data Infrastructure Jon Holtzman (NMSU) APOGEE team.

Slides:



Advertisements
Similar presentations
ASPCAP Review Progress report and schedule Q2. SOFTWARE PACKAGES October 4-5, /14/2015.
Advertisements

Chemical Cartography with SDSS/APOGEE Michael Hayden (NMSU), Jo Bovy (IAS), Steve Majewski (UVa), Jennifer Johnson (OSU), Gail Zasowski (JHU), Leo Girardi.
FMOS Observations and Data 14 January 2004 FMOS Science Workshop.
NOAO/Gemini Data workshop – Tucson,  Hosted by CADC in Victoria, Canada.  Released September 2004  Gemini North data from May 2000  Gemini.
Research Astronomy In Southern NM: Insights From the Sloan Digital Sky Survey (SDSS) Jon Holtzman NMSU Department of Astronomy.
Compilation of stellar fundamental parameters from literature : high quality observations + primary methods Calibration stars for astrophysical parametrization.
Processing Status June 2009 Mary Williams Padova Meeting, 11/06/09.
SONG – Stellar Observations Network Group - The robotic software for the SONG network S.Frandsen 1, Eric Weiss 1, J. Skottfelt 2, M.F. Andersen 1, F.Grundahl.
18 April 2007 Second Generation VLT Instruments 1 VIRCAM & CPL: Lessons Learned Jim Lewis and Peter Bunclark Cambridge Astronomy Survey Unit.
Microsoft ® Application Virtualization 4.5 Infrastructure Planning and Design Series.
1 m – APOGEE Feed Jon Holtzman Diane Feuillet (NMSU)
Synthesis of Incomplete and Qualified Data using the GCE Data Toolbox Wade Sheldon Georgia Coastal Ecosystems LTER University of Georgia.
The Project AH Computing. Functional Requirements  What the product must do!  Examples attractive welcome screen all options available as clickable.
APOGEE DR10 Everybody. Data taken from April 2011 through July 2012 – First year survey data all observed spectra, even if all visits not complete: summed.
Commissioning the NOAO Data Management System Howard H. Lanning, Rob Seaman, Chris Smith (National Optical Astronomy Observatory, Data Products Program)
Digitized Sky Survey Update Brian McLean : Archive Sciences Branch / Operations and Engineering Division.
Memorandam of the discussion on FMOS observations and data kicked off by Ian Lewis Masayuki Akiyama 14 January 2004 FMOS Science Workshop.
GAUDI Ground-based Asteroseismology Uniform Database Interface E. Solano Bases de données en spectroscopie stellaire. Paris.
Douglas L. Tucker (FNAL) SISPI Meeting 22 February 2007 Sky Camera DB Inputs.
6e-1 Science Data Products Daryl Swade DMS Systems Engineer S&OC System Design Review #1.
Data Management Subsystem: Data Processing, Calibration and Archive Systems for JWST with implications for HST Gretchen Greene & Perry Greenfield.
STIS Closeout Plan Paul Goudfrooij 2005 HST Calibration Workshop, 10/26/2005.
Accessing APOGEE Data Jon Holtzman (NMSU) APOGEE team.
Data Management Subsystem Jeff Valenti (STScI). DMS Context PRDS - Project Reference Database PPS - Proposal and Planning OSS - Operations Scripts FOS.
IVC : a simulation and model-fitting tool for optical-IR interferometry Abstract I present a new software tool, called “Interferometry Visibility Computations”
Chapter 14 – Chemical Analysis Review of curves of growth How does line strength depend on excitation potential, ionization potential, atmospheric parameters.
The SOC Pilot and the ATOA Jessica Chapman CASS Observatory Operations Research Program Leader 28 June 2011.
EÖTVÖS UNIVERSITY BUDAPEST Department of Physics of Complex Systems VO Spectroscopy Workshop, ESAC Spectrum Services 2007 László Dobos (ELTE)
Planetary Science Archive PSA User Group Meeting #1 PSA UG #1  July 2 - 3, 2013  ESAC PSA Archiving Standards.
Dec 8, Line Widths/ Resolution (1) ThAr UNe Sky Solid: fiber 150 Open: fibers 10/290 Each point is median of results from lots of frames (few per.
Fuerteventura, Spain – May 25, 2013 Physical parameters of a sample of M dwarfs from high- resolution near-infrared spectra Carlos del Burgo Collaborators:
Consortium Meeting La Palma October PV-Phase & Calibration Plans Sarah Leeks 1 SPIRE Consortium Meeting La Palma, Oct. 1 – PV Phase and.
MAST Users Group – July 2009 MAST will provide the archive user interface for Kepler data, primarily light curves and target pixel data. ASB Staffing for.
PDS Geosciences Node Page 1 Archiving Mars Mission Data Sets with the Planetary Data System Report to MEPAG Edward A. Guinness Dept. of Earth and Planetary.
Discussion - Survey Design Survey product equation: #fields = fld/nt x useable x (%xnights/yr) x years = 4 x 0.5 x (0.75 x 13 x 18) x 3 = 4 x 0.5 x 175.
The CoRoT ground-based complementary archive The CoRoT ground-based complementary archive Monica Rainer, Ennio Poretti M. Rosa Panzera, Angelo Mistò INAF.
APOGEE-2 and Data Infrastructure Jon Holtzman (NMSU) APOGEE team (Steve Majewski, PI)
APOGEE DR12 Status/Discussion. Schedule Nominal DR12 deadline is July 31, i.e. tomorrow SDSS-III upper level management was consulted about a small delay.
Data products of GuoShouJing telescope(LAMOST) pipeline and current problems LUO LAMOST Workshop.
Swift HUG April Swift data archive Lorella Angelini HEASARC.
SPACE TELESCOPE SCIENCE INSTITUTE Operated for NASA by AURA COS Pipeline Calibration Goals of CALCOS Association Table Input and Output Files High Level.
AstroGrid-D and GAVO AstroGrid-D: Infrastructure Network, Computing Resources Grid-Technology, Middleware GAVO: Interoperability of archives.
HARPS Data Flow System Christophe Lovis Geneva Observatory HARPS-N PDR, 6-7 December 2007, Cambridge MA.
06-1L ASTRO-E2 ASTRO-E2 User Group - 14 February, 2005 Astro-E2 Archive Lorella Angelini/HEASARC.
PACS Hitchhiker’s Guide to Herschel Archive Workshop – Pasadena 6 th - 10 th Oct 2014 The PACS Spectrometer: Overview and Products Roberta Paladini NHSC/IPAC.
Automated Fitting of High-Resolution Spectra of HAeBe stars Improving fundamental parameters Jason Grunhut Queen’s University/RMC.
STATUS OF APOGEE DATA PRODUCTS, REDUCTION, AND ANALYSIS Jon Holtzman (NMSU) David Nidever (UVa) Ana Garcia-Perez (UVa) Carlos Allende-Prieto (IAC) Szabolcs.
ApproxHadoop Bringing Approximations to MapReduce Frameworks
1 Transiting Exoplanet Survey Satellite Daryl Swade Archive Team Meeting June 16, 2014.
Julie Hollek and Chris Lindner.  Background on HK II  Stellar Analysis in Reality  Methodology  Results  Future Work Overview.
Page 1 NHSC PACS Web Tutorial PACS 301 nhsc.ipac.caltech.edu/helpdesk NHSC/PACS Web Tutorials Running the PACS Spectrometer pipeline for CHOP/NOD Mode.
1 SUZAKU HUG 12-13April, 2006 Suzaku archive Lorella Angelini/HEASARC.
Chemical Compositions of Stars from IGRINS Spectra; the Good, the Risky, and the Ugly some comments on the uses and abuses of ordinary stellar abundance.
APOGEE-2 SRD Mini-Review December 5, 2013 APOGEE-2 SRD Data and Pipeline Requirements Jon Holtzman New Mexico State.
Observational procedures and data reduction Lecture 4: Data reduction process XVII Canary Islands Winter School of Astrophysics: ‘3D Spectroscopy’ Tenerife,
The HST/STIS Next Generation Spectral Library Michael Gregg, U.C. Davis/IGPP-LLNL David Silva, ESO (NOAO) John Rayner, IfA Guy Worthey, Washington State.
Advanced Higher Computing Science The Project. Introduction Worth 60% of the total marks for the course Must include: An appropriate interface using input.
SINFONI data reduction using the ESO pipeline. Instrument design and its impact on the data (I) integral field spectrometer using mirrors brickwall pattern.
A.Zanichelli, B.Garilli, M.Scodeggio, D.Rizzo
Working with MAVEN Data
Stellar Spectroscopy at Appalachian State University R.O. Gray
NRAO VLA Archive Survey
NHSC/PACS Web Tutorials
CASE-FOMBS Follow-up of One Million Bright Stars
JWST Pipeline Overview
ProtoDUNE SP DAQ assumptions, interfaces & constraints
Databases, Web Pages and Archives
COMPASS Database SPACE TELESCOPE SCIENCE INSTITUTE Gretchen Greene
Echidna: current status and expected performance
ASPCAP Review Progress report and schedule Software/data status
Presentation transcript:

APOGEE-2 Data Infrastructure Jon Holtzman (NMSU) APOGEE team

– Data infrastructure for APOGEE-2 will be similar to that of APOGEE-1, generalized to multiple observatories, and with improved tracking of processing – APOGEE raw data and data products are stored on the Science Archive Server (SAS) – Reduction and analysis software is (mostly) managed through the SDSS SVN repository – Raw and reduced data described (mostly) through SDSS datamodel – Data and processing documented via SDSS web pages and technical papers Data infrastructure

– APOGEE instrument reads continuously (every ~10s) as data are accumulating, 3 chips at 2048x2048 each Raw data are stored on instrument control computer (current capacity is several weeks of data) Individual readouts are “annotated” with information from telescope and stored on “analysis” computer (current capacity is several months). These frames are archived to local disks that are “shelved” at APO (currently 20 x 3TB disks) – “quick reduction” software at observatory assembles data into data cubes and compresses (lossless) for archiving on SAS Maximum daily compressed data volume ~ 60 Gb Raw data

Does not include NMSU 1m + APOGEE data LCO data will be concurrent Total 2.5m raw data to date: ~11 TB

“quick reduction” software estimates S/N (at H=12.2) which is inserted into plate database for use with autoscheduling decisions APOGEE-1 – Data transferred to SAS next day, transferred to NMSU later that day, processed with full pipeline following day, updated S/N loaded into platedb, initial QA inspection APOGEE-2 proposal: – Process data at observatory with full pipeline next day, or at SAS location (Utah) and/or – Improve “quick reduction” S/N Initial processing

Three main stages (+1 post-processing) – APRED : processing of individual visits (multiple exposures at different detector spectral dither positions) into visit-combined spectra, with initial RV estimates. Can be done daily – APSTAR: combine multiple visits into combined spectra, with final RV determination. For APOGEE-1, has been run annually (DR10: year 1, DR11: year 1+year2) – ASPCAP: process combined (or resampled visit) spectra through stellar parameters and chemical abundances pipeline For APOGEE-1, has been run 3 times – ASPCAP/RESULTS: apply calibration relations to derived parameters, set flag values for these Pipeline processing

Raw data: data cubes (apR) Processed exposures (maybe not of general interest?) – 2D images (ap2D) – Extracted spectra (ap1D) – Sky subtracted and telluric corrected (apCframe) Visit spectra – Combine multiple exposures at different dither positions – apVisit files: native wavelength scale, but with wavelength array Combined spectra – Combine multiple visits, requires relative RVs – apStar files: resampled spectra to log(lambda) scale Derived products from spectra – Radial velocities and scatter from multiple measurements (done during combination) – Stellar parameters/chemical abundances from best-fitting template Parameters: Teff, log g, microturbulence (fixed), [M/H], [alpha/M], [C/M], [N/M] Abundances for 15 individual elements – aspcapStar and aspcapField files: stellar parameters of best-fit, pseudo-continuum normalized spectra and best fiitting templates Wrap-up catalog files (allStar, allVisit) APOGEE data products

APOGEE data volume Raw data: 2.5m+APOGEE: ~4 TB/year APOGEE-1  ~6 TB/year with MaNGA co-observing 1m+APOGEE: ~2 TB/year LCO+APOGEE: ~3 TB / year TOTAL APOGEE-1 + APOGEE-2 : ~75 TB Processed visit files: ~ 3 TB/year (80% individual exposure reductions) Processed combined star files: ~500 GB/100,000 stars Processed ASPCAP files: raw FERRE files ~500 GB/100,000 stars Bundled output: ~100 GB / 100,000 stars TOTAL APOGEE-1 + APOGEE-2 (one reduction!): ~ 40 TB

APOGEE data access “Flat files” available via SDSS SAS: all intermediate and final data product files summary ``wrap-up” files (catalog) “Catalog files” available via SDSS CAS: apogeeVisit, apogeeStar, aspcapStar Spectrum files available via SDSS API and web interface Planning 4 data releases in SDSS-IV: DR14: July 2017 (data through July 2016) DR15: July 2018 (data through July 2017 – first APOGEE-S) DR16: July 2019 (data through July 2018) DR17: Dec 2020 (all data)

APOGEE software products apogeereduce: IDL reduction routines (apred and apstar) aspcap speclib: management of spectral libraries, but not all input software (no stellar atmospheres code, limited spectral synthesis code) ferre: F95 code to interpolate in libraries, find best fit idlwrap: IDL code to manage ASPCAP processing apogeetarget: IDL code for targetting

APOGEE pipeline processing Software all installed and running on Utah servers Software already in pipeline form (few lines per full reduction step to distribute and complete among multiple machines/processors) Some need to improve distribution of knowledge and operation among team Some external data/software required for ASPCAP operation Generation of stellar atmospheres (Kurucz and/or MARCS) Generation of synthetic spectra (ASSET, but considering MOOG and TURBOSPECTRUM)

APOGEE software/personnel apogeereduce developer: Nidever, Holtzman, (Nguyen) operation: Holtzman, (Hayden, Nidever, Nguyen) ASPCAP grids: ASSET: Allende Prieto / Koesterke Turbospec: Zamora, Garcia-Hernandez, Sobeck, Garcia- Perez, Holtzman MOOG: Shetrone, Holtzman (pipeline), others speclib postprocessing: Allende-Prieto, Holtzman ferre: Allende Prieto idlwrap: Holtzman, Garcia-Perez (Shane) Operation: Holtzman (Shane, Shetrone)

END

Star level bitmasks Targeting flags APOGEE_TARGET1, APOGEE_TARGET2: main survey vs ancillary, telluric, etc. STARFLAG: bitmask flagging potential conditions, e.g. LOW_SNR BAD_PIXELS VERY_BRIGHT_NEIGHBOR PERSIST_HIGH

Data quality/issues: ASPCAP Current ASPCAP runs are fits for 6 parameters: Teff, log g, [M/H], [alpha/M], [C/M], [N/M] Teff, log g, [M/H], and [alpha/M] have been “calibrated” using observations of clusters: systematic corrections have been applied to these parameters, and are nonzero for Teff, log g, and [M/H] Results for [C/M] and [N/M] are more challenging to verify, and are more suspect In flat fields, PARAM (calibrated parameters) vs FPARAM (fit parameters) In CAS database, TEFF, LOGG, METALS, ALPHAFE (calibrated) vs/ FIT_TEFF, FIT_LOGG, FIT_METALS, FIT_ALPHAFE (fit) Key catalog bitmasks ASPCAP_FLAG: bitmask flagging potential conditions, e.g., STAR_BAD STAR_WARN PARAMFLAG: details about nature of ASCPAP_FLAG bits

DR10: Data taken from April 2011 through July 2012 – First year survey data all observed spectra, even if all visits not complete: summed spectra of what is available release spectra and ASPCAP results – Commissioning data (through June 2011): degraded LSF (especially red chip). No ASPCAP – 170 fields (includes a few commissioning-only fields) – 710 plates (+ sky frames + calibration frames/monitors) – 40-50K stars Looking past DR10 – 250+ fields available as of May, currently being combined – Plan to have DR10-level reductions of all year 2 data around time of DR10 release Scope of Data

Data access: flat files SAS: “flat” files Datamodel: APOGEE_TARGET: targeting files include all _possible_ targets as well as selected ones APOGEE_DATA: raw data cubes APOGEE_REDUX: reduced data APOGEE_REDUX: currently corresponds to Embedded web pages provide a guide and some static plots Embedded web pages Versions / organization Identify via apred_version/apstar_version/aspcap_version/results_version apred_version : contains visit files (apVisit) organized by plate/MJD apstar_version – contains combined star files, organized by field location aspcap_version – raw ASPCAP results, organized by field location results_version – adds ASPCAP “calibrated” results and sets some additional data quality bits Current version is r3/s3/a3/v302; DR10 version likely to be v303?

Summary “wrap-up” files Main summary data files allStar-v302.fits: catalog data for all DR10 stars allStar-v302.fits allVisit-v302.fits: catalog data for all DR10 visits allVisit-v302.fits: These files are not overly large (~60000 star entries in allStar currently), so are really quite manageable Pay attention to bitmasks! allstar=mrdfits(‘allStar-v302.fits’,1) ; skip stars with STAR_BAD (bit 23) and NO_ASPCAP_RESULT (bit 31)set in aspcapflag badbits=(2L^23 or 2L^31) gd=where((allstar.aspcapflag and badbits) gt 0) plot,s[gd].teff,s[gd].logg,…. ; find giant binaries badbits=(2^23 or 2^31) gd=where(allstar.vscatter gt 1 and (allstar.aspcapflag and badbits) eq 0 and s.logg lt 3.8)

Data access: API Can get programmatic access to data via APOGEE API (soon)APOGEE API One particularly useful application: downloading subset of spectra Also basis for SAS web app: visual interface to spectra APOGEE API currently under development, available in next several months Database used by API is loaded, graphical spectrum access available via web app:

Data access: CAS Data from summary files (allStar, allVisit, allPlates has been loaded into CAS (TESTDR10, currently restricted access) tables apogeePlate, apogeeStar, apogeeVisit, aspcapStar Example: Example SELECT top 10 p.star,p.ra, p.dec, p.glon, p.glat, p.vhelio_avg, p.vscatter, a.teff,a.logg,a.metals, v.vhelio FROM apogeeStar p JOIN aspcapStar a on a.apstar_id = p.apstar_id JOIN apogeeVisit v on a.star = v.star WHERE (a.aspcap_flag & dbo.fApogeeAspcapFlag('STAR_BAD')) = 0 and p.nvisits > 6 order by a.star Object search through CAS Object search through CAS implemented in sky server

Abundances of cooler stars Second instrument or first instrument relocation Surface gravity issues: red clump vs red giant Abundance analysis of faint bulge stars: RR Lyr and RC stars Achieving distance distribution