The JWST Data Calibration Pipeline

The JWST Data Calibration Pipeline
Bryan Hilbert, NIRCam Team New Hire Training, 23 Feb 2017

Outline Pipeline Description Pipeline Philosophy Creation Workflow
Typical User Experience High Level Pipeline Architecture Data Products Pipeline Versions – baseline vs optimal Pipeline Availability Installation Preparing to Run Running the Pipeline – 3 ways Resources and Documentation

Pipeline Philosophy Algorithms based on community input
Input from instrument teams and mode specific expert teams Overall goal is best justified algorithms Use same code for different instruments Where possible Easier to maintain Takes advantage of strengths of all teams Provide pipeline directly to community

Hubble calibration pipelines
Pipeline Philosophy Hubble calibration pipelines 11 different science instruments built over 20+ years Each one had its own custom calibration pipeline Written in a variety of languages IRAF SPP, Fortran, C, Python Maintenance nightmare Little possibility for code sharing Written as monolithic, procedural applications that could only feed data through the whole process start to finish Distributed to astronomers, but very difficult to customize A little bit of history and background …

JWST Calibration Pipelines
Pipeline Philosophy JWST Calibration Pipelines Completely new design Written in Python (some C extensions) Each calibration step written as a Python class Step classes can be strung together into pipelines Pipelines can be strung together Can run from command line or within Python

Pipeline Creation Workflow
NIRCam Team Pipeline Creation Workflow Algorithm Requests NIRSpec Team NIRISS Team Majority of Inputs Instrument Experts Extensive Ground Testing FGS Team MIRI Team Coronagraphic WG Special Case Inputs Science Experts Includes Instrument Experts Time Series Obs WG Moving Targ WG

Decisions JWST Calibration WG NIRCam Team NIRSpec Team NIRISS Team
Composed of STScI and external members Critical Review of Inputs Identify Common Steps Encourage cross pollination of ideas Encourage cross instrument development Decide on the algorithms for development (resource constrained) FGS Team MIRI Team Coronagraphic WG Time Series Obs WG Moving Targ WG

Implementation JWST DMS WG NIRCam Team NIRSpec Team
JWST Calibration WG NIRISS Team FGS Team JWST DMS WG Composed of scientists and programmers Work with programmers to implement algorithms Defines products, supports DMS design Testing/Validation of Pipeline Feedback to JCalWG MIRI Team Coronagraphic WG Time Series Obs WG Moving Targ WG

Code! Science Software Branch NIRCam Team NIRSpec Team
JWST Calibration WG NIRISS Team FGS Team JWST DMS WG MIRI Team Coronagraphic WG Science Software Branch Write the pipeline Integrate into the Data Management System (DMS) Time Series Obs WG Moving Targ WG

User Experience Pipeline automatically run on all data
Default parameters for all pipeline steps Pipeline products produced and archived Final as well as raw and intermediate products User can download and run pipeline locally Change defaults Add customized reduction steps Pipeline may require internet connection for telemetry and reference file queries

Quick Aside - Data Format
Raw data stored in multi-extension FITS files Generally a data array and associated header in each extension Data array shape (i × g × y × x) pixels. i = number of integrations (i.e. ramps) g = number of groups per integration (≈ detector readouts) x,y = array dimensions (e.g. 2048,2048 for NIRCam full-frame) After line-fitting, “slope” data will have (i × y × x) pixels.

JWST Pipeline Architecture
Stage 1: Ramps-to-Slopes Stage 2: Calibrated Slopes Stage 3: Ensemble Processing Archive

Stage 1 CALDETECTOR1 CALIMAGE2 CALSPEC2 ARCHIVE Stage 2 Stage 3
CALCORON3 CALAMI3 CALTSO3 CALSPEC3 CORON – Coronagraphy TSO – Time Series Obs. AMI – (NIRISS only) Aperture Masking Interferometry ARCHIVE

CALDETECTOR1 Near-Infrared Detectors Mid-Infrared Detectors
Data Quality Initialization Data Quality Initialization Saturation Check Saturation Check CALDETECTOR1 Error Initialization Error Initialization Superbias Subtraction Nonlinearity Correction Reference Pixel Correction RSCD Correction Nonlinearity Correction Last Frame Correction Persistence Correction Dark Correction Dark Correction Reference Pixel Correction Jump Detection Persistence Correction Common NIR Only Slope Fitting Jump Detection MIR Only Slope Fitting

CALIMAGE2 Input: Uncalibrated slope images for all integrations X x
Step NIRCam NIRISS FGS MIRI NIRSpec All Targ Acquisition WCS Info X x Background Sub Telescope Emission Subtraction Flat Field Flux Calibrations Rectify 2D Image Output: Calibrated slope images for all integrations

CALIMAGE3 X Input: Calibrated slope images for all integrations
Step NIRCam NIRISS FGS MIRI Refine Relative WCS X Background Matching Outlier Detection Image Combination Self Calibration Create Exposure Level Products Source Catalog Output: Coadded images (i.e. mosaic). Catalog of sources. Updated exposure-level products

Data Products Raw data Intermediate stage data Final data
Best quality from the automated pipeline Flux, wavelength, and position calibrated Baseline and Optimal versions very similar Optimal have improved quality Any user can rerun the pipeline offline Changed parameters to specific steps Replace a step with a user written version

Imaging Data Products Mosaic of all images with the same filter
In sky coordinates (e.g., ra, dec) Catalog of sources automated generation Integration level All images fully corrected with flagging of CRs, etc. In detector & sky(rectified) coordinates

Imaging Data Products Very basic examples of CALDETECTOR1 results for single pixel Saturated readouts identified Superbias subtraction – single offset Linearity correction – saturated readouts skipped Slope calculated from unsaturated, non cosmic ray-impacted readouts. Final slope = average of slope pieces Slope #1 Slope #2

Imaging Data Products: Example
Gordon et al. (2015, PASP, 127, 696) – MIRI PASP paper X

Spectroscopic Data Products
Slit & Slitless: 2D spectral image along-slit and wavelength Includes MOS targets (each processed separately) IFU: 3D spectral cube ra, dec, and wavelength Extracted spectrum for the target source will be able to advise pipeline if source is point/extended via APT Integration level All images fully corrected with flagging of CRs, etc. In detector & sky(rectified) coordinates 3D cubes or 2D images depending on observation type

Spectroscopic Data Products: IFU Example

Coronagraphic Data Products
Coadded images In detector coordinates After PSF subtraction Integration level All images fully calibrated with flagging of CRs, etc. In detector & sky(rectified) coordinates

Coronagraphic Data Products: MIRI 4QPM Example

Aperture Masking Interferometry (AMI) Data Products
NIRISS only Fringe parameters Closure phases and amplitudes Reconstructed images

Time Series Observation (TSO) Data Products
Time Series Observations – for monitoring a variable source on short timescales (e.g. stellar lightcurves, occultations, etc) Photometry Aperture photometry per integration Spectroscopy Extracted spectrum per integration White-light photometry per integration Integration level Image or spectral image as appropriate fully corrected with flagging of CRs, etc. In detector & sky(rectified) coordinates

Pipeline Versions “Baseline” Version “Optimal” Version
Correct algorithms Completed and tested prior to launch “Optimal” Version “Best” algorithms Continually evolving Details and differences between the individual steps in the baseline and optimal pipelines can be found on the Calibration Working Group website.

Pipeline Availability
SSB software available via Anaconda, from Continuum Analytics. Most software (but NOT JWST pipeline) is in the “astroconda” channel. Instructions for installing Anaconda, as well as the STScI astroconda packages are at: You can then install the jwst package, which DOES contain the Build 7 version of the JWST pipeline, using: % conda create -n <ENV_NAME> --file <URL> Where <ENV_NAME> is your chosen name for the environment (e.g. jwstb7), and <URL> depends on the version of python you want to install and your OS. (Note that there is a bug in the python 3.5 version of the Build 7 pipeline. Best to use python 2.7 for now.) - python 2.7 on linux python 3.5 on linux - python 2.7 on OSXhttp://ssb.stsci.edu/conda/jwstdp-0.7.0rc/jwsthp osx-py35.0.txt - python 3.5 on OSX

Pipeline Availability
This will install the Build 7 version of the jwst package, along with all of its dependencies, into an environment on your system called “jwstb7” Activate this environment using “source activate jwstb7” A test command of “which strun” should return something that looks like: //anaconda/envs/jwstb7/bin/strun

Preparing to Run the Pipeline
Download the necessary configuration files to working directory: $ collect_pipeline_cfgs /your/dir/path/ (e.g. ‘.’) One configuration file for each pipeline step, plus one for each pipeline Contains the name of the step, name of the class, mandatory input parameters Optional inputs can be entered as well Examples: Files for individual steps Reference Pixel Subtraction (refpix.cfg) name = "refpix" class = "jwst.refpix.RefPixStep" odd_even_columns = True use_side_ref_pixels = True side_smoothing_length=11 side_gain=1.0 odd_even_rows = False output_file = “rfpx_sub.fits” Superbias subtraction (superbias.cfg) name = "superbias" class = "jwst.superbias.SuperBiasStep” override_superbias = “my_superbias.fits” output_file = “sb_subtracted.fits” Default (mandatory) entries User entered, optional parameters

Preparing to Run the Pipeline
name = "SloperPipeline" class = "jwst.pipeline.SloperPipeline" save_calibrated_ramp = False [steps] [[dq_init]] config_file = dq_init.cfg output_file = ‘myfile_dq.fits’ [[saturation]] config_file = saturation.cfg [[ipc]] skip = True [[superbias]] config_file = superbias.cfg [[refpix]] config_file = refpix.cfg odd_even_columns = False [[reset]] config_file = reset.cfg [[lastframe]] config_file = lastframe.cfg [[linearity]] config_file = linearity.cfg [[dark_current]] [[jump]] config_file = jump.cfg [[ramp_fit]] config_file = ramp_fit.cfg Example: File for pipeline (CALDETECTOR1 a.k.a calwebb_sloper) Pipeline will run listed steps in pre-defined calwebb_sloper order. (order of steps within cfg file does not matter) Can add optional/non-default parameters for a step directly in pipeline cfg file e.g. “skip” to skip the step, “output_file” to specify fits file output name Only need to list configuration files for steps where you want to use non-default parameters stored in those files

Running the Pipeline: strun
Run from the command line using ‘strun’ Use config file to set non-default params: $ strun calwebb_sloper.cfg input.fits $ strun superbias.cfg input.fits Use pipeline/step class name: $ strun jwst.pipeline.SloperPipeline input.fits $ strun jwst.dq_init.DQInitStep input.fits Set non-default param values: $ strun dq_init.cfg input.fits --output_file=input_dqinit.fits $ strun dq_init.cfg input.fits --override_mask=mymask.fits $ strun calwebb_sloper.cfg input.fits --steps.refpix.skip=True Use the –h command-line option to get help on all arguments available: $ strun calwebb_sloper.cfg –h

Running the Pipeline: within Python
Run full pipeline: % from jwst.pipeline import SloperPipeline % result = SloperPipeline.call(‘input.fits’, config_file=‘calwebb_sloper.cfg’) Run Individual Steps: % from jwst.dq_init import DQInitStep % from jwst.refpix import RefPixStep Data Models as intermediate outputs: % dqresult = DQInitStep.call(‘input.fits’) % refresult = RefPixStep.call(dqresult) % dqresult.save(‘input_dq.fits’) % refresult.save(‘input_ref.fits’) Fits files as intermediate outputs: % DQInitStep.call(‘input.fits’, output_file=‘input_dq.fits’) % RefPixStep.call(‘input_dq.fits’, output_file=‘input_ref.fits’)

Running the Pipeline: notebook
Alicia Canipe of the NIRCam team has produced a Jupyter notebook which runs the pipeline up through the Flat Fielding step on some NIRCam imaging data.

Record Keeping Output products (files and data models) contain keywords to help keep track of pipeline and CRDS versions that were used CAL_VER: Cal Pipeline version number (‘0.7.0’) CAL_SVN: Cal Pipeline repo revision number (‘4341’) CRDS_VER: CRDS client version number (‘6.0.0’) CRDS_CTX: Name of CRDS context map (‘jwst_0192.pmap’) Products also contain step completion keywords S_IPC, S_DQINIT, S_REFPIX, S_DARK, … Set to “COMPLETE” if successfully applied Set to “SKIPPED” if attempt was aborted for any reason Keyword doesn’t appear if step not in pipeline flow As well as record of reference files used R_DARK, R_GAIN, R_IPC, R_MASK, R_LINEAR, … e.g. R_DARK = ‘crds://jwst_miri_dark_0026.fits’

Data Models from jwst.datamodels import RampModel
Python objects which encapsulate all information associated with a given type of data/file, and are used as input/output between pipeline steps Each model is a bundle of array or tabular data, along with meta data Models exist for many observational data types and reference data Examples: RampModel, ImageModel, IFUCubeModel, FlatModel, GainModel, etc Structure and contents defined using YAML from jwst.datamodels import RampModel mod = RampModel()  empty model instance mod = RampModel(“myramp.fits”)  populated instance mod.data, mod.pixeldq mod.err, mod.groupdq mod.meta.exposure.integration_time mod.meta.instrument.filter mod.search_schema(“time”) Arrays containing pixel data Metadata

Calibration Reference Database System (CRDS)
Database containing the collection of reference files for all instruments/pipeline steps CRDS automatically selects the appropriate reference files for the observation being run through the pipeline Based on details of observation (exposure type, instrument, detector, filter, etc.) Uses a context mapping file (e.g. jwst_0192.pmap) Can use reference files other than those selected by CRDS via the override_XXXXX keywords in each pipeline step

Resources and Documentation
Pipeline user guide Pipeline documentation for individual steps and data models Core schema for data models - JWST/jwst/blob/master/jwst/datamodels/schemas/core.schema.yaml (This may change soon, with the core schema common to science and reference file data models in the link above, and the remaining science file-only schema in a different location.) Calibration Working Group – descriptions of the algorithms to be implemented for each pipeline step Reference file formats and descriptions of how CRDS chooses reference files for a given observation Jupyter notebook with an example of running pipeline steps up through flat fielding, using NIRCam imaging data (in the training_notebooks directory)

Extra Slides

CALSPEC2

CALCORON3

CALAMI3

CALSPEC3

CALTSO3

The JWST Data Calibration Pipeline

Similar presentations

Presentation on theme: "The JWST Data Calibration Pipeline"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The JWST Data Calibration Pipeline

Similar presentations

Presentation on theme: "The JWST Data Calibration Pipeline"— Presentation transcript:

Similar presentations

About project

Feedback