Presentation is loading. Please wait.

Presentation is loading. Please wait.

Palomar Transient Factory Data Flow Jason Surace IPAC/Caltech.

Similar presentations


Presentation on theme: "Palomar Transient Factory Data Flow Jason Surace IPAC/Caltech."— Presentation transcript:

1 Palomar Transient Factory Data Flow Jason Surace IPAC/Caltech

2 Complicated Data Path Data flows through multiple pipelines, creating a variety of science products tailored for different purposes. These pipelines operate on multiple timescales. What data you want depends in large part on what science you want to do. Realtime Data Processing – image subtraction, transient and solar system object detection. High Fidelity Daily Processing – nightly processing and recalibration for highest data quality images and source catalogs. Ensemble Processing – periodic construction of coadded images, processing of catalogs to create high precision light curves. Long-term Data Curation - storage of all raw data, processed data (images and extracted photometry), and an advanced data archive with data exploration tools, with public release.

3 P48 Caltech/C ahill NERSC Image Subtraction and Transient Detection/RB Pipeline Ingest Realtime Image Subtraction Pipeline Photometric Pipeline Reference Pipeline Lightcurve Pipeline Transient Candidates Lightcurves Reference Catalogs Epochal Images and Catalogs IPAC Moving Object Pipeline SSOs Reference Images

4 Data Transfer Data flows from the 48-inch and the PTF camera system via high-speed microwave link through a relay node at the San Diego Supercomputing Center to Cahill at Caltech. From there it forks to two places: NERSC at LBNL and IPAC at Caltech. Raw data moves as a multi-extension FITS file containing all 12 CCD images in an exposure, along with header metadata. This is the raw data product, and will not be used by many of you.

5 Raw Data 12 chips extracted from the MEF file and moaicked together. Dead CCD

6 NERSC Realtime Pipeline NERSC/LBL developed the first version of a realtime data pipeline which performs basic calibration, image subtraction against a reference image dataset, transient candidate detection, and candidate vetting via the RealBogus software. Still operating today. This is the feed-in for the existing extragalactic transient marshall. Designed around SNe detection. Most of the SNe work you have seen has come from this pipeline. In-collaboration dataset; functionality has been redeveloped, improved, and expanded by IPAC, which will be the basis for future ZTF alerts.

7 IPAC Data Ingest Data flows in realtime to IPAC. Upon receipt, the MEF files are broken up into individual CCD files. PTF data system processes all the CCDs wholly independently. Metadata about all the images goes into an operations database. Data receive an initial WCS. All the data are stored on spinning disk and in a deep tape archive. You are here. PTF data lives is here.

8 Infrared Processing and Analysis Center Multi-mission Science Center (IRAS, ISO, Spitzer WISE, Herschel, Planck, 2MASS, etc) Maintains several data rooms. ~1TB of data every 4-5 days. 24 drones with 240 cores. Roughly 0.5 PB spinning disk. Associated network equipment. Database and file servers. Archive servers. Tape backup. This will increase by a factor of 10x in the ZTF era! One shudders to imagine LSST, which will be measuring it’s computing power in megawatts. IPAC Morrisroe Computer Center

9 R-band Holdings 1449 nights, 3.3 million images 50 billion source apparitions (epochal detections)

10 g-band Holdings 406 nights, 830 thousand images

11 H-alpha Holdings 99 nights, 125 thousand images, plus an equal amount In at least one other H-alpha filter.

12 Realtime Pipeline Realtime – data is processed as received, turnaround in 10- 20 minutes. Needed for same-night followup. Astrometric and photometrically calibrated. Image subtraction against a reference image library constructed from all the data to-date. In-house software. “Streak detection” for fast-moving objects; moving object pipeline constructs solar system object tracklets. Transient candidate detection and extraction via psf-fitting and aperture extraction. Machine-learning “scores” candidates. Image subtractions and candidate catalogs are pushed to an external gateway where they are picked up by the solar system, ToO, and extragalactic marshalls. Not publicly available at this time.

13 Realtime Image Subtraction and Transient Detection Originally the community “HOTPANTS” software, now replaced with a more sophisticated in-house image subtraction package.

14 Realtime Pipeline This is a fast streak candidate from the Solar System Marshall.

15 Photometric Pipeline This pipeline processes data in the traditional manner. Starts up at the end of the night, after all the data has been received. Calibration is derived from the entire night’s worth of data. Specifically, the bias and flat-fields are derived from the data themselves. Photometric calibration is derived from extracted photometry from all sources, fitting color, extinction, time and large-scale spatial variations vs. the SDSS. Typically reach an accuracy of a few %. Astrometric calibration is done individually at the CCD level, against a combined SDSS and UCAC4 catalog. Typically good to 0.15”. Output from this pipeline are calibrated single-CCD FITS images and single-CCD catalog FITS binary tables (both aperture and psf-fit). These are archived through IRSA. Available 1-3 days after observation. These are publicly available data products.

16 Photometric Pipeline Output Single R-band thumbnail image of Arp 220, 8 arcminutes across. Aperture extractions catalog (sextractor-based) overlaid. All observations and detections of everything are saved in the archive. Products are a reduced image, bit-encoded data quality mask, and catalogs. All products are FITS.

17 Reference Image Pipeline Once enough individual observations accumulate, the “reference image” pipeline is triggered. This pipeline coadds the existing data, after selecting “best frames”, e.g. best seeing, photometric conditions, astrometry, etc. Coaddition is done based on CCD id, PTF tile, and filter. These images are the reference of the static sky, at a level deeper than the individual observations. “Reference Catalogs” are extracted from these images. This concept is important, because these are both the underlying basis of the image subtractions, and also the basis of the light-curve pipeline. Like PTF coverage, the depth of these is variable, but is current 5<n<50. Resulting products are FITS images and FITS binary tables. Publicly available.

18 Reference Images Single Image 60 sec @R Field 5257, Chip 7, Stack of 34

19 Deep Sky Coadds aka “Reference Images” * Results not typical. Near Galactic Center.

20 Deep Coadds

21 Light Curve Pipeline Each night, all detected sources from the photometric pipeline are matched against the reference catalog (better than a generic catalog-matching approach). All sources ever seen for a given CCD, PTF tile, and filter combination are loaded and analyzed. Least variable sources used as anchors for the calibration. Image-by-image correction factors computed for that image as a whole and stored as a lookup table. Application of these secondary correction factors improves overall relative calibration to near-millimag levels for bright sources (that part is important). Triggers less frequently (planned weekly updates). Highest level of our products. Not publicly available.

22 Example Light Curves Something a little different, these are relatively faint asteroid light curves from Chang et al. 2014.

23 Data Products What you can publicly get today: Calibrated epochal images and catalog files at g and R-band for all data taken prior to Dec 31, 2012. Calibrated reference images and catalogs for all fields observed prior to Dec 31, 2012. In one year: Rolling release of iPTF data including light curves.

24 PTF Archive at IRSA Data products can be searched and retrieved via sophisticated GUI tools and also through an application program interface that allows integration of the archive into other, 3 rd party software.


Download ppt "Palomar Transient Factory Data Flow Jason Surace IPAC/Caltech."

Similar presentations


Ads by Google