The Palomar Transient Factory or Adventures in High Fidelity Rapid Turnaround Data Processing at IPAC Jason Surace Russ Laher, Frank Masci, Wei Mi (did.

Slides:



Advertisements
Similar presentations
Science Archives in the 21st Century Best Practices in Ingestion and Data Access at the NASA/IPAC Infrared Science Archive
Advertisements

MOST - Moving Object Search Tool for NEOWISE and IRSA Kevin Yau 6/11/2010.
ADASS XVII Sep 2007The NOAO Pipeline Applications Francisco Valdes (NOAO) Robert Swaters (UMd) Derec Scott (NOAO) Mark Dickinson (NOAO)
1 FNAL-ANL PreCam Reductions Douglas L. Tucker (FNAL) DES Collaboration Meeting ICG, Portsmouth PreCam Parallel Session 29 June 2011.
1 Non-streaking moving objects from iPTF Frank Masci, Adam Waszczak, Russ Laher & James Bauer iPTF workshop, August 2014.
Optimal Photometry of Faint Galaxies Kenneth M. Lanzetta Stony Brook University.
Palomar Transient Factory Data Flow Jason Surace IPAC/Caltech.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Data-Intensive Solutions.
The Need for Contiguous Fields NGAO Team Meeting, Waimea January 22, 2007 Claire Max.
The Montage Image Mosaic Service: Custom Mosaics on Demand ESTO John Good, Bruce Berriman Mihseh Kong and Anastasia Laity IPAC, Caltech
_______________ RIT Observatory Data Pipeline & Automation Project: Summer Research ______________________ Presented by: Kevin Beaulieu & Dustin Crabtree.
X-ray sources in NSVS Tim McKay University of Michigan 04/03/04.
Cosmos Data Products Version Peter Capak (Caltech) May, 23, 2005 Kyoto Cosmos Meeting.
Galaxy Clustering in Far-Infrared SWIRE Fields Fan Fang, David Shupe, Russ Laher, Frank Masci, Alejandro Afonso-Luis, David Frayer, Seb Oliver, Ian Waddington,
SONG – Stellar Observations Network Group - The robotic software for the SONG network S.Frandsen 1, Eric Weiss 1, J. Skottfelt 2, M.F. Andersen 1, F.Grundahl.
Relative measurements with Synoptic surveys I.Photometry & Astrometry Eran Ofek Weizmann Institute.
Astro-DISC: Astronomy and cosmology applications of distributed super computing.
An Astronomical Image Mosaic Service for the National Virtual Observatory / ESTO.
Introduction to Spitzer and some applications Data products Pipelines Preliminary work K. Nilsson, J.M. Castro Cerón, J.P.U. Fynbo, D.J. Watson, J. Hjorth.
The Use of Infrared Color-Color Plots to Identify Rare Objects in the Galactic Mid-Plane Jessica Fuselier Dr. Robert Benjamin, advisor.
A Primer on Image Acquisition and Data Reduction Using TheSky6, CCDSoft V5 and Microsoft Excel Thomas C. Smith Dark Ridge Observatory (DRO)
The Catalina Sky Survey Eric J. Christensen A.Boattini, A. R. Gibbs, A. D. Grauer, R. E. Hill, J. A. Johnson, R. A. Kowalski, S. M. Larson, F. C. Shelly.
NEO Research Project in Korea Wonyong Han 1, Yong-Ik Byun 2, Hong-Suh Yim 1, Young-Jun Choi 1, Hong-Kyu Moon 1 & NESS Team 1 Korea Astronomy and Space.
Why Build Image Mosaics for Wide Area Surveys? An All-Sky 2MASS Mosaic Constructed on the TeraGrid A. C. Laity, G. B. Berriman, J. C. Good (IPAC, Caltech);
The NASA/NExScI/IPAC Star and Exoplanet Database 14 May 2009 David R. Ciardi on behalf of the NStED Team.
Purpose The purpose of this project was to analyze the data from the telescope to verify the relationship between period of rotation and diameter of the.
PPA Stack User Driven Image Stacking for ODI data via a Highly Customizable Web Interface Soichi Hayashi Indiana University - Pervasive Technology Institute.
The VAO is operated by the VAO, LLC. VAO: Archival follow-up and time series Matthew J. Graham, Caltech/VAO.
National Center for Supercomputing Applications Observational Astronomy NCSA projects radio astronomy: CARMA & SKA optical astronomy: DES & LSST access:
1 New Frontiers with LSST: leveraging world facilities Tony Tyson Director, LSST Project University of California, Davis Science with the 8-10 m telescopes.
Producing Science with the Palomar Transient Factory Branimir Sesar (MPIA, formerly Caltech)
Dec 2, 2014 Hubble Legacy Archive and Hubble Source Catalog Rick White & Brad Whitmore Current teams: HLA: Michael Dulude, Mark Kyprianou, Steve Lubow,
1 Radio Astronomy in the LSST Era – NRAO, Charlottesville, VA – May 6-8 th LSST Survey Data Products Mario Juric LSST Data Management Project Scientist.
Test and Operation of AST3 (Survey Control and Data System) Zhaohui Shang Tianjin Normal University National Astronomical Observatories, CAS On behalf.
MASSACHUSETTS INSTITUTE OF TECHNOLOGY NASA GODDARD SPACE FLIGHT CENTER ORBITAL SCIENCES CORPORATION NASA AMES RESEARCH CENTER SPACE TELESCOPE SCIENCE INSTITUTE.
The Pipeline Processing Framework LSST Applications Meeting IPAC Feb. 19, 2008 Raymond Plante National Center for Supercomputing Applications.
1 The EIS Experience: Lessons Learned May 12, 2005.
Jim Lewis and Guy Rixon, CASU. 24 April, 2001 Data-reduction Pipeline for the INT WFC: slide 1 The Data-reduction Pipeline for the INT Wide Field Camera.
The Chinese SONG proposal: scientific concerns Jianning Fu (Beijing Normal University) and Chinese SONG team Beijing ─ March 29, 2010 The third workshop.
Palomar Transient Factory Data Flow Jason Surace IPAC/Caltech.
星網計劃 The NETS Project: The NEtwork of Transit Survey 江瑛貴 Ing-Guey Jiang National Tsing-Hua Univ., Taiwan.
LSST: Preparing for the Data Avalanche through Partitioning, Parallelization, and Provenance Kirk Borne (Perot Systems Corporation / NASA GSFC and George.
Students: Anurag Anjaria, Charles Hansen, Jin Bai, Mai Kanchanabal Professors: Dr. Edward J. Delp, Dr. Yung-Hsiang Lu CAM 2 Continuous Analysis of Many.
Quality Assurance Benchmark Datasets and Processing David Nidever SQuaRE Science Lead.
Data Analysis Software Development Hisanori Furusawa ADC, NAOJ For HSC analysis software team 1.
Serving Data to the GLAST User Community Don Horner (L3 GSI/GSFC) and the GLAST Science Support Center Team Data Properties and Impact on Data Serving.
Hunting youngest Type Ia SNe in the intermediate Palomar Transient Factory Yi Cao (Caltech) On behalf of the intermediate Palomar Transient Factory collaboration.
1 Imaging Surveys: Goals/Challenges May 12, 2005 Luiz da Costa European Southern Observatory.
From photons to catalogs. Cosmological survey in visible/near IR light using 4 complementary techniques to characterize dark energy: I. Cluster Counts.
LSST and VOEvent VOEvent Workshop Pasadena, CA April 13-14, 2005 Tim Axelrod University of Arizona.
Surace 2014 Jason Surace (Data Systems Lead) Zwicky Transient Facility Data System.
Pan-STARRS Seminar: IPPEugene Magnier Pan-STARRS Image Processing Pipeline An Overview IFA Pan-STARRS Seminar 735 October 6, 2004.
Distributed Pipeline Programming for Mosaics Or Mario Tips’N’Tricks.
Ray Plante for the DES Collaboration BIRP Meeting August 12, 2004 Tucson Fermilab, U Illinois, U Chicago, LBNL, CTIO/NOAO DES Data Management Ray Plante.
Dynamic Andromeda: Novae and Variable Stars in M31 Yi Cao 2 nd year graduate at Caltech On Behalf of the Palomar Transient Factory collaboration.
Followup Observations in the Swift Era S. R. Kulkarni California Institute of Technology.
Difference Image Analysis at OAC Groningen, 1st Dec 2004 AW-OAC team.
IPHAS Early Data Release E. A. Gonzalez-Solares IPHAS Consortium AstroGrid National Astronomy Meeting, 2007.
The LSST Data Processing Software Stack Tim Jenness (LSST Tucson) for the LSST Data Management Team Abstract The Large Synoptic Survey Telescope (LSST)
Selection and Characterization of Interesting Grism Spectra Gerhardt R. Meurer The Johns Hopkins University Gerhardt R. Meurer The Johns Hopkins University.
Young Star Disk Evolution in the Planet Formation Epoch Gregory Mace Mentor: Lisa Prato Northern Arizona University Physics and Astronomy.
26th October 2005 HST Calibration Workshop 1 The New GSC-II and it’s Use for HST Brian McLean Archive Sciences Branch.
GSPC -II Program GOAL: extend GSPC-I photometry to B = V ˜ 20 add R band to calibrate red second-epoch surveys HOW: take B,V,R CCD exposures centered at.
Data Pipeline Wei Zheng Johns Hopkins University.
Pi of the Sky off-line experiment with GLORIA Ariel Majcher National Centre for Nuclear Research Warsaw, Poland 10th INTEGRAL/BART Workshop, April.
T. Axelrod, NASA Asteroid Grand Challenge, Houston, Oct 1, 2013 Improving NEO Discovery Efficiency With Citizen Science Tim Axelrod LSST EPO Scientist.
Presented by: Kevin Beaulieu & Dustin Crabtree
A galaxy at redshift 10? Brigitta Eder Vera Könyves
LSST Photometric Calibration
Karen Meech Institute for Astronomy TOPS 2003
Presentation transcript:

The Palomar Transient Factory or Adventures in High Fidelity Rapid Turnaround Data Processing at IPAC Jason Surace Russ Laher, Frank Masci, Wei Mi (did the IPAC work) Branamir Sesar, Eran Ofek, David Levitan (students & post-docs) Vandana Desai, Carl Grillmair, Steve Groom, Eugean Hacopians, George Helou, Ed Jackson, Lisa Storrie-Lombardi, Lin Yan (IPAC Team) Eric Bellm (Project Scientist), Shri Kulkarni (PI)

What was/is PTF/iPTF? PTF is a robotic synoptic sky survey system designed to study transient (time-domain) phenomena. Surveys square degrees a night, predominantly at R-band to a depth of Primarily aimed at supernova science. But also can study variable stars, exoplanets, asteroids, etc. And produces an imaging sky survey like SDSS over larger area. PTF ran 4 years on-sky starting in 2009, now “iPTF” for another 3. Early foray into the next big theme in astronomy. Total budget ~$3M. Surace 2014

Surace 2011 Former CFHT 12k Camera -> PTF Camera Eliminated nitrogen dewar; camera now mechanically cryo-cooled. New field flattener, etc. 7.8 square degree active area. Surace 2014

The Venerable 48-inch Telescope Surace 2014

PTF camera installed in late 2008; Operations started 2009 Fully robotic operation. Automatically opens, takes calibrations, science data, and adapts to weather closures. Human intervention used to guide science programs.

Infrared Processing and Analysis Center Surace 2009 IPAC is NASA’s multi-mission science center and data archive center for IR/submm astronomy. Specifically, we handle processing, archiving, and/or control for numerous missions including: IRAS, ISO, Spitzer, GALEX, Herschel, Planck, and WISE, as well as 2MASS, KI, and PTI. Also the seat of the Spitzer Science Center, NExSci, NED, NStED, and IRSA. Approximately 150 employees in two buildings on the CIT campus. Surace 2014

R-band Holdings 1292 nights, 3.1 million images 47 billion source apparitions (epochal detections) Surace 2014

g-band Holdings 241 nights, 500 thousand images Surace 2014

H-alpha Holdings 99 nights, 125 thousand images Surace 2014

P48 Caltech/C ahill NERSC Image Subtraction and Transient Detection/RB Pipeline Ingest Realtime Image Subtraction Pipeline Photometric Pipeline Reference Pipeline Lightcurve Pipeline Transient Candidates Lightcurves Reference Catalogs Epochal Images and Catalogs IPAC Moving Object Pipeline SSOs Reference Images Surace 2014

IPAC Infrastructure Data transmission from Palomar via microwave link to SDSC. ~1TB of data every 4-5 days. 24 drones with 240 cores. Mixed Sun and Dell blade units running RHE. Roughly 0.5 PB spinning disk in Nexsan storage units. Associated network equipment. Database and file servers. Archive servers. Tape backup. IPAC Morrisroe Computer Center Surace 2014

Cluster/Parallelization Architecture PTF data are observed on a fixed system of spatial tiles on the sky. Vastly simplifies data organization and processing. PTF fields and CCD combinations are the basic unit to parallelize processing over multiple cluster nodes. Each node processes a CCD at a time. “Virtual Pipeline Operator” on a master control node oversees job coordination and staging. Multi-tiered local scratch disk, “sandbox” (working area) and archive disk structure; inherited architecture from previous projects driven by issues with very large file counts and I/O heavy processes. Disk system shared with archive for budget constraint issues. Surace 2014

Software Structure Individual modules written predominantly in C, but also FORTRAN, PYTHON, MATLAB, and IDL. Connected with PERL wrapper infrastructure into discrete pipelines. Postgres database used for tracking dataflow, data quality, etc. Relational database not used in the operations system for catalog storage; not needed, and flat file access is more efficient. Heavy use of community software: sextractor, swarp, scamp, astrometry.net, daophot, hotpants. Cheaper not to re-invent the wheel. Software replaced as needed by new code development. Highly agile development program: unknown and changing science requirements, small team, and no separate development system due to budget constraints! Continuous refinement process. There’s a trap with big data development on a new instrument. Surace 2014

Realtime Pipeline Realtime – data is processed as received, turnaround in 20 minutes. Needed for same-night followup. Astrometric and photometrically calibrated. Image subtraction against a reference image library constructed from all the data to-date. In-house software. “Streak detection” for fast-moving objects; moving object pipeline constructs solar system object tracklets. Transient candidate detection and extraction via psf-fitting and aperture extraction. Machine-learning “scores” candidates. Image subtractions and candidate catalogs are pushed to an external gateway where they are picked up by the solar system, ToO, and extragalactic marshalls. Surace 2014

Realtime Image Subtraction and Transient Detection Surace 2014 Originally the community “HOTPANTS” package, now replaced with a more sophisticated in-house image subtraction algorithm.

Photometric Pipeline This pipeline processes data in the traditional manner. Starts up at the end of the night, after all the data has been received. Calibration is derived from the entire night’s worth of data. Specifically, the bias and flat-fields are derived from the data themselves. Photometric calibration is derived from extracted photometry from all sources, fitting color, extinction, time and large-scale spatial variations vs. the SDSS. Typically reach an accuracy of a few %. Astrometric calibration is done individually at the CCD level, against a combined SDSS and UCAC4 catalog. Typically good to 0.15”. Output from this pipeline are calibrated single-CCD FITS images and single-CCD catalog FITS binary tables (both aperture and psf-fit). These are archived through IRSA. Available 1-3 days after observation.

Photometric Pipeline Output Single R-band thumbnail image of Arp 220, 8 arcminutes across. Aperture extractions catalog (sextractor-based) overlaid. All observations and detections of everything are saved in the archive. Products are a reduced image, bit-encoded data quality mask, and catalogs. All products are FITS.

Reference Image Pipeline Once enough individual observations accumulate, the “reference image” pipeline is triggered. This pipeline coadds the existing data, after selecting “best frames”, e.g. best seeing, photometric conditions, astrometry, etc. Coaddition is done based on CCD id, PTF tile, and filter. These images are the reference of the static sky, at a level deeper than the individual observations. “Reference Catalogs” are extracted from these images. This concept is important, because these are both the underlying basis of the image subtractions, and also the basis of the light-curve pipeline. Like PTF coverage, the depth of these is variable, but is current 5<n<50. Resulting products are FITS images and FITS binary tables.

Reference Images Single Image 60 Field 5257, Chip 7, Stack of 34 Surace 2014

Deep Sky Coadds aka “Reference Images” * Results not typical. Near Galactic Center. Surace 2014

Deep Coadds Surace 2014

Light Curve Pipeline Each night, all detected sources from the photometric pipeline are matched against the reference catalog (better than a generic catalog-matching approach). All sources ever seen for a given CCD, PTF tile, and filter combination are loaded and analyzed. Least variable sources used as anchors for the calibration. Image-by-image correction factors computed for that image as a whole and stored as a lookup table. Application of these secondary correction factors improves overall relative calibration to near-millimag levels for bright sources (that part is important). Triggers less frequently (planned weekly updates). Highest level of our products.

Surace 2014 From Van Eyken Binary star light curves taken from PTF processed images in Orion.

Example Light Curves Something a little different, these are relatively faint asteroid light curves from Chang et al Surace 2014

PTF Archive at IRSA Surace 2014 Data products can be searched and retrieved via sophisticated GUI tools and also through an application program interface that allows integration of the archive into other, 3 rd party software.

PTF Archive at IRSA Surace 2014 IRSA is looking to hire a UI software developer, see the Caltech website or ask Steve Groom at this meeting.

PTF “Marshals” PTF “Science Marshals” sit on top of the data archive. Marshals are like interactive science twikis. Marshals are predominantly written by science users for their science collaborations, with coordinated interaction between them and the ops/archive system. The ops system produces science products (e.g. data), the archive produces access to science products, the marshals help turn the science products into science results (e.g. papers). They can be used to classify data, listen for alerts, lay down new observations for robotic followup, coordinate collaborators, etc. Surace 2014

iPTF Extragalactic Marshal Surace 2014

iPTF Extragalactic Marshal Surace 2014

NEA “Streaker” Marshal Surace 2014

NEA “Streaker” Marshal Surace 2014

GRB Target of Opportunity (ToO) Marshall iPTF ToO Marshall iPhone App Surace 2014 GRBs and (should they ever be detected) gravity waves can only be localized to tens to a few hundred square degrees. PTF and ZTF can survey these areas in tens of minutes as targets of opportunity to localize fading electromagnetic counterparts. Marshall receives alerts from Fermi and Swift, automatically lays down proposed ToO observations, and alerts a user by phone to activate the followup.

ZTF was awarded full funding through NSF-MSIP (Mid-Scale Innovation Program). ZTF now a roughly 50:50 public:private partnership. Total Budget ~$17M Zwicky Transient Facility More or less what PTF was, but an order of magnitude more of it. Surace 2014

Wafer-Scale CCDs Surace 2014 e2v CCD231-C6 6k x 6k form factor with 15 micron pixels. A little under 4 inches on a side. Focal plane readout time <10 seconds! 16 CCDs, 4 readouts each. And they are cheap. 30-second cadence means 1.2 GB raw data every 45 seconds. ~16x current data rate from PTF. 5 CCDs in-hand, remaining 11 now ordered.

Surace 2014 ZTF camera FOV is 50 square degrees. Largest camera on >1m telescope by area in the world. Or, to make a little clearer, here’s Orion. The white box is the ZTF imaging area. The moon is in the upper right corner of the white box.

Surace 2014

And to Process All This? Surace 2014 IPAC is the data processing and archive center for all aspects of ZTF. Continuous raw data flow of 30MB/s PB/yr of data products. Drone farm of 128 computers. Replication of proven PTF design in subunits similar to PTF data load (camera quadrants).

Surace 2014

Transient Science Summer Schools

Schedule Early 2014 – PTF data for selected high cadence fields (M81, Beehive, Orion, Kepler, Stripe 82, Cass-A – Complete PTF Archive release – Rolling Releases of iPTF Archive,, including deep reference images and light curves – ZTF First Light (Jan), commissioning of camera, building of new reference images – First ZTF data release (images, catalogs, light curves, transient candidates) 2019 – Release of transient alerts – NSF funded period ends. Project continues with private partners. Surace 2014

Surace 2014