1 KFPA Critical Design Review – Fri., Jan. 30, 2009 KFPA Data Pipeline Bob Garwood- NRAO-CV.

Slides:



Advertisements
Similar presentations
Components of a Data Analysis System Scientific Drivers in the Design of an Analysis System.
Advertisements

National Radio Astronomy Observatory June 13/14, 2005 EVLA Phase II Proposal Review EVLA Phase II Computing Development Bryan Butler (EVLA System Engineer.
Dale E. Gary Professor, Physics, Center for Solar-Terrestrial Research New Jersey Institute of Technology 1 9/25/2012Prototype Review Meeting.
Data Analysis Plan GBT IDL Workshop October 15, 2004.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
Atacama Large Millimeter/submillimeter Array Expanded Very Large Array Robert C. Byrd Green Bank Telescope Very Long Baseline Array.
Data Processing and User Software Ken Ebisawa (Astro-E2 GOF) presentation and demonstration.
C. ChandlerEVLA Advisory Committee Meeting September 6-7, Scientific Commissioning Plan Claire Chandler.
Hunt for Molecules, Paris, 2005-Sep-20 Software Development for ALMA Robert LUCAS IRAM Grenoble France.
Data Management Subsystem: Data Processing, Calibration and Archive Systems for JWST with implications for HST Gretchen Greene & Perry Greenfield.
DCS Overview MCS/DCS Technical Interchange Meeting August, 2000.
Software Integration and Test Techniques in a Large Distributed Project: Evolution, Process Improvement, Results Paola Sivera - ESO.
1 ANASAC Meeting – May 20, 2015 ALMA Pipeline Brian Glendenning (for Jeff Kern)
SPACE TELESCOPE SCIENCE INSTITUTE Operated for NASA by AURA COS Pipeline Language(s) We plan to develop CALCOS using Python and C Another programming language?
HIFI Tutorial 3: Getting some basic science out A.P.Marston ESAC 27 June 2013.
Atacama Large Millimeter/submillimeter Array Expanded Very Large Array Robert C. Byrd Green Bank Telescope Very Long Baseline Array New VLBA capabilities.
Nick Elias 2010 May 14 CASA Developers' Meeting1.
S.T. MyersEVLA Advisory Committee Meeting September 6-7, 2007 EVLA Algorithm Research & Development Steven T. Myers (NRAO) CASA Project Scientist with.
ALMA Software B.E. Glendenning (NRAO). 2 ALMA “High Frequency VLA” in Chile Presently a European/North American Project –Japan is almost certainly joining.
CHAPTER TEN AUTHORING.
Atacama Large Millimeter/submillimeter Array Karl G. Jansky Very Large Array Robert C. Byrd Green Bank Telescope Very Long Baseline Array CASA Progress.
Atacama Large Millimeter/submillimeter Array Expanded Very Large Array Robert C. Byrd Green Bank Telescope Very Long Baseline Array Using CASA to Simulate.
The ALMA TelCal subsystem Dominique Broguière, Institut de RadioAstronomie Millimétrique (IRAM) TelCal Phasing meeting – Grenoble -10/12/2012.
Doug Tody E2E Perspective EVLA Advisory Committee Meeting December 14-15, 2004 EVLA Software E2E Perspective.
THE LWA SOFTWARE LIBRARY Jayce Dowell – LWA Users’ Meeting – July 27, 2012.
Scheduling Blocks: a generic description Andy Biggs (ESO, Garching)
The european ITM Task Force data structure F. Imbeaux.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Archive Access Tool Review of SSS Readiness for EVLA Shared Risk Observing, June 5, 2009 John Benson Scientist.
EVLA Transition to Science Operations: An Overview EVLA Advisory Committee Meeting, March 19-20, 2009 Bob Dickman AD, NM Operations.
Gustaaf van MoorselEVLA Advisory Committee Meeting May 8-9, 2006 EVLA Computing Software Overview.
EVLA Software Bryan Butler. 2007May22EVLA SAGE Meeting2 Requirements and Goals of EVLA Software Maximize scientific throughput of the instrument At a.
Long Term Transition Plan Gareth Hunt EVLA M&C PDR 2002 May 15.
Atacama Large Millimeter/submillimeter Array Expanded Very Large Array Robert C. Byrd Green Bank Telescope Very Long Baseline Array FPGA Spectrometer.
GBT Focal Plane Array Development Program Steven White Program Manager.
Observing Modes from a Software viewpoint Robert Lucas and Philippe Salomé (SSR)
Sanjay BhatnagarEVLA Advisory Committee Meeting May 8-9, EVLA Algorithm Research & Development Progress & Plans Sanjay Bhatnagar CASA/EVLA.
Atacama Large Millimeter/submillimeter Array Karl G. Jansky Very Large Array Robert C. Byrd Green Bank Telescope Very Long Baseline Array GBT Control System.
FASR Software Considerations Gordon Hurford SSL AUI – August 2007.
SAGE meeting Socorro, May 22-23, 2007 EVLA Science Operations: the Array Science Center Claire Chandler NRAO/Socorro.
Atacama Large Millimeter/submillimeter Array Expanded Very Large Array Robert C. Byrd Green Bank Telescope Very Long Baseline Array Data Processing Progress.
14 June, 2004 EVLA Overall Design Subsystems II Tom Morgan 1 EVLA Overall Software Design Final Internal Review Subsystems II by Tom Morgan.
Introduction to EVLA Software Bryan Butler. 2006Dec05/06EVLA M&C Transition Software CDR2 EVLA Computing (Terse) History The original EVLA Phase I proposal.
ADMIT: ALMA Data Mining Toolkit  Developed by University of Maryland, University of Illinois, and NRAO (PI: L. Mundy)  Goal: First-view science data.
Pipeline Basics Jared Crossley NRAO NRAO. What is a data pipeline?  One or more programs that perform a task with reduced user interaction.  May be.
Digital Packaging Processor - Overview Gordon Hurford Nov 7, 2011 EOVSA Technical Design Meeting - NJIT.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
ESO, 17 April 2007ESAC meeting1 ALMA offline User Test 5 Silvia Leurini, ESO.
M.P. RupenEVLA Advisory Committee Meeting September 6-7, Correlator Test Plan Michael P. Rupen.
ADMIT: ALMA Data Mining Toolkit  Developed by University of Maryland, University of Illinois, and NRAO (PI: L. Mundy)  Goal: First-view science data.
Frazer OwenNSF EVLA Mid-Project Review May 11-12, Transition to EVLA
Gustaaf van MoorselEVLA Advisory Committee Meeting December 14-15, 2004 EVLA Computing End-to-end (E2e) software.
Atacama Large Millimeter/submillimeter Array Expanded Very Large Array Robert C. Byrd Green Bank Telescope Very Long Baseline Array Emmanuel Momjian (NRAO)
EVLA Data Processing PDR Pipeline design Tim Cornwell, NRAO.
Metadata for the SKA - Niruj Mohan Ramanujam, NCRA.
Atacama Large Millimeter/submillimeter Array Expanded Very Large Array Robert C. Byrd Green Bank Telescope Very Long Baseline Array Using CASA to Simulate.
Software Integration and Test Techniques in a Large Distributed Project: Evolution, Process Improvement, Results Paola Sivera - ESO.
PC-DMIS Introduction to GD&T Selection
Fundamentals of Information Systems, Sixth Edition
Bryan Butler EVLA Computing Division Head
NRAO VLA Archive Survey
Computing Architecture
HIFI Hands-on Exercises
EVLA Overall Software Design
EVLA Computing Software Overview.
EVLA Advisory Committee Meeting
Gustaaf van Moorsel September 9, 2003
Observatory Science Operations
EVLA Algorithm Research & Development
Observatory Science Operations
Presentation transcript:

1 KFPA Critical Design Review – Fri., Jan. 30, 2009 KFPA Data Pipeline Bob Garwood- NRAO-CV

2 KFPA Critical Design Review – Fri., Jan. 30, 2009 History ● Science and Data Pipeline Workshop – November Initial pipeline sketch. ● Conceptual Design Review – February Initial design. ● KFPA Data Analysis Meeting – June ● Memo describing possible KFPA observing modes. Pisano, August 2008.

3 KFPA Critical Design Review – Fri., Jan. 30, 2009 Changes since Conceptual Design Review ● Basic design essentially unchanged ● Out-of-scope items (deferred) – continuum – cross-correlation (polarization) – complicated calibration schemes (“basketweaving”) ● baseline fitting added as an explicit step

4 KFPA Critical Design Review – Fri., Jan. 30, 2009

5 Existing GBT Data Analysis Software ● sdfits tool produces SDFITS file – associates raw data from a backend with meta data describing the observations. DCR, SP, Spectrometer. (data capture) ● GBTIDL – recommended spectral line analysis tool. Focused on single spectra processing and analysis, not on imaging. Used to prepare the data to be imaged elsewhere. (calibration, editing) ● AIPS is used to produce images.

6 KFPA Critical Design Review – Fri., Jan. 30, 2009 We can reduce k-band data now ● K-band spectrometer data calibrated and imaged using existing tools.

7 KFPA Critical Design Review – Fri., Jan. 30, 2009 Missing components ● None of the steps to an image are automated. ● Uses lab-measured Tcal values. ● Uses a scalar Tcal without regard to any structure in Tcal across the bandpass. ● Cross-correlation (polarization) data is not supported after the sdfits step. ● Poor support for continuum data.

8 KFPA Critical Design Review – Fri., Jan. 30, 2009 Missing Components continued ● Only have prototype tool for visually interacting with large amounts of data (e.g. visual flagging). ● Only prototype tools for statistically flagging or editing the data (e.g. RFI rejection).

9 KFPA Critical Design Review – Fri., Jan. 30, 2009 Goals of the Prototype Pipeline ● Support KFPA commissioning ● Explore new processing tools/techniques not yet widely available in GB (vector calibration, statistical data flagging and editing, visualization, parallel processing). ● Prototype an automated pipeline – add necessary meta data to capture user intent ● Prototype tools necessary to support larger focal plane array (e.g. parallel computing)

10 KFPA Critical Design Review – Fri., Jan. 30, 2009 Goals continued ● Based on prototyped tools, estimate cost associated with delivering a pipeline and necessary computing hardware to handle the expected data rates for a larger focal plane array. ● Develop these tools and pipeline infrastructure for use with data from other backends.

11 KFPA Critical Design Review – Fri., Jan. 30, 2009 Pipelines ● Crude pipeline can be assembled from existing components for quick-look images. – Small modification to sdfits (data capture) to properly capture individual feed offsets from pointing position. – Some additional meta data to capture default image parameters and associated “off” information.

12 KFPA Critical Design Review – Fri., Jan. 30, 2009 Pipelines ● Imperative for large focal plane array. – large data rates and volume ● Necessary for even a modest 7 element array. ● Useful for data from other GBT backends – Users often end up creating partial pipelines – The NRAO archive needs this to be able to provide more than just the raw GBT data. – Other telescopes routinely provide roughly- calibrated data to their users – most institutions consider this the starting point of a data pipeline.

13 KFPA Critical Design Review – Fri., Jan. 30, 2009 Pipelines ● Requires using a standard observing mode. – Sufficient meta data needs to be captured to drive the pipeline (e.g. groups of scans that should be processed together, associated “off” information, etc). ● Individual components can be used outside of the pipeline – often with additional options.

14 KFPA Critical Design Review – Fri., Jan. 30, 2009 Pipeline ● None of those steps is unique to the KFPA – KFPA-specific steps are likely as part of the statistical flagging and editing component as well as in data capture. ● Components are being developed independently. – no dependencies between components ● Some components are likely to be useful interactively – especially flagging and editing.

15 KFPA Critical Design Review – Fri., Jan. 30, 2009 Pipeline Design continued ● Eventually - Continuum data will be extract from the spectral line data at the appropriate point in the pipeline. This work is out-of- scope for the initial pipeline. ● Language – python – Experience with python in Green Bank – Same language used in the ALMA pipeline and in casa.

16 KFPA Critical Design Review – Fri., Jan. 30, 2009 Pipeline design, continued ● Data formats – SDFITS up to imaging step. ● Currently produced by data capture (sdfits) ● Tools already exist to interact with this data. ● May be necessary to split data into multiple SDFITS files for parallel computing needs. – Alternatives used as necessary – for speed or take advantage of existing tools – e.g. AIPS

17 KFPA Critical Design Review – Fri., Jan. 30, 2009 Parallel Computing ● Most of these steps are “embarrassingly parallel” - data from individual feeds can be processed independently – exceptions: some statistical flagging and editing and cross-correlation data – these are out of scope for the initial pipeline. ● Parallel processing will be explored during KFPA pipeline development.

18 KFPA Critical Design Review – Fri., Jan. 30, 2009 Development Priorities ● Calibration – Complete GBTIDL vector Tcal and initial calibration database work. – Design pipeline calibration database. ● Data Capture – This is the current bottleneck. Work is underway to improve the processing speed. A new raw data format may be necessary.

19 KFPA Critical Design Review – Fri., Jan. 30, 2009 Priorities continued ● Data capture (continued) – ensure that feed offsets are used properly with pointing direction to get individual feed pointings – put default calibration values into calibration database (GBTIDL model first, pipeline model when design completed). – Add appropriate meta information as necessary to automate data flow through the pipeline.

20 KFPA Critical Design Review – Fri., Jan. 30, 2009 Priorities, continued ● Pipeline design and implementation – Automate flow of data between existing compontents. – Initially this will be a simple script triggered off of the standard observing modes using default values and available meta information. – It will be possible to re-run the pipeline using alternative parameters (e.g. baseline fits, additional statistical flags, interactive flagging and editing, etc).

21 KFPA Critical Design Review – Fri., Jan. 30, 2009 Priorities, continued ● Data Visualization – Evaluate existing tools for viewing with and interacting with GBT data in sdfits form. ● Data quality throughout the pipeline ● Interactive flagging ● Summer student project – 2008 – prototype data viewer. Can do interactive flagging, not sufficiently general.

22 KFPA Critical Design Review – Fri., Jan. 30, 2009 Priorities, continued ● Investigate simple parallel processing options – start with existing code (sdfits) – take advantage of independence of data from each feed – keep things simple

23 KFPA Critical Design Review – Fri., Jan. 30, 2009 Priorities, continued ● Statistical data flagging – Borrow from code developed by GBTIDL users – Borrow from aips++/casa autoflagger – Develop “basketweaving” equivalent for KFPA array. ● Use (near) crossing points on sky (same feed; multiple feeds) to adjust data. ● out of scope for initial pipeline development

24 KFPA Critical Design Review – Fri., Jan. 30, 2009 Priorities, continued ● Algorithm development (calibration, continuum data handling, etc). Roberto Ricci, U. Calgary.

25 KFPA Critical Design Review – Fri., Jan. 30, 2009 Resources ● Bob Garwood, NRAO – 1 FTE, component design and development ● Robert Ricci, U. Calgary – algorithm development