ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, 17-19 April 2013 ACA plan Manabu Watanabe National Astronomical Observatory.

Slides:



Advertisements
Similar presentations
Yokogawa Network Solutions Presents:
Advertisements

System Integration and Performance
Digital FX Correlator Nimish Sane Center for Solar-Terrestrial Research New Jersey Institute of Technology, Newark, NJ EOVSA Technical Design Meeting.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir.
Dale E. Gary Professor, Physics, Center for Solar-Terrestrial Research New Jersey Institute of Technology 1 9/25/2012Prototype Review Meeting.
CABB Observations Preparations and Observing Mark Wieringa.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Introduction Part 3: Input/output and co-processors dr.ir. A.C. Verschueren.
Discovering Computers Fundamentals, Third Edition CGS 1000 Introduction to Computers and Technology Fall 2006.
Dale E. Gary Professor, Physics, Center for Solar-Terrestrial Research New Jersey Institute of Technology 1 3/16/2012OVSA Preliminary Design Review.
Chapter 6 Errors, Error Detection, and Error Control
CSCI 4550/8556 Computer Networks Comer, Chapter 7: Packets, Frames, And Error Detection.
1 Chapter Six - Errors, Error Detection, and Error Control Chapter Six.
Chapter 11 - Monitoring Server Performance1 Ch. 11 – Monitoring Server Performance MIS 431 – created Spring 2006.
Chapter 6: Errors, Error Detection, and Error Control
Chapter 6 Errors, Error Detection, and Error Control
The importance of switching in communication The cost of switching is high Definition: Transfer input sample points to the correct output ports at the.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 8: Implementing and Managing Printers.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 8: Implementing and Managing Printers.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 8: Implementing and Managing Printers.
A. McDonald (1), J. Devlin (2). (1) Department of Physics, La Trobe University, Victoria, Australia (2) Department of Electronic Engineering, La Trobe.
Ninth Synthesis Imaging Summer School Socorro, June 15-22, 2004 Cross Correlators Walter Brisken.
EMBEDDED SOFTWARE Team victorious Team Victorious.
Network Topologies An introduction to Network Topologies and the Link Layer.
Hands-On Microsoft Windows Server 2008
Hunt for Molecules, Paris, 2005-Sep-20 Software Development for ALMA Robert LUCAS IRAM Grenoble France.
Basic concepts of radio interferometric (VLBI) observations Hiroshi Imai Department of Physics and Astronomy Graduate School of Science and Engineering.
Hands-On Microsoft Windows Server 2003 Administration Chapter 2 Managing Windows Server 2003 Hardware and Software.
Update on LBA Technical Developments CASS Chris Phillips 11 October 2013.
Chapter 1 What is Programming? Lecture Slides to Accompany An Introduction to Computer Science Using Java (2nd Edition) by S.N. Kamin, D. Mickunas, E.
Introduction to Networks CS587x Lecture 1 Department of Computer Science Iowa State University.
Cycle-3 Capabilities and the OT Andy Biggs ALMA Regional Centre, ESO.
14-15 May,2002 EVLA Correlator Backend Functional Design Tom Morgan 1 Backend Preliminary Functional Design.
Gauge Operation and Software by Scott A. Ager. Computer Recommendations 750 MHz Pentium III 64 Meg SRAM 40 Gig Hard Drive 1024 x 768 graphics CD Writer.
TelCal Phasing Engine description Draft Robert Lucas
ALMA Integrated Computing Team Coordination & Planning Meeting #2 Santiago, January 2014 Control Group Planning Rafael Hiriart, Control Group Lead.
Chapter 12 Transmission Control Protocol (TCP)
ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, April 2013 ICT Group Planning: Control Rafael Hiriart ICT Control Group.
Scheduling Blocks: a generic description Andy Biggs (ESO, Garching)
Correlator Growth Path EVLA Advisory Committee Meeting, March 19-20, 2009 Michael P. Rupen Project Scientist for WIDAR.
ALMA Integrated Computing Team Coordination & Planning Meeting #3 Socorro, June 2014 ACA planning Manabu Watanabe, George Kosugi NAOJ.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
ALMA Integrated Computing Team Coordination & Planning Meeting #4 Santiago, November 2014 Telescope Calibration Planning Dominique Broguière.
Data Communications & Computer Networks, Second Edition1 Chapter 6 Errors, Error Detection, and Error Control.
Chapter 6: Errors, Error Detection, and Error Control Data Communications and Computer Networks: A Business User’s Approach Third Edition.
Real-time Acquisition and Processing of Data from the GMRT Pulsar Back- ends Ramchandra M. Dabade (VNIT, Nagpur) Guided By, Yashwant Gupta.
02/6/ jdr1 Interference in VLBI Observations Jon Romney NRAO, Socorro ===================================== 2002 June 12.
Observing Modes from a Software viewpoint Robert Lucas and Philippe Salomé (SSR)
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University July 21, 2008WODA.
Logic Analyzer ECE-4220 Real-Time Embedded Systems Final Project Dallas Fletchall.
ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, April 2013 Telescope Calibration Planning Dominique Broguiere.
ES Slowdown, Optimization, Testing. Plan for shutdown: Timeline April: Focus on resolution of major outstanding issues: – Bulk data deployment  stable.
Configuration Mapper Sonja Vrcic Socorro,
Unit 1 Lecture 4.
GAN: remote operation of accelerator diagnosis systems Matthias Werner, DESY MDI.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
بسم الله الرحمن الرحيم MEMORY AND I/O.
1 Chapter Overview Monitoring Access to Shared Folders Creating and Sharing Local and Remote Folders Monitoring Network Users Using Offline Folders and.
Scenario use cases Szymon Mueller PSNC. Agenda 1.General description of experiment use case. 2.Detailed description of use cases: 1.Preparation for observation.
Atacama Large Millimeter/submillimeter Array Karl G. Jansky Very Large Array Robert C. Byrd Green Bank Telescope Very Long Baseline Array ALMA Correlator.
 Software reliability is the probability that software will work properly in a specified environment and for a given amount of time. Using the following.
Advanced Taverna Aleksandra Pawlik University of Manchester materials by Katy Wolstencroft, Aleksandra Pawlik, Alan Williams
ALMA Integrated Computing Team Coordination & Planning Meeting #3 Socorro, June 2014 Observation with ACA correlator for Cycle3 Manabu Watanabe NAOJ.
ACA TP Spectrometer Manabu Watanabe (NAOJ)
The Distributed Application Debugger (DAD)
Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir. A.C. Verschueren Eindhoven University of Technology Section of Digital.
Simulation Requirements
JIVE UniBoard Correlator (JUC) Firmware
Which of the following is a digital communications mode?
Reliability and Channel Coding
Chapter 13: I/O Systems.
Presentation transcript:

ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, April 2013 ACA plan Manabu Watanabe National Astronomical Observatory of Japan

ICT-CPM April 2013 ACA involved failures in Q n 24% of failures have its origin in the bugs of ACA software. ( From a simple JIRA ticket analysis) n Shared memory trouble and the bulk data trouble go up to 10/13.

ICT-CPM April 2013 Planning items and time frame(1) n ACA planning items with rough time frames

ICT-CPM April 2013 Planning items and time frame(2) n ACA planning items with rough time frames (continued)

ICT-CPM April 2013 Bug fix n Failure in attaching the shared memory  The observation fails if the problem happens, then the involved container should be shutdown and restarted. n sendData() method sometimes takes very long to return  The observation fails because CDP master fails to send data if the problem happens, then the observation should be run again. The root cause of the problem is still unclear, network, software (ACACORR, BDS, bulk data receivers),… n XP delay should be effective for the single-dish observation as well  The cross polarization delay does not work for the single dish cross polarization observation. It does work properly for the interferometry. n 1.907KHz shift in the center frequency of channels  The center frequency of channels are always shifted by about kHz between the frequency label and the actual spectra. n Phase of the Walsh function should be changed every subscans  ACACORR has thought the 90 degree phase switching starts at the beginning of each subscan. But, actually LO starts the switching from T00:00:00. The phase of the 90 degree phase switching could be different to each other. ACACORR plans to change the beginning phase of the 90 degree phase switching for each subscan.

ICT-CPM April 2013 Adjustment to ACA correlator (1) n Relax a health check of 3bit histogram  The observation fails when the total number of samples in the histogram is NOT equal to at the correlator calibration samples corresponds to 960ms which is the sampling period of the histogram. This health check fails frequently in the observation these days and Fujitsu ensures the soundness of the histogram even when the total number of samples of that is different from We plan to relax the health check of the histogram. n Increasing time interval for getting 3bit histogram  The ACA correlator sometimes fails in the inter-module communication. The failure may lead to the observation failure. We suspect that the frequent getting 3bit histogram disturbs the inter-module communication of the ACA correlator. This change should be available in April or May after the further investigation of the problem. n Remove the check of FFT overflow flag in CDP nodes  CDP nodes print messages of the FFT overflow when CDP nodes detect FFT overflow flag in the data header which received from the ACA correlator. But, the ACA correlator had been changed. The FFT overflow flag is still there but it is trustless any more. It should be nice to remove the trustless FFT overflow messages from the container log of CDP nodes.

ICT-CPM April 2013 Adjustment to ACA correlator (2) n Suppress warning for the FFT overflow and the delta sigma overflow  CCC print messages of the FFT overflows and the delta sigma overflows when ACA correlator detects the overflows. These are useful information. The problem is the FFT overflows and the delta sigma overflows will continuously happen during the interval between observations. During the interval, the input signals from the antennas may NOT be reliable, e.g., missing frames, broken frames, zero signal levels, and so on. So, these overflow messages are useless in that case and very annoying. It should be nice to remove these overflow messages during the interval between observations. n Parallelize the monitor commands for all quadrants  CCC monitors the status (temperature, fan speed, voltage) of the ACA correlator. The monitoring will be parallelized for 4 quadrants. Get hardware failure command will be parallelized as well.

ICT-CPM April 2013 New features (1) n New ACACorrGUI  We have several requests to improve ACACorrGUI. Some of the requests are motivate of the totally new ACACorrGUI. The new ACACorrGUI will be a receiver of the BDF transmitted from CDP master and display the spectra for all baselines at one time. n Alarm based on the analysis on container log files  Failures occur continuously at a certain frequency in the observation with ACA correlator. It takes long time to identify the root cause of the failure every time. We plan to implement a simple log inspection program to push alarms by identifying some of the failures which are familiar occurrence. n ACA specific delay read from TMCDB  ACA correlator needs its specific delay compensation. Takeshi Kamazaki requests that the specific delay should be in TMCDB for necessary change. n Window function read from TMCDB  ACA correlator applies a window function by weighted running mean. Takeshi requests the weight function should be in the TMCDB for necessary change.

ICT-CPM April 2013 New features (2) n Finite dead time in the bin switching  ACACORR support the bin switching. The dwell time should be given in advance and the dead time should be zero. These assumptions should be justified for the frequency switching but not for nutator switching. n Increasing the number of bins (3 or more)  ACACORR support the bin switching for 2 bins usecase. The bins of ACACORR should be extended if 3 or more bins are needed. n WVR coefficients  ACACORR cares about the effective period of WVR coefficients but CORR does not. CORR could have multiple WVR coefficients for each spectral windows but ACACORR could have only one WVR coefficients for the receiver band at once. ACACORR should (or should not?) follow CORR. n ACACORR porting to 64bit OS  ACACORR should be ported into 64bit RH6.4 or so.

ICT-CPM April 2013 New features (3) n BDNT configuration read from TMCDB  Bogdan requests CORR and ACACORR to read the BDNT configuration from TMCDB. n TCP connection in BDNT  Bogdan requests CORR and ACACORR to use TCP instead of UPD in the data transmission from CDP nodes to CDP master.

ICT-CPM April 2013 Reqest for improvement (1) n Increasing efficiency (1) in Tsys measurement  Stuart requests ACA to reduce the Tsys measurement time to 30 seconds from 2 minutes ACA takes currently. We think we can reduce it up to 1 minutes by taking advantage of the subscan sequence with “delta requantization correction”. n Increasing efficiency (2)  Takeshi requests reduce the overhead (lead time and processing time) which takes about 20 seconds for the correlator calibration and about 15 seconds for the real observation. The slow response of the ACA correlator gives the major part of the lead time so software has a limited amount of time to be reduced. n Special data rate calculation in AUTO_ONLY mode  Takeshi requests that the data rate should be calculated as TP array when the number of antennas is 4 or less in the array regardless of their CAIs in the AUTO_ONLY mode. n Reduce unnecessary warning messages  It should be nice to reduce the annoying log messages where practical.

ICT-CPM April 2013 Reqest for improvement (2) n Updating 3bit linearity correction every integrations  Takeshi requests an enhancement of the 3bit linearity correction. n Updating delta requantization correction every integrations  Takeshi requests an enhancement of the delta requantization correction. n Automatic self-test of ACA correlator when ACACORR gets started  Takeshi requests ACACORR to run a self-test of ACA correlator (mci_st) automatically whenever ACACORR gets started. This will help the operator.

ICT-CPM April 2013 New features unspecified yet n Digitizer quantization correction  Takeshi should provide the algorithm of the digitizer quantization correction for ACA correlator. Then ACACORR will implement that. n Subarraying in an SB  ACA phase calibration may require subarraying in the execution of SB. Science should clarify the calibration plan first, then Computing should discuss about the implementation of that in detail. Probably, Scheduling, CONTROL, DataCapture, ASDM, OT, ACACORR should be involved.

ICT-CPM April 2013 No plan yet n 3LO in interferometry  3LO is available for the single dish observation but 3LO of ACA does not work as planned for the interferometry. Takeshi explains the root cause of the problem in the ticket. Please refer to the ticket for details. Note that 2LO should work properly and 90 degree phase switching is another alternative. n Phase-up mode  ACA phase up mode for VLBI has never been considered seriously. The ACA correlator should need some further development work if the phase up mode is necessary which naturally requires some further works for ACACORR.