ES Slowdown, Optimization, Testing. Plan for shutdown: Timeline April: Focus on resolution of major outstanding issues: – Bulk data deployment  stable.

ES Slowdown, Optimization, Testing

Plan for shutdown: Timeline April: Focus on resolution of major outstanding issues: – Bulk data deployment  stable use of multiple arrays: Status nominally allows effective use of multiple arrays but not sure if issues with high rate of execution failure (5 of 15-20) is related – Notification channel issues: often results in long delays at start up, missing data at times (wvr/tsys) (Problem comes and goes) – Data capture issues: Issues with container crashes resulting in lost data. Limits maximum execution time. Status: Fixes in, requires testing/verification. – “Handover” time has grown: Systems not coming up, is a combination of hardware and software. Tzu/Nick/Emilio working on this. May: – Start acceptance testing – Focus on simulation and testing improvements begins – Potential missions to work on remaining issues if needed – Focus of debugging moves from issues affecting Cycle 1 to issues affecting CSV scope of testing for future cycles.

Plan for shutdown: planned missions Bulk data: Completed mission. Issues still remain. CSV is using a “stable” which has troubles (typically about 5-7 executions a night fail for reasons that look like timeout issues) Notification channel mission: Unfortunately marginalized by power shutdown Data capture mission after this meeting

Plan for shutdown: Obsmode Suite Test suite for basic, science-like executions. Tests SSR/SOS functionality, data recording, range of Cycle 1 capabilities Completed late March: taking awhile to get repeatedly good datasets Despite this reasonable data now getting to the pipeline testers (SACM group). This will become SB execution regression for Cycle 1. Will be extended for Cycle 2 capabilities as they come/are verified (already have polarization “science-like” SB that will evolve this way) All reduction intended to be done with the Pipeline. If it can’t be, it will advise the reduction of new modes.

Plan for shutdown: Software basic Minimal set of tests to run at weekly regression to verify functionality Will likely consist of Total Power, Autocorrelation, ACA+BL correlator run at the same time (4-5 executions). – Total power raster – Auto correlation raster (to be combined with above when dual mode works) – ACA+BL executions of PNT, SBR, Tsys, Bandpass, PNT, Tsys, Bandpass – These are nominally not directly reducible from the pipeline. Initial set defined, need to iterate with ADC on the details. Initial proposal was sent to ADC, has evolved a bit. Tool kit being developed based on MS side. Metrics will include things like “detectDelayJump(threshold,timescale)”, “detectPlatforming(threshold)”, etc. Also using scan, spw, data size metrics to make sure everything that should be there is there. Check flagging fraction It is assumed that this is throw away code that will be implemented as metrics in the pipeline eventually. Discuss timeline? Intended that basic executions are not pipeline reducible (too much overhead for weekly regression) Idea is for computing to run, science to provide pass/fail criteria Contributions likely from CSV, DSO, ARCs and spans SACM, DMG and Pipeline related staff. Deadline for design and execution blocks: April 30 Deadline for toolkit TBD (in progress, will likely evolve)

Plan for shutdown: Plan for intensive Designed to catch issues often present in major releases Again, design tools that can eventually put into Pipeline when time allows All of these need SSR work to make things automatic in terms of source selection for “science target” as well as calibrators. Special execution scripts are not needed. Designed to use Pipeline: (initial reduction done in Pipeline, all tool creation is in progress to be absorbed into Pipeline eventually) – Frequency labels (SB created) – Phase transfer, Phase/delay jump (mixed mode, SB created) – Return to phase/delay after band change (SB to be made by end of April) – TDM phase/delay jump and platforming detection (SB to be created by end of April, fast dumps) – Scan sequence stresses/latency check (SB to be created by end of April) Not to be reduced at least to first order in Pipeline – Verify execution of all CalTargets and results which are repeatable – Includes data checks to “applied online” as well as “reduced offline” targets Intensive suite will incorporate new capabilities as they come forward with the goal of not introducing new tests but incorporating new features into the old tests (not a new idea…)

Plan for next year: SSR/SOS and unit tests SSR/SOS review completed Monday/Tuesday High priority placed on query interface refactor: – Would like to eventually migrate things into the calibrator catalog interface but will design to ease this at a later date – Target based queries go into the target High priority placed on merging observing mode functions that are in the SSR/SOS side to Control when needed, make SSR/SOS obsmode inherit from Control, not other way around (don’t ask…) Development of Sessions, Observatory Calibration Scripts and new modes will add a layer of ObservingStrategy. Timeline for this full refactor is ~1 year given manpower and need to develop some new functionality on our side. Development will be done in parallel branches with refactor worked on in one branch and separable new capabilities in another ScanLists will manage logic of execution breaks (currently the ScanList is a dumb handler) Unit tests will be updated as time allows Development/refactor assignments: N Phillips (SIST) observatory calibration scripts; P Cortes (DSO) sessions and observing strategy rework; Ignacio Toledo (DSO-DA) query refactor; S Corder (CSV) ScanList intelligence design assignments as possible to other groups (this item is completely dependent on refactor, not on critical path for Cycle 2).

Optimization/Coordination Who will do which work? During what array time? What is the timescale for getting performance metrics into the pipeline? Has CSV left anything out to help provide a long term viable observatory operational model (>3 years)? Are the divisions of the testing suites appropriate/complete? What is the level of support that can be provided with the refactor/unit tests? What is the model for getting more coordinated and complete testing into the lower level? – Can we test with a more realistic simulation environment? (Better testing of interactions?) – Can we test with better scalability considerations?

ES Slowdown, Optimization, Testing. Plan for shutdown: Timeline April: Focus on resolution of major outstanding issues: – Bulk data deployment  stable.

Similar presentations

Presentation on theme: "ES Slowdown, Optimization, Testing. Plan for shutdown: Timeline April: Focus on resolution of major outstanding issues: – Bulk data deployment  stable."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ES Slowdown, Optimization, Testing. Plan for shutdown: Timeline April: Focus on resolution of major outstanding issues: – Bulk data deployment  stable.

Similar presentations

Presentation on theme: "ES Slowdown, Optimization, Testing. Plan for shutdown: Timeline April: Focus on resolution of major outstanding issues: – Bulk data deployment  stable."— Presentation transcript:

Similar presentations

About project

Feedback