Download presentation
Presentation is loading. Please wait.
Published byHope Sullivan Modified over 6 years ago
1
NCRs and Waivers GLAST Large Area Telescope LAT Pre-Shipment Review
Gamma-ray Large Area Space Telescope GLAST Large Area Telescope LAT Pre-Shipment Review NCRs and Waivers Pat Hascall Systems Engineering Stanford Linear Accelerator Center NCRs and Waivers
2
NCR Introduction Presentation focus is on the main hardware related NCRs that remain open Impact assessment for those NCRs identified as “Can Not Duplicate” (CND) will be discussed. Several NCRs were left open for Environmental Testing at NRL and will be closed when final data review is completed after TV (Documentation of analysis or minor rework/inspection) NCRs are classified into categories for discussion purposes NCR Summary List of all open NCRs is presented for reference NCRs and Waivers
3
NCR Category Definitions
Open NCR’s classified into categories for discussion purposes: Category Definition Count Hardware Discrepancy H/W Issue, half can be closed with minor rework or inspection after TV 6 FSW Discrepancy Identified FSW bug, FSW JIRA in work or completed Monitor for Verification Likely test issue, trending for repeat 2 Known Feature Specification not violated, but trending or changes required to accommodate behavior 11 EGSE/Data Processing NCR has been isolated to EGSE/Data Processing 12 CND Could not duplicate 7 NCRs and Waivers
4
NCR Resolution Definitions
Open NCR’s classified into categories for discussion purposes: Closure Plan Definition Count Issue resolved, closure plan in place or no impact to LAT Closing Issue resolved, documentation required to close 13 Close after TV Monitored during TV with no issues seen (complete writeup and close) or minor rework scheduled after TV 5 B0.6.11 Resolved with FSW release B0.6.11 1 B0.7.0 Resolved with FSW release B0.7.0 4 B1.0.0 Resolved with FSW release B1.0.0 Closure deferred EGSE issues, low priority to fix or hold open to track repeats 9 Work continuing QAR CND or other issue expected to be transferred to a QAR 7 In process Investigation ongoing 3 Waiver Waiver in process NCRs and Waivers
5
NCR Resolution vs Category
NCRs and Waivers
6
NCRs For Discussion NCR Date Opened Affected Hardware
Problem Description NCR Category Resolution 855 3/23/06 DAQ LATC verify occcasionally reports errors in CRC, SPT, ARC CND QAR 880 4/12/06 SIU Spontaneous Reboot 881 EPU Spontaneous Reboot 902 5/9/06 EPU 0 unexpected reboot 922 6/5/06 Housekeeping telemetry stopped FSW Discrepancy in process 946 7/11/06 ACD ACD FREE board 5 not responsive after power up 948 7/19/06 EPU 1 reboot, EPU 2 reboot 949 7/21/06 Primary SIU reboot, redundant SIU reboot 957 8/4/06 EPU 0 CPU junction temperature higher than expected in vacuum Hardware Discrepancy Closure near 975 9/1/06 Datagram timetag error NCRs and Waivers
7
NCR 855 LATC Verify Errors CND, QAR
Issue Calorimeter and Tracker front-end register (RC and FE) was not read successfully Analysis Issue in 16 of 3805 LATC configurations Affects about 10 bits of ~ 2 million bits written/read in each LATC execution Mostly at the start of commissioning: Fourteen happened in first 290 runs One in run 777 One in run 1426 None in last 2400 write/reads None using FSW B0.6.9 (2120 runs) Have not been able to replicate on the testbed Resolution Plan Mitigation plans in FSW JIRA FSW-653 (B0.6.10) adds a reset to the calorimeter at power up JIRA FSW-729 adds a retry by LATC when a verify error occurs, to be implemented if the problem reappears Impacts on On-orbit performance With no action, the LAT will not start the physics acquisition and one orbit worth of data would not be collected NCRs and Waivers
8
NCR 922 Housekeeping Telemetry Stopped
QAR Issue: During a charge injection test, LAT housekeeping process (LHK) has twice emitted an error and subsequently did not create telemetry packets successfully Analysis: Review by FSW indicates a possible cause that the LHK and LCI (LAT Charge Injection) process may be in contention for memory Was reproduced on testbed, analysis in progress Resolution Plan: Find and correct FSW error Impacts on On-Orbit performance: Loss of housekeeping data until restart NCRs and Waivers
9
NCR 946 ACD Free 5 Power Up CND, QAR
Issue: ACD Free Board 5 was not responsive to commands after power up Analysis: Other power up parameters (voltages and currents) reviewed with no abnormalities seen Occurred once in over 275 power ups (122 using redundant GASU) Two possible causes Look-at-me from redundant GASU did not reach or might have been decoded improperly on the Free board in which case it would only respond to commands from the primary (unpowered) GASU Custom power up sequence (compensates for an old bug in ASIC primary/redundant switching code) did not result in successful power up in redundant mode Resolution Plan: Cycle power to FREE card (interim solution) Add functionality to FSW to be able to send Look-at-me command to a FREE card without power cycling (JIRA FSW 718 adds capability, targeted for B1.0.0) Impacts on On-Orbit performance: Extend power up sequence timeline slightly NCRs and Waivers
10
CPU Reboots CND, QAR Issue
EPU and SIU reboots seen on 7 occasions (next chart has details) Analysis Several of these have likely causes relating to EGSE Two recent reboots have occurred without the EGSE symptoms Rate is 7 in about 1800 hours, or once per 250 hours Existing data does not give sufficient information to determine the cause Resolution Plan FSW diagnostic software to be loaded that will help diagnose the cause of the reboots by identifying which tasks are executing Records most recent 1024 task switches Can be read out in primary or secondary boot Run LAT on a non-interference basis at observatory to accumulate run hours Impacts on On-orbit performance Loss of data and LAT housekeeping telemetry until SIU is rebooted, potential for SC directed loadshed Loss of ½ the science data until EPU is rebooted NCRs and Waivers
11
CPU Reboots (Continued)
NCR Unit Type of reboot LAT Activity Potential Cause 880 SIU redundant VxWorks reboot TkrTotGain_SVC_500hz 881 EPU2 uncorrectable memory error Transient memory error 902 CPU exception, PPC Vector 0x300 (DSI) During LatReinit, concurrent with main feed on command EPU2 was in primary boot and transmitting boot telemetry when script rebooted the SIU. This is a nonstandard configuration that may have contributed to the reboot 948-1 EPU1 Watchdog LPA: tackscan-6_0.55hr LCB errors at the time of the reboot indicate that the VSC was falling behind. 949-1 SIU primary e2e_LAT_22xGammafilterNoPer_0.17hr 1 pps/timetone errors from all 3 processors, and timehack table entry errors from the SIU indicate reboot likely induced due to incorrect sequence 1 PPS and timetone messages from the VSC 948-2 Checkstop LAT-22x_0.50hr muon run 949-2 intSeAppLrs_e2e_LAT-22xGammafilterNoPer_0.50hr No clear pattern has developed NCRs and Waivers
12
NCR 957 EPU 0 CPU Junction Temperature
Issue: After applying vacuum, EPU 0 CPU junction temperature was higher than the other CPUs by up to 28 degrees and reached 104 degrees during bakeout. Telemetry returned to normal after repressurization. Analysis: Not observed at unit test or ambient level LAT testing Possible causes Degradation of thermal connection from heat sink to CPU CPU on-chip sensor defective (unlikely) Thermal analysis of board and tests on similar hardware support first hypothesis (disconnected heat sinkn test and analysis results consistent with observed temperature rise) Under this condition, thermal and reliability margin still exceeds requirements Expected junction temperature for this box at hot is 80 degrees No potential for electrical side effects Heatsink is brazed to the rail, and is adjacent to a empty slot Epoxy is not electrically conductive Board is conformal coated Resolution Plan: Use as is Impacts on On-Orbit performance: None NCRs and Waivers
13
NCR 975 Datagram Time Tags Issue:
Some datagrams have a time tag that is 4.2 seconds “fast” Analysis: Rate is about 1 in datagrams Likely FSW bug Analysis in progress Will reproduce on testbed Could potentially be worked around on the ground Resolution Plan: Find bug and fix it Impacts on On-orbit performance None, if fixed May use GSW mitigation until FSW error is found and fixed NCRs and Waivers
14
Monitor for Verification
Open NCRs NCR Date Opened Affected Hardware Problem Description NCR Category Resolution Could Not Duplicate 535 6/20/05 TKR Monitor trend data for TKR 4 thru LAT testing (spin-off NCR) Known Feature Close after TV 624 8/27/05 ACD Fluctuations in temperature reading (originally an ACD PR) Monitor for Verification 625 AcdVetoHitmapPha apparent retrigger Closure near 626 ACDMonitor script high counts during T/V (originally ACD PR) 684 10/3/05 Noise Occupancy failures due to intermittent hot strips. 718 10/27/05 VETO threshold for channel 1123 can't be set below minimum value of 0.45pC (originally an ACD PR) 806 1/17/06 Trigger CAL is retriggering during SVAC run Closing 840 3/8/06 DAQ RunControl software problems observed on spare PDU during T/V test (NCR #794); opened NCR to track FSW updates B0.7.0 851 3/16/06 One or more EPU resets unexpectedly due to large number of events and interaction with end of run activities FSW Discrepancy B1.0.0 852 LCB errors observed during muon run EGSE Closure deferred 855 3/23/06 FSW LATC verify occcasionally reports errors in CRC, SPT, ARC CND QAR NCRs and Waivers
15
Open NCRs NCRs and Waivers NCR Date Opened Affected Hardware
Problem Description NCR Category Resolution Could Not Duplicate 859 3/23/06 FSW CAL LCI data compression less than expected, results in very large datagrams FSW Discrepancy B0.6.11 880 4/12/06 Integration SIU Spontaneous Reboot CND QAR 881 EPU Spontaneous Reboot 882 DAQ GLAT2525 power up software crash due to Result FIFO not empty. Known Feature B0.7.0 884 4/15/06 VSC errors observed during LAT power on EGSE Closure deferred 894 4/29/06 LATC dump GTFE mask error in process 902 5/9/06 EPU 0 unexpected reboot 909 5/16/06 LAT06X EEPROM writecount test results misinterpreted Closing 913 5/23/06 FSW error in packing ACD charge injection data 922 6/5/06 Housekeeping telemetry stopped 932 6/20/06 TKR RS103 susceptiblity in tracker Hardware Discrepancy Waiver 933 6/22/06 EGSE overwrites timetone messages NCRs and Waivers
16
Open NCRs NCRs and Waivers NCR Date Opened Affected Hardware
Problem Description NCR Category Resolution Could Not Duplicate 936 6/24/06 EGSE EGSE timed out on the 1553 bus Closure deferred 938 6/26/06 Housekeeping packet sampled early, causing test error 939 6/28/06 Thermal CS102 sensitivity in RTD sensors Hardware Discrepancy B0.7.0 941 Mech Temperature sensors swapped Closing 942 6/29/06 ACT to LAT blanket interface not consistent Close after TV 945 7/6/06 DAQ EGSE map of FSW files (in FMX) not properly initialized 946 7/11/06 ACD ACD FREE board 5 not responsive after power up CND QAR 948 7/19/06 EPU 1 reboot, EPU 2 reboot 949 7/21/06 Primary SIU reboot, redundant SIU reboot 957 8/4/06 EPU 0 CPU junction temperature higher than expected in vacuum 958 8/7/06 GTRC phase error induced by FPGA bug Known Feature 959 8/8/06 EGSE VSC crashes 960 Script missed an LPA stop command, resulting in test error NCRs and Waivers
17
Open NCRs NCRs and Waivers NCR Date Opened Affected Hardware
Problem Description NCR Category Resolution Could Not Duplicate 966 8/14/06 EGSE EGSE current measurement scale incorrect Closing 967 EGSE temperature measurements inaccurate over temp Closure deferred 969 8/16/06 ACD ACD FREE HV monitors read 100V at cold Known Feature 971 8/18/06 TKR GTFE (0,-x6,20) readback intermittent at cold 975 9/1/06 DAQ Datagram timetag error FSW Discrepancy in process 976 9/2/06 Watchdog reboot at hot induces correctable memory errors 977 9/5/06 Script sampled telemetry too soon, causing test error NCRs and Waivers
18
LAT Waivers (1/3) CCR # Title Description Status 433-0311
DC Voltage Tolerance LAT is required to tolerate 0-40V DC. Due to MOSFET switches at power feed inputs, LAT can tolerate minimum 15V, excluding transient events. Approved Test Point Short Circuit Isolation LAT is required to operate within spec if any test point is shorted to ground. A shorted external clock select pin would render the redundant GASU inoperable. DC Voltage Tolerance #2 LAT required to tolerate 0-40V DC. After a voltage drop analysis, it was found that the TEM MOSFET switches would receive too low a voltage with the DAQ feed voltage at 15V. To operate the TEM's safely, the input voltage needs to be 18.5V minimum. GTFE TID LAT is required to perform TID testing on all GTFE ASIC lots. The final two lots were not tested since previous lots exhibited such large margins. Tracker Environmental Test With Non-Flt or Missing Cables Several tracker towers went through environmental test with a subset of missing or non-flight flex cables. The replacement flight cables were not subjected to component-level vibe and will not see twelve tvac cycles. NCRs and Waivers
19
LAT Waivers (2/3) CCR # Title Description Status 433-0361
24AWG STD Strength Cu High strength Cu alloy is required for 24AWG wire. LAT uses standard strength Cu wire. As reported by the LAT PCB, standard strength 24AWG wire has been used on previous NASA projects with GSFC’s approval with no compromise to product reliability. Approved J-STD vs NASA STD LAT circuit card assemblies uses J-STD-001 as the workmanship standard instead of NASA-STD Tracker Flex Cable and MCM Coupon Failures Several flex cables and MCM’s are installed on the LAT although they have failed coupons. Radiator Sine Vibe The radiators will not be installed for LAT-level sine vibe test. Instead, the radiators were subjected to alternative tests, i.e. pull test, tap test, LAT-level acoustic test. EMI Skirt Stay Clear Center EMI skirt pieces near SC-LAT flexures exceed the LAT stay-clear by 0.015” max. VCHP CECM The VCHP feed violates the CECM requirement. The measured value is ~700mVp-p vs the requirement of 200mVp-p. NCRs and Waivers
20
LAT Waivers (3/3) CCR # Title Description Status 446-0402
Keyed Test Point Connectors Test point connectors are not keyed. Approved Segregate Signal Types on Test Connectors JL-39 contains signals of different signal classes. 446-0XX RS103 TKR Noise Occupancy Tracker noise occupancy exceeded its limit of 10-4 during RS 103 (vertical polarization) between MHz. Submitted to GSFC Cable Shield Termination Floating shields are not properly terminated per NASA-STD NCRs and Waivers
21
SC-LAT ICD Waivers ICN # Title Description Status -095
LAT Grid Interface Hole Out-of-Tolerance Several grid interface hole locations are out of tolerance. Using the as-built LAT Grid and SC interface hole locations, the analysis shows the predicted forces to align the shear pins are small and a minimum of 0.007” exist between the bolts and holes in the flexures and mating should not be an issue. Approved -107 Recessed Grid Bushings The +Y and –Y LAT grid interface hole bushings are recessed by 0.022” worst case. Stress analysis at the SC mount interface shows the margins of safety for ultimate and yield bearing strength is 7% which is acceptable. The margin of safety for pin bending is >200%. NCRs and Waivers
22
NCR and Waiver Summary The LAT Team is confident that none of the NCRs or Waivers presented are significant enough to prevent the LAT from shipping. The LAT Team recommends proceeding with the LAT shipping as planned. NCRs and Waivers
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.