Presentation on theme: "Systems Prognostic Health Management EMIS 7305 March 28, 2006"— Presentation transcript:
1 Systems Prognostic Health Management EMIS 7305 March 28, 2006 Systems Engineering ProgramSystems Prognostic Health Management EMIS 7305 March 28, 2006Christopher ThompsonSenior Research EngineerLockheed Martin Missiles and Fire ControlDisclaimer: This briefing is unclassified and contains no proprietary information. Any views expressed by the author are his, and in no way represent those of Lockheed Martin Corporation.
2 Topic Outline Introduction Definitions The Goal of Prognostic Health ManagementPHM StakeholdersPHM ModelingSensorsPrognostics Analysis ToolsAvailabilityExamples
3 Introduction Education B.S. in Electrical Engineering, SMU (1997) M.S. in Mechanical Engineering, SMU (2001)- Focus: Fatigue and Fracture MechanicsM.S. in Systems Engineering (one class remaining)- Focus: Reliability, Statistical AnalysisPh.D. in Applied Science (anticipated ~ 2008)- Proposed Dissertation Title:Sensor Optimization for Systems Prognostic-Diagnostic Health Management in a Unmanned Ground Combat Vehicle
4 Introduction Experience Lockheed Martin Missiles and Fire Control, Dallas TXSystems Engineer- Multifunction Utility/Logistics Equipment (MULE)Reliability Engineer- Army Tactical Missile System (TACMS)Lockheed Martin Aeronautics, Fort Worth TXVehicle Systems - Prognostic Health Management- F-35 Joint Strike FighterSMU School of Engineering- TA for Dr. Jerrell Stracener
6 IntroductionSome keys to the successful fielding of the U.S. Army’s Future Combat Systems are:Reducing the Logistics footprintIncreasing AvailabilityReducing total cost of ownershipImplementing Performance Based LogisticsImprovements in the ‘ilities’ (RAM-T)ReliabilityAvailabilityMaintainabilityTestabilitySupportability
7 Some DefinitionsPrognostics - Of or relating to prediction; a sign of a future happening; a portent.Prognostics is the process of calculating and reporting an estimate of remaining useful life for a component, within sufficient time to repair or replace it before failure occurs.
8 Some DefinitionsPrognostic Health Management (PHM) – The implementation of an integrated software and hardware system which monitors the health, status and performance of a vehicle or system, tracks consumables (oil, batteries, ammunition, filters, fuel, coolant…) and configuration (software versions, part history…), and determines remaining life of all safety and performance critical components, predicting failures before they occur, thereby enhancing logistics and maintenance activities. PHM consists of ‘on-board’ as well as ‘off-board’ components.
9 Some DefinitionsDiagnostics - The identification of a fault or failure condition of an element, component, sub-system or system, combined with the deduction of the lowest measurable cause of that condition through confirmation, localization, and isolation.Confirmation is the process of validation that a failure/fault has occurred, the filtering of false alarms, and assessment of intermittent behavior.Localization is the process of restricting a failure to a subset of possible causes.Isolation is the process of identifying a specific cause of failure, down to the smallest possible ambiguity group.
10 Some DefinitionsFault – A condition that renders an element unable to perform its required function at desired levels of performance, or in a degraded mode.Failure – The inability of a component, system or sub-system to perform its intended function as designed. Failure may be the result of one or more faults.Fault Tolerance – The design of a system so that it will continue to operate in a degraded or reduced level rather than failing completely, when some part of the system fails.
11 Some DefinitionsFailure Cascade – The result when a failure occurs in a system of interconnected components, and the successful operation of a component depends on the successful operation of a preceding component. Conversely, a failure can trigger the failure of successive parts, and potentially amplify the result or impact. Redundancy and fault tolerant design can reduce the criticality or impact of the cascade, but not necessarily prevent a failure.
12 Some DefinitionsDesign Failures – These take place due to inherent errors or flaws in the system design.Infant Mortality Failures - These cause newly manufactured systems to fail, and can generally be attributed to errors in the manufacturing process, or poor material quality control.Random Failures - These can occur at any time during the entire life of a system. Electrical systems are more likely to fail in this manner.Wear Out Failures - As a system ages, degradation will cause systems to fail. Mechanical systems are more likely to fail in this manner.
13 Some DefinitionsOne-To-One Redundancy - Each active component in a system has a redundant backup on standby. The active component is monitored at all times, and the standby component will activate if the primary component fails. Since the probability of both components failing at the same time is low, One-To-One Redundancy provides the highest level of availability, but at a considerable disadvantage of requiring double the size, weight, power and cost, while reducing reliability (more components which can fail).
14 Some DefinitionsN + X Redundancy – N components are required to perform a function, but the system is configured with N + X components. When any of the N components fail, one of the X modules activates. The advantage lies in reduced size, weight, power and cost of the system, in the case where X is smaller than N. In case of multiple component failures, this scheme provides lesser system availability.
15 Some DefinitionsLoad Sharing – Multiple components share a combined load. A higher level component manages load distribution, and monitors the health and status of the components. If one of the load sharing components fails, the load is re-distributed among the others, allowing for graceful performance degradation. In this scheme, there is almost no extra cost. The main disadvantage is that multiple failures, system performance may degrade below an acceptable level.
16 The Ultimate Goal of Prognostics The purpose of Prognostic Health Management is to repair systems before they fail, while maximizing useful life consumption, and to have the necessary parts, tools and maintainers waiting nearby to resolve the correct problem as quickly and efficiently as possible.
18 Systems Engineering’s Role in PHM Requirements DevelopmentSystem IntegrationSystem ArchitectureInterface ManagementRisk AssessmentPerformance Measures: TPM’s & KPP’sSystem Modeling & Knowledge IntegrationFunctional Decomposition
19 PHM RequirementsThe PHM system shall isolate X percent of all detected failures to a single component, within Y percent confidence interval.The PHM system shall predict X percent of expected failures for the next Y hours of operation.The PHM system shall predict all failures that can result in a Safety Critical Failure.The PHM system shall incorporate sensors to assess platform health, status and performance.The PHM system shall incorporate sensors to monitor platform consumables.The PHM system shall record and store all sensor data in onboard memory.
20 The ‘Ilities’ & Product Support ReliabilityFMECA: Failure Modes & Effects CriticalityFRACAS: Failure Reporting & Corrective ActionsMeasures: MTBF, MTBSA, MTBEFF, MTBUMAMaintainability- Maintenance Ratio- Preventive Maintenance Checks- Condition Based Maintenance- Design for MaintainabilityAvailability- AO, AI, AA
21 The ‘Ilities’ & Product Support Testability- Verification and Validation- Fault Insertion- SimulationSupportabilityConsumables MonitoringSupply Planning and PredictionSystem Safety- Single & Multiple Fault Tolerant DesignSafety Critical FailuresHuman/Machine Interaction
22 PHM Modeling eXpress Modeling Tool Model Based Reasoning Case Based ReasoningKnowledge BasesPrognostics Analysis Tools
24 Impact Technologies Prognostics developed at Impact Technologies: • Gas Turbine Engines and Auxiliary Systems• Avionics PHM and Reasoning• Aircraft Actuators (EMA, EHA)• Switching Mode Power Supplies, GPS Receivers and Power Electronics• Generators and Electric Drive Systems• Bearings, Gears, Shafts, Drive Trains, and Clutches• Hydraulic, Lube Oil and Fuel Systems• Structures and Components• Diesel Engines
25 Impact TechnologiesPrognostics modules have been developed and successfully tested on the following systems:Pratt & Whitney F-100 engine on F-15 and F-22Engine, generator, lubrication system and gearbox on Honeywell F124Oil wetted components on GE F , GE F404, Rolls Royce F405CH-47 T-55 engine and drive-train andCH-60 intermediate gearboxBlackhawk Carrier Plate Prognosis SystemJSF Clutch Wear and Lift-Fan Prognosis SystemFuel system and Power generation system on DDG-class Navy Ships
26 Impact TechnologiesA number of different techniques have been used in the development of these prognostics:Analytical and stochastic physics of failure modelsAdvanced signal processingFeature extraction methodsHealth state estimation and prediction algorithmsStatistical reliabilityBayesian updating methodsComponent damage accumulation modelsProbabilistic remaining useful life estimationData driven modeling techniques
27 Model Based ReasoningModel Based Reasoning (MBR) is a qualitative scheme where a model of the system is combined with an inference engine that is able to accomplish fault detection and fault isolation. The qualitative model is used to describe system elements and components, interconnections, and input/output behavior of the system being diagnosed, or ‘Knowledge Base’ and to establish an envelope of ‘correct behavior’. To accomplish diagnosis, the model determines what differences exist between the actual behavior of the system and the model of the system. The inference engine, using this comparison information, accomplishes the fault isolation task.
28 Case Based ReasoningCase Based Reasoning (CBR) is the process of solving problems based on past understanding of similar problems. The vast majority of this type of information is contained within the maintainers and operators – the experience and knowledge of the person using the system in question. CBR compares a case, forms an implicit generalization of the case, and then identifies commonalities between a retrieved case and the target problem.
31 Prognostic Analysis Tools Traditional Academic Solutions to PHM:Run-to-Failure analysis of large, expensive systems, such as ship or rail enginesAnalysis involves impractical, complex math models that require years of training to understand and interpretVery expensiveTime consuming processRarely offer concrete design guidelines or solutions
32 Prognostic Analysis Tools Why Engineers in Industry Need More:We have bottom lines and schedules to meet!We have customer requirements to satisfy!Systems Engineers work with designers who don’t like impractical, complex math models that require years of training to understand and interpret!We have program managers who don’t like very expensive, time consuming solutions!We like concrete design guidelines and solutions!
33 Sensor Technology BIT/BITE Sensor Fusion and Virtual Sensors Sensor Conditioning and FilteringSmart Sensors
34 Availability Analysis Availability, AchievedwhereMTBF = Mean Time Between FailureMTTR = Mean Time To Repair
35 Availability Analysis Availability, OperationalwhereMTBUMA = Mean Time Between UnscheduledMaintenance ActionsALDT = Administrative Logistical Down TimeMTTR = Mean Time To Repair
36 Availability Analysis MTBUMA = Mean Time Between UnscheduledMaintenance ActionswhereMTBM = Mean Time Between FailuresMTBM = Mean Time Between Maintenance
37 Availability Analysis How can we improve AO?- By decreasing Administrative & Logistical Down Time (ALDT)- By increasing Mean Time Between Failures (MTBF)- By decreasing Mean Time To Repair (MTTR)- By increasing Mean Time Between Unscheduled Maintenance Actions (MTBUMA) – [by decreasing MTBR induced and MTBR no defect]
38 Availability Analysis How can we decrease ALDT?- By improving LogisticsImprove scheduling of inspectionsImprove commonality of partsDecrease time to get replacements- By improving PrognosticsReplace parts before they fail, not afterMaximize use of component lifeImprove off-board prognostics trendingMore sensors!!
39 Availability Analysis How can we increase MTBF?- By improving ReliabilitySelect more rugged componentsImprove life screening and testingImprove thermal management- By improving QualityBetter parts screeningBetter manufacturing processes- By adding RedundancyAt the cost of Size, Weight and Power!
40 Availability Analysis How can we decrease MTTR?- By improving MaintainabilityImprove quality and efficacy trainingSimplify fault isolationDecrease number of tools and special equipmentDecrease access time (panels, connectors…)Improve Preventative Maintenance- By improving DiagnosticsImprove BIT and BITEDecrease ambiguity group sizeImprove maintenance manuals and training
41 Availability Analysis How can we increase MTBM (induced/no defect)?- By improving SafetyLimit the potential for accidental damage- By improving PrognosticsImprove PHM models to monitor induced damage- By improving DiagnosticsLower the false alarm rateDon’t repair/replace things which aren’t broken!
42 Sensor Example Engine Health/Performance Monitoring: Place an acoustic sensor on the engine housing.Establish ‘nominal’ operating parameters.Develop library relating fault precedents to failures:= odd sounds which warn of impending failure. Monitor for ‘out of nominal’ acoustic signature.
43 PHM ExampleConsider a toaster: Not just any toaster, but the toaster on the first mission to Mars. NASA could only afford to send one, and it must work, every time, or else the astronauts won’t have toast. The toaster must also not endanger the mission by causing a safety hazard or waste bread.Mission Critical Function:- make toastSafety Critical Functions:- don’t injure the astronauts- don’t damage the spaceship- don’t burn the toast!
44 PHM Example Identify the elements of a toaster. What are the failure modes?What should we monitor for safety hazards?What elements should we monitor for diagnostics?What data should we collect for prognostics?How would we optimize the sensor coverage and data collection?
45 Issues Related to PHMContinually monitoring sensors and storing all that data for analysis will quickly consume available bandwidth and storage space.Capturing ‘profound knowledge’ of a complex engineered system and its myriad failure modes is very difficult, and involves integrating knowledge which crosses discipline boundaries: SE, EE, ME, RAM-T, Safety, Software, Math, Statistics, Physics…Prognostic analysis of data is a very difficult problem, with no easy or universal solution.PHM is a relatively new field.
46 Final Remarks Do I have any practical PHM suggestions? - Aim for the low hanging fruitUse the sensors you already have in creative ways.Only add sensors when you must.You can’t monitor everything, so don’t try.- Don’t reinvent the wheelBuild on other’s work and experience.Find good tools to design your system.