Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Machine Learning to Automate Fault Detection in Flight Discrepancy and Software Problem Anomaly Reports Horatiu Dumitru, Adam Czauderna, Jane Cleland-Huang.

Similar presentations


Presentation on theme: "Using Machine Learning to Automate Fault Detection in Flight Discrepancy and Software Problem Anomaly Reports Horatiu Dumitru, Adam Czauderna, Jane Cleland-Huang."— Presentation transcript:

1 Using Machine Learning to Automate Fault Detection in Flight Discrepancy and Software Problem Anomaly Reports Horatiu Dumitru, Adam Czauderna, Jane Cleland-Huang DePaul University DePaul University, Systems and Requirements Engineering Center This project was jointly funded by NSF REU Supplement CCF: and a grant from Lockheed Martin via the Software Engineering Research Consortium.

2 Flight Maintenance Hardware and software errors are reported by pilots and maintenance crew. They are entered into flight discrepancy reports and software problem anomaly reports. Flight discrepancy reports Software problem Anomaly reports Flight incidents, test failures etc Analysts use basic search features to search through reports to find recurring problems.

3 Problem Statement  Thousands of problem and anomaly reports are generated for each aircraft.  Searching for, and monitoring recurring faults is time consuming and relies upon the intuition of the analysts.  Many critical fault trends go undetected, leading to potential failures and loss of opportunity to mitigate problems.

4 A More Automated Approach Instead of relying upon analysts to search for recurring problems, utilize machine learning techniques to discover and monitor cross-cutting faults. Corpus of known faults Flight discrepancy reports Software problem Anomaly reports Data mining tools to detect occurrence of known faults and to identify new fault trends. Analyst reviews candidate faults. Flight incidents, test failures etc

5 Sub-problems  We identified two sub-problems  Identifying & monitoring known problem trends  Detect recurrence of the fault  To determine when a fault has been successfully mitigated (and no longer recurs).  Identifying previously unknown problems  These problems may never have been conceived of  Once identified, a problem transitions to monitored status.

6  A fault-pattern exhibits itself as a cross-cutting concern that cuts across problem reports and affects various hardware and/or software devices.  A primary concern of a software system is defined as a dominant aspect such as a specific hardware or software feature  Example: A feature to display medical records  A cross-cutting concern represents an aspect that is scattered across a number of more dominant concerns.  Example: Login feature A Top Down Approach

7 Cluster name: The login and password are sent to the server. - The details of the health units and specialties are retrieved. Cluster name: The system shows the specific screen for each type of complaint. - The system shows the login screen - Error message should be showed. - Show a message informing the employee of the missing/incorrect data. Cluster name: The result of the login attempt is presented to the employee on their local display. - The query results are formatted and presented to the user on their local display. Cluster name: The system retrieves the employee details using the login as a unique identifier. - The unique identifier is used to retrieve the complaint entry. - The unique identifier is used to retrieve the disease type to query. - The unique identifier is used to retrieve the list of health units which are associated with the selected specialty. - The unique identifier is used by the system to search the repository for the selected health unit. A Typical cross-cutting topic

8 - The system retrieves the employee details using the login as a unique identifier. - The unique identifier is used to retrieve the complaint entry. - The unique identifier is used to retrieve the disease type to query. - The unique identifier is used to retrieve the list of health units which are associated with the selected specialty. -The unique identifier is used by the system to search the repository for the selected health unit. Dominant terms Stop words recessive terms Identify and remove dominant terms

9 - The system retrieves the employee details using the login as a unique identifier. - The system shows the login screen - The login and password are sent to the server. - The employee provides the login and password. -The result of the login attempt is presented to the employee on their local display. -The result of the login attempt is presented to the employee on their local display. Step 2: Dominant terms are removed and requirements are re-clustered around weaker terms. Recluster around recessive terms

10 An overview of our solution Step 1: Preprocess data to remove stop words and stem words to root forms. Step 2: Cluster the problem reports using an unsupervised clustering method. Step 3: Compute cohesion and size metrics and use them to select the best cluster. Step 4: Identify the key terms for the selected cluster. Step 5: Create a problem topic from identified terms and add to topic list. Step 6: Remove the identified terms from ALL problem reports Step 7: Repeat steps 2-6 until no more clusters are found. Nozzle, repair, cracked. Topic list nozzle repair cracked nozzle Step 8: Present topic list to analyst for review.

11 Step 1: Preprocessing Step 1: Preprocess data to remove stop words and stem words to root forms. 1.Parse each of the feature requests to stem each word to its root form, so that similar words can be matched. 2.Remove common words known as ‘stop- words’ as these are not useful in computing similarity between documents. 3.Remove any words which only appear once, as these are not helpful in the clustering process. 4.Use a term-frequency, inverse document frequency (tf-idf) model to represent each feature request a as a weighted vector of terms (t 1, t 2,…..,t n )

12 Step 2: Clustering Our approach uses an underlying clustering method known as SPK-Means Step 2: Cluster the problem reports using an unsupervised clustering method. Two-stage spherical K-means clustering Input: unlabelled instances, number of clusters K, initial centroids I, convergence condition Output: crisp K-partition. Steps: 1.Initialization: initialize centroids using I: ; 2.Batch instance assignment and centroid update until convergence a.assign each instance to nearest cluster i with largest b.update each centroid: 3.Incremental optimization of objective function until convergence: a.randomly select an instance b.move it to the cluster that maximizes the gain of objective function c.update each centroid:

13 Step 2: Clustering (continued) A consensus approach is taken in which n clusterings are generated as follows for each clustering: 1.70% of the fault reports are randomly selected and clustered using SPK-Means. 2.The remaining 70% of faults are classified into the generated clusters. 3.A co-association matrix is generated that documents the number of times each pair of faults occur together. 4.The faults are re-clustered using a simple hierarchical clustering scheme in which the values in the co-association matrix represent the proximities between faults. Step 2: Cluster the problem reports using an unsupervised clustering method.

14 Step 3: Find the best cluster 1.The cosine distance of each problem statement to the centroid of its cluster is computed. 2.For each cluster, all distances are summed. 3.The average distance is computed for each cluster. 4.These two values (2) and (3) are normalized and used to determine the best cluster. Step 3: Compute cohesion and size metrics and use them to select the best cluster. Our goal is to find the single most cohesive cluster in each iteration of the process.

15 Step 4: Identify key terms 1.Determine the cluster’s dominant terms: Find terms with the highest weight according to the centroids Only pick terms above a certain thresh-hold 2.Add the dominant terms to a list Step 4: Identify the key terms for the selected cluster. Weather forecast condit Cluster: (weather, forecast, condit) To be informed of current weather conditions Be able to check the weather Provide a weather forecast for the length of the traveler's stay. Need a service to show me the current weather and forecast. Local weather information. Provide local weather conditions and forecasts. View current weather conditions Provide weather information for various destinations Be able to know the information about Weather. Provide weather forecasts Display the weather forecast. Provide local weather information for the week. Please note: Due to export controls regulations and non- disclosure agreements we are unable to illustrate our approach with the Lockheed Martin Data. Instead we illustrate the user requirements for an airport kiosk.

16 Step 5: Add problem topic to list 1.Take the dominant terms identified from the ‘best cluster’ in the last iteration and add them as a group to the topic list. 2.Four sample topics from the airport kiosk: weather, forecast, condit reserv, hotel destin, map, direct flight, connect, inform Step 5: Create a problem topic from identified terms and add to topic list. Topic list

17 Step 6: Remove topics 1.Remove dominant topics from all of the problem reports Step 6: Remove the identified terms from ALL problem reports nozzle repair cracked nozzle Cluster: (weather, forecast, condit) To be informed of current weather conditions Be able to check the weather Provide a weather forecast for the length of the traveler's stay. Need a service to show me the current weather and forecast. Local weather information. Provide local weather conditions and forecasts. View current weather conditions Provide weather information for various destinations Be able to know the information about Weather. Provide weather forecasts Display the weather forecast. Provide local weather information for the week. Why? Because we would like to form additional clusters around the remaining concepts.

18 Step 7: Repeat 1.Once dominant terms have been removed, re-cluster around remaining terms. 2.Repeat steps 2-7 until a stopping condition is met. 3.Candidate stopping condition:  No additional interesting topics remain.  Individual problem statements contain only stop-words. 4.Note: This approach generates fuzzy clusters i.e. a single statement can be placed into multiple clusters. Step 7: Repeat steps 2-6 until no more clusters are found.

19 Step 8: Analysis and Review 1.Engineers review candidate list of problem faults. 2.Engineers mark each detected fault as:  Valid  Invalid  Insignificant Step 8: Present topic list to analyst for review.

20 Sample Results (from Airport Kiosk) Flight, connect, inform Provide flight information including departure times and gate numbers. Provide up to date information about flight delays. To be informed of connecting flights To provide connecting flight information The kiosk should have secured network access to get ongoing flight’s information. Get flight status Check in for flight Provide current flight information for O’hare and other airports around the country. I need to be able to check for my flight information. Nearby, restaur To provide nearby restaurant listings To provide nearby traffic conditions Locate restaurants Display the location of food courts and other restaurants on the airport map. Provide information on nearby businesses, hotels, and restaurants, and their relation to the airport. Be able to see some reviews on the certain restaurants or hotels Make reservation at restaurant near the hotel. Create list of restaurants nearby attractions

21 Evaluation against Answer Set 1.Results from a standard metric (JACARD) used to compare to Airport kiosk clusterings. (Note this metric returns relatively low values even for fairly similar clusterings) 2.However our observations suggest that many of the additional clusters discovered by our tool represent good topics that were not manually discovered by human analysts. Additional evaluation is needed to confirm this hypothesis. Answer set compared to:Jacard metric results Random clustering0.04 Consensus clustering0.17 Incremental with 100 iterations0.19 Incremental with 50 iterations0.16

22 Results on C-130 Data  Results from our clustering process were presented to engineers at Lockheed Martin.  Engineers were well-satisfied with the results for several reasons:  The iterative clustering approach pushes the BEST clusters to the top of the list and appears to produce higher quality clusters than standard SPK-Means approaches.  In an initial review session our process identified at least one recurring fault that engineers may previously have been unaware of.

23 Future Steps  Incorporate techniques that use real-time feedback to improve the fault detection algorithms and decrease the ranking of rejected clusters.  Incorporate techniques such as acronym expansion and synonym recognition to reduce redundancy in results.  Deliver GUI based tools to LM that can be incorporated into their fault management process.  Onsite visit of DePaul researchers to Lockheed Martin to test same techniques on additional datasets.

24 AUTOMATED MINING OF CROSS-CUTTING CONCERNS FROM PROBLEM REPORTS AND REQUIREMENTS SPECIFICATIONS Horatiu Dumitru, Adam Czauderna, Jane Cleland-Huang DePaul University DePaul University, Systems and Requirements Engineering Center This project was jointly funded by NSF REU Supplement CCF: and a grant from Lockheed Martin via the Software Engineering Research Consortium.


Download ppt "Using Machine Learning to Automate Fault Detection in Flight Discrepancy and Software Problem Anomaly Reports Horatiu Dumitru, Adam Czauderna, Jane Cleland-Huang."

Similar presentations


Ads by Google