Presentation is loading. Please wait.

Presentation is loading. Please wait.

L A N G L E Y R E S E A R C H C E N T E R Big Data Analytics and Machine Learning in Aerospace Manjula Ambur, Lin Chen, Charles Liles, Robert Milletich,

Similar presentations


Presentation on theme: "L A N G L E Y R E S E A R C H C E N T E R Big Data Analytics and Machine Learning in Aerospace Manjula Ambur, Lin Chen, Charles Liles, Robert Milletich,"— Presentation transcript:

1 L A N G L E Y R E S E A R C H C E N T E R Big Data Analytics and Machine Learning in Aerospace Manjula Ambur, Lin Chen, Charles Liles, Robert Milletich, Daniel Sammons, Ted Sidehamer, and Jeremy Yagle NASA Langley Research Center March 17, 2016 1

2 NASA Overview Big Data Analytics and Machine Learning Background Vision of Virtual Experts/Assistants Data Intensive Scientific Discovery Knowledge Analytics/Cognitive Computing Projects towards Virtual Assistants Aerospace Data Assistants and Algorithms Aerospace Knowledge Assistants and Software Progress, Insights, and Challenges Team Acknowledgements Outline 2

3 NASA Vision and Centers NASA Vision: “To reach for new heights and reveal the unknown so that what we do and learn will benefit all humankind” 3

4 Aerosciences Atmospheric Characterization Entry, Descent & Landing Intelligent Flight Systems Measurement Systems Systems Analysis & Concepts Advanced Materials & Structural Systems NASA Langley Technical Areas Mission: NASA Langley is a research, science, technology and development center that provides game-changing innovations to enable NASA to make significant contributions to the nation

5 What is Big Data Analytics and Machine Intelligence? Big Data Analytics Data whose scale, diversity and complexity requires new techniques and algorithms to manage it and extract value and knowledge Four V’s of Big Data Analytics Machine Learning Algorithms capable of learning from both data and human interaction to enable insights and make predictions Machine Intelligence An autonomous entity that can observe and act upon its environment and make decisions like humans Volume: scale of data Velocity: analysis of streaming data Variety: numerous forms of data Veracity: mitigating uncertainty of data 5

6 Why Big Data is a Big Deal Data-based decisions Deluge of Data Compute Power Machine Learning Convergence of Factors : Data, Technology, and Thinking 6

7 Why Big Data is a Big Deal: Huge Investments Federal Research: ~ $1 Billion Google, Microsoft, IBM, Facebook: Big Investments : ~ Many Billions Brain Initiatives: EU and US Being used at Boeing, GE, Lockheed Martin, DOE, NASA… Transforming Medical Diagnosis and Research Universities: Data Analytics Programs 7

8 Why Big Data is a Big Deal: Real World Examples Aerodynamics in Formula One Racing Oncology advisor GE Maintenance Recommendations: Google, Amazon, Netflix Air Traffic Management CERN Asteroids and Stars 8

9 NASA Langley Comprehensive Digital Transformation Modeling & Simulation (M&S) Integrated analysis and design of complex systems Facilitate improved physics-based discipline tools Optimally combine testing and M&S Modeling & Simulation (M&S) Integrated analysis and design of complex systems Facilitate improved physics-based discipline tools Optimally combine testing and M&S Advanced Information Technology Open, secure collaboration for synergy Networks handle burgeoning data Data governance, architecture, and management Advanced Information Technology Open, secure collaboration for synergy Networks handle burgeoning data Data governance, architecture, and management High Performance Computing (HPC) Next generation software development Rapid Compute power for M&S and BDA&MI Architecture for real-time analysis and design High Performance Computing (HPC) Next generation software development Rapid Compute power for M&S and BDA&MI Architecture for real-time analysis and design Big Data Analytics & Machine Intelligence (BDA&MI) Rapid synthesis of global scientific info. for new insights Data intensive scientific discoveries for advanced designs Virtual Experts: Human-machine symbiosis Big Data Analytics & Machine Intelligence (BDA&MI) Rapid synthesis of global scientific info. for new insights Data intensive scientific discoveries for advanced designs Virtual Experts: Human-machine symbiosis Goals:  Accelerated Scientific Innovations and Discoveries  Focused, Relevant Research and Technology Development  Intelligent and Rapid Engineering and System designs  Virtual Analysis, Design, and Verification of Aerospace Systems and Science Instruments Collaboration and Partnership is Paramount -- NASA, OGA, Industry, Universities Vision: Catalyst to Enable Transformative Solutions to NASA Mission Challenges Core Capabilities - Emphasis Areas 9

10 Enable NASA employees to achieve greater scientific discoveries and systems innovations Big Data Analytics & Machine Intelligence Capability Vision: Virtual Research and Design Partner 10

11 LaRC 2035 Vision: Virtual Research and Design Partner 11 Research & Design Faster and Smarter with Minimal Time/Effort Goal: Productivity x3 by 2035 Enable NASA employees to achieve significantly greater scientific and engineering discoveries, and systems innovation and optimization Able to quickly digest the latest research innovations and leverage insights Deep analysis of world-wide multimedia scientific information and data enabling discovery of trends, unobvious relationships, and possible paths with evidence Ability to ask engineering design-related questions and get reliable answers Process modeling and simulation data in real time for effective/efficient testing Accelerated ideation & design; increase research productivity

12 Two Key Areas for Virtual Partner/Assistants Deriving new insights, correlations, and discoveries not otherwise possible from diverse experimental and computational data The Fourth Paradigm Obtaining insights, identifying trends, aiding in discovery, and finding answers to specific questions by mining knowledge from scholarly, web, and multimedia content Cognitive Computing 12

13 13 Aerospace Data Assistants Data Intensive Scientific Discovery Projects

14 A thousand years ago: Experimental Science Description of natural phenomena; Galileo Last few hundred years: Theoretical Science Newton’s Laws, Maxwell’s Equations Last few decades: Computational Science Simulation of complex phenomena Today: Data Intensive Science Discoveries and insights from Data Source: The Fourth Paradigm: Data-Intensive Scientific Discovery by Microsoft Research Data Intensive Scientific Discovery Emergence of a Fourth Paradigm… 14

15 Aerospace Data Assistants: Projects/Pilots Data Intensive Scientific Discovery (DISD) Deriving new insights, correlations, and discoveries from diverse experimental and computational data sets Entry Descent Landing Trajectory Analysis Rapid Exploration of Aerospace Designs Cognitive Assessment of Crew State Monitoring Climate Data Fusion and Analysis Predicting Flutter from Aeroelasticity Data Anomaly Detection in the Non- Destructive Evaluation of Materials 15

16 Anomaly Detection in Non-Destructive Evaluation of Materials Images Predicting Flutter from Aeroelasticity Data Develop techniques and algorithms to automatically detect anomalies during the nondestructive evaluation of materials Goals Significantly reduce SME analysis time and assist experts in discovering additional anomalies Help to design better material compositions and structures Techniques Two-Dimensional Regression designed to detect anomalous pixels Convolutional Neural Networks to classify the image data Accomplishments & Next Steps Algorithms are validated with real data sets and further enhanced Deliver a tool with a good UI for SMEs to use as an ‘Assistant’ for anomaly detection of composite materials analysis Develop methods to automatically detect the onset of flutter during wind tunnel testing Goals Find new ways of predicting flutter in the time domain Identify non-traditional predictor variables and unseen patterns Better understand precursors to flutter and improve configurations Techniques Piecewise Regression to locate peaks, track coalescence of structural modes Time Series Motifs to identify signatures in the data that could represent precursors to flutter Accomplishments & Next Steps Peak detection tested with multiple datasets Several significant time series motifs detected Testing with synthetic data for validation of algorithms Aerospace Data Assistants Projects – 1 of 2

17 Cognitive Assessment of Crew State Monitoring Rapid Exploration of Aerospace Designs Build classification models for predicting cognitive state using physiological data collected during flight simulations Goals Identify unsafe cognitive states in aircrew real-time Apply results for more effective pilot training Techniques Ensemble of machine learning tools (deep neural network, gradient boosting, random forest, support vector machine, decision tree) Data pre-processing using detrending and power spectral density Accomplishments & Next Steps Initial data mapping, statistical analysis, and signals processing Explore combining multiple signal models using ensembling Developing models from test subjects data from different days Develop a generalized machine learning platform to be used for analyzing mod-sim data for design optimization Goals Provide surrogate modeling to explore the trade space of aerospace vehicle designs with easy-to-use web interface Use fast machine learning models instead of computationally- intensive code for rapid exploration and optimization Techniques Supervised machine learning algorithms, SVM, and Neural Networks trained on labeled data Accomplishments & Next Steps Python 2.7 with SKLearn algorithms are being used Web interface using PHP being developed for SME use Aerospace Data Assistants Projects – 2 of 2

18 Current State SME relies on traditional methods to pre-select data; Requires expertise and is time-consuming Being Developed Long Term Vision Algorithms that mimic SME knowledge: Validate the algorithm Save SME time Application of algorithms to data, and to other legacy datasets Yields New Insights Virtual Expert Autonomous Assistant to SME that analyzes all possible data and augments decision making Aerospace Data Analytics: Challenge of Physics-Based Algorithms All Data SME-defined subset of data Being Developed Data Mining techniques to detect patterns and correlations which will be validated by SMEs Data Analytics Team and SMEs working together 18

19 19 Aerospace Data Assistants Algorithms and Techniques Data Intensive Scientific Discovery Projects

20 Linear regression Application 1: Non-Destructive Evaluation (NDE) Image analysis Goal: Automate delamination detection Method: Fit data with linear regression and detect outlier regions. Regression performed on 1D and 2D signals; Uses C ++ and R Application 2: Aeroelastic Flutter Data Analytics Goal: Detect precursors and onset of aeroelastic flutter Method: Fit best quadratics between structural modes to detect mode coalescence; Uses MatLab Top: Linear regression of 1D-signals for anomaly detection in carbon fiber; Bottom: Mode identification in flutter time-series data using linear regression

21 Time Series Motifs Application: Pattern mining of time domain Aeroelastic Flutter Data Goal: Identify flutter precursors to: Create a dictionary of motifs for a given configuration Classify data for use with machine learning algorithms that will support a real-time ‘Flutter Assistant’ Method: Application of the Motif Enumeration (MOEN) algorithm created by Dr. Abdullah Mueen; Open-source framework Justification: MOEN has been successfully applied to research problems in other scientific domains including robotics, biology, and seismology In order to detect motifs across the various sensor signals, a given sensor’s output (Signal A) is compared to another sensor (Signal B) by creating a composite signal (Signal A/B). The algorithm is then applied to the composite signal to detect the motifs (above right) common to both sensors. Significant motifs are Identified by a physics-based selection process and then validated by SMEs. Scott, Robert C., et al. "Aeroservoelastic Wind-Tunnel Test of the SUGAR Truss Braced Wing Wind-Tunnel Model." 56 th AIAA/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference. 2015.

22 Deep learning: convolutional neural network (CNN) Application: NDE Image Analysis to segment delaminations Method: Convolutional encoder/decoder neural network; end-to-end training to map raw data to segmentation; Using Caffe Justification: Very successful in medical image analysis such as wound segmentation (top right) Results on Simulated Data Results on Experimental Data From Wang, Changhan, et al. "A unified framework for automatic wound segmentation and analysis with deep convolutional neural networks." Engineering in Medicine and Biology Society (EMBC), 2015 37th Annual International Conference of the IEEE. IEEE, 2015.

23 Artificial neural networks Application 1: Crew State Monitoring Goal: Build classification models capable of accurate, real-time prediction of aircrew cognitive state using physio data collected during flight simulations Method: ANN trained to classify cognitive state Application 2: Rapid Exploration of Aerospace Designs (READ) Goal: Build classification / regression models on user-uploaded data for aerospace designs Method: train ANNs on labeled data, use trained models for prediction and visualization EEGECG Galvanic Skin ResponseRespiration Rate Eye Tracking Feature Generation Input Layer Hidden Layer Output Layer / Classification “Normal” State Channelized Attention Diverted Attention Startle / Surprise

24 Ensemble of Machine Learning Techniques Application 1: Non-Destructive Evaluation (NDE) Image analysis Goal: Automate delamination detection Method: Combine several machine learning models into overall prediction using regression to determine if sample contains a delamination; Using MatLab and Python Sci Kit Application 2: Crew State Monitoring Goal: Build classification models capable of accurate, real-time prediction of aircrew cognitive state using physio data collected during flight simulations Method: Utilize 2-level Meta Model combining multiple classification algorithms to improve classification accuracy; Using Theono & Python Random forests Extremely random forestsAda Boost Gradient boosting k – nearest neighbors Fully grow k independent classification trees and combine predictions Similar to random forests but split per node is also randomized Fit consecutive weak learners based on classification tree stumps and combine predictions Similar to Ada Boost with different loss function Identify k closed points to test sample based on distance metric EEGECG Galvanic Skin ResponseRespiration Rate Eye Tracking Feature Generation Artificial Neural NetworkGradient Boost ClassifierRandom Forest Level 1 Models Artificial Neural Network Level 2 Meta Model “Normal” StateChannelized Attn Diverted AttnStartle /Surprise

25 Machine Learning Languages and Libraries Neural Networks and Deep Learning Regression Ensemble Learning Time Series Motifs PythonTheano/Keras- Scikit-Learn, XGBoost - MATLAB- MATLAB Functions -Mex C# Wrapper LuaTorch--- C/C++/C#Caffe Developed in- house -Open source code Many robust, open-source tools are available and being used - Available for most languages Have enterprise license for MATLAB and our scientists and engineers use it Initially, use languages team members are comfortable with Solution to problem is more important than language/library Allows for efficient exploration of solutions Once solutions are found, re-implement into single language Initial investment leads to significant time-savings (e.g. debugging) down the road

26 26 Aerospace Knowledge Assistants Data Intensive Scientific Discovery Projects

27 Two Key Areas for Virtual Partner/Assistants Deriving new insights, correlations, and discoveries not otherwise possible from diverse experimental and computational data The Fourth Paradigm Obtaining insights, identifying trends, aiding in discovery, and finding answers to specific questions by mining knowledge from scholarly, web, and multimedia content Knowledge Assistants/Cognitive Computing 27

28 Digest and analyze thousands of articles without reading Rapidly identify groupings, trends, connections, and experts Explore technology gaps that could be leveraged Identify cross-domain leverages and research Autonomous Flight Research Carbon Nanotubes Research Space Radiation Research Knowledge Assistants Using Watson Content Analytics Analysis of 130,000 articles from a 20-year time span Key Capabilities Successfully demonstrated value and developed robust expertise and capability Buying licensed content is a challenge ; Working with NASA content, open content and individual researchers collections Analysis of 4,000 articles integrating scholarly and web content Analysis of 1,000 articles of NASA research from Human Research Program 28 Human Machine Teaming, Uncertainty Quantification, Vehicle Design being worked

29 Knowledge Assistants: Deep Analytics Examples Automated Document Clustering and Trend Analysis Expert Networks 29 Software: Utilizing IBM Watson Content Analytics algorithms –K means is a scalable means of clustering large datasets; Statistical techniques combined with semantic techniques with visualization provides the analytics power

30 Cognitive Technologies for Aerospace Aerospace Innovation Advisor Proof of Concept Program Linkage: CAS Example Topics: Hybrid Electric Propulsion; On Demand Mobility Pilot Advisor Proof of Concept Program Linkage: SASO Flight deck expert system for Root Cause Analysis and advise ARC, AFRC, and JSC are also investigating use; LaRC is connected with those efforts Cognitive Technologies for Aerospace Using Watson Discovery Advisor Goal: Accelerate the discovery of new insights by synthesizing information in seconds, and providing answers with evidence Develop and Demonstrate two Proof of Concepts - Application to our aerospace domains : 1. Understand scientific and domain language 2. Adapt and learn quickly from inquiries, results, and iteration 4. Compose and visualize information at large 3. Generate new hypothesis and discoveries 6

31 Partnerships and Education 31 Universities GA Tech: Data Analytics and ML for systems integration MIT: Computer Science and AI Lab : ML algorithms University of New Mexico : ML algorithms ODU: Machine Learning University of Michigan: Confluence of Mod Sim, HPC, & Big Data Carnegie Mellon: Machine Learning University of Washington: Big Data in Aerospace Program Agencies and Industry NASA – Ames, Glenn, JSC, HQ OGA - IARPA; DOE;........... IBM: Cognitive Computing Technologies Boeing, Lockheed,…. Partnerships Education & Outreach ~ 14 Seminars & ~ 5 Workshops Machine Learning and Analytics Courses – MatLab; Deep Learning.. Websites: Big Data o Machine Learning Toolbox o Knowledge Analytics o Lunch & Learn Sessions

32 Progress and Challenges Diversified Portfolio Mission-Focused Use Cases Understanding Problem Domain and Data Leveraging Collaborations & Open Source Tools Applications in Aerospace In early stages – Treading path.. Lot of hype and misperceptions Motivated and multi skilled team 32 Technical community sees value; Strong collaboration with SMEs Mix of Research, Experimentation, Iteration, and Persistence needed

33 Next Steps Frameworks and Methodology to help Expand the capability ; Suite of ML tools for broad use Synergy with Mod-Sim, HPC & Adv. IT Productize algorithms for SMEs Use Formal Collaborations with Universities, Industry and OGAs Build Stronger Linkages/advocacy to Missions 33 Broad Buy-in and Understanding of Value

34 Acknowledgements – Big Data Analytics Team Data Analytics and Machine Learning Expertise: Manjula Ambur, Lin Chen, Christina Heinich, Charles Liles, Robert Milletich, Daniel Sammons, Ted Sidehamer, and Jeremy Yagle Subject Matter Expertise: Danette Allen, Damodar Ambur, Dale Arney, Trey Arthur, Randy Bailey, Eric Burke, Jeff Cerro, Kyle Ellis, Christie Funk, Dana Hammond, Angela Harrivel, Jeff Herath, Jon Holbrook, Patty Howell, Lisa Le Vie, Constantine Lukashin, Alan Pope, Brandi Quam, Cheryl Rose, Jamshid Samareh, Mark Sanetrik, Rob Scott, Lisa Scott- Carnell, Steve Scotti, Walt Silva, Mia Siochi, Chad Stephens, Scott Striepe, Marty Waszak, Bill Winfree, and Kristopher Wise 8


Download ppt "L A N G L E Y R E S E A R C H C E N T E R Big Data Analytics and Machine Learning in Aerospace Manjula Ambur, Lin Chen, Charles Liles, Robert Milletich,"

Similar presentations


Ads by Google