Presentation is loading. Please wait.

Presentation is loading. Please wait.

Deep Learning for Big Data

Similar presentations


Presentation on theme: "Deep Learning for Big Data"— Presentation transcript:

1 Deep Learning for Big Data
P. Baldi University of California, Irvine Department of Computer Science Institute for Genomics and Bioinformatics Center for Machine Learning and Intelligent Systems

2 Intelligence in Brains and Machines
2

3 Intelligence in Brains and Machines
LEARNING 3

4 Intelligence in Brains and Machines
LEARNING DEEP LEARNING 4

5 Cutting Edge of Machine Learning: Deep Learning in Neural Networks
Engineering applications: Computer vision Speech recognition Natural Language Understanding Robotics

6 Computer Vision - Image Classification
Imagenet Over 1 million images, 1000 classes, different sizes, avg 482x415, color 16.42% Deep CNN dropout in 2012 6.66% 22 layer CNN (GoogLeNet) in 2014 4.9% (Google, Microsoft) super-human performance in 2015 GoogLeNet. Going deeper with convolutions, Sources: Krizhevsky et al ImageNet Classification with Deep Convolutional Neural Networks, Lee et al Deeply supervised nets 2014, Szegedy et al, Going Deeper with convolutions, ILSVRC2014, Sanchez & Perronnin CVPR 2011, Benenson,

7 Deep Learning Applications
Engineering: Computer Vision (e.g. image classification, segmentation) Speech Recognition Natural Language Processing (e.g. sentiment analysis, translation) Science: Biology (e.g. protein structure prediction, analysis of genomic data) Chemistry (e.g. predicting chemical reactions) Physics (e.g. detecting exotic particles) and many more

8 Deep Learning in Biology: Mining Omic Data

9 Deep Learning in Biology: Mining Omic Data
Solved C. Magnan and P. Baldi. Sspro/ACCpro 5.0: Almost Perfect Prediction of Protein Secondary Structure and Relative Solvent Accessibility. Problem Solved? Bioinformatics, (advance access June 18), (2014).

10 Deep Learning in Biology: Mining Omic Data
C. Magnan and P. Baldi. Sspro/ACCpro 5.0: Almost Perfect Prediction of Protein Secondary Structure and Relative Solvent Accessibility. Problem Solved? Bioinformatics, (advance access June 18), (2014).

11 Deep Learning in Biology: Mining Omic Data
C. Magnan and P. Baldi. Sspro/ACCpro 5.0: Almost Perfect Prediction of Protein Secondary Structure and Relative Solvent Accessibility. Problem Solved? Bioinformatics, (advance access June 18), (2014).

12 Deep Learning P. Di Lena, K. Nagata, and P. Baldi.
Deep Architectures for Protein Contact Map Prediction. Bioinformatics, 28, , (2012) Deep Learning

13 Deep Learning Chemical Reactions
RCH=CH2 + HBr → RCH(Br)–CH3

14 Deep Learning Chemical Reaction: ReactionPredictor
M. Kayala, C. Azencott, J. Chen, and P. Baldi. Learning to Predict Chemical Reactions. Journal of Chemical Information and Modeling, 51, 9, 2209–2222, (2011). M. Kayala and P. Baldi. ReactionPredictor: Prediction of Complex Chemical Reactions at the Mechanistic Level Using Machine Learning. Journal of Chemical Information and Modeling, 52, 10, 2526–2540, (2012).

15 Deep Learning in Physics: Searching for Exotic Particles

16

17 Daniel Whiteson Peter Sadowski

18 Deep network improves AUC by 8%
Higgs Boson Detection BDT=Bayesian Decision Trees used in package TMVA . We also had shallow NNs trained with TMVA but these were not as good as our shallow networks. Same training and test sets. 5 different weight initializations to get stds. Deep network improves AUC by 8% BDT= Boosted Decision Trees in TMVA package Nature Communications, July 2014

19 THANK YOU

20 Deep Learning in Chemistry: Predicting Chemical Reactions
RCH=CH2 + HBr → RCH(Br)–CH3 Many important applications (e.g. synthesis, retrosynthesis, reaction discovery) Two different approaches: Write a system of rules Learn the rules from big data

21 ReactionExplorer System has about 1800 rules
Writing a System of Rules: Reaction Explorer J. Chen and P. Baldi. No Electron Left-Behind: a Rule-Based Expert System to Predict Chemical Reactions and Reaction Mechanisms. Journal of Chemical Information and Modeling, 49, 9, , (2009). ReactionExplorer System has about 1800 rules Covers undergraduate organic chemistry curriculum Interactive educational system Licensed by Wiley from ReactionExplorer and distributed world-wide Jonathan Chen

22 Problems Very tedious Non-scalable Limited coverage (undergraduate)


Download ppt "Deep Learning for Big Data"

Similar presentations


Ads by Google