Convolutional LSTM Networks for Subcellular Localization of Proteins

Slides:



Advertisements
Similar presentations
Dougal Sutherland, 9/25/13.
Advertisements

PROTEIN SECONDARY STRUCTURE PREDICTION WITH NEURAL NETWORKS.
Lecture 14 – Neural Networks
Aula 5 Alguns Exemplos PMR5406 Redes Neurais e Lógica Fuzzy.
Dynamic Face Recognition Committee Machine Presented by Sunny Tang.
Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei In this coffee talk, I would like to present you some basic.
Overview of Back Propagation Algorithm
Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
Pattern Recognition & Machine Learning Debrup Chakraborty
Deep Learning Neural Network with Memory (1)
Low Level Visual Processing. Information Maximization in the Retina Hypothesis: ganglion cells try to transmit as much information as possible about the.
EE459 Neural Networks Examples of using Neural Networks Kasin Prakobwaitayakit Department of Electrical Engineering Chiangmai University.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Matching Protein  -Sheet Partners by Feedforward and Recurrent Neural Network Proceedings of Eighth International Conference on Intelligent Systems for.
Lecture 8, CS5671 Neural Network Concepts Weight Matrix vs. NN MLP Network Architectures Overfitting Parameter Reduction Measures of Performance Sequence.
Supervised Sequence Labelling with Recurrent Neural Networks PRESENTED BY: KUNAL PARMAR UHID:
Perceptrons Michael J. Watts
Predicting the dropouts rate of online course using LSTM method
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.
Attention Model in NLP Jichuan ZENG.
Convolutional Sequence to Sequence Learning
Unsupervised Learning of Video Representations using LSTMs
Convolutional Neural Network
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
Recurrent Neural Networks for Natural Language Processing
Intelligent Information System Lab
Intro to NLP and Deep Learning
Lecture 5 Smaller Network: CNN
Neural Networks 2 CS446 Machine Learning.
Shunyuan Zhang Nikhil Malik
Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules
CSE P573 Applications of Artificial Intelligence Neural Networks
Attention Is All You Need
Master’s Thesis defense Ming Du Advisor: Dr. Yi Shang
Grid Long Short-Term Memory
RNN and LSTM Using MXNet Cyrus M Vahid, Principal Solutions Architect
A Comparative Study of Convolutional Neural Network Models with Rosenblatt’s Brain Model Abu Kamruzzaman, Atik Khatri , Milind Ikke, Damiano Mastrandrea,
Image Captions With Deep Learning Yulia Kogan & Ron Shiff
A First Look at Music Composition using LSTM Recurrent Neural Networks
Deep learning Introduction Classes of Deep Learning Networks
CSE 573 Introduction to Artificial Intelligence Neural Networks
Understanding LSTM Networks
CSC 578 Neural Networks and Deep Learning
The Big Health Data–Intelligent Machine Paradox
Papers 15/08.
Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions
Machine Translation(MT)
Natural Language to SQL(nl2sql)
Convolutional Neural Networks
Attention.
实习生汇报 ——北邮 张安迪.
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
CSC 578 Neural Networks and Deep Learning
Attention for translation
-- Ray Mooney, Association for Computational Linguistics (ACL) 2014
Learn to Comment Mentor: Mahdi M. Kalayeh
Rgh
Neural Machine Translation using CNN
Question Answering System
Presented By: Harshul Gupta
Week 3 Presentation Ngoc Ta Aidean Sharghi.
Sequence-to-Sequence Models
Deep learning: Recurrent Neural Networks CV192
CSC 578 Neural Networks and Deep Learning
Week 7 Presentation Ngoc Ta Aidean Sharghi
Neural Machine Translation by Jointly Learning to Align and Translate
Presentation transcript:

Convolutional LSTM Networks for Subcellular Localization of Proteins Søren Kaae Sønderby, Casper Kaae Sønderby, Henrik Nielsen*, and Ole Winther *Center for Biological Sequence Analysis Department of Systems Biology Technical University of Denmark Introduce myself: Biology background

Protein sorting in eukaryotes Various compartments have different functions and different sets of proteins. Nobel Prize to Günter Blobel in 1999.

Feed-forward Neural Networks Problems for sequence analysis: No builtin concept of sequence No natural way of handling sequences of varying length No mechanism for handling long range correlations (beyond input window size) Widely used in protein sequence analysis, e.g. by me

LSTM networks An LSTM (Long Short Term Memory) cell LSTM networks are easier to train than other types of recurrent neural networks can process very long time lags of unknown size between important events are used in speech recognition, handwriting recognition, and machine translation xt: input at time t ht-1: previous output i : input gate, f : forget gate, o: output gate, g: input modulation gate, c: memory cell. The blue arrow head refers to ct−1.

“Unrolled” LSTM network Each square represents a layer of LSTM cells at a particular time (1, 2, ... t). The target y is presented at the final timestep.

Regular LSTM networks Bidirectional: one target per position ”Double unidirectional” = what was shown on the previous slide. Bidirectional: one target per position Double unidirectional: one target per sequence

Attention LSTM networks Bidirectional, but with one target per sequence. Align weights determine where in the sequence the network directs its attention.

Convolutional Neural Networks A convolutional layer in a neural network consists of small neuron collections which look at small portions of the input image, called receptive fields. Often used in image processing, where they can handle translation invariance. First layer convolutional filters learned in an image processing network, note that many filters are edge detectors or color detectors

Our basic model …… Y K P W A t t+1 T xt xt+1 xT xt xt-1 xt-2 xt+1 xt+2 Conv. LSTM FFN …… t t+1 T Target prediction at t=T Soft max xt xt+1 xT Note that conv. weights are shared across sequence steps for the convolutional filters 1D convolution (variable width) Y K P W A xt xt-1 xt-2 xt+1 xt+2 Conv. weights

Weighted hidden average Our model, with attention Weighted hidden average Soft max Target prediction FFN Encoder Decoder …… ht ht+1 hT Vectors containing the activations in each LSTM unit at each time step Attention Att. Weighting over sequence positions 𝛼t 𝛼t+1 𝛼T Conv. LSTM …… t T xt xt+1 xT t+1 Skip this if time is short

Our model, specifications Input encoding: Sparse, BLOSUM80, HSDM and profile (R1×80) Conv. filter sizes: 1, 3, 5, 9, 15, 21 (10 of each) LSTM layer: 1×200 units Fully connected FFN layer: 1×200 units Attention model: Wa (R200×400), va (R1×200)

MultiLoc architecture MultiLoc is an SVM-based based predictor using only sequence as input MultiLoc is the source of our data set

MultiLoc2 architecture PhyloLoc = phylogenetic profiles: in which taxonomic range is the gene found? GOLoc = Gene Ontology codes from homologous proteins MultiLoc2 corresponds to MultiLoc + PhyloLoc + GOLoc. Thus, its input is not only sequence, but also metadata derived from homology searches.

SherLoc2 architecture SherLoc2 corresponds to MultiLoc2 + EpiLoc EpiLoc = a prediction system based on features derived from PubMed abstracts found through homology searches

Results: performance Ensemble = several trainings with different random seeds Note that our model is biologically naïve compared to MultiLoc

Learned Convolutional Filters Images made by the Seq2Logo program Filter D may represent the cytoplasmic end of a TM helix

Learned Attention Weights A point is coloured black if the attention weight for that position in that sequence iss above a certain threshold.

t-SNE plot of LSTM representation

Contributions 1. We show that LSTM networks combined with convolutions are efficient for predicting subcellular localization of proteins from sequence. 2. We show that convolutional filters can be used for amino acid sequence analysis and introduce a visualization technique. 3. We investigate an attention mechanism that lets us visualize where the LSTM network focuses. 4. We show that the LSTM network effectively extracts a fixed length representation of variable length proteins.

Acknowledgments Thanks to: Søren & Casper Kaae Sønderby, Ole Winther for doing the actual implementation and training Ole Winther for supervising Søren & Casper Søren Brunak for introducing me to the world of neural networks The organizers for accepting our paper You for listening!