Presentation is loading. Please wait.

Presentation is loading. Please wait.

Convolutional LSTM Networks for Subcellular Localization of Proteins

Similar presentations


Presentation on theme: "Convolutional LSTM Networks for Subcellular Localization of Proteins"— Presentation transcript:

1 Convolutional LSTM Networks for Subcellular Localization of Proteins
Søren Kaae Sønderby, Casper Kaae Sønderby, Henrik Nielsen*, and Ole Winther *Center for Biological Sequence Analysis Department of Systems Biology Technical University of Denmark Introduce myself: Biology background

2 Protein sorting in eukaryotes
Various compartments have different functions and different sets of proteins. Nobel Prize to Günter Blobel in 1999.

3 Feed-forward Neural Networks
Problems for sequence analysis: No builtin concept of sequence No natural way of handling sequences of varying length No mechanism for handling long range correlations (beyond input window size) Widely used in protein sequence analysis, e.g. by me

4 LSTM networks An LSTM (Long Short Term Memory) cell LSTM networks
are easier to train than other types of recurrent neural networks can process very long time lags of unknown size between important events are used in speech recognition, handwriting recognition, and machine translation xt: input at time t ht-1: previous output i : input gate, f : forget gate, o: output gate, g: input modulation gate, c: memory cell. The blue arrow head refers to ct−1.

5 “Unrolled” LSTM network
Each square represents a layer of LSTM cells at a particular time (1, 2, ... t). The target y is presented at the final timestep.

6 Regular LSTM networks Bidirectional: one target per position
”Double unidirectional” = what was shown on the previous slide. Bidirectional: one target per position Double unidirectional: one target per sequence

7 Attention LSTM networks
Bidirectional, but with one target per sequence. Align weights determine where in the sequence the network directs its attention.

8 Convolutional Neural Networks
A convolutional layer in a neural network consists of small neuron collections which look at small portions of the input image, called receptive fields. Often used in image processing, where they can handle translation invariance. First layer convolutional filters learned in an image processing network, note that many filters are edge detectors or color detectors

9 Our basic model …… Y K P W A t t+1 T xt xt+1 xT xt xt-1 xt-2 xt+1 xt+2
Conv. LSTM FFN …… t t+1 T Target prediction at t=T Soft max xt xt+1 xT Note that conv. weights are shared across sequence steps for the convolutional filters 1D convolution (variable width) Y K P W A xt xt-1 xt-2 xt+1 xt+2 Conv. weights

10 Weighted hidden average
Our model, with attention Weighted hidden average Soft max Target prediction FFN Encoder Decoder …… ht ht+1 hT Vectors containing the activations in each LSTM unit at each time step Attention Att. Weighting over sequence positions 𝛼t 𝛼t+1 𝛼T Conv. LSTM …… t T xt xt+1 xT t+1 Skip this if time is short

11 Our model, specifications
Input encoding: Sparse, BLOSUM80, HSDM and profile (R1×80) Conv. filter sizes: 1, 3, 5, 9, 15, 21 (10 of each) LSTM layer: 1×200 units Fully connected FFN layer: 1×200 units Attention model: Wa (R200×400), va (R1×200)

12 MultiLoc architecture
MultiLoc is an SVM-based based predictor using only sequence as input MultiLoc is the source of our data set

13 MultiLoc2 architecture
PhyloLoc = phylogenetic profiles: in which taxonomic range is the gene found? GOLoc = Gene Ontology codes from homologous proteins MultiLoc2 corresponds to MultiLoc + PhyloLoc + GOLoc. Thus, its input is not only sequence, but also metadata derived from homology searches.

14 SherLoc2 architecture SherLoc2 corresponds to MultiLoc2 + EpiLoc
EpiLoc = a prediction system based on features derived from PubMed abstracts found through homology searches

15 Results: performance Ensemble = several trainings with different random seeds Note that our model is biologically naïve compared to MultiLoc

16 Learned Convolutional Filters
Images made by the Seq2Logo program Filter D may represent the cytoplasmic end of a TM helix

17 Learned Attention Weights
A point is coloured black if the attention weight for that position in that sequence iss above a certain threshold.

18 t-SNE plot of LSTM representation

19 Contributions 1. We show that LSTM networks combined with convolutions are efficient for predicting subcellular localization of proteins from sequence. 2. We show that convolutional filters can be used for amino acid sequence analysis and introduce a visualization technique. 3. We investigate an attention mechanism that lets us visualize where the LSTM network focuses. 4. We show that the LSTM network effectively extracts a fixed length representation of variable length proteins.

20 Acknowledgments Thanks to: Søren & Casper Kaae Sønderby, Ole Winther
for doing the actual implementation and training Ole Winther for supervising Søren & Casper Søren Brunak for introducing me to the world of neural networks The organizers for accepting our paper You for listening!


Download ppt "Convolutional LSTM Networks for Subcellular Localization of Proteins"

Similar presentations


Ads by Google