Principles of using neural networks for predicting molecular traits from DNA sequence Principles of using neural networks for predicting molecular traits.

Slides:



Advertisements
Similar presentations
Assessment of sequence alignment Lecture Introduction The Dot plot Matrix visualisation matching tool: – Basics of Dot plot – Examples of Dot plot.
Advertisements

No reference available
CS273B: Deep learning for Genomics and Biomedicine
IEEE BIBM 2016 Xu Min, Wanwen Zeng, Ning Chen, Ting Chen*, Rui Jiang*
Journal of Vision. 2017;17(4):9. doi: / Figure Legend:
Molecular Phylogenetics
Interpreting noncoding variants
Network topology reflects target gene expression profile and function.
Convolutional Neural Networks for Visual Tracking
binding sites 58 of the 473 unambiguously assigned phosphorylation sites are predicted by Scansite to be sites for binding. 50 of these correspond.
Analysis of mutational effects of parameters on the phenotype (i. e
Hnf4g importance in human colon cancer organoids and regulation of the Hnf4a locus Hnf4g importance in human colon cancer organoids and regulation of the.
(A) Graph depicting interaction between the five regulatory motifs measured by synergy and cooccurrence. (A) Graph depicting interaction between the five.
Analysis of in silico flux changes along the exponential growth phase: (A) In silico flux changes from 24 to 36 h, from 36 to 48 h, and from 48 to 60 h.
Motif detectability corresponds to the phylogenetic profile of the cognate transcription factor. Motif detectability corresponds to the phylogenetic profile.
Tight regulation of gene expression variation is coupled to promoter architecture. Tight regulation of gene expression variation is coupled to promoter.
Heatmaps of the gene mutation distributions in oncogene‐signaling blocks for six representative cancer types. Heatmaps of the gene mutation distributions.
Functional analysis of duplicate pair CIK1–VIK1 (A) Genetic interaction profile similarity. Functional analysis of duplicate pair CIK1–VIK1 (A) Genetic.
Proteins with hotspot mutations are often in the interaction networks with known cancer driver proteins Proteins with hotspot mutations are often in the.
Volume 38, Issue 4, Pages (May 2010)
(C–F) Complete interaction networks (representing both baits and preys) for selected groups of baits. (C–F) Complete interaction networks (representing.
Mononucleotide models describe sequence‐to‐shape relationships well
Relationship between clinical data and latent factors
Topological overlap matrix (TOM) plots of weighted, gene coexpression networks constructed from one mouse studies (A–F) and four human studies including.
Prediction of IGHV status based on Factor 1 in the CLL data and validation on outlier cases on independent assays Prediction of IGHV status based on Factor.
A complete functional map of UBE2I
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Transcriptional dynamics during lineage commitment
Network derived from large‐scale fractionation predicts 48 protein complexes and communities Network derived from large‐scale fractionation predicts 48.
Hierarchical structure in the yeast transcription regulatory network.
Marker reproducibility and metastasis prediction performance.
Widespread modulation of epigenetic landscape for differentiated organoids is driven by Hnf4g Widespread modulation of epigenetic landscape for differentiated.
Heat maps showing global relative growth phenotype and comparison between measured and predicted values. Heat maps showing global relative growth phenotype.
Yang Liu, Perry Palmedo, Qing Ye, Bonnie Berger, Jian Peng 
Consequences of reducing SF3B2 expression.
Examples for intrachromosomal mRNA co‐regulation patches
Multi‐Omics Factor Analysis: model overview and downstream analyses
Optimality principles in adaptive evolution.
Analysis of the phosphomimetic ProP‐PD selection results
Recursive construction and error correction of a simple combinatorial library. Recursive construction and error correction of a simple combinatorial library.
Genetic and molecular characterization of DicF-ftsZ mRNA interactions.
Allele‐specific expression at single‐cell resolution
Transcriptomic and epigenetic changes associated with Factor 1 in the scMT data Transcriptomic and epigenetic changes associated with Factor 1 in the scMT.
Static properties of transcription factors (TFs) within the hierarchical framework. Static properties of transcription factors (TFs) within the hierarchical.
48 clinically relevant GPCR receptors mapped using MYTH
DominoEffect R package
sac mutants are hypersensitive to IR
Simulation showing that the cell length variability of the entire population can mask abnormal cell length variability at a specific cell cycle period.
Modeling patterns of p53 activation dynamics from natural language
An integrated NHR network.
Design and optimization of the computational model.
Subnetworks enriched for the hallmarks of cancer.
Network modules correspond to known and novel functional distinctions between neuronal subtypes. Network modules correspond to known and novel functional.
Community diversity and metagenome depth interact to influence assembly quality. Community diversity and metagenome depth interact to influence assembly.
Antisense‐mediated regulation of SUR7.
Melting behavior of protein complexes
Antisense expression associates with larger gene expression variability. Antisense expression associates with larger gene expression variability. (A–D)
Global and focused views of human interaction map.
Functional metabolic rearrangements in chloramphenicol‐resistant populations Functional metabolic rearrangements in chloramphenicol‐resistant populations.
Dynamic regulatory map and static network for yeast response to AA starvation. Dynamic regulatory map and static network for yeast response to AA starvation.
Integrative omic approaches for the study of host–pathogen interactions Integrative omic approaches for the study of host–pathogen interactions (A) Proteomic.
Integration of proteomic data into the model
Genomewide profiling of chromatin accessibility in prostate cancer specimens Genomewide profiling of chromatin accessibility in prostate cancer specimens.
Many of the diseases associated with high coevolution scores share genetic components. Many of the diseases associated with high coevolution scores share.
Diagram of the workflow (related to Fig 1)‏
Deep Learning in Bioinformatics
Changes in mutation rate or protein abundance are not observed in HATs when comparing rho+ to rho0 cells. Changes in mutation rate or protein abundance.
Mouse Y1H pipeline. Mouse Y1H pipeline. (A) Workflow to retrieve the mouse TF open‐reading frame (ORF) clone resource. We predict that the mouse genome.
Example of training and deployment of deep convolutional neural networks. Example of training and deployment of deep convolutional neural networks. During.
Fig. 2 Visualization of features.
Presentation transcript:

Principles of using neural networks for predicting molecular traits from DNA sequence Principles of using neural networks for predicting molecular traits from DNA sequence (A) DNA sequence and the molecular response variable along the genome for three individuals. Conventional approaches in regulatory genomics consider variations between individuals, whereas deep learning allows exploiting intra‐individual variations by tiling the genome into sequence DNA windows centred on individual traits, resulting in large training data sets from a single sample. (B) One‐dimensional convolutional neural network for predicting a molecular trait from the raw DNA sequence in a window. Filters of the first convolutional layer (example shown on the edge) scan for motifs in the input sequence. Subsequent pooling reduces the input dimension, and additional convolutional layers can model interactions between motifs in the previous layer. (C) Response variable predicted by the neural network shown in (B) for a wild‐type and mutant sequence is used as input to an additional neural network that predicts a variant score and allows to discriminate normal from deleterious variants. (D) Visualization of a convolutional filter by aligning genetic sequences that maximally activate the filter and creating a sequence motif. (E) Mutation map of a sequence window. Rows correspond to the four possible base pair substitutions, columns to sequence positions. The predicted impact of any sequence change is colour‐coded. Letters on top denote the wild‐type sequence with the height of each nucleotide denoting the maximum effect across mutations (figure panel adapted from Alipanahi et al, 2015). Christof Angermueller et al. Mol Syst Biol 2016;12:878 © as stated in the article, figure or figure legend