Mining proteomes for short motifs (possible potential as bioactive peptides) Proteomes – Man – pathogens – food organisms Computation – Evolutionary conservation.

Slides:



Advertisements
Similar presentations
Antigen Presentation K.J. Goodrum Department of Biomedical Sciences Ohio University 2005.
Advertisements

T-cell epitope prediction by molecular dynamics simulations Irini Doytchinova Medical University of Sofia School of Pharmacy Medical University of Sofia.
Undergraduate Exercises with Trp Cage Paula Evans, Chet Fornari, Jeff Hansen, Jennifer Inlow, Larry Merkle.
Domain-SLiM mining from High Throughput Protein Interaction Data Hugo Willy August 19, 2010.
Feature selection for characterizing HLA class I peptide motif anchors. Perry G. Ridge 1, Hernando Escobar 1, Peter E. Jensen 1, Julio C. Delgado 1, David.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
The Model To model the complex distribution of the data we used the Gaussian Mixture Model (GMM) with a countable infinite number of Gaussian components.
Protein Sequence Analysis - Overview Raja Mazumder Senior Protein Scientist, PIR Assistant Professor, Department of Biochemistry and Molecular Biology.
Protein Bioinformatics Course
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
Structural Bioinformatics R. Sowdhamini National Centre for Biological Sciences Tata Institute of Fundamental Research Bangalore, INDIA.
Phosphoproteomics and motif mining Martin Miller Ph.d. student CBS DTU
PART II. Prediction of functional regions within disordered proteins Zsuzsanna Dosztányi MTA-ELTE Momentum Bioinformatics Group Department of Biochemistry.
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
Finish up array applications Move on to proteomics Protein microarrays.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
Variant Prioritization in Disease Studies. 1. Remove common SNPs Credit: goldenhelix.com.
PREDICTION OF CATALYTIC RESIDUES IN PROTEINS USING MACHINE-LEARNING TECHNIQUES Natalia V. Petrova (Ph.D. Student, Georgetown University, Biochemistry Department),
Russell Group, Protein Evolution _________ ____ Rob Russell Cell Networks University of Heidelberg Interactions and Modules: the how and why of molecular.
Protein Structure  The structure of proteins can be described at 4 levels – primary, secondary, tertiary and quaternary.  Primary structure  The sequence.
Finding genes in the genome
DNA-Protein Interactions & Complexes. Prokaryotic promoter Consensus sequence is not present in majority of prokaryotic promoters. Sequence motifs.
1 Computational Approaches(1/7)  Computational methods can be divided into four categories: prediction methods based on  (i) The overall protein amino.
Considerations for multi-omics data integration Michael Tress CNIO,
Molecular Modeling in Drug Discovery: an Overview
Intrinsically disordered proteins Zsuzsanna Dosztányi EMBO course Budapest, 3 June 2016.
Tzachi Hagai, Ariel Azia, M. Madan Babu, Raul Andino  Cell Reports 
APPLICATIONS OF BIOINFORMATICS IN DRUG DISCOVERY
Experimental confirmation of in vitro GPCR activity of compounds and their scaffolds screened from a chemical library. Experimental confirmation of in.
There are four levels of structure in proteins
binding sites 58 of the 473 unambiguously assigned phosphorylation sites are predicted by Scansite to be sites for binding. 50 of these correspond.
Protein Bioinformatics Course
Distribution of disorder in the cytosolic phosphoproteome
Phosphorylation and sequence disorder in microtubule-associated protein Tau.A, schematic illustration of the domain profile of Tau with all known phosphorylation.
Genome organization and Bioinformatics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Ligand Docking to MHC Class I Molecules
Volume 19, Issue 1, Pages (January 2012)
Structure of in‐frame deletions.
Nancy Baker SILS Bioinformatics Seminar January 21, 2004
Network derived from large‐scale fractionation predicts 48 protein complexes and communities Network derived from large‐scale fractionation predicts 48.
Phosphopeptides identified harboring minimal binding motifs
Novel phosphorylation sites on H+-ATPase proteins
Properties of proteins and residues with frequent hotspot mutations
Complementary identification and novel protein discovery
Protein information in the Human Protein Atlas.
Significantly enriched phosphorylation motifs from up-regulated phosphopeptides by Motif-X analysis. Significantly enriched phosphorylation motifs from.
Epitope Mapping Performance using a single peptide microarray.
Schematic representation of proteogenomic annotation strategy.
Global visualization of antigen and epitope discovery.
The evolutionary conservation of the phosphoproteomes.a, E. coli. b, B. subtilis. The evolutionary conservation of the phosphoproteomes.a, E. coli. b,
N-terminal extension of a gene using peptides mapping upstream to an annotated start site. N-terminal extension of a gene using peptides mapping upstream.
Between Order and Disorder in Protein Structures: Analysis of “Dual Personality” Fragments in Proteins  Ying Zhang, Boguslaw Stec, Adam Godzik  Structure 
Protein domains Jasmin sutkovic
Volume 134, Issue 5, Pages (September 2008)
(A) Missense mutations identified in different domains of ROR2 in recessive Robinow syndrome. (A) Missense mutations identified in different domains of.
Pathway analysis of genes upregulated after RSV infection.
Tzachi Hagai, Ariel Azia, M. Madan Babu, Raul Andino  Cell Reports 
Integrative omic approaches for the study of host–pathogen interactions Integrative omic approaches for the study of host–pathogen interactions (A) Proteomic.
Annoted amino acid sequence of Aedes aegypti gliotactin (Gli).
General structure of RIFINs and STEVORs
RAD51 interacts with SUMO-1 through its SIM
Phosphopeptides identified harboring minimal binding motifs
Analysis of LC8-binding and nonbinding motifs reveals distinct positional preferences. Analysis of LC8-binding and nonbinding motifs reveals distinct positional.
Motif sequence logo and surface analysis of LC8.
Survey of phosphorylation motifs
Selectivity-determining regions
Conserved motifs in the ABC
Predicted pathogenic BRCA1 amino acid substitutions are confined to the evolutionarily conserved N- and C-terminal domains. Predicted pathogenic BRCA1.
Sequence alignment of colicin lysis proteins.
Presentation transcript:

Mining proteomes for short motifs (possible potential as bioactive peptides) Proteomes – Man – pathogens – food organisms Computation – Evolutionary conservation – Evolutionary convergence – Predicting SLiM-like properties

Short linear motifs SLiMPRED predictor α-Helixβ-SheetPolyproline II LIG_EH1_1 LIG_Dynein_DLC8_1 LIG_CAP-Gly_1 LIG_GLEBS_BUB3_1 LIG_PDZ_1LIG_SH3_1 LIG_IQLIG_PP1LIG_SH3_2 LIG_MDM2LIG_PP2B_1LIG_SH3_3 LIG_NRBOXLIG_SH2_GRB2LIG_SH3_5 LIG_Sin3_1LIG_SH2_SRCLIG_TRAF2_1 LIG_Sin3_3LIG_SH2_STAT3LIG_TRAF6 LIG_SH2_STAT5LIG_WW_1 LIG_SIAH_1 LIG_TRFH_1 LIG_WRPW_1 Restricted training set to protein-binding motifs including:

Training a short linear motif predictor (SLiMPred) α-Helixβ-SheetPolyproline IIOther sequences Unique ELMs SLiM residues ,410 Non-SLiM residues Most motifs lie in disordered regions of proteins Existing predictor ANCHOR predicts protein-binding within disordered regions

SLiMPred (blue) v ANCHOR (red) Alpha-helix Beta-sheet Polyproline-II helix Other

SLIMPred has some predictive ability in ordered regions too Disordered regions Ordered regions

SLiMPred: predicting motif-like regions along a protein Disorder SLIMPred Relative Local Conservation Mooney et al J Mol Biol (2012) 415:

Which kinds of interactions should we use in searching for novel motifs? ALLYeast 2 hybrid complex

casl.ucd.ie/empa/programberlin/2- uncategorised/52

Potential workflows to identify novel peptides from proteins Conservation analysis SLiMPrints Convergent evolution analysis SLiMFinder Extracellular peptides PeptideRanker Intracellular peptides SLiMPred ANCHOR Known structure of protein ligand, candidate peptide sequence Pepsite (Trabuco et al Nucleic Acids Res. 2012) Known structure with linear peptide in complex Predict cyclised peptide mimetic (CYCLOPS virtual library; Duffy et al 2012).

SLiMFinder human known versus the new Known true positive motifs are discovered With variations in many protein interaction sets Novel motifs are much sparser, often only discovered once

All the human motif discovery results are available in an online searchable database (search on genes or motifs) bioware.soton.ac.uk/slimdb