PART II. Prediction of functional regions within disordered proteins Zsuzsanna Dosztányi MTA-ELTE Momentum Bioinformatics Group Department of Biochemistry.

Slides:



Advertisements
Similar presentations
PREDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Bioinformatics.
Advertisements

Mining proteomes for short motifs (possible potential as bioactive peptides) Proteomes – Man – pathogens – food organisms Computation – Evolutionary conservation.
© Wiley Publishing All Rights Reserved. Analyzing Protein Sequences.
Structural bioinformatics
Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity Nicholas M. Luscombe and Janet M. Thornton JMB (2002)
The Model To model the complex distribution of the data we used the Gaussian Mixture Model (GMM) with a countable infinite number of Gaussian components.
Protein Modules An Introduction to Bioinformatics.
Tutorial 2: Some problems in bioinformatics 1. Alignment pairs of sequences Database searching for sequences Multiple sequence alignment Protein classification.
APC = anaphase-promoting complex
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Protein Tertiary Structure Prediction
Graves’ Disease Case: Previously Normal thyroid signaling requires circuit of signaling: hypothalamus, pituitary, thyroid Signaling between cells requires.
Linear motifs and phosphorylation sites. What is a linear motif? ( in molecular biology )
Protein Bioinformatics Course
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Predicting protein disorder Peter Tompa Institute of Enzymology Hungarian Academy of Sciences Budapest, Hungary.
Makromolekulak_2010_12_07 Simon István. Prion protein.
Prediction of protein disorder Zsuzsanna Dosztányi MTA-ELTE Momentum Bioinformatics Group Department of Biochemistry Eotvos Lorand University, Budapest,
Prediction of protein disorder Zsuzsanna Dosztányi Institute of Enzymology, Budapest, Hungary
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
Protein Secondary Structure Prediction Based on Position-specific Scoring Matrices Yan Liu Sep 29, 2003.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Intrinsically disordered proteins: drug development and the most interesting examples Peter Tompa Institute of Enzymology Hungarian Academy of Sciences.
Bioinformatics Ayesha M. Khan 9 th April, What’s in a secondary database?  It should be noted that within multiple alignments can be found conserved.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Functional classification of IDPs Peter Tompa Institute of Enzymology Hungarian Academy of Sciences Budapest, Hungary.
Protein-Protein Interaction Hotspots Carved into Sequences Yanay Ofran 1,2, Burkhard Rost 1,2,3 1.Department of Biochemistry and Molecular Biophysics,
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Tertiary Structure Globular proteins (enzymes, molecular machines)  Variety of secondary structures  Approximately spherical shape  Water soluble 
CSBSI 2007 Bioinformatics and Computational Biology Program Department of Genetics, Development, and Cell Biology Department of Computer Science Generating.
Protein Disordered Regions and the Evolution of Eukaryotes Allan Wu Phar 201 Phil Bourne.
PREDICTION OF CATALYTIC RESIDUES IN PROTEINS USING MACHINE-LEARNING TECHNIQUES Natalia V. Petrova (Ph.D. Student, Georgetown University, Biochemistry Department),
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Alignment & Secondary Structure You have learned about: Data & databases Tools Amino Acids Protein Structure Today we will discuss: Aligning sequences.
Pre-mRNA secondary structures influence exon recognition Michael Hiller Bioinformatics Group University of Freiburg, Germany.
Russell Group, Protein Evolution _________ ____ Rob Russell Cell Networks University of Heidelberg Interactions and Modules: the how and why of molecular.
Graves’ Disease Case: Previously Normal thyroid signaling requires circuit of signaling: hypothalamus, pituitary, thyroid Signaling between any cells requires.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Motif Search and RNA Structure Prediction Lesson 9.
BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.
Ubiquitination Sites Prediction Dah Mee Ko Advisor: Dr.Predrag Radivojac School of Informatics Indiana University May 22, 2009.
Transcription factor binding motifs (part II) 10/22/07.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Intrinsically disordered proteins Zsuzsanna Dosztányi EMBO course Budapest, 3 June 2016.
Protein families, domains and motifs in functional prediction
There are four levels of structure in proteins
binding sites 58 of the 473 unambiguously assigned phosphorylation sites are predicted by Scansite to be sites for binding. 50 of these correspond.
Sequence alignment of C-terminal phosphorylated plant aquaporins
Protein Bioinformatics Course
Phosphorylation and sequence disorder in microtubule-associated protein Tau.A, schematic illustration of the domain profile of Tau with all known phosphorylation.
Ligand Docking to MHC Class I Molecules
Relationship between Genotype and Phenotype
Structure and Intrinsic Disorder in Protein Autoinhibition
Phosphopeptides identified harboring minimal binding motifs
Different Genes ~ Protein Primary Structure
Significantly enriched phosphorylation motifs from up-regulated phosphopeptides by Motif-X analysis. Significantly enriched phosphorylation motifs from.
A Million Peptide Motifs for the Molecular Biologist
N-terminal extension of a gene using peptides mapping upstream to an annotated start site. N-terminal extension of a gene using peptides mapping upstream.
Nora Pierstorff Dept. of Genetics University of Cologne
TGFβ Signaling in Growth Control, Cancer, and Heritable Disorders
Deep Learning in Bioinformatics
Relationship between Genotype and Phenotype
Segment 5 Molecular Biology Part 1b
Phosphopeptides identified harboring minimal binding motifs
Survey of phosphorylation motifs
Model of the change in receptor structure on engagement of the ligand IFN-γ. Model of the change in receptor structure on engagement of the ligand IFN-γ.
Presentation transcript:

PART II. Prediction of functional regions within disordered proteins Zsuzsanna Dosztányi MTA-ELTE Momentum Bioinformatics Group Department of Biochemistry Eotvos Lorand University, Budapest, Hungary

Protein disorder is prevalent LDR (40<) protein, % kingdom B A E

Protein disorder is functional Xie H et al. J Proteome Res. 2007, 6,

Protein disorder is important Prion proteinPrion disease CFTRCystic fibrosis  Alzheimer’s  -synucleinParkinson’s p53, BRCA1cancer

Functions of intrinsically disordered proteins I Entropic chains ‏ II Linkers III Molecular recognition IV Protein modifications (e.g. phosphorylation) V Assembly of large multiprotein complexes

Interactions of IDPs Complex between p53 and MDM2 Coupled folding and binding

Interactions of proteins

Coupled folding and binding Functional advantages Weak transient, yet specific interactions Post-translational modifications Flexible binding regions that can overlap Evolutionary plasticity Signaling Regulation

Coupled folding and binding Experimental difficulties Highly flexible Weak interactions Short half-life Complexes of IDPs in the PDB:~ 200 Known instances:~ 2,000 Estimated number of such interactions in the human proteome:~ 100,000

Interactions of IDPs F19, W23 and L26

p27 Cyclin-dependent kinase (Cdk) inhibitor, p27Kip1 (p27) Cell cycle regulation Binds to cdk-cyclin komplex and inhibitis their activity Fully disordered protein

p27

Partial unfolding enables the phosphorylation of Tyr88, starting a series of signaling events that leads to the beginning of S phase.

Prediction of functional regions within IDPs Disordered binding regions (ANCHOR) Linear motifs (ELM, SlimPred) Morfs (Morfpred) Specific conservation pattern

Disordered protein complexes Complex between p53 and MDM2 Interaction sites are usually linear (consist of only 1 part) enrichment of interaction prone amino acids Sequence Binding sites No need for structure, binding sites can be predicted from sequence alone

Prediction of disordered binding regions – ANCHOR  What discriminates disordered binding regions? A cannot form enough favorable interactions with their sequential environment It is favorable for them to interact with a globular protein  Based on simplified physical model Based on an energy estimation method using statistical potentials Captures sequential context

ANCHOR Human p53 C –terminal region

ANCHOR and linear motifs NCOA2 transcription co-activator The regions between is disordered Contains 3 receptor binding motifs: xLxxLLx (LIG_NRBOX)

LMs and Disordered Binding Regions Linear motifs and disordered binding regions often overlap Complementary approaches Prediction of disordered binding regions can help to increase likelihood of true instances

LMs and Disordered Binding Regions Linear motifs and disordered binding regions often overlap Complementary approaches Prediction of disordered binding regions can help to increase likelihood of true instances

Machine learning approaches SlimPred: trained on ELM database Morfpred: trained on short chains in complex Very small datasets Negative datasets

Conservation

Conservation patterns of linear motifs No evolutionary constraints to keep the structure Strong constraints on functional site Island-like conservation

SlimPrints Generates sequence alignments of orthologous sequences Relative conservation score per position Filters out less reliable regions Fails if sequences are too divergent, or too similar