Study of the DcpS Family March 5 th 2009 Structural Bioinformatics Msc BIOINFO, UPF Salvador Jesús Capella Gutiérrez Juan Ramón Meneu Hernández Rut Carolina.

Slides:



Advertisements
Similar presentations
Blast to Psi-Blast Blast makes use of Scoring Matrix derived from large number of proteins. What if you want to find homologs based upon a specific gene.
Advertisements

1 Genome information GenBank (Entrez nucleotide) Species-specific databases Protein sequence GenBank (Entrez protein) UniProtKB (SwissProt) Protein structure.
Lecture 4: DNA transcription
By what mechanisms are calcium signals read and translated into biochemical response? What is the structural basis for the function of the proteins involved?
Basics of Comparative Genomics Dr G. P. S. Raghava.
Psi-BLAST, Prosite, UCSC Genome Browser Lecture 3.
Structural bioinformatics
Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity Nicholas M. Luscombe and Janet M. Thornton JMB (2002)
Protein structure (Part 2 of 2).
Readings for this week Gogarten et al Horizontal gene transfer….. Francke et al. Reconstructing metabolic networks….. Sign up for meeting next week for.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
The Protein Data Bank (PDB)
Protein Modules An Introduction to Bioinformatics.
Pattern databases in protein analysis Arthur Gruber Instituto de Ciências Biomédicas Universidade de São Paulo AG-ICB-USP.
Sequence/Structure Alignment Resources from NCBI Steve Bryant Protein Data Bank Rutgers University November 19, 2005.
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
Genomics and bioinformatics summary 1. Gene finding: computer searches, cDNAs, ESTs, 2.Microarrays 3.Use BLAST to find homologous sequences 4.Multiple.
Protein Structure Prediction II
Protein Tertiary Structure Prediction Structural Bioinformatics.
Predicting Function (& location & post-tln modifications) from Protein Sequences June 15, 2015.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Spinal Muscular Atrophy SMN1 Billy Baader - Genetics 677 Medline Plus (2009) Spinal Muscular Atrophy retrieved Feb 3, 2009 from:
Department of Biochemistry
Bioinformatics Analysis of YqjG: an introduction and some questions YqjG: “Uncharacterized protein” from Escherichia coli UniProt ID = P42620 (YQJG_ECOLI)
Protein Tertiary Structure Prediction
Cédric Notredame (30/08/2015) Chemoinformatics And Bioinformatics Cédric Notredame Molecular Biology Bioinformatics Chemoinformatics Chemistry.
Protein domains. Protein domains are structural units (average 160 aa) that share: Function Folding Evolution Proteins normally are multidomain (average.
Evolution, structure and function of 1pujA Scott L. Allen, Alexander Mulherin, Takayuki Hasegawa.
Tools: Amino acid sequences (PDB, EBI) from many diverse organisms to be provided for students to select about 5-6 organisms representing the three domains.
Levels of Protein Structure
Sequence Alignment Techniques. In this presentation…… Part 1 – Searching for Sequence Similarity Part 2 – Multiple Sequence Alignment.
Identification of Protein Domains. Orthologs and Paralogs Describing evolutionary relationships among genes (proteins): Two major ways of creating homologous.
Exploiting Structural and Comparative Genomics to Reveal Protein Functions  Predicting domain structure families and their domain contexts  Exploring.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Lecture 1: Fundamentals of Protein Structure
You have worked for 2 years to isolate a gene involved in axon guidance. You sequence the cDNA clone that contains axon guidance activity. What do you.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Bioinformatics Ayesha M. Khan 9 th April, What’s in a secondary database?  It should be noted that within multiple alignments can be found conserved.
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
Motif discovery and Protein Databases Tutorial 5.
BLAST Slides adapted & edited from a set by Cheryl A. Kerfeld (UC Berkeley/JGI) & Kathleen M. Scott (U South Florida) Kerfeld CA, Scott KM (2011) Using.
Homology modeling with SWISS-MODEL
Sequence Based Analysis Tutorial March 26, 2004 NIH Proteomics Workshop Lai-Su L. Yeh, Ph.D. Protein Science Team Lead Protein Information Resource at.
March 28, 2002 NIH Proteomics Workshop Bethesda, MD Lai-Su Yeh, Ph.D. Protein Scientist, National Biomedical Research Foundation Demo: Protein Information.
Last Class 1. Transcription 2. RNA Modification and Splicing
Point Specific Alignment Methods PSI – BLAST & PHI – BLAST.
Exercises Pairwise alignment Homology search (BLAST) Multiple alignment (CLUSTAL W) Iterative Profile Search: Profile Search –Pfam –Prosite –PSI-BLAST.
Guidelines for sequence reports. Outline Summary Results & Discussion –Sequence identification –Function assignment –Fold assignment –Identification of.
DNA makes RNA  Transcription RNA makes Proteins  Translation Information flows from genes  proteins – But not the other way! (usually)
InterPro Sandra Orchard.
Gene models and proteomes for Saccharomyces cerevisiae (Sc), Schizosaccharomyces pombe (Sp), Arabidopsis thaliana (At), Oryza sativa (Os), Drosophila melanogaster.
Cytochrome P450 Monooxygenases ubiquitous in nature: > 40 in humans > 250 in plants >400 in rice signature motif: F—G-R-C-G requires redox partner scission.
Bos taurus Olfactory Receptor Katie Davis 1,2 and Sandra Rodriguez-Zas 1 1 Department of Animal Sciences, University of Illinois Urbana-Champaign, 2 ACES.
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
Protein Families, Motifs & Domains.
Basics of Comparative Genomics
Genome Annotation Continued
Genome Center of Wisconsin, UW-Madison
Predicting Active Site Residue Annotations in the Pfam Database
Evolution of Biochemical Pathways
ExPASy (Expert Protein Analysis System)
Structure of CheA, a Signal-Transducing Histidine Kinase
Volume 10, Issue 3, Pages (March 2003)
EST Analysis of the Cnidarian Acropora millepora Reveals Extensive Gene Loss and Rapid Sequence Divergence in the Model Invertebrates  R.Daniel Kortschak,
Basics of Comparative Genomics
Meigang Gu, Kanagalaghatta R. Rajashankar, Christopher D. Lima 
Basic Local Alignment Search Tool
Volume 10, Issue 3, Pages (March 2003)
Sequence Analysis Alan Christoffels
Presentation transcript:

Study of the DcpS Family March 5 th 2009 Structural Bioinformatics Msc BIOINFO, UPF Salvador Jesús Capella Gutiérrez Juan Ramón Meneu Hernández Rut Carolina Morata Gil

Main Scheme 1) 1)Main Approaches a) a)Initial Purpose: P-Bodies b) b)Second Purpose: HIT Family c) c)Final Purpose: DcpS Family 2) 2)DcpS Family a) a)Biological Aspects b) b)Basic Analysis c) c)Extended Analysis

1 st PART: main approaches

Initial Purpose: P-Bodies P-Bodies are discrete cytoplasmic domains where several proteins involved in mRNA degradation, translational repression and some other related functions colocalize

Initial Purpose: P-Bodies P-Bodies are discrete cytoplasmic domains where several proteins involved in mRNA degradation, translational repression and some other related functions colocalize Constituted by proteins belonging to different families !!! Look for the protein belonging to the family with more documented structures: HIT Family Pfam ProteinFamilyInteractionsSpeciesStructures DcpSHIT SOLUTION MAIN PROBLEM

second Purpose: hit family Histidine Triad / HIT Motif / HIT hexapeptide H  H  H   = Hydrophobic residue >swissprot|Q96C86|DCPS_HUMAN MADAAPQLGKRKRELDVEEAHAASTEEKEAGVGNGTCAPVRLPFSGFRLQKVLRE SARDK IIFLHGKVNEASGDGDGEDAVVILEKTPFQVEQVAQLLTGSPELQLQFSNDIYSTYHL FP PRQLNDVKTTVVYPATEKHLQKYLRQDLRLIRETGDDYRNITLPHLESQSLSIQWVY NIL DKKAEADRIVFENPDPSDGFVLIPDLKWNQQQLDDLYLIAICHRRGIRSLRDLTPEHL PL LRNILHQGQEAILQRYRMKGDHLRVYLHYLPSYYHLHVHFTALGFEAPGSGVERAHL LAE VIENLECDPRHYQQRTLTFALRADDPLLKLLQEAQQS

results PSSM Matrix second Purpose: hit family TARGET Blast against Uniref100 Blast against PDB HMMPfam against PFAM HMMSearch against PDB Filter out mutated sequences Select consensus sequences DCPS_Human + target DCPS Family HMMFetch against PFAM HMM Matrix + target

Mutagenesis Studies: H  H => N)  H 

Same Sequence. Different Substrates.

9 final structures for the HIT Family second Purpose: hit family

ClustalW Filter out mutated sequences Select consensus sequences T-CoffeeStampAlignfit + Stamp

second Purpose: hit family 9 final structures for the HIT Family Superposition turned out to be a real mess !!! ALIGNFIT + STAMPSTAMP ALIGNFIT + STAMP

second Purpose: hit family Histidine Triad / HIT Motif / HIT hexapeptide H  H  H   = Hydrophobic residue HIT Superfamily Fhit DcpS Hint Within each branch => High degree of conservation among proteins Between each branch => HIT MOTIF is the only region absolutely conserved

9 final structures for the HIT Family Superposition turned out to be a real mess !!! DcpS Family final Purpose: dcps family

2 nd PART: DCPS family

Biological aspects Main degradation pathway In mammals, this family contains only one member, DcpS, which: Stands for “ s cavenger mRNA d e c a p ping enzyme” Hydrolyses the residual cap structure following 3' to 5' mRNA decay Is the first member of the HIT family of proteins with a defined biological function.

Biological aspects DcpS shares functional similarity with Dcp2:

Biological aspects Mutations in the HIT motif lead to the complete loss of the function The region (binding site) is critical for decapping activity

BASIC analysis Human (Homo sapiens)Mouse (Mus musculus) Yeast (Sacch. cerevisiae) Rat (Rattus norvegicus)Bovine (Bos taurus) Pig (Sus scrofa) TARGET SEQUENCE (no PDB Structure) TEMPLATES (PDB Structure) Several orthologues have been studied:

BASIC ANALYSIS TEMPLATES (PDB Structure) 1XMM Human (Homo Sapiens) + 1VLR Mouse (Mus musculus) STAMP 86 % Sequence Identity

BASIC ANALYSIS TEMPLATES (PDB Structure) 1XMM Human (Homo Sapiens) + 1VLR Mouse (Mus musculus) 86 % Sequence Identity DALI

BASIC ANALYSIS TEMPLATES (PDB Structure) 1XMM Human (Homo Sapiens) + 1VLR Mouse (Mus musculus) 86 % Sequence Identity SAP

BASIC ANALYSIS TARGET SEQUENCE (no PDB Structure) >sp|Q8MIZ3|DCPS_PIG Scavenger mRNA-decapping enzyme DcpS OS=Sus scrofa GN=DCPS PE=2 SV=1 MADTAPQPSKRKRERDPEEAEAPSTEEKEARVGNGTSAPVRLPFSGFRVKKVLR ESARDK IIFLHGKVNEASGDGDGEDAIVILEKTPFQVDQVAQLLMGSPELQLQFSNDIYSTYH LFP PRQLSDVKTTVVYPATEKHLQKYLHQDLHLVRETGGDYKNITLPHLESQSLSIQWV YNIL DKKAEADRIVFENPDPSDGFVLIPDLKWNQKQLDDLYLIAICHRRGIKSLRDLTPEH LPL LRNILREGQEAILQRYQVTGDRLRVYLHYLPSYYHLHVHFTALGFEAPGAGVERAH LLAE VIENLEQDPEHYQRRTLTFALRADDPLLTLLQEAQRS

BASIC ANALYSIS TARGET DCPS_Pig ClustalWT-CoffeeStamp + DCPS Templates (2)

BASIC ANALYSIS SwissModel

BASIC ANALYSIS TARGET DCPS_Pig ClustalWT-CoffeeStamp + DCPS Templates (2)

BASIC ANALYSIS STAMP

extended analysis DcpS protein HUMAN (337 AA) DcpS N-terminal domain ( ) DcpS C-terminal domain ( ) HIT domain ( ) HIT MOTIF ( )

extended analysis DcpS dimer in complex with m 7 GpppG where: N-terminal domain – swapped dimer C-terminal domain dimer Chain A Chain B

extended analysis After analysing the HIT domain, we go deep into… DcpS N-terminal domain DcpS C-terminal domain >swissprot|Q96C86|DCPS_HUMAN Scavenger mRNA-decapping enzyme DcpS; MADAAPQLGKRKRELDVEEAHAASTEEKEAGVGNGTCAPVRLPFSGFRLQKVLRESA RDK IIFLHGKVNEASGDGDGEDAVVILEKTPFQVEQVAQLLTGSPELQLQFSNDIYSTYHLFP PRQLNDVKTTVVYPATEKHLQKYLRQDLRLIRETGDDYRNITLPHLESQSLSIQWVYNIL DKKAEADRIVFENPDPSDGFVLIPDLKWNQQQLDDLYLIAICHRRGIRSLRDLTPEHLPL LRNILHQGQEAILQRYRMKGDHLRVYLHYLPSYYHLHVHFTALGFEAPGSGVERAHLL AE VIENLECDPRHYQQRTLTFALRADDPLLKLLQEAQQS … to look for hints on the specificity of DcpS

extended analysis We do the analysis through 3 different approaches: SequenceStructureLiteratureBLASTDALIPubMed

extended analysis Members of the DcpS Family DcpS C-terminal domain Sequence: BLAST Predicted or putative proteins (not reviewed) No new information related to other families

extended analysis DcpS C-terminal domain Structure: DALI No new information related to other families All the results are members of the DcpS and HIT Families

extended analysis DcpS C-terminal domain Literature: PubMed The C-Terminal DcpS Domain Is Related, but Distinct from the HIT Protein Family “A DALI search of the Protein Data Bank revealed structural similarity between DcpS and a number of HIT proteins....” “…. In addition and as noted above, the DcpS C-terminal domain is not sufficient for cap hydrolysis, indicating substantive differences between these protein families.”

extended analysis Basically members of the DcpS and HIT Families No new information related to other families DcpS C-terminal domain BLASTDALIPubMed

extended analysis Members of the DcpS Family Predicted or putative proteins (not reviewed) No new information related to other families DcpS N-terminal domain Sequence: BLAST

extended analysis N-terminal domain shares structural homology to NTF2-like proteins DcpS N-terminal domain Literature: PubMed PDB search using DALI Carotenoid binding protein ( CBP ) Metazoan mRNA export factor p15 Yeast mRNA export factor Mex67

extended analysis DcpS N-terminal domain Literature: PubMed DcpS DALI

extended analysis DcpS N-terminal domain Literature: PubMed DcpS mRNA export factor p15 DALI

extended analysis DcpS N-terminal domain Literature: PubMed DcpS mRNA export factor p15 Carotenoid binding prot (CBP) DALI

extended analysis DcpS N-terminal domain Literature: PubMed DcpS mRNA export factor p15 Carotenoid binding prot (CBP) mRNA export factor Mex67 DALI

extended analysis DcpS N-terminal domain Literature: PubMed These NTF2-like proteins form HETERODIMERS

extended analysis DcpS dimer in complex with m 7 GpppG where: N-terminal domain – swapped dimer C-terminal domain dimer Chain A Chain B DcpS N-terminal domain Literature: PubMed DcpS forms swapped – HOMODIMERS !!!

extended analysis DcpS N-terminal domain Literature: PubMed “N-terminal domain shares structural homology to NTF2-like proteins …” BUT “… DcpS domain swapped – dimer has a unique topology and organization, which is different from either Mex67/Mtr2 or p15/TAP complexes “Mex67/Mtr2 and p15/TAP have been implicated in mRNA export pathways …” BUT “… Any functional significance to the DcpS N-terminal domain structure remains unclear. ”

extended analysis DcpS N-terminal domain Structure: DALI A PDB search using DALI shows kind of structural homology with proteins belonging to some other families Further studies should be carried out

Thank you Any questions?