Presentation is loading. Please wait.

Presentation is loading. Please wait.

Biology 224 Instructor: Tom Peavy Feb 21 & 26, 2008 <Images from Bioinformatics and Functional Genomics by Jonathan Pevsner> Protein Structure & Analysis.

Similar presentations


Presentation on theme: "Biology 224 Instructor: Tom Peavy Feb 21 & 26, 2008 <Images from Bioinformatics and Functional Genomics by Jonathan Pevsner> Protein Structure & Analysis."— Presentation transcript:

1 Biology 224 Instructor: Tom Peavy Feb 21 & 26, 2008 <Images from Bioinformatics and Functional Genomics by Jonathan Pevsner> Protein Structure & Analysis

2 protein Protein families Protein function Physical properties Protein localization Gene ontology (GO): --cellular component --biological process --molecular function

3 Protein domains, motifs & signatures

4 Definitions Signature: a protein category such as a domain or motif Domain: a region of a protein that can adopt a 3D structure a fold a family is a group of proteins that share a domain examples: zinc finger domain immunoglobulin domain Motif (or fingerprint): a short, conserved region of a protein typically 10 to 20 contiguous amino acid residues

5 Definition of a domain According to InterPro at EBI ( http://www.ebi.ac.uk/interpro /): A domain is an independent structural unit, found alone or in conjunction with other domains or repeats. Domains are evolutionarily related. According to SMART (http://smart.embl-heidelberg.de): A domain is a conserved structural entity with distinctive secondary structure content and a hydrophobic core. Homologous domains with common functions usually show sequence similarities.

6 15 most common domains (human) Zn finger, C2H2 type1093 proteins Immunoglobulin1032 EGF-like471 Zn-finger, RING458 Homeobox417 Pleckstrin-like405 RNA-binding region RNP-1400 SH3394 Calcium-binding EF-hand392 Fibronectin, type III300 PDZ/DHR/GLGF280 Small GTP-binding protein 261 BTB/POZ236 bHLH226 Cadherin226

7 Varieties of protein domains Extending along the length of a protein Occupying a subset of a protein sequence Occurring one or more times

8 Example of a protein with domains: Methyl CpG binding protein 2 (MeCP2) MBDTRD The protein includes a methylated DNA binding domain (MBD) and a transcriptional repression domain (TRD). MeCP2 is a transcriptional repressor. Mutations in the gene encoding MeCP2 cause Rett Syndrome, a neurological disorder affecting girls primarily.

9 Result of an MeCP2 blastp search: A methyl-binding domain shared by several proteins

10 Are proteins that share only a domain homologous?

11 Proteins can have both domains and patterns (motifs) Domain (aspartyl protease) Domain (reverse transcriptase) Pattern (several residues) Pattern (several residues)

12 SwissProt entry for HIV-1 pol links to many databases

13

14 http://www.ebi.ac.uk/Databases/ ExPASy Proteomics Server The ExPASy (Expert Protein Analysis System) proteomics server of the Swiss Institute of Bioinformatics (SIB) is dedicated to the analysis of protein sequences and structures as well as 2-D PAGE (Disclaimer / References).proteomics Swiss Institute of BioinformaticsDisclaimerReferences http://ca.expasy.org/ InterPro is a database of protein families, domains and functional sites in which identifiable features found in known proteins can be applied to unknown protein sequences. http://www.ebi.ac.uk/interpro/ InterPro

15 PROSITE Database of protein families and domains http://ca.expasy.org/prosite/ Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains. http://www.sanger.ac.uk/Software/Pfam/index.shtml PRINTS is a compendium of protein fingerprints http://umber.sbs.man.ac.uk/dbbrowser/PRINTS/ The ProDom protein domain database consists of an automatic compilation of homologous domains. http://prodes.toulouse.inra.fr/prodom/current/html/home.php

16 SMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures. http://smart.embl-heidelberg.de/ The ProDom protein domain database consists of an automatic compilation of homologous domains. http://prodes.toulouse.inra.fr/prodom/current/html/home.php Houses the PIRSF, ProClass and ProLINK databases http://pir.georgetown.edu/

17 Protein family classification and databases PIRSF TIGRFAMs SUPERFAMILY Gene3D PANTHER http://pir.georgetown.edu/iproclass/ http://www.tigr.org/TIGRFAMs/index.shtml http://supfam.mrc-lmb.cam.ac.uk/SUPERFAMILY/ http://www.biochem.ucl.ac.uk/bsm/cath/Gene3D/ http://www.pantherdb.org/

18

19 Definition of a motif A motif (or fingerprint) is a short, conserved region of a protein. Its size is often 10 to 20 amino acids. Simple motifs include transmembrane domains and phosphorylation sites. These do not imply homology when found in a group of proteins. In PROSITE,a pattern is a qualitative motif description (a protein either matches a pattern, or not).

20 Gene Ontology (GO) Consortium

21 The Gene Ontology Consortium An ontology is a description of concepts. The GO Consortium compiles a dynamic, controlled vocabulary of terms related to gene products. There are three organizing principles: Molecular function Biological process Cellular component

22 Gene product cytochrome c GO entry terms: molecular function = electron transporter activity, the biological process = oxidative phosphorylation and induction of cell death the cellular component = mitochondrial matrix and mitochondrial inner membrane. Example

23 GO consortium (http://www.geneontology.org) No centralized GO database. Instead, curators of organism-specific databases assign GO terms to gene products for each organism. AmiGO is the searchable portion of the GO --Gene Symbol, name, UniProt access numbers, and Text searches can be used to find GO entries

24 The Gene Ontology Consortium: Evidence Codes ICInferred by curator IDAInferred from direct assay IEAInferred from electronic annotation IEPInferred from expression pattern IGIInferred from genetic interaction IMPInferred from mutant phenotype IPIInferred from physical interaction ISSInferred from sequence or structural similarity NASNon-traceable author statement NDNo biological data TASTraceable author statement

25

26


Download ppt "Biology 224 Instructor: Tom Peavy Feb 21 & 26, 2008 <Images from Bioinformatics and Functional Genomics by Jonathan Pevsner> Protein Structure & Analysis."

Similar presentations


Ads by Google