Presentation is loading. Please wait.

Presentation is loading. Please wait.

Protein and Function Databases

Similar presentations


Presentation on theme: "Protein and Function Databases"— Presentation transcript:

1 Protein and Function Databases
Tutorial 9 Protein and Function Databases

2 UniProt - SwissProt/TrEMBL PROSITE Pfam Gene Onltology DAVID
Protein and Function Databases UniProt - SwissProt/TrEMBL PROSITE Pfam Gene Onltology DAVID

3 Glossary Domain A structural unit which can be found in multiple protein contexts.

4 Glossary Repeat A short unit which is unstable in isolation but forms a stable structure when multiple copies are present. Family A collection of related proteins.

5 UniProt The Universal Protein Resource (UniProt) is a central repository of protein sequence, function, classification and cross reference. It was created by joining the information contained in swiss-Prot and TrEMBL.

6 Protein search Uniprot input Reviewed protein

7 Uniprot output Sequence download Accession number Protein status
organism length

8 Information for one protein
General information annotations

9 GO annotation (MF, BP, CC)
General keywords GO annotation (MF, BP, CC)

10 Alternative splicing isoforms Features in the sequence

11 Sequences References

12 Alignment for two or more proteins

13 MSA

14 Blast

15 Pfam Pfam is a database of multiple alignments of protein domains or conserved protein regions.

16 What kind of domains can we find in Pfam?
Trusted Domains Repeats Fragment Domains Nested Domains Disulfide bonds Important residues (e.g active sites) Trans membrane domains

17 What kind of domains can we find in Pfam?
Context domains: are those that despite not scoring above the family threshold are expected to be real, based on the other domains found in the protein. Signal peptides: (indicate a protein that will be secreted) Low complexity regions Coiled Coils: (two or three alpha helices that wind around each other)

18

19 Pfam input

20 Domains Domain range and score

21 Description Structure info Gene Ontology Links

22

23 Prosite ProSite is a database of protein domains and motifs that can be searched by either regular expression patterns or sequence profiles.

24

25 Search Results Domains architecture

26

27 Gene Ontology (GO) It is a database of biological processes,
It is a database of biological processes, molecular functions and cellular components. GO does not contain sequence information nor gene or protein description. GO is linked to gene and protein databases. The GO database is structured as a tree

28 Search by AmiGO

29 Three principal branches

30 Directed Acyclic Graph
GO structure is a Directed Acyclic Graph

31 GO sources ISS Inferred from Sequence/Structural Similarity
IDA Inferred from Direct Assay IPI Inferred from Physical Interaction TAS Traceable Author Statement NAS Non-traceable Author Statement IMP Inferred from Mutant Phenotype IGI Inferred from Genetic Interaction IEP Inferred from Expression Pattern IC Inferred by Curator ND No Data available IEA Inferred from electronic annotation

32 Results for alpha-synuclein

33 DAVID Functional Annotation Bioinformatics Microarray Analysis
DAVID  Functional Annotation Bioinformatics Microarray Analysis Identify enriched biological themes, particularly GO terms Discover enriched functional-related gene/protein groups Cluster redundant annotation terms Explore gene names in batch 

34 annotation classification ID conversion

35 Functional annotation
Upload Annotation options

36


Download ppt "Protein and Function Databases"

Similar presentations


Ads by Google