Presentation is loading. Please wait.

Presentation is loading. Please wait.

Functional Annotation of the Horse Genome

Similar presentations


Presentation on theme: "Functional Annotation of the Horse Genome"— Presentation transcript:

1 Functional Annotation of the Horse Genome
Genomic Annotation and Functional Modeling Workshop Maxwell H. Gluck Equine Research Center 15-16 November, 2011

2 Genomic Annotation Genome annotation is the process of attaching biological information to genomic sequences. It consists of two key steps: identifying functional elements in the genome: “structural annotation” attaching biological information to these elements: “functional annotation” biologists often use the term “annotation” when they are referring only to structural annotation

3 Structural & Functional Annotation
Structural Annotation: Open reading frames (ORFs) predicted during genome assembly predicted ORFs require experimental confirmation Functional Annotation: initially, predicted ORFs have no functional literature and functional annotation relies on sequence analysis, etc functional literature exists for many genes/proteins prior to genome sequencing functional annotation does not rely on a completed genome sequence! relies on molecular data

4 functional annotation
Genomic Annotation structural annotation functional annotation

5 Genomic Annotation Genomic annotation is distributed across different databases. Databases: have different file formats, annotation pipelines, etc. Biologists: don’t care (just want their data…) Problem: how to exchange/share genomic annotations from different databases?

6 Bio-ontologies Ontologies first used in biology to enable databases to share & exchange data. Bio-ontologies are used to capture biological information in a way that can be read by both humans and computers annotate data in a consistent way allows data sharing across databases These same features allow computational analysis of high-throughput “omics” datasets using ontologies.

7 relationships between terms
Ontologies relationships between terms digital identifier (computers) description (humans)

8 Ontologies & Genomic Annotation
Structural Annotation: Open reading frames (ORFs) predicted during genome assembly predicted ORFs require experimental confirmation Sequence Ontology Project (SO): provide for a structured controlled vocabulary for the description of primary annotations of nucleic acid sequence Functional Annotation: initially, predicted ORFs have no functional literature and GO annotation relies on computational methods (rapid) functional literature exists for many genes/proteins prior to genome sequencing Gene Ontology (GO): annotation of gene product function

9 Other annotations using other bio-ontologies e.g.
Genomic Annotation Structural Annotation including Sequence Ontology Other annotations using other bio-ontologies e.g. Anatomy Ontology Nomenclature (species’ genome nomenclature committees) Functional annotation using Gene Ontology

10 Horse GO Annotation EBI GOA provides sequence-based GO annotation for all UniProtKB records based upon analysis of functional motifs and domains. AgBase provides sequence-based GO for horse protein records not in UniProtKB (e.g. NCBI predicted proteins). AgBase also provides sequence-based GO annotation for ESTs, mRNAs with no linked protein records if they are linked on a microarray (or by request). For gene products with published literature, AgBase provides experimental based GO annotation.

11 Horse currently has 23,126 genes and 24,518 proteins in Entrez
(20,691 RefSeq proteins).

12 Compared to 24,518 proteins in NCBI.
(20,691 RefSeq proteins) EBI GO annotation project provides 29,413 GO annotations for 7,826 horse proteins in UniProtKB.

13 Total: 22,554 genes + 4,400 pseudogenes
compared to 23,126 genes in NCBI.

14 Genes unique to Ensembl (26,954)
Horse Genes Genes unique to NCBI (3,955) genes with overlap (19,462) Genes unique to Ensembl (26,954) These numbers based upon published gene numbers from both NCBI & Ensembl. How important is it for the community to have a reference gene set?

15 Provide annotations (data)
Provide tools (to help with modeling using this data) Provide training (to enable use of the data & tools)

16

17 Based on literature (detailed, species-specific).

18 Based on functional domains/motifs (generalized).

19 Based on sequence similarity to other species (not species-specific.

20 “No Data”.

21 AgBase: the next three years
New grant will expand biocuration effort Biocuration for USDA Animal Genome species. Capture functional information in other ontologies (e.g., anatomy, disease cell type). Provide tools and HPC support for larger data sets. Focus on genes that have no literature. Continued funding for workshops. Helping by request – bioinformatics, biocomputing, functional modeling. Linking phenotype to traits and functions. Comparative browsers Tools to help locate QTL candidate genes. PROJ NO: MIS AGENCY

22 Aim 1: expanding biocuration.
Use text mining to provide ranked horse gene list for manual biocuration. eGIFT – links functional terms from literature to papers for biocurators. Use eGIFT to determine what ontologies we need to capture horse literature information (not just GO).

23

24 Aim 2: pipelines for annotations.
RNASeq, etc – finding novel genes with no literature. Expand existing pipelines to add GO for functional modeling based on sequence analysis. Develop new tools for functional modeling of RNASeq data: PMID: PMID:

25

26 Aim 3: Linking genotype to phenotype
Provide comparative genome browsers to enable cross-species comparison of common traits. Phenotype – anatomy, physiology, behavior. Working with NSF funded groups to develop phenotype annotations. Use data to link phenotype to genotype.

27 Horse Genome Annotation
AgBase will provide ranked gene list for horse – community feedback and comment. User requests for annotation also. Continuing support: bioinformatics, biocomputing, tool development, help with functional modeling. What do you want?

28 Workshop This afternoon: Tomorrow (hands on):
more detail about functional modeling a working knowledge of functional annotation (GO) adding GO to a data set mapping accessions Tomorrow (hands on): working on your own data sets example data to work on tutorial – learning more about GO WIIFM – making connections between molecular and ‘omics’; back to biology

29


Download ppt "Functional Annotation of the Horse Genome"

Similar presentations


Ads by Google