Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wfleabase.org/docs/arthropod-gene-finding/ Unlocated Arthropod genes and ways to find them Many bug genes are hard to find - Daphnia’s many tandems were.

Similar presentations


Presentation on theme: "Wfleabase.org/docs/arthropod-gene-finding/ Unlocated Arthropod genes and ways to find them Many bug genes are hard to find - Daphnia’s many tandems were."— Presentation transcript:

1 wfleabase.org/docs/arthropod-gene-finding/ Unlocated Arthropod genes and ways to find them Many bug genes are hard to find - Daphnia’s many tandems were lost for a bit Duplicate genes, a bain and a boon Genome tile expression picks out many more April 2008Don Gilbert Genome Informatics Lab, Biology Dept., Indiana University gilbertd@indiana.edu

2 wfleabase.org/docs/arthropod-gene-finding/ Environ Stresses find Novels Novel Daphnia genes show under stress Novel Drosophila species genes are missed by prediction

3 wfleabase.org/docs/arthropod-gene-finding/ Duplicate genes are common Daphnia surpasses C.elegans for rich tandem gene set. Bugs have many tandem genes

4 wfleabase.org/docs/arthropod-gene-finding/ Duplicates confuse Finders Prediction errors are common in duplicate gene regions. None of 13 predictors found all 4 tandems of this Dwil P450 cluster, but each gene was properly predicted among them.

5 wfleabase.org/docs/arthropod-gene-finding/ Duplicates find Errors Duplicates solve prediction dilemma in Drosophila. Prediction cline is artifact of Dmel training. Retraining with Dmoj removes it.

6 wfleabase.org/docs/arthropod-gene-finding/ Odorant genes concur Curation of Drosophila Obp genes also removes prediction cline. Vieira et al. (2007), and further analysis by myself recovered genes using Psi-Blast trained on species Obp genes. Computational errors are significantly more common in Far-, Mid-mel group. Obp genes show no overall gain/loss across groups.

7 wfleabase.org/docs/arthropod-gene-finding/ Tile expression finds genes Daphnia tile expression with gene finding calls 26% coding bases over the genome, compared to 17% from gene predictions, or 5,000 - 10,000 new genes. Manak et al 2006, with Drosmel also found 24% CDS/genome, up from 18% CDS/genome from reference gene set. Computational tools need to mature; gene finding is preliminary.

8 wfleabase.org/docs/arthropod-gene-finding/ Summary: Locating novel genes 1.More genes are expressed in unusual environs, and are specific. Use many environmental, developmental and tissue conditions to see range of genes via expression. Understand the limits of gene homology. 2.Duplicate genes are common, a problem, an aid to finding genes. Examine duplicate genes carefully. Tools that distinguish these can be used to find paralogs missed by traditional methods. 3.Near species training reduces errors and spurious effects. Use same- species and near-species data as much as possible in preparing automated annotations. Be aware of and control for informant species-distance as a source of bias. 4.Genome-wide tile expression finds more genes. As an alternative to EST studies, it has values and drawbacks. Computational methods need to improve to use this data well.

9 wfleabase.org/docs/arthropod-gene-finding/ Genome maps on your laptop Genome data sets that I use are available for your computer. Includes GMOD GBrowse software in a ready-to-run bundle* http://eugenes.org/gmod/genomeview-package2008/ * This is fully configured for Intel-MacOSX 10.5, others need further installation. See http://www.gmod.org/GBrowse Map data (large) are at ftp://eugenes.org/eugenes/gbrowse/databases/ daphnia_pulex : Daphnia genome data from wfleabase.org nasonia : Wasp gene predictions, homology, EST tribcas : Tribolium basic gene set from NCBI genomes drospege : 12 Drosophila genomes drosmel : Dros. mel rel 5.5 genome with Affymetrix transcriptome data

10 wfleabase.org/docs/arthropod-gene-finding/ End note Acknowledgements I am grateful to support from NSF (DBI-0640462) and the NIH, including TeraGrid award for making this work possible. Daphnia sequencing and portions of the analyses were provided by DOE Joint Genome Institute and in collaboration with the Daphnia Genomics Consortium (DGC). References Gilbert, 2007. New and old genes in Drosophila genomes. http://insects.eugenes.org/DroSpeGe/about/analysis-doc/ Gilbert, 2007. Daphnia gene duplicates. http://wfleabase.org/genome-summaries/gene-duplicates/ Gilbert, 2008. Tandem genes lost + found. http://insects.eugenes.org/DroSpeGe/about/analysis-doc/ Manak, JR et al., 2006... unannotated transcription in Dros. mel. Nature Genetics, doi:10.1038/ng1875 Vieira, F.G. et al. 2007... analysis of the Odorant-Binding genes in Drosophila genomes. Genome Biology, doi:10.1186/gb- 2007-8-11-r235


Download ppt "Wfleabase.org/docs/arthropod-gene-finding/ Unlocated Arthropod genes and ways to find them Many bug genes are hard to find - Daphnia’s many tandems were."

Similar presentations


Ads by Google