Presentation is loading. Please wait.

Presentation is loading. Please wait.

ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.

Similar presentations


Presentation on theme: "ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore."— Presentation transcript:

1 ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore University of California, Davis, Dept. of Vegetable Crops, Davis, CA 95616, USA Approximately 3,700 of the genes in the Arabidopsis Col-0 genome are single copy. These genes were used to identify conserved orthologs in several other plant species. Using computational approaches we identified 1104 lettuce, 686 sunflower, 1704 tomato, 2016 soybean, 1701 maize and 1290 rice ESTs that are conserved orthologs to these Arabidopsis genes. Each EST sequence from these sets has an unambiguous single strong BLAST hit to the Arabidopsis genome. Reciprocal BLAST searches (Arabidopsis single copy genes versus EST assemblies) showed that more than 80% of BLAST hits had only a single strong hit. It indicated that the majority of these conserved orthologs are represented by single genes in multiple plant species. The total number of Arabidopsis genes that have similarity (BLAST score 1e-20 or better) to at least one of these selected ESTs is 2205, which is 60% of total number of single copy genes in Arabidopsis. Only 248 sequences were in common between EST collections from different species and Arabidopsis single copy genes. This can be partially explained by the incomplete representation within each EST collection. Analysis and visualization of single copy genes over Arabidopsis chromosomes (http://cgpdb.ucdavis.edu/COS_Arabidopsis/arabidopsis_single_copy_genes_2003.html) revealed that these genes were distributed throughout the genome regardless of large scale chromosomal duplications. This indicates that deduction of order of genes in common ancestors is required for informative analyses of synteny. SINGLE COPY ORTHOLOGS SUMMARY source number of single copy orthologs lettuce1104 sunflower686 tomato1704 soybean2016 maize1701 rice1290 common between all 248 common between lettuce and sunflower 431 Arabidopsis (total) 2205 (out of 3,714 single copy genes) Graphical representation of BLAST search of lettuce, sunflower, tomato, soybean, maize and rice ESTs against Arabidopsis genome. The picture displays potential conserved orthologs (single copy genes in Arabidopsis). Each box (element) is a single copy Arabidopsis gene having homology to selected sets of plant ESTs. Genes are plotted along five Arabidopsis chromosomes according to their physical positions. Patterns of segmental duplications in Arabidopsis genome (generated by GenomePixelizer http://www.atgc.org/). Regions selected by white boxes are shown in large scale above. http://www.atgc.org/ CHRM 5 CHRM 4 Segmental duplication between Arabidopsis chromosomes 4 and 5 Color Scheme: Black - single copy genes Purple - kinases Green - cytochrome Red - resistance genes Yellow - ribosomal proteins Gray lines connect genes with sequence identity 40% or greater Note: Single copy genes are distributed evenly through both segments of the duplicated region. Image was generated by GenomePixelizer using the “locus zoomer” function. Additional information is available at: http://www.atgc.org/GP_Ref/presentation/ Credits: This work was funded by USDA IFAFS Plant Genome Program to the Compositae Genome Project Questions and comments to Alexander Kozik, email: akozik@atgc.org Raw data and detailed description of the sequence extraction pipeline is available at: http://cgpdb.ucdavis.edu/COS_Arabidopsis/ PIPELINE TO IDENTIFY SINGLE COPY ORTHOLOGS PIPELINE TO EXTRACT ALIGNMENTS AT NUCLEOTIDE LEVEL http://cgpdb.ucdavis.edu/COS_Arabidopsis/Codon_Usage_Pipeline.html MULTIPLE ALIGNMENT VISUALIZED WITH TkLife ( http://www.atgc.org/TkLife/ ) Arabidopsis  lettuce  sunflower  alignment summary  codon mismatch and amino acid mismatch (non-synonymous substitutions) codon match (and amino acid match) codon mismatch and amino acid match (synonymous substitutions) Putative scenario of gene loss after segmental duplication Because of extensive gene loss after duplication, deduction of gene order in ancestral genomes is required for informative synteny analysis between different genomes. GenBank files of Arabidopsis genome (DNA sequences of entire chromosomes and corresponding annotation) GenBank Parser spliced DNA sequences corresponding to ORFs translation translated (protein) sequences [subject] ESTs (unigene) set [query] BLASTX search [ESTs vs proteins] [step 1] [step 2] [step 3] [step 4] SeqsExtractorFromBlastX (Python script) BLAST output (alignment) extraction of DNA sequences corresponding to BLAST alignments from “spliced DNA” (subject) and EST (query) files. Script automatically counts codon usage. Output: spreadsheet with info about codon usage BLAST parser (Tcl/Tk script) tab-delimited file with info about BLAST alignments (start points and end points for each sequence in BLAST report) [step 5] final step of the pipeline: Arabidopsis predicted proteins (27,169 seqs) BLAST search Arabidopsis proteins against themselves and selection of Arabidopsis single copy genes [step 1] Arabidopsis single copy genes (3,714 seqs) lettuce ESTs (68,197 seqs) sunflower ESTs (67,180 seqs) tomato ESTs (113,932 seqs) maize ESTs (362,510 seqs) soybean ESTs (341,564 seqs) rice ESTs (107,329 seqs) BLAST search of selected ESTs versus all Arabidopsis predicted proteins and selection of ESTs with a single strong hit to Arabidopsis genome (Exp cutoff 1e-20) [step 3] BLAST search of Arabidopsis single copy genes versus full sets of ESTs selection of ESTs with BLAST hits to Arabidopsis single copy subset [step 2]


Download ppt "ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore."

Similar presentations


Ads by Google