Annotating a Scarlet Runner Bean genome fragment put together by shotgun sequencing Scarlet Runner ean Max Bachour
Predicting Genes Web based programs can predict the genes in a sequence. These predicted genes are then checked against the NCBI database to see if they match any known gene or protein. Three gene predicting programs are: –Genescan –FGenesh –GeneMark
Genscan Genscan predicts three genes, the first two don’t match anything substantial. The third gene however showed a large number of matches when the protein sequence was blasted. A number of strong matches with a “maturase related” protein were found; the most notable one coming from a soybean, a close relative to the SRB.
GeneMark Genemark predicted 20 genes. Only gene 10 had a substantial match against protein data base. Again protein predicted from gene 10 matched with maturase. In addition there was a match to an unnamed protein in grapes Terminal 8577; 8897; Internal 9001; 9131; Internal 9381; 10322; Internal 10616; 10685; Initial 10793; 11395; 603;
Fgenesh Predicted three genes. Genes 1 and 3 had no substantial matches Gene 2 once again matched the maturase protein from the soybean.
Expressed Sequence Tags (EST’s) EST’s help us further confirm the accuracy of the predicted gene Many records of a sequence being expressed makes being a gene more likely Goldberg Lab EST
EST’s Compared to Predicted Genes EST from blasting entire contig and the predicted genes with good matches fit together.
Conclusion No significant repeats Maturase gene has overwhelming evidence that it is in fact a real gene. Error Value low and high EST match Sequence again Examine more contigs
THANK YOU! x THANK YOU! ∞