Annotating a Scarlet Runner Bean genome fragment put together by shotgun sequencing Scarlet Runner ean Max Bachour.

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

Scarlet Runner Bean Genome Annotation: Contig
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Scarlet Runner Bean Genome Sequence: Contig By Elaine Chiu.
The Scarlet Runner Bean Genome: Contig By Eden Maloney.
PH Regulation in Blueberries Locating Nhx1. Which proteins regulate pH? The Nhe or Nhx (Na/H exchanger) family of genes – Six known members of this family.
Max BachourJessica Chen. Shotgun or 454 sequencing High throughput sequencing technique that can collect a large amount of data at a fast rate. Works.
MainLabMeeting_PingZheng_ Ran the fgenesh on the large contigs from the matina_1_6_RNA dataset and performed BLAST the Putative genes against.
Scarlet Runner Bean Genome Assembly Nancy Phang June 4, 2004.
Sequence Similarity Searching Class 4 March 2010.
Alignment of mRNAs to genomic DNA Sequence Martin Berglund Khanh Huy Bui Md. Asaduzzaman Jean-Luc Leblond.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
Compartmentalized Shotgun Assembly ? ? ? CSA Two stated motivations? ?
Genome Annotation BCB 660 October 20, From Carson Holt.
Doug Brutlag 2011 Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University School of Medicine Genomics, Bioinformatics.
Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy) Genome Databases Computational Molecular Biology Biochem 218 – BioMedical Informatics.
Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.
Using DNA Subway in the Classroom Red Line Lesson Sketch.
Using DNA Subway in the Classroom Red Line Lesson Sketch.
The Ensembl Gene set The “Genebuild” 21 April 2008.
Genome Annotation using MAKER-P at iPlant Collaboration with Mark Yandell Lab (University of Utah) iPlant: Josh Stein (CSHL) Matt Vaughn.
Bikash Shakya Emma Lang Jorge Diaz.  BLASTx entire sequence against 9 plant genomes. RepeatMasker  55.47% repetitive sequences  82.5% retroelements.
Kerstin Howe, Mario Caccamo, Ian Sealy The Zebrafish Genome Sequencing Project Bioinformatics resources.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
BME 110L / BIOL 181L Computational Biology Tools October 29: Quickly that demo: how to align a protein family (10/27)
EXPLORING DEAD GENES Adrienne Manuel I400. What are they? Dead Genes are also called Pseudogenes Pseudogenes are non functioning copies of genes in DNA.
HC70AL Final Presentation Chris McQuilkin June 4 th, 2009.
MAIZE GENOME ANNOTATION PROJECT AGRY GROUP 2 KARTHIK PADMANABHAN SHUAI CHEN SHAYLYN WIARDA 12/06/12.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
BME 110L / BIOL 181L Computational Biology Tools February 19: In-class exercise: a phylogenetic tree for that.
DNA sequencing. Dideoxy analogs of normal nucleotide triphosphates (ddNTP) cause premature termination of a growing chain of nucleotides. ACAGTCGATTG ACAddG.
Recombinant DNA Technology and Genomics A.Overview: B.Creating a DNA Library C.Recover the clone of interest D.Analyzing/characterizing the DNA - create.
Srr-1 from Streptococcus. i/v nonpolar s serine (polar uncharged) n/s/t polar uncharged s serine (polar uncharged) e glutamic acid (neg. charge) sserine.
Human Genome.
Plant Biology Division Post-process of IMGAG M.t. 2.0 Release Affymetrix Medicago Probe set – IMGAG 2.0 / MTGI 8.0 Mapping Zhao Bioinformatics Lab.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
How can we find genes? Search for them Look them up.
How to Claim your Biotech- Based Invention Deborah Reynolds Detailee, TCPS
Large-scale Prediction of Yeast Gene Function Introduction to Bio-Informatics Winter Roi Adadi Naama Kraus
A guided tour of Ensembl This quick tour will give you an outline view of what Ensembl is all about. You will learn: –Why we need Ensembl –What is in the.
Mojavensis: Issues of Polymorphisms Chris Shaffer GEP 2009 Washington University.
SRB Genome Assembly and Analysis From 454 Sequences HC70AL S Brandon Le & Min Chen.
What is BLAST? Basic BLAST search What is BLAST?
Gene Finding in Chimpanzee Evidence based improvement of ab initio gene predictions Chris Shaffer06/2009.
454 Genome Sequence Assembly and Analysis HC70AL S Brandon Le & Min Chen.
Work Presentation Novel RNA genes in A. thaliana Gaurav Moghe Oct, 2008-Nov, 2008.
Detecting Protein Function and Protein-Protein Interactions from Genome Sequences TuyetLinh Nguyen.
Myb Transcription Factors Dylan Coughtrey Laboratory Methods in Genomics Spring 2011.
Using DNA Subway in the Classroom Genome Annotation: Red Line.
Gene prediction in metagenomic fragments: A large scale machine learning approach Katharina J Hoff, Maike Tech, Thomas Lingner, Rolf Daniel, Burkhard Morgenstern.
What is BLAST? Basic BLAST search What is BLAST?
bacteria and eukaryotes
Human Genome Project.
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
Bioinformatics for Research
Genome Sequence Annotation Server
Symposium on Applied Bioinformatics
Genome Annotation Continued
GEP Annotation Workflow
HC70AL Final Presentation
Gene Annotation with DNA Subway
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Cuong Nguyen, Deng Xin, Dongmei, Zheng Wang
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Next Generation Sequencing and Human Genome Databases
Identify D. melanogaster ortholog
The Future of Genetic Research
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Basic Local Alignment Search Tool
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Annotating a Scarlet Runner Bean genome fragment put together by shotgun sequencing Scarlet Runner ean Max Bachour

Predicting Genes Web based programs can predict the genes in a sequence. These predicted genes are then checked against the NCBI database to see if they match any known gene or protein. Three gene predicting programs are: –Genescan –FGenesh –GeneMark

Genscan Genscan predicts three genes, the first two don’t match anything substantial. The third gene however showed a large number of matches when the protein sequence was blasted. A number of strong matches with a “maturase related” protein were found; the most notable one coming from a soybean, a close relative to the SRB.

GeneMark Genemark predicted 20 genes. Only gene 10 had a substantial match against protein data base. Again protein predicted from gene 10 matched with maturase. In addition there was a match to an unnamed protein in grapes Terminal 8577; 8897; Internal 9001; 9131; Internal 9381; 10322; Internal 10616; 10685; Initial 10793; 11395; 603;

Fgenesh Predicted three genes. Genes 1 and 3 had no substantial matches Gene 2 once again matched the maturase protein from the soybean.

Expressed Sequence Tags (EST’s) EST’s help us further confirm the accuracy of the predicted gene Many records of a sequence being expressed makes being a gene more likely Goldberg Lab EST

EST’s Compared to Predicted Genes EST from blasting entire contig and the predicted genes with good matches fit together.

Conclusion No significant repeats Maturase gene has overwhelming evidence that it is in fact a real gene. Error Value low and high EST match Sequence again Examine more contigs

THANK YOU! x THANK YOU! ∞