Eukaryotic Gene Prediction Rui Alves. How are eukaryotic genes different? DNA RNA Pol mRNA Ryb Protein.

Slides:



Advertisements
Similar presentations
BIOINFORMATICS GENE DISCOVERY BIOINFORMATICS AND GENE DISCOVERY Iosif Vaisman 1998 UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL Bioinformatics Tutorials.
Advertisements

Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
Ab initio gene prediction Genome 559, Winter 2011.
Central dogma DNA is made (transcribed) into RNA RNA is made (translated) into protein.
SBI 4U November 14 th, What is the central dogma? 2. Where does translation occur in the cell? 3. Where does transcription occur in the cell?
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction Gene prediction methods Gene indices Mapping cDNA on.
Hidden Markov Models in Bioinformatics Example Domain: Gene Finding Colin Cherry
Gene Prediction Methods G P S Raghava. Prokaryotic gene structure ORF (open reading frame) Start codon Stop codon TATA box ATGACAGATTACAGATTACAGATTACAGGATAG.
Tutorial 7 Genome browser. Free, open source, on-line broswer for genomes Contains ~100 genomes, from nematodes to human. Many tools that can be used.
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
BME 130 – Genomes Lecture 7 Genome Annotation I – Gene finding & function predictions.
Eukaryotic Gene Finding
Lecture 12 Splicing and gene prediction in eukaryotes
Eukaryotic Gene Finding
Genome Annotation BCB 660 October 20, From Carson Holt.
Doug Brutlag 2011 Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University School of Medicine Genomics, Bioinformatics.
RNA processing. RNA species in cells RNA processing.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy) Genome Databases Computational Molecular Biology Biochem 218 – BioMedical Informatics.
Gene Structure and Identification
Chapter 6 Gene Prediction: Finding Genes in the Human Genome.
Applications of HMMs Yves Moreau Overview Profile HMMs Estimation Database search Alignment Gene finding Elements of gene prediction Prokaryotes.
LECTURE 2 Splicing graphs / Annoteted transcript expression estimation.
Transcription Transcription is the synthesis of mRNA from a section of DNA. Transcription of a gene starts from a region of DNA known as the promoter.
Biology 1060 Chapter 17 From Gene to Protein. Genetic Information Important: Fig Describe how genes control phenotype –E.g., explain dwarfism in.
Genome Annotation BBSI July 14, 2005 Rita Shiang.
Intelligent Systems for Bioinformatics Michael J. Watts
Gene finding and gene structure prediction M. Fatih BÜYÜKAKÇALI Computational Bioinformatics 2012.
Genome Annotation Rosana O. Babu.
Srr-1 from Streptococcus. i/v nonpolar s serine (polar uncharged) n/s/t polar uncharged s serine (polar uncharged) e glutamic acid (neg. charge) sserine.
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
Gene Prediction: Similarity-Based Methods (Lecture for CS498-CXZ Algorithms in Bioinformatics) Sept. 15, 2005 ChengXiang Zhai Department of Computer Science.
Mark D. Adams Dept. of Genetics 9/10/04
Comp. Genomics Recitation 9 11/3/06 Gene finding using HMMs & Conservation.
From Genomes to Genes Rui Alves.
Complexities of Gene Expression Cells have regulated, complex systems –Not all genes are expressed in every cell –Many genes are not expressed all of.
Bioinformatics and Computational Biology
Basic Overview of Bioinformatics Tools and Biocomputing Applications II Dr Tan Tin Wee Director Bioinformatics Centre.
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
Genes and Genomes. Genome On Line Database (GOLD) 243 Published complete genomes 536 Prokaryotic ongoing genomes 434 Eukaryotic ongoing genomes December.
JIGSAW: a better way to combine predictions J.E. Allen, W.H. Majoros, M. Pertea, and S.L. Salzberg. JIGSAW, GeneZilla, and GlimmerHMM: puzzling out the.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
Motif Search and RNA Structure Prediction Lesson 9.
Hidden Markov Model and Its Application in Bioinformatics Liqing Department of Computer Science.
Genome Annotation Assessment in Drosophila melanogaster by Reese, M. G., et al. Summary by: Joe Reardon Swathi Appachi Max Masnick Summary of.
(H)MMs in gene prediction and similarity searches.
Annotation of eukaryotic genomes
CFE Higher Biology DNA and the Genome Transcription.
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
Gene Expression & Regulation Chapter 8.6. KEY CONCEPT Gene expression is carefully regulated in both prokaryotic and eukaryotic cells.
Using DNA Subway in the Classroom Genome Annotation: Red Line.
Basics of Genome Annotation Daniel Standage Biology Department Indiana University.
Biological Motivation Gene Finding in Eukaryotic Genomes Rhys Price Jones Anne R. Haake.
1 Gene Finding. 2 “The Central Dogma” TranscriptionTranslation RNA Protein.
bacteria and eukaryotes
Genome Annotation (protein coding genes)
”Gene Finding in Eukaryotic Genomes”
EGASP 2005 Evaluation Protocol
What is a Hidden Markov Model?
EGASP 2005 Evaluation Protocol
Eukaryotic Gene Finding
Recitation 7 2/4/09 PSSMs+Gene finding
Gene Annotation with DNA Subway
Introduction to Bioinformatics II
Working in the Post-Genomic C. elegans World
4. HMMs for gene finding HMM Ability to model grammar
Modeling of Spliceosome
.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 3 Gene Prediction and Annotation 4 Genome Structure 5 Genome.
Presentation transcript:

Eukaryotic Gene Prediction Rui Alves

How are eukaryotic genes different? DNA RNA Pol mRNA Ryb Protein

How are eukaryotic genes different? DNA RNA Pol Ryb Protein mRNA Spliceosome mRNA Correctly Identifying Splicing sites is not a trivial task

How do we predict splicing sites? By Homology Ab initio –SS motifs –Codon usage –Exonic Splicing Enhancers –Intronic Splicing Enhancers –Exonic Splicing Silencers –Intronic Splicing Silencers

Homology Splice Site Prediction KnownsplicedgenePredictedsplicedgene

Splice Site Motifs

Exonic Splicing Enhancers

Exonic Splicing Silencers Genes & Development 18:

Interaction between SE and SI

Rules for Splicing 3’ end likely target for repression Distance between SE and 3’ end < 100bp Splicing efficiency  p(interaction SEC-3’ end)

Methods for splicing detection Training set of know spliced genes Algorithm Test set of know spliced genes Set of know spliced genes GA, NN, HMM Bayesian GA, NN, HMM Bayes,ME Test set Predictions

A Genetic Algorithm Method MotifDM1… AM i …EM DM1 AM p(i) EM IM Shuffle lines and columns k times and each time calculate the probability of a given combination of motifs getting spliced Select m best combinations and continue to evolve the algorithm until it predicts training set

A Neural Net Method Weight Table for splice elements Hidden Nodes Sequences Predicted Splicing Corrected Weight Table for splice elements

Summary Eukaryotic genes have exons Biological rules combined with mathematical and statistical approaches can be used to predict the boundaries for the exons and to predict the splice variants

How to find what genes a string of DNA contains Rui Alves

Simple steps Go to a known gene prediction server (or google for one) Input sequence and wait for prediction Get prediction(s), either as cDNA or as a tranlated protein sequence and do homology searches to identify them in a known database (e.g. NCBI or SWISSPROT)

Simple steps a) Go to a known gene prediction server (or google for one) Input sequence and wait for prediction Get prediction(s), either as cDNA or as a translated protein sequence and do homology searches to identify them

Paper Presentation The human genome (Science) vs. The human genome (Nature) Nature : Pages 875 to 901 Science: Pages Compare the differences in methods and results for the annotation DO NOT SPEND TIME TALKING ABOUT THE SEQUENCING OR ASSEMBLY ITSELF Do not go into the comparative genome analysis