1/29 Comparative Genomics. 2/29 Overview of the Talk Comparing Genomes Homologies & Families Sequence Alignments.

Slides:



Advertisements
Similar presentations
GBrowse at TAIR Philippe Lamesch TAIR curator. Seqviewer.
Advertisements

Part I: Tips and Techniques from curators GBrowse at TAIR David Swarbreck.
1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
Introduction to Genetic Analysis TENTH EDITION Introduction to Genetic Analysis TENTH EDITION Griffiths Wessler Carroll Doebley © 2012 W. H. Freeman and.
DNAStructureandReplication. Transformation: Robert Griffith (1928)
Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.
Human Genome Project What did they do? Why did they do it? What will it mean for humankind? Animation OverviewAnimation Overview - Click.
Gramene Comparative & Phylogenomics Resources for Plants Joshua C. Stein 1, William Spooner 1, Sharon Wei 1, Liya Ren 1, Doreen Ware 1,2 1 Cold Spring.
Basics of Comparative Genomics Dr G. P. S. Raghava.
1/30 Comparative Genomics. 2/30 Overview of the Talk Comparing Genomes Homologies & Families Sequence Alignments.
Comparative genomics Joachim Bargsten February 2012.
First release of HOGENOM, a database of homologous genes from complete genome Equipe Bioinformatique et Génomique Evolutive Laboratoire de Biométrie et.
BIO513: Lecture 1. Central dogma “The central dogma of molecular biology deals with the detailed residue-by-residue transfer of sequential information.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
[Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.
Selection on codons OEB Degenerate Code.
Genome Browsers Ensembl (EBI, UK) and UCSC (Santa Cruz, California)
Alternative splicing and evolution Daniel Jeffares.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
28-Way vertebrate alignment and conservation track in the UCSC Genome Browser Journal club Dec. 7, 2007.
Genome Browsers UCSC (Santa Cruz, California) and Ensembl (EBI, UK)
Selection upon codons BIOS E *Aside: shallow trees are strange… And ignore question 7. of assignment…
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction.
EVOLUTIONARY AND COMPUTATIONAL GENOMICS Shin-Han Shiu Plant Biology / CMB / EEBB / Genetics / QBMI.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Comparative Genomics of the Eukaryotes
Genome projects and model organisms Level 3 Molecular Evolution and Bioinformatics Jim Provan.
Nucleotide sequence alignments in Compara Stephen Fitzgerald
The Ensembl Gene set The “Genebuild” 21 April 2008.
Introduction to Gene Mining Part B: How similar are plant and human versions of a gene? After completing part B, you will demonstrate How to use NCBI BLASTp.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
Genomes School B&I TCD Bioinformatics May Genome sizes Completed eukaryotic nuclear genomes Type of organismSpeciesGenome size (10 6 base pairs)
Tentative definition of bioinformatics Bioinformatics, often also called genomics, computational genomics, or computational biology, is a new interdisciplinary.
This presentation was originally prepared by C. William Birky, Jr. Department of Ecology and Evolutionary Biology The University of Arizona It may be used.
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
Comparative genomics and proteomics in Ensembl Sep 2006.
IGEM 101: Session 7 4/2/15Jarrod Shilts 4/5/15Ophir Ospovat.
An Introduction to Ensembl Presented By Hilary O. Pavlidis.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
1 Genome Evolution Chapter Introduction Genomes contain the raw material for evolution; Comparing whole genomes enhances – Our ability to understand.
© 2015 W. H. Freeman and Company CHAPTER 1 The Genetics Revolution Introduction to Genetic Analysis ELEVENTH EDITION Introduction to Genetic Analysis ELEVENTH.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Bioinformatic Tools for Comparative Genomics of Vectors Comparative Genomics.
Comparative genomics Haixu Tang School of Informatics.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Using blast to study gene evolution – an example.
Phylogenetic analysis taken from and es/MSAPhylogeny.htm.
Copyright © 2008 Pearson Education, Inc., publishing as Pearson Benjamin Cummings PowerPoint ® Lecture Presentations for Biology Eighth Edition Neil Campbell.
Gene models and proteomes for Saccharomyces cerevisiae (Sc), Schizosaccharomyces pombe (Sp), Arabidopsis thaliana (At), Oryza sativa (Os), Drosophila melanogaster.
MICROBIOLOGIA GENERALE Prokaryotic genomes. The prokaryotic genome.
Eukaryotic genes are interrupted by large introns. In eukaryotes, repeated sequences characterize great amounts of noncoding DNA. Bacteria have compact.
MICROBIOLOGIA GENERALE Prokaryotic genomes. The Escherichia coli nucleoid.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
MCB 7200: Molecular Biology
Exploring Molecular Evolution
Basics of Comparative Genomics
Comparative Genomics.
How to use a bioinformatics website!
In-Text Art, Ch. 16, p. 316 (1).
Genomes and Their Evolution
Genome Projects Maps Human Genome Mapping Human Genome Sequencing
Visualization of genomic data
Every living organism inherits a blueprint for life from its parents.
Part I: Tips and Techniques from curators
Evolution of eukaryote genomes
Exploring Molecular Evolution
Bioinformatics Lecture 2 By: Dr. Mehdi Mansouri
Basics of Comparative Genomics
Presentation transcript:

1/29 Comparative Genomics

2/29 Overview of the Talk Comparing Genomes Homologies & Families Sequence Alignments

3/29 Evolution at the DNA Level …ACTGACATGTACCA… …AC----CATGCACCA… Mutation Sequence edits Rearrangements Deletion Inversion Translocation Duplication

4/29 We can better understand evolution/ speciation We can find important, functional regions of the sequence (codons, promoters, regulatory regions) It can help us locate genes in other species that are missing or not well-defined (also through comparison and alignments). Why Compare Genomes?

5/29  Mammals have roughly 3 billion base pairs in their genomes  Over 98% human genes are shared with primates, wth more than 95-98% similarity between genes.  Even the fruit fly shares 60% of its genes with humans! (March 2000)  Differences: gene structure, sequence Remember… one nucleotide change can cause disease such as sickle cell anemia and cancer. Comparing Genomes

6/29 Uses all the species Uses a representative protein (the longest) for every gene Builds a gene tree EnsemblCompara GeneTrees: Analysis of complete, duplication aware phylogenetic trees in vertebrates. Vilella AJ, Severin J, Ureta-Vidal A, Durbin R, Heng L, Birney E. Genome Res Nov 24. How Does Ensembl Predict Homology?

7/29 Load longest protein for every gene from all species WU Blastp + SmithWaterman longest translation of every gene against every other (Blast Reciprocal Hit/ Blast Score Ratio) Protein clustering, build multiple alignments (MCoffee) From each alignment, build a gene tree Reconcile each gene tree with the species tree to determine internal nodes (TreeBest) Orthologues, paralogues… Steps in Homology Prediction..MEDPATA…

8/29 Viewing Trees in Ensembl

9/29 Types of Homologues Orthologues : any gene pairwise relation where the ancestor node is a speciation event Paralogues : any gene pairwise relation where the ancestor node is a duplication event

10/29 The Gene Tree for INS (insulin precursor) A red square is a duplication event (Paralogues) A blue square is a speciation event (Orthologues)

Reconciliation M R H M R H species tree unrooted gene tree Duplication node Speciation node MRHMRH MHRMHR gene loss R’ H’ M’

12/29 Orthologue Types What is ‘1 to 1’? What is ‘1 to many’?

13/29 Protein Families How: Cluster proteins for every isoform in every species + UniProt proteins. BLASTP comparison of: –all Ensembl ENSP… –all metazoan (animal) proteins in UniProt

14/29 1.Find the human MYL6 gene: go to its gene summary. 2.How many paralogues does it have? Find them in the gene tree. 3.Which paralogue is closest to the human MYL6 gene? In what taxon is the common ancestor? Homologues Exercise

15/29 Pan-Compara (Ensembl Genomes) Bacillus subtilis Escherichia coli K12 Mycobacterium tuberculosis H37Rv Neisseria meningitidis A 4A Pyrococcus horikoshii Staphylococcus aureus N315 Streptococcus pneumoniae TIGR4 Streptococcus pyogenes M1 SF370 Plasmodium falciparum Plasmodium vivax Anolis carolinensis Ciona savignyi Danio rerio Equus caballus Gallus gallus Homo sapiens Macaca mulatta Anopheles gambiae Caenorhabditis elegans Drosophila melanogaster Arabidopsis thaliana Oryza sativa japonica Vitis vinifera Saccharomyces cerevisiae Schizosaccharomyces pombe Monodelphis domestica Mus musculus Ornithorhynchus anatinus Pan troglodytes Pongo pygmaeus Xenopus tropicalis x8 x3 x2 x13

16/29

17/29 Families

18/29 Ensembl Proteins in the Family

19/29 Overview of the Talk Comparing Genomes Homologies and Families Sequence Alignments

20/29 Large stretches of non-coding regions in vertebrates Regulatory regions of: Developmental genes Transcription factors miRNA Non-Coding Regions Kikuta et. al, Genome Research, May 2007

21/29 Comparative Genomics today

22/29 To identify homologous regions To spot trouble gene predictions Conserved regions could be functional To define syntenic regions (long regions of DNA sequences where order and orientation is highly conserved) Aligning Whole Genomes- Why?

23/29 Aligning large genomic sequences Difficulties: Requires a significant computer resource Scalability, as more and more genomes are sequenced Time constraint As the «true» alignment is not known, then difficult to measure the alignment accuracy and apply the right method

24/29 Whole Genome Alignments BLASTZ-net (nucleotide level) closer species e.g. human – mouse Translated BLAT (amino acid level) more distant species, e.g. human – zebrafish EPO/PECAN multispecies alignments ORTHEUS used to determine ancestral alleles

25/29 1.Find the Ensembl MYH2 gene for human and go to Region in Detail. 2.Turn on the BLASTZ alignment against cow. What part of the cow genome aligns to this region in human? 3.Jump to the region in cow. Alignments Exercise

26/29 Go back to the human page. Use the Alignments (text) and Multi-species view links to explore the alignments. AlignmentsExercise Alignments Exercise

27/29 Go back to region in detail Turn on the conservation score for 31 species, and the constrained elements tracks. Where are the regions of high conservation? 1.Click on the regulatory feature that corresponds to a highly conserved block of sequence. What is it? Conserved Regions Exercise

28/29 Ancestral Alleles Go to the variation tab for rs , and take the Phylogenetic Context link. What is the allele in the four primates? Hint… either go to the gene tab and click on the SNP ID from the variation table, or do a new search using rs

29/29 Compara Team at EBI Javier Herrero Kathryn Beal Stephen Fitzgerald Albert Vilella

30/29 End of Course Survey Exercises on page 43. Answers are on page 44.