1/30 Comparative Genomics. 2/30 Overview of the Talk Comparing Genomes Homologies & Families Sequence Alignments.

Slides:



Advertisements
Similar presentations
GBrowse at TAIR Philippe Lamesch TAIR curator. Seqviewer.
Advertisements

Part I: Tips and Techniques from curators GBrowse at TAIR David Swarbreck.
1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
Introduction to Genetic Analysis TENTH EDITION Introduction to Genetic Analysis TENTH EDITION Griffiths Wessler Carroll Doebley © 2012 W. H. Freeman and.
Eukaryotic Intron Loss Tobias Mourier & Daniel C. Jeffares.
Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.
Human Genome Project What did they do? Why did they do it? What will it mean for humankind? Animation OverviewAnimation Overview - Click.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
Gramene Comparative & Phylogenomics Resources for Plants Joshua C. Stein 1, William Spooner 1, Sharon Wei 1, Liya Ren 1, Doreen Ware 1,2 1 Cold Spring.
Basics of Comparative Genomics Dr G. P. S. Raghava.
Comparative genomics Joachim Bargsten February 2012.
First release of HOGENOM, a database of homologous genes from complete genome Equipe Bioinformatique et Génomique Evolutive Laboratoire de Biométrie et.
Evolution at the DNA level …ACGGTGCAGTTACCA… …AC----CAGTCCACCA… Mutation SEQUENCE EDITS REARRANGEMENTS Deletion Inversion Translocation Duplication.
12 and 15 November, 2004 Chapter 19 Molecular Evolution and Phylogenetics.
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
[Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.
Selection on codons OEB Degenerate Code.
Alternative splicing and evolution Daniel Jeffares.
CS273a Lecture 9/10, Aut 10, Batzoglou Multiple Sequence Alignment.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
28-Way vertebrate alignment and conservation track in the UCSC Genome Browser Journal club Dec. 7, 2007.
Defining the Regulatory Potential of Highly Conserved Vertebrate Non-Exonic Elements Rachel Harte BME230.
Genome Browsers UCSC (Santa Cruz, California) and Ensembl (EBI, UK)
Selection upon codons BIOS E *Aside: shallow trees are strange… And ignore question 7. of assignment…
Short Primer on Comparative Genomics Today: Special guest lecture 12pm, Alway M108 Comparative genomics of animals and plants Adam Siepel Assistant Professor.
Bioinformatics Genome anatomy Comparisons of some eukaryotic genomes Allignment of long genomic sequences Comparative genomics Oxford Grid Reconstruction.
EVOLUTIONARY AND COMPUTATIONAL GENOMICS Shin-Han Shiu Plant Biology / CMB / EEBB / Genetics / QBMI.
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Comparative Genomics of the Eukaryotes
Genome projects and model organisms Level 3 Molecular Evolution and Bioinformatics Jim Provan.
Elements of Molecular Biology All living things are made of cells All living things are made of cells Prokaryote, Eukaryote Prokaryote, Eukaryote.
Nucleotide sequence alignments in Compara Stephen Fitzgerald
The Ensembl Gene set The “Genebuild” 21 April 2008.
TAIR, PMN, SGN and Gramene workshop Focus on comparative genomics and new tools Philippe Lamesch, A. S. Karthikeyan, Aureliano Bombarely Gomez, Pankaj.
Ultraconserved Elements in the Human Genome Bejerano, G., et.al. Katie Allen & Megan Mosher.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
1 Genome Evolution Chapter Introduction Genomes contain the raw material for evolution Comparing whole genomes enhances – Our ability to understand.
Genomes School B&I TCD Bioinformatics May Genome sizes Completed eukaryotic nuclear genomes Type of organismSpeciesGenome size (10 6 base pairs)
This presentation was originally prepared by C. William Birky, Jr. Department of Ecology and Evolutionary Biology The University of Arizona It may be used.
Comparative genomics and proteomics in Ensembl Sep 2006.
IGEM 101: Session 7 4/2/15Jarrod Shilts 4/5/15Ophir Ospovat.
An Introduction to Ensembl Presented By Hilary O. Pavlidis.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
1/29 Comparative Genomics. 2/29 Overview of the Talk Comparing Genomes Homologies & Families Sequence Alignments.
1 Genome Evolution Chapter Introduction Genomes contain the raw material for evolution; Comparing whole genomes enhances – Our ability to understand.
Comparative genomics analysis of NtcA regulons in cyanobacteria: Regulation of nitrogen assimilation and its coupling to photosynthesis Wen-Ting Huang.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Gene Regulations and Mutations
Bioinformatic Tools for Comparative Genomics of Vectors Comparative Genomics.
Highly Conserved Non-Coding Sequences are Associated with Vertebrate Development PLoS Biol Jan;3(1):e7. Epub 2004 Nov 11. Yvonne Li Paper presentation.
Mark D. Adams Dept. of Genetics 9/10/04
Comparative genomics Haixu Tang School of Informatics.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Using blast to study gene evolution – an example.
MCB 7200: Molecular Biology Biotechnology terminology Common hosts and experimental organisms Transcription and translation Prokaryotic gene organization.
Phylogenetic analysis taken from and es/MSAPhylogeny.htm.
Chapter 1 Introduction.
Evolution of Animal Cytochromes P450 from Sponges to Mammals David R. Nelson University of Tennessee Health Sciences Center Memphis.
Gene models and proteomes for Saccharomyces cerevisiae (Sc), Schizosaccharomyces pombe (Sp), Arabidopsis thaliana (At), Oryza sativa (Os), Drosophila melanogaster.
Eukaryotic genes are interrupted by large introns. In eukaryotes, repeated sequences characterize great amounts of noncoding DNA. Bacteria have compact.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
MCB 7200: Molecular Biology
Basics of Comparative Genomics
Comparative Genomics.
Genome Projects Maps Human Genome Mapping Human Genome Sequencing
Every living organism inherits a blueprint for life from its parents.
Part I: Tips and Techniques from curators
Evolution of eukaryote genomes
Basics of Comparative Genomics
Presentation transcript:

1/30 Comparative Genomics

2/30 Overview of the Talk Comparing Genomes Homologies & Families Sequence Alignments

3/30 Evolution at the DNA Level …ACTGACATGTACCA… …AC----CATGCACCA… Mutation Sequence edits Rearrangements Deletion Inversion Translocation Duplication

4/30 We can better understand evolution/ speciation We can find important, functional regions of the sequence (codons, promoters, regulatory regions) It can help us locate genes in other species that are missing or not well-defined (also through comparison and alignments). Why Compare Genomes?

5/30  Mammals have roughly 3 billion base pairs in their genomes  Over 98% human genes are shared with primates, wth more than 95-98% similarity between genes.  Even the fruit fly shares 60% of its genes with humans! (March 2000)  Differences: gene structure, sequence Remember… one nucleotide change can cause disease such as sickle cell anemia and cancer. Comparing Genomes

6/30 Uses all the species Uses a representative protein (the longest) for every gene Builds a gene tree EnsemblCompara GeneTrees: Analysis of complete, duplication aware phylogenetic trees in vertebrates. Vilella AJ, Severin J, Ureta-Vidal A, Durbin R, Heng L, Birney E. Genome Res Nov 24. How Does Ensembl Predict Homology?

7/30 Load longest protein for every gene from all species WU Blastp + SmithWaterman longest translation of every gene against every other (Blast Reciprocal Hit/ Blast Score Ratio) Protein clustering, build multiple alignments (MCoffee) From each alignment, build a gene tree (TreeBest) Reconcile each gene tree with the species tree to determine internal nodes (TreeBest) Orthologues, paralogues… Steps in Homology Prediction..MEDPATA…

8/30 Viewing Trees in Ensembl

9/30 Types of Homologues Orthologues : any gene pairwise relation where the ancestor node is a speciation event Paralogues : any gene pairwise relation where the ancestor node is a duplication event

10/30 The Gene Tree for INS (insulin precursor) A red square is a duplication event (Paralogues) A blue square is a speciation event (Orthologues)

Reconciliation M R H M R H species tree unrooted gene tree Duplication node Speciation node MRHMRH MHRMHR gene loss R’ H’ M’

12/30 Orthologue Types What is ‘1 to 1’? What is ‘1 to many’?

13/30 Protein Families How: Cluster proteins for every isoform in every species + UniProt proteins. BLASTP comparison of: –all Ensembl ENSP… –all metazoan (animal) proteins in UniProt

14/30 1.Find the human MYL6 gene: go to its gene summary. 2.How many paralogues does it have? Find them in the gene tree. 3.Which paralogue is closest to the human MYL6 gene? In what taxon is the common ancestor? Homologues Exercise

15/30 Pan-taxonomic compara Anopheles gambiae Caenorhabditis elegans Drosophila melanogaster Aspergillus nidulans Neurospora crassa Saccharomyces cerevisiae Schizosaccharomyces pombe B_aphidicola_Tokyo_1998 B_burgdorferi_DSM_4680 B_subtilis E_coli_K12 M_tuberculosis_H37Rv N_meningitidis_A P_horikoshii S_aureus_N315 S_pneumoniae_TIGR4 S_pyogenes_SF370 W_pipientis_wMel Anolis carolinensis Ciona savignyi Danio rerio Equus caballus Gallus gallus Homo sapiens Macaca mulatta Monodelphis domestica Mus musculus Ornithorhynchus anatinus Pan troglodytes Pongo pygmaeus Xenopus tropicalis Dictyostelium discoideum Plasmodium falciparum Plasmodium vivax Arabidopsis thaliana Oryza sativa Vitis vinifera

16/30

17/30 Families

18/30 Ensembl Proteins in the Family

19/30 Overview of the Talk Comparing Genomes Homologies and Families Sequence Alignments

20/30 To identify homologous regions To spot trouble gene predictions Conserved regions could be functional To define syntenic regions (long regions of DNA sequences where order and orientation is highly conserved) Aligning Whole Genomes- Why?

21/30 Aligning large genomic sequences Difficulties: Requires a significant computer resource Scalability, as more and more genomes are sequenced Time constraint As the «true» alignment is not known, then difficult to measure the alignment accuracy and apply the right method

22/30 Whole Genome Alignments BLASTZ-net (nucleotide level) closer species e.g. human – mouse Translated BLAT (amino acid level) more distant species, e.g. human – zebrafish EPO/PECAN multispecies alignments ORTHEUS used to determine ancestral alleles

23/30 Which Multispecies Alignments? Mercator-Pecan 16 amniota vertebrates + constrained elements Enredo-Pecan-Ortheus (EPO) For 6 primates For 5 teleost fish + constrained elements For 12 eutherian mammals For 34 eutherian mammals + constrained elements

24/30 “Phylogenetic Footprinting” – conserved noncoding regions can be functional Regulatory regions discovered in this way for genes: Hoxb-1, Hoxb4, PAX6, SOX9 Non-Coding Regions

25/30 More Examples Highly conserved transcription factor binding sites discovered eg. 401 bp non-coding sequence involved in transcriptional regulation of Interleukins. New genes (human-mouse comparison) eg. APOA5, identified as a paralogue to APOA4 in human and mouse.

26/30 Going Beyond Mammals Where human-mouse is too conserved, go to other species: Chicken (Mammals and birds: 300MYA) e.g. A cardiac-specific enhancer of Nkx2-5 Human and fish ( MYA) In 2002, comparison of human to Fugu rubripes led to identification of 1000 genes.

27/30 Regulatory Features of the PDX1 gene Region in Detail shows conservation of sequence in regions involved in PDX1 transcriptional regulation ( kb upstream of the gene).

28/30 1.Have a look at Region in Detail for the ACN9 gene. 2.Turn on the BLASTZ alignment against macaque. What parts of the macaque genome aligns to this region in human? 3.Turn on the constrained elements for the 33 eutherian mammals. How does this track differ from the BLASTZ alignment? Alignments Exercise

29/30 1.Zoom out one box in the zoom slide. Are there constrained elements upstream of the ACN9 transcript that overlap a regulatory feature? 2. View the ‘6 primates alignment’ using the Alignments links at the left. Alignments Continued

30/30 Compara Team at EBI Javier Herrero Kathryn Beal Stephen Fitzgerald Leo Gordon