A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al. 2011 Presentation by Robert Lewis and Kaylee Wells.

Slides:



Advertisements
Similar presentations
Manolis Kellis: Research synopsis Brief overview 1 slide each vignette Why biology in a computer science group? Big biological questions: 1.Interpreting.
Advertisements

Two short pieces MicroRNA Alternative splicing.
Speaker: HU Xue-Jia Supervisor: WU Yun-Dong Date: 19/12/2013.
Duplication, rearrangement, and mutation of DNA contribute to genome evolution Chapter 21, Section 5.
[Bejerano Fall10/11] 1 Thank you for the midterm feedback! Projects will be assigned shortly.
[Bejerano Fall10/11] 1 Any Project reflections?
[Bejerano Fall09/10] 1 Milestones due today. Anything to report?
28-Way vertebrate alignment and conservation track in the UCSC Genome Browser Journal club Dec. 7, 2007.
Comparative Genomics and Evolution Pollard, K.S., et al., Forces Shaping the Fastest Evolving Regions in the Human Genome. PLoS Genetics 2(10), McLean,
Chris Chander, Luke Adea BioSci D145 Feb. 12, 2015
CS 374: Relating the Genetic Code to Gene Expression Sandeep Chinchali.
[Bejerano Fall09/10] 1 Thank you for the midterm feedback!
“An integrated encyclopedia of DNA elements in the human genome” ENCODE Project Consortium. Nature 2012 Sep 6; 489: Michael M. Hoffman University.
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
Active Lecture Questions for BIOLOGY, Eighth Edition Neil Campbell & Jane Reece Questions prepared by Jung Choi, Georgia Institute of Technology Copyright.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Comparative Genomics II: Functional comparisons Caterino and Hayes, 2007.
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
ENCODE The Human Genome project sequenced “the human genome” “the human genome” that we have labeled as such doesn’t actually exist What we call.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
P300 Marks Active Enhancers Ruijuan LiChao HeRui Fu.
Ultraconserved Elements in the Human Genome Bejerano, G., et.al. Katie Allen & Megan Mosher.
The Genome is Organized in Chromatin. Nucleosome Breathing, Opening, and Gaping.
Generating Diversity: how genes and genomes evolve Erin “They call me Dr. Worm” Friedman 29 September 2005.
Igor Ulitsky.  “the branch of genetics that studies organisms in terms of their genomes (their full DNA sequences)”  Computational genomics in TAU ◦
발표자 석사 2 년 김태형 Vol. 11, Issue 3, , March 2001 Comparative DNA Sequence Analysis of Mouse and Human Protocadherin Gene Clusters 인간과 마우스의 PCDH 유전자.
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
Vidyadhar Karmarkar Genomics and Bioinformatics 414 Life Sciences Building, Huck Institute of Life Sciences.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
GWAS Hits and Functional Implications Peter Castaldi February 1, 2013.
Chapter 21 Eukaryotic Genome Sequences
Anatomy of a Genome Project A.Sequencing 1. De novo vs. ‘resequencing’ 2.Sanger WGS versus ‘next generation’ sequencing 3.High versus low sequence coverage.
Click to edit Master title style Click to edit Master subtitle style CLICKER QUESTIONS For CAMPBELL BIOLOGY, NINTH EDITION Jane B. Reece, Lisa A. Urry,
Recombinant DNA Technology and Genomics A.Overview: B.Creating a DNA Library C.Recover the clone of interest D.Analyzing/characterizing the DNA - create.
Pattern Matching Rhys Price Jones Anne R. Haake. What is pattern matching? Pattern matching is the procedure of scanning a nucleic acid or protein sequence.
SHI Meng. Abstract Changes in gene expression are thought to underlie many of the phenotypic differences between species. However, large-scale analyses.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
Thoughts on ENCODE Annotations Mark Gerstein. Simplified Comprehensive (published annotation, mostly in '12 & '14 rollouts)
Retroviruses and the RW Genome – Active Roles in Evolution of Immunity and Pregnancy Chicago, June 8, 2015 James A. Shapiro University of Chicago
Input: Alignment. Model parameters from neutral sequence Estimation example.
Can genes help explain our evolution? - What type of changes (regulatory or structural mutations?) - How many genes are involved?
Finding genes in the genome
Accessing and visualizing genomics data
1 What forces constrain/drive protein evolution? Looking at all coding sequences across multiple genomes can shed considerable light on which forces contribute.
KEY CONCEPT 8.5 Translation converts an mRNA message into a polypeptide, or protein.
Schematic of Eukaryotic Protein-Coding Locus
Published primate genome sequences - I Published primate genome sequences - II.
分子診斷學概論  第一章 綜說 overview 疾病發生原因的影響層次 DNA 、 RNA 或蛋白質 分子診斷的目的 偵測這些致病因子是那個層次發生變化 本書著重 DNA 、 RNA 的變化 蛋白質層次由原文書章節提供 The Application of Proteomics To Disease.
Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.
Kerstin Lindblad-Toh1 et al.
Sungkyunkwan University, School of Medicine.
The Transcriptional Landscape of the Mammalian Genome
Detection of the footprint of natural selection in the genome
Figure 1. Annotation and characterization of genomic target of p63 in mouse keratinocytes (MK) based on ChIP-Seq. (A) Scatterplot representing high degree.
Very important to know the difference between the trees!
Genome Projects Maps Human Genome Mapping Human Genome Sequencing
What are the Patterns Of Nucleotide Substitution Within Coding and
Gene duplications: evolutionary role
A Zero-Knowledge Based Introduction to Biology
Gene Density and Noncoding DNA
Volume 18, Issue 9, Pages (February 2017)
mRNA Degradation and Translation Control
Hannah K. Long, Sara L. Prescott, Joanna Wysocka  Cell 
Presented by, Jeremy Logue.
Presented by, Jeremy Logue.
Volume 11, Issue 7, Pages (May 2015)
Presentation transcript:

A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells

What is Evolutionary Constraint? Restrictions that conserve non- deleterious alleles! Explains why something didn’t (or doesn’t) evolve. Aspect of an organism that has not changed over time

Phylogeny and constrained elements from the 29 genome sequences Compared with the HMRD Looked at 100 bp sites 4.2 substitutions per site vs 0.68 HMRD Low probability of a non purifying sequence remaining fixed with 29 species! Therefore, better constraint detection

Shotgun Sequencing

Method for Detecting Constraint Generated 2X coverage Shotgun Sequence Contigs were 2.8kb Scaffolds were 51.8kb Depth (Coverage) = N x (L/G) N = # Reads G = Genome Length L = Read Length

Sequencing Assembly and Alignment! With 29 mammalian species they were able to find: 3.6 million elements spanning 4.2% of the human genome Length of elements significantly smaller 36 bp vs 123 bp in the HMRD comparison

PhastCons – How well individual bases are conserved SiPhy – Indicates bases under selection HMRD vs 29 Mammals HMRD  1 Element 29 Mammals  4 elements for NRSF binding

~1.5% of the genome is Protein coding 5% undergoing purifying selection Of the 5%, 3.5% are regulatory elements Genome Wide Association Studies Rely on Non-Coding Sequences

Exons and protein coding regions 3,788 candidate exons. (2% increase) Stop codon read through to subsequent stop codon in 4 genes (regulatory) >10,000 synonymous constrained elements in 25% of genes Regions with very low synonymous substitution rate (No change in AA) HIGHLY CONSTRAINED

HoxA2 (2 sites with SCE) Synonymous rate = base change but not AA change PhyloP = Nucleotide Conservation Scale from -14 to 3 (+) = More Conserved (-) = Faster Evolution (changing) dN/dS indicates selective pressure X > 1 = Change in phenotype These sites are known enhancers and drive expression in other Hox

RNA structures and structural elements! Look at RNA sequences  Determine secondary structure Found 37,381 possible elements Important b/c structure indicates function! (Look at structure and find likely target)

Promoters! Again, Structure = Function! Organized into 3 categories High Constraint Development Intermittent constrain Basic Cell Functions Low Constraint Immunity & Reproduction

Regulatory Motifs HMRD already created catalog of motifs conserved across genome  Not good for finding new motifs! 29 Mammals revealed 688 regulatory motifs associated with 345 transcription factors 2.7 million conserved instances  form regulatory network 375 motif targets with 21 regulators per target gene

Chromatin Signatures Indicate possible functions for 37.5% of unexplained conserved elements Functions of elements outside coding regions, UTRs, proximal promoters.

Accounting for constrained elements ~30% constrained elements overlap were associated with protein-coding transcripts ~27% overlap specific enriched chromatin states ~1.5% novel RNA structures ~3% conserved regulatory motifs ~60% of constrained elements overlap with any of those features

Implications for interpreting disease associated variants SNPs associated with human disease are 1.37-fold enriched for constrained regions. Only a small portion of SNPs are likely to be causative. HOXB1 and HOXB2 associated with tooth development phenotypes

Implications for Disease associated variance Look at SNPs for HOXB1 Helps resolve which SNPs disrupt function Rs disrupts Forkhead-family motif in an enhancer

Codon specific selection Looked at 6.05 million codons 84.2% Purifying (Negative) selection sites 2.4% Positive selection sites 4,431 Proteins with 15,383 positive selections sites Distributed positively selected sites for: immune response, taste perception, meiotic chromosome segregation and transcription regulation Localized positive selection sites for: microtubule based movement, topological change, telomere maintenance

Exaptation of mobile elements Elements can move and be retained where advantageous in the genome 280,000 mobile elements exaptations common to mammalian genomes Of the ~1.1 million constrained elements from 90 million years of divergence between marsupials and eutherians we can trace 19% to mobile elements 11% of mobile elements constrained

Accelerated evolution in the primate linage 564 human-accelerated regions (HARs) Previously 202 known 577 primate-accelerated regions (PARs) In these regions constrained elements for brain and limb development Influence genes harboring or neighboring are enriched for extracellular signaling, receptor activity, immunity, axon guidance, cartilage development, and embryonic pattern formation Why are we different from our primate linage?

Main Points and Key Techniques Analysis of 29 mammalian genomes showed a map of >3.5 million constrained elements. ~4% of the human genome The function of ~60% of these constrained sequences can be identified. Protein coding sequences RNA structures Promoters and transcriptional regulators Chromatin signatures This article shows the importance of constrained elements in the evolution of the mammalian lineage as well as their role in diseases.

Acknowledgments Lindblad-Toh et al., (2011) A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478,476–482. Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304,1321–1325 (2004).