Presentation is loading. Please wait.

Presentation is loading. Please wait.

28-Way vertebrate alignment and conservation track in the UCSC Genome Browser Journal club Dec. 7, 2007.

Similar presentations


Presentation on theme: "28-Way vertebrate alignment and conservation track in the UCSC Genome Browser Journal club Dec. 7, 2007."— Presentation transcript:

1 28-Way vertebrate alignment and conservation track in the UCSC Genome Browser Journal club Dec. 7, 2007

2 Vertebrate genome sequencing the Broad Institute of MIT (Massachusetts Institute of Technology) and Harvard the Human Genome Sequencing Center at the Baylor College of Medicine the Genome Sequencing Center at Washington University. the Sanger Center the Department of Energy ’ s (DOE ’ s) Joint Genome Institute the National Institute of Genetics in Japan.

3

4 Alignment: Similarities & differences between genome sequences: 1. functional noncoding regions 2. protein-coding genes 3. non-coding RNA genes

5 Aims 1. to more reliably identify functional elements via sequence alignment 2. To enhance the effectiveness of the disease-model species for experiment 3. To determine the course of evolution & reconstruct the ancestral genome sequence

6 April 2007: 17  28 11 old species data 6 updated old species 11 new species

7

8

9

10 >79% Heterogeneous mix

11 Coverage: 2 – >99% 16 – 5.1% ~ 8.5% 10 – ~2x (2x – 87.5%, 5x – 99.4%) Cloning bias …

12 Applications Application 1: indels in protein-coding regions Application 2: conservation of start and stop codons Application 3: phylogenetic extent of alignment of functional regions

13 Application 1 Indels accumulated at a uniform rate during the evolution? The phenotypic consequence of human- specific protein indels? Positions of potentially disease-associated indels resisted substitution over evolutionary time – interspecies conservation

14 6-bp indel near the start of PRNP Primate & glires PG D

15

16 Total Indel: 209 # of Indel / # per MY Parametric bootstrap test ---- significantly differ from hypothesis 4/MY 2/MY

17 Human specific protein indels

18 SULF1: human specific 3-bp insertion in exon 11 Replication slippage1.Fixed in humans 2.Very conserved region (retain 4Es over 2 billion years) 3.Without 3D data

19 GFM2: human specific 6-bp insertion 1.Not conserved region 2.This insertion only occurs in some human individuals 3.Similar protein 3D data implied no phynotypic consequence

20

21 Human replacement disease-associated amino acid mutations are overabundant occur predominantly in positions essential to the structure and function of the proteins Subramanian and Kumar, BMC Genomics 2006, 7:306

22 Disease-associated deletion More species considering Data from PhenCode Locus Variants PAH Simplified distance -- # of distinct aa.

23 6

24 >79% < Hard to identify precise gene boundaries based on comparative genomics data Drift away

25 Hypothesis 1: the CpG islands that are common near gene starts are more difficult to sequence

26 Hypothesis 2: Selection at the start codon might be more relaxed in genes with multiple promoters (alternate promoters) 4%1.65%

27 Hypothesis 3: the program may not have enough surrounding conserved sequence to reliably align the small initial coding exon around the start codon

28 Hypothesis 3: the program may not have enough surrounding conserved sequence to reliably align the small initial coding exon around the start codon similar

29 Conclusion A bias against CpG islands in the draft sequence combined with difficulty in aligning small initial coding exons does explain a great deal of the observed unalignability of start codons compared with stop codons Gene model based on multiple genomic alignments must be aware of the start codon

30 Background – finding functional elements conservation in noncoding regions is much more subject to evolutionary turnover than in protein-coding regions. Evolutionary(conservation) turnover -- Most studies tacitly equate homology of functional elements with sequence homology. This assumption is violated by the phenomenon of turnover, in which functionally equivalent elements reside at locations that are nonorthologous at the sequence level. Frith et al. Genome research 2006 More species genomics data --- higher resolution

31 251000 coding exons of RefSeq genes 481 ultraconserved elements 94000 predicted regulatory regions(PRPs) 3900 putative transcriptional regulatory regions (pTRRs)

32 Alignability: the fraction that aligns with a designated comparison species

33 Human


Download ppt "28-Way vertebrate alignment and conservation track in the UCSC Genome Browser Journal club Dec. 7, 2007."

Similar presentations


Ads by Google