Presentation is loading. Please wait.

Presentation is loading. Please wait.

[Bejerano Aut07/08] 1 MW 11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.

Similar presentations


Presentation on theme: "[Bejerano Aut07/08] 1 MW 11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean."— Presentation transcript:

1 http://cs273a.stanford.edu [Bejerano Aut07/08] 1 MW 11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean

2 http://cs273a.stanford.edu [Bejerano Aut07/08] 2 Lecture 17 Ultraconservation

3 http://cs273a.stanford.edu [Bejerano Aut07/08] 3 Sequence Conservation implies Function (but which function/s?...) human another species common ancestor...CTTTGCGA-TGAGTAGCATCTACTATTT......ACGTGGGACTGACTA-CATCGACTACGA... functional region! Comparative Genomics of Distantly related species: Note: the inverse “no conservation  no function” is a much weaker statement given current knowledge

4 http://cs273a.stanford.edu [Bejerano Aut07/08] 4 How They Measured all human-mouse alignments human-mouse ancestral repeats alignment Difference: 5% of Human Genome [Mouse consortium, Nature 2002]

5 http://cs273a.stanford.edu [Bejerano Aut07/08] 5 Human Genome: 3*10 9 letters What They Found [Science 2004 Breakthrough of the Year, 5 th runner up] 1.5% known function >50% junk 3x more functional DNA than known! compare to other species >5% human genome functional ~10 6 substrings do not code for protein What do they do then?

6 http://cs273a.stanford.edu [Bejerano Aut07/08] 6 Ultraconservation

7 http://cs273a.stanford.edu [Bejerano Aut07/08] 7 Typical DNA Conservation levels [Bejerano et al., Science 2004] Conserved elements between human and mouse are on average 85% identical. [mouse consortium, 2002]

8 http://cs273a.stanford.edu [Bejerano Aut07/08] 8 Ultraconserved Elements [Bejerano et al., Science 2004] fish

9 http://cs273a.stanford.edu [Bejerano Aut07/08] 9 Ultraconserved Elements [Bejerano et al., Science 2004]

10 http://cs273a.stanford.edu [Bejerano Aut07/08] 10 No known function requires this much conservation CDSncRNATFBS * * * * * seq. ?

11 http://cs273a.stanford.edu [Bejerano Aut07/08] 11 Discovery can be fun (compared to 4 results day before our ScienceExpress paper) 58,300

12 http://cs273a.stanford.edu [Bejerano Aut07/08] 12 Ultra Conserved (UC) Elements Any contiguous block of human-mouse-rat alignment that is identical in all three species, syntenic and ≥200 bp. (p=10 -22 of finding one such element in slowest rate 2.9G neutral DNA) Turns out there are 481(!) such blocks of sizes 200-779bp (total of 126Kb) in all chromosomes but 21, Y. *68 (61%) associated with alt-spliced exons

13 http://cs273a.stanford.edu [Bejerano Aut07/08] 13 Ultra Clusters By joining two ultras into a cluster when separated <675Kb we obtained 89 clusters (each named after prominent gene/s) Non exonic elements tend to congregate in clusters. Exonic elements are distributed more randomly (tend to overlap an alt-spliced exon). exonic non possibly

14 http://cs273a.stanford.edu [Bejerano Aut07/08] 14 Genomic Distribution exonic non possibly

15 http://cs273a.stanford.edu [Bejerano Aut07/08] Associate Ultras with Nearby Genes 481 UCEs identified 111 exonic 114 possibly exonic 256 non-exonic 100 introns 156 intergenic 93 type I genes 225 type II genes

16 http://cs273a.stanford.edu [Bejerano Aut07/08] 16 Functional Annotation of Related Genes p = 1.3 x 10 -19 p = 4.8 x 10 -10 p = 5.6 x 10 -18 p = 1.2 x 10 -4 p = 9.1 x 10 -6 p = 4.4 x 10 -6 Exonic Non Exonic 86 7 218 ultra related genes Exonic – RNA processing @ transcription regulation Non Exonic – regulation of transcription at DNA level p = 0.39 p = 0.44 p = 0.77 p = 2.5 x 10 -20 p = 1.8 x 10 -15 p = 7.8 x 10 -15

17 http://cs273a.stanford.edu [Bejerano Aut07/08] 17 Non Exonic Enhancers The non exonic ultras are often found in “gene deserts” (140 / 256 >10Kb from a known gene; 88 > 100Kb away). The genes flanking these ultras are GO enriched for development (p = 10 -6 ), particularly early developmental tasks (p = 2-7 x 10 -5 ) suggesting distal enhancer roles. Indeed, uc.351 is contained in a proven enhancer of DACH, located 225Kb upstream of it [Nobrega et al., Science 2003]. ultra conserved

18 http://cs273a.stanford.edu [Bejerano Aut07/08] 18 Zoom to uc.351, 225Kb upstream of DACH ultra conserved e.d 12.5

19 http://cs273a.stanford.edu [Bejerano Aut07/08] 19 Validating Regulatory Elements Reporter Gene Minimal Promoter Conserved Element transgenic where is the wild type gene expressed? where is the reporter gene expressed? wild type

20 http://cs273a.stanford.edu [Bejerano Aut07/08] 20 Predictions and Proofs I Based on public domain genome wide data: ultraconserved elements one subset codes protein larger subset does not generate testable hypotheses for function from existing knowledge (2004) [Pennacchio et al., Nature, 2006]

21 http://cs273a.stanford.edu [Bejerano Aut07/08] 21 The most conserved elements in the genome If one concatenates ultras that are 1-2bp away, all 4 longest ultras (1044, 779, 731, 711bp) lie at 3’ end of POLA, near ARX, on chr X. The longest has 8 subs. and no indels (99.3%id) in chicken. ARX is a homeobox gene involved in CNS development; defects in the gene are linked to epilepsy, mental retardation, autism and cerebral malformations. ultra conserved

22 http://cs273a.stanford.edu [Bejerano Aut07/08] 22 Exonic uc’s correlate with Alt Splicing 68 / 111 exonic ultras overlap an exon that shows clear evidence of being alternatively spliced. Of the 59 GO annotated genes containing these elements: 24 are RNA binding (p = 8.1 x 10 -18 ), including HNRPU, HNRPDL, HNRPH1, HNRPK, HNRPM. 16 contain the RNA recognition motif (p = 9.1 x 10 -19 ), including SFRS1, SFRS3, SFRS6, SFRS7, SFRS10, SFRS11. These ultras often overlap a short exon that is retained only in some tissues. Such is the explicitly studied, uc.33 in PTBP2 (length 312bp) overlaps a 34bp exon included in the mature transcript only in the brain [Rahman et al., Genomics 2004]

23 http://cs273a.stanford.edu [Bejerano Aut07/08] 23 Predictions and Proofs II Based on public domain genome wide data: ultraconserved elements one subset codes protein larger subset does not generate testable hypotheses for function from existing knowledge (2004) [Pennacchio et al., Nature, 2006] post transcriptional modification

24 http://cs273a.stanford.edu [Bejerano Aut07/08] Splicing Microarrays

25 http://cs273a.stanford.edu [Bejerano Aut07/08] 25 Non-Sense Mediated Decay (NMD)

26 http://cs273a.stanford.edu [Bejerano Aut07/08] [Ni et al., Genes & Dev 2007 ] Of the 29 exonic ultraconserved elements in RNA-binding protein genes in human, 15 have human and/or mouse EST evidence suggesting the presence of AS-NMD in those regions.

27 http://cs273a.stanford.edu [Bejerano Aut07/08] Model for Homeostatic Auto/Cross-regulation [Ni et al., Genes & Dev 2007 ]

28 http://cs273a.stanford.edu [Bejerano Aut07/08] = 100% conservation; associated with AS = normal splice = alternative splice = retained intron = normal stop codon = premature termination codon [Lareau et al., Nature 2007 ] Similar Results

29 http://cs273a.stanford.edu [Bejerano Aut07/08] 29 Ultraconserved Non-coding RNA [Calin et al, Cancer Cell, 2007] miRNA complementarity About 1/3 of all ultras are expressed. Some are predicted to provide microRNA targets. A few are anti-correlated with miRNA expression levels. A few even act as oncogenes.

30 http://cs273a.stanford.edu [Bejerano Aut07/08] 30 Ultras are Under Strong Human Selection Ultra DAF NonSyn DAF [Katzman et al, Science,2007] Mutational cold spots? NO. Rare (new) mutations are introduced to the population. Fierce purifying selection? YES. Very few of these get anywhere near fixation. chimp A humans A A G A

31 http://cs273a.stanford.edu [Bejerano Aut07/08] 31 Relation to Human Disease [Derti et al., Nature Genetics, 2006] SHH LMBR1 1Mb Limb Lettice et al. HMG 2003 12: 1725-35

32 http://cs273a.stanford.edu [Bejerano Aut07/08] 32 Link to Disease Remains Elusive

33 http://cs273a.stanford.edu [Bejerano Aut07/08] 33 A Vertebrate Innovation? Only 24 ultras can be partially traced back through direct sequence search to Ciona, C. Elegans or Drosophila. All overlap coding exons from known genes (17 of which show clear evidence of alt-splicing inc. EIF2C1, DDX, BCL11A, EVI1, ZFR, CLK4, HNRPH1, GRIA3 ). No intronic element in human was found to be coding in another species, although in some cases EST evidence indicates intron retention, presumably not as CDS. Interestingly, ribosomal DNA (not part of the draft genomes) also harbors 6 ultraconserved elements in 18S, 28S. def

34 http://cs273a.stanford.edu [Bejerano Aut07/08] 34 Genomic Distribution of Ultraconserved Elements exonic non possibly

35 http://cs273a.stanford.edu [Bejerano Aut07/08] 35 Computationally Driven Biology Simplified case study hypothesis set generalize survey analyze CS BIO candidates experiment

36 http://cs273a.stanford.edu [Bejerano Aut07/08] 36 What we do understand.. Ultraconserved elements exist. They are maintained via strong on-going selection. It is a heterogeneous bunch: Some mediate splicing Some regulate gene expression Some express ncRNAs (categories are not necessarily mutually exclusive) Knockouts of four regulatory ultras did not lead to severe phenotypes (similar protein cases: Pbx2, Nkx6.2, Gli1)

37 http://cs273a.stanford.edu [Bejerano Aut07/08] 37 What we don’t understand Their functional density: How did they come to be? What is the selective advantage that lets them persist?

38 http://cs273a.stanford.edu [Bejerano Aut07/08] 38 Broad Guess It’s about 3-D structure. Observation: rDNA (18S, 28S) have ultraconserved stretches, multiple constraints in a complex 3-D structure, the Ribosome. ncRNA ultras: structure confers function Splicing related ultras: the Splicosome Cis-reg ultras: TSS 3-D proximity, chromatin and/or packed TFBS (Transcription factories?) TSS


Download ppt "[Bejerano Aut07/08] 1 MW 11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean."

Similar presentations


Ads by Google