Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genome organization Eukaryotic genomes are complex and DNA amounts and organization vary widely between species.

Similar presentations


Presentation on theme: "Genome organization Eukaryotic genomes are complex and DNA amounts and organization vary widely between species."— Presentation transcript:

1 Genome organization Eukaryotic genomes are complex and DNA amounts and organization vary widely between species.

2 Genome Organization G

3 C value paradox: The amount of DNA in the haploid cell of an organism is not related to its evolutionary complexity or number of genes.

4 Highly Repeated Sequences

5

6

7 There are different classes of eukaryotic DNA based on sequence complexity.

8 Amount of DNA in a Genome Does Not Correlate with Complexity
basepairs

9 How many genes do humans have?
Original estimate was between 50,000 to 100,000 genes We now think humans have ~ 20,000 genes How does this compare to other organisms? Mice have ~30,000 genes Pufferfish have ~35,000 Nematodes (C. elegans), have ~19,000 Yeast (S. cerevisiae) has ~6,000 The microbe responsible for tuberculosis has ~4,000

10

11 Single Copy Sequences Exome

12 Even the Amount of DNA a Gene Spans Differs Among Species

13 Problems? Some gene products are RNA (tRNA, rRNA, others) instead of protein Some nucleic acid sequences that do not encode gene products (noncoding regions) are necessary for production of the gene product (protein or RNA). Eukaryotic genes are complex!

14 Gene Identification Open reading frames Sequence conservation
Database searches Synteny Sequence features CpG islands Evidence for transcription ESTs, microarrays Gene inactivation Transformation, RNAi

15 Unique genes

16 Noncoding regions Regulatory regions Introns
RNA polymerase binding site Transcription factor binding sites Introns Polyadenylation [poly(A)] sites

17 Splice Sites Eukaryotes only
Removal of internal parts of the newly transcribed RNA. Takes place in the cell nucleus Splice sites difficult to predict

18 One gene, many proteins via alternative splicing , 3’ cleavage and polyadenlyation

19

20 Exon Shuffling

21 Trans-Splicing in Higher Eukaryotes
21 Gingeras, Nature (2009) 461,

22 Non-contiguous Transcription Generates An Enormous Number of Possible Transcripts
• Trans-splicing exists in higher eukaryotes as well as in lower ones like Trypanosomes Six 2-exon co-linear combinations from four exons Blue: only co-linear Red: all combinations 325 combinations of 3-exons, non-colinear • Reassortment of exons coding for ncRNA or protein domains could dramatically increase number of functional products beyond the number of ‘genes’ 22 Gingeras, Nature (2009) 461,

23 Why genome size isn’t the only concern (size doesn’t matter?)
More sophisticated regulation of expression? Proteome vastly larger than genome? Alternate splicing RNA editing Postranslational modifications? Cellular location? Moonlighting

24 Gene families E.g. globins, actin, myosin Clustered or dispersed
Pseudogenes

25

26 Pseudogenes Nonfunctional copies of genes
Formed by duplication of ancestral gene, or reverse transcription (and integration) Not expressed due to mutations that produce a stop codon (nonsense or frameshift) or prevent mRNA processing, or due to lack of regulatory sequences

27 Duplicated genes Encode closely related (homologous) proteins
Formed by duplication of an ancestral gene followed by mutation Five functional genes and two pseudogenes

28 Paralogs vs Orthologs Different members of the globin gene family are paralogs, having evolved one from another through gene duplication. Paralogs are separated by a gene duplication event. Each specific gene family member (e.g. a specific gene in human) is an ortholog of the same family member in another species (e.g. mouse). Both evolved from an ancestral globin gene. Orthologs are separated by a speciation event. It is not always easy to distinguish true orthologs from paralogs , especially in polyploid organisms!

29 Protein - coding sequences less than 1.5% of the genome in humans!

30 Noncoding RNAs (ncRNA)
Do not have translated ORFs Small Not polyadenylated

31

32

33

34 Functions of Known lncRNAs
• Transcriptional interference -lncRNA transcription turns off transcription of nearby gene • Initiation of chromatin remodeling - lncRNA transcription turns on transcription of nearby gene • Promoter inactivation - lncRNA binds to TFIIB and to promoter DNA • Activation of an accessory protein - lncRNA binds to allosteric effector protein TLS and inhibits histone acetyltransferase, decreasing transcription 34 Ponting et al, Cell (2009) 136,

35 Functions of Known lncRNAs
• Activation of transcription factors - binding of lncRNA to Dlx2 activates Dlx5/6 activity • Oligomerization of an accessory protein - lncRNA induces heat shock factor trimerization • Transport of transcription factors -lnRNA NRON keeps NFAT out of nucleus • Epigenetic silencing of gene clusters -Xist RNA inactivates X chromosome • Epigenetic repression of genes in trans -HOTAIR binds PRC2, leading to methylation and silencing of several genes in HOXD locus Ponting et al, Cell (2009) 136, 35

36 ncRNA ~97-98% of the transcriptional output of the human genome is ncRNA Introns Transfer RNAs (tRNA) ~ 500 tRNA genes in human genome Ribosomal RNAs Tandem arrays on several chromosomes copies of 28S – 5.8S – 18S cluster copies of 5S cluster

37

38 Genome Organization - ncRNA
The level of transcription from human chromosomes 21 and 22 is an order of magnitude higher than can be accounted for by known or predicted exons Almost half of all transcripts from well-constructed mouse cDNA libraries are ncRNAs (identified because they do not code for an open reading frame of larger than 100 codons)

39 Repeat sequences – 50% or more of the genome

40 Repetitive DNA Moderately repeated DNA Simple-sequence DNA
Tandemly repeated rRNA, tRNA and histone genes (gene products needed in high amounts) Large duplicated gene families Mobile DNA Simple-sequence DNA Tandemly repeated short sequences Found in centromeres and telomeres (and others) Used in DNA fingerprinting to identify individuals

41 Segmental duplications
Found especially around centromeres and telomeres Often come from nonhomologous chromosomes Many can come from the same source Tend to be large (10 to 50 kb) Unique to humans?

42 Repetitive DNA - Segmental duplications

43 Mobile DNA Moves within genomes
Most of the moderately repeated DNA sequences found throughout higher eukaryotic genomes L1 LINE is ~5% of human DNA (~50,000 copies) Alu is ~5% of human DNA (>500,000 copies) Some encode enzymes that catalyze movement

44

45 Repetitive DNA – Highly repetitive satellite DNA

46


Download ppt "Genome organization Eukaryotic genomes are complex and DNA amounts and organization vary widely between species."

Similar presentations


Ads by Google