Presentation is loading. Please wait.

Presentation is loading. Please wait.

BS222 – Genome Science Lecture 3 What is a gene Dr. Vladimir B. Teif.

Similar presentations


Presentation on theme: "BS222 – Genome Science Lecture 3 What is a gene Dr. Vladimir B. Teif."— Presentation transcript:

1 BS222 – Genome Science Lecture 3 What is a gene Dr. Vladimir B. Teif

2 Module structure Genomes, sequencing projects and genomic databases (VT) Sequencing technologies (VT) What is a gene (VT) Epigenetics overview (PVW) Transcription regulation (VT) 3D chromatin organisation (VT) DNA methylation and other DNA modifications (VT) NGS applications I: Experiments and basic analysis (VT) NGS applications II: Data integration (VT) Comparative genomics (JP, guest lecture) SNPs, CNVs, population genomics (LS, guest lecture) Histone modifications (PVW) Non-coding RNAs (PVW) Genome Stability (PVW) Transcriptomics (PVW) Year's best paper (PVW) Revision lecture (all lecturers; spring term)

3 While watching try to formulate the definition of the gene
WHAT IS A GENE? A UNIT OF HEREDITY TRANSFERRED TO OFFSPRING... A DNA SEQUENCE THAT ENCODES FUNCTION... THE BASIC PHYSICAL/FUNCTIONAL UNIT OF HEREDITY… Watch this *oversimplified* video: While watching try to formulate the definition of the gene

4 “PRE-HISTORIC” GENE DEFINITIONS
Gerstein et al., Genome Res :

5 The central dogma genome {A,C,G,T} {A,C,G,U} {20 letters}
1 to 1 mapping {A,C,G,U} see next page {20 letters} Adapted from Gill Bejerano,

6 The genetic code T = Adapted from Gill Bejerano,

7 Genes can be on both strands
“plus strand” “minus strand” Adapted from Gill Bejerano,

8 THE CENTRAL DOGMA CORRECTED
Adapted from

9 Gene structure UTR = Untranslated Region CDS = Coding Sequence
Adapted from Gill Bejerano,

10 EXAMPLE: β-globin gene
01_19.jpg 01_19.jpg

11 ALTERNATIVE SPLICING In alternative splicing, particular exons of a gene may be included or excluded from the final, messenger RNA (mRNA) produced from that gene.

12

13 EXAMPLES OF ALTERNATIVE SPLICING
(A) Alternative splicing producing variant proteins. Alternative splicing results in the variable presence of a 17 amino acid (17aa) peptide near the middle of the WT1 Wilms tumor protein and of a Lys-Thr-Ser tripeptide (KTS) between the third and fourth zinc finger (ZF) domains. Four different isoforms exist for the human ERBB4 protein. Just before the transmembrane (TM) domain there is the alternative presence of a 23-amino-acid peptide or a 13-amino-acid peptide (JM-a and JM-b isoforms, respectively). And within the tyrosine kinase (TK) domain is the variable presence of a 16-amino-acid peptide that has a binding site for phosphatidylinositol-3-kinase (CYT-1 isoforms have the peptide; CYT-2 isoforms lack it).

14 10_15.jpg 10_15.jpg some diagrams of alternative splicing mechanisms

15 OVERLAPPING READING FRAMES
Alternative splicing of the CDKN2A gene produces two entirely different tumour suppressor proteins, p16-INK4A and p14-ARF, which work in cell cycle control. Exon 2, the one exon with coding sequence for both proteins, is translated in different reading frames. (B) This gene generates several transcript variants which differ in their first exons. At least three alternatively spliced variants encoding distinct proteins have been reported, two of which encode structurally related isoforms known to function as inhibitors of CDK4 kinase. The remaining transcript includes an alternate first exon located 20 Kb upstream of the remainder of the gene; this transcript contains an alternate open reading frame (ARF) that specifies a protein which is structurally unrelated to the products of the other variants. This ARF product functions as a stabilizer of the tumor suppressor protein p53 as it can interact with, and sequester, the E3 ubiquitin-protein ligase MDM2, a protein responsible for the degradation of p53. In spite of the structural and functional differences, the CDK inhibitor isoforms and the ARF product encoded by this gene, through the regulatory roles of CDK4 and p53 in cell cycle G1 progression, share a common functionality in cell cycle G1 control. This gene is frequently mutated or deleted in a wide variety of tumors, and is known to be an important tumor suppressor gene. (provided by RefSeq, Sep 2012)

16 RNA EDITING

17 Copy number variation (CNV)
The number of copies of a particular gene can vary from one individual to another.

18 COPY NUMBER VARIATION IN CANCER
Exome sequencing of osteosarcoma Nature Communications 6, 8940 (2015)

19 Pseudogenes Although not fully functional, sometimes may be functional
Pseudogenes may be included in annotations, but marked as “pseudogenes” Pseudogenes have lost at least some of the ability their real gene relative has in gene expression within the cell or their ability to code protein. Pseudogenes often result from the accumulation of mutations.

20 Retrogenes Retrogenes count as genes in annotations

21 MOBILE GENETIC ELEMENTS
Mobile genetic elements can move around within a genome, or transfer from one species to another. Barbara McClintock discovered mobile genetic elements in 1948, but scientists at that time did not believe her. After 1953 she stopped publishing this (to avoid alienating colleagues). Awarded Nobel Prize in 1985 Barbara McClintock (1902 – 1992)

22 McClintock’s experiments
To learn more details about McClintock’s experiments, watch this video at home: McClintock's microscope and ears of corn, National Museum of Natural History, USA

23 Up to 50% of the human genome is formed by mobile elements
TRANSPOSABLE ELEMENTS (TEs) Watch this video that explains TEs (3min) Class I: "copy and paste" retrotransposons Class II: "cut and paste" DNA transposons Up to 50% of the human genome is formed by mobile elements

24 Protein splicing Liu, 2000, Annu. Rev. Genet. 34, 61-76

25 Protein splicing Known since 1990 Less common than RNA splicing
Found in all kingdoms of life Liu, 2000, Annu. Rev. Genet. 34, 61-76

26 REFINING THE CONCEPT OF A GENE
RNA splicing Overlapping reading frames Regulatory elements Copy-number variants Intronic genes RNA editing Mobile elements NOT pseudogenes Retrogenes Protein splicing Gerstein et al., Genome Res :

27 REFINING THE CONCEPT OF A GENE
“The gene is a union of genomic sequences encoding a coherent set of potentially overlapping functional products” Gerstein et al., Genome Res. 2007, 17,

28 HOW MANY GENES ARE HERE? Suggest your answer!

29 HOW MANY GENES ARE THERE IN THE HUMAN GENOME?

30 “For the purposes of our study, genes will include any interval along the chromosomal DNA that is transcribed and then translated into a functional protein, or that is transcribed into a functional RNA molecule. By “functional” we mean to include any gene that appears to perform a biological function, even one that might not be essential. Our definition intentionally excludes pseudogenes… When multiple proteins or RNAs are produced from the same region through alternative splicing or alternative transcription initiation, we will count these variants as part of a single gene. Our total gene count, therefore, corresponds to the total number of distinct chromosomal intervals, or loci, that encode either proteins or noncoding RNAs”

31 “we observed over 30 million distinct transcripts in approximately 700,000 distinct genomic locations, of which only about 40,000 (5%) appear to represent functional gene loci”

32 = miscellaneous RNA Pertea et al., bioRxiv 332825

33 GENES IN THE GENOME BROWSER https://genome. ucsc
GENES IN THE GENOME BROWSER Gene direction is shown by >>>> or <<<<<

34 GENES IN THE GENOME BROWSER https://genome. ucsc
GENES IN THE GENOME BROWSER Gene direction is shown by >>>> or <<<<<

35 Prokaryotes vs eukaryotes
Adopted from

36 Prokaryotes vs eukaryotes

37 In prokaryotes the number of genes is ~proportional to the genome size
Is it also true for eukaryotes?

38

39 Prokaryotes vs eukaryotes
85-88% of the genome in coding regions Usually no introns Organised in polycistronic transcriptional units (operons) In total genes Well-defined promoters Eukaryotes: Just 2-4% of the genome in protein coding regions Excessive intron use and much longer genes Some genes are organised in clusters, but this is not very typical ~ protein genes Complex regulatory regions

40

41 Prokaryotes vs eukaryotes
Homo sapiens (human) Prokaryotes vs eukaryotes Saccharomyces cerevisiae (yeast) Drosophila melanogaster (fruit fly) Mais (plant) Escherichia coli (bacteria) Legend Gene Intron Pseudogene Repetitive elements Adopted from

42 definition of the gene TAKE HOME MESSAGE genome size, gene density
What is a gene? What looks like a gene but is not a gene? How many genes do we have? MUST KNOW: PROMOTER, TRANSCRIPTION START SITE Corrected central dogma; pro-, eukaryotic gene structure definition of the gene genome size, gene density OPEN READING FRAME, ALTERNATIVE PROMOTER TRANSCRIPT, PSEUDOGENE


Download ppt "BS222 – Genome Science Lecture 3 What is a gene Dr. Vladimir B. Teif."

Similar presentations


Ads by Google