Genomes.

Slides:



Advertisements
Similar presentations
Genomics – The Language of DNA Honors Genetics 2006.
Advertisements

DNA Organization Lec 2. Aims The aims of this lecture is to investigate how cells organize their DNA within the cell nucleus, how is the huge amount of.
The Organization of Cellular Genomes Complexity of Genomes Chromosomes and Chromatin Sequences of Genomes Bioinformatics As we have discussed for the last.
Biology Ch. 12 Review.
Disease-causing bacteria (smooth colonies) Harmless bacteria (rough colonies) Heat-killed, disease- causing bacteria (smooth colonies) Control (no growth)
12.1 DNA Griffith – Questioned how bacteria made people sick/ pneumonia – Smooth strains caused, harmless strains rough – Heat killed; however, heat killed.
Genomics, Genetics and Biochemistry
Chap. 6 Problem 2 Protein coding genes are grouped into the classes known as solitary (single) genes, and duplicated or diverged genes in gene families.
Genomes.
GENETIC-CONCEPTS.
Genome Organization & Protein Synthesis and Processing in Plants
The Molecular Genetics of Gene Expression
Genome Structure 12 Jan, Nature of DNA Transformation (uptake of foreign DNA) in prokaryotes and eukaryotes has repeatedly shown that DNA is hereditary.
ECE 501 Introduction to BME
Molecular Genetics Ch. 16, 17, 18, 19, 20. DNA Replication Happens during interphase of mitosis. Semiconservative Replication 3 basic steps  Unwind and.
DNA Replication When a cell or organism reproduces, a complete set of genetic instructions must pass from one generation to the next.
Genomes summary 1.>930 bacterial genomes sequenced. 2.Circular. Genes densely packed Mbases, ,000 genes 4.Genomes of >200 eukaryotes (45.
Transcription Transcription- synthesis of RNA from only one strand of a double stranded DNA helix DNA  RNA(  Protein) Why is RNA an intermediate????
Essentials of the Living World Second Edition George B. Johnson Jonathan B. Losos Chapter 13 How Genes Work Copyright © The McGraw-Hill Companies, Inc.
DNA Chapter 10.
Chromosome Organization and Molecular Structure. Chromosomes & Genomes Chromosomes complexes of DNA and proteins – chromatin Viral – linear, circular;
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Gene and Chromosome. DNA is the genetic material.
From Gene To Protein Chapter 17. The Connection Between Genes and Proteins Proteins - link between genotype (what DNA says) and phenotype (physical expression)
Lesson Overview 13.1 RNA.
Eukaryotic Gene Expression The “More Complex” Genome.
Human Genetics The Human Genome 1.
Unit 4 Genetics Ch. 12 DNA & RNA.
Chapter 17 From Gene to Protein.
RNA and Protein Synthesis
RNA AND PROTEIN SYNTHESIS RNA vs DNA RNADNA 1. 5 – Carbon sugar (ribose) 5 – Carbon sugar (deoxyribose) 2. Phosphate group Phosphate group 3. Nitrogenous.
Genome Genome Gene expression Gene expression signal transduction signal transduction Plant microbial interaction Plant microbial interaction.
AP Biology Ch. 17 From Gene to Protein.
Eukaryotic Genomes Demonstrate Sequence Organization Characterized by Repetitive DNA Honors Genetics Lemon Bay High School
Fig Genome = Genic + Intergenic (or non-genic) Eukaryotic genomes: composition of human genome.
FROM DNA TO PROTEIN Transcription – Translation. I. Overview Although DNA and the genes on it are responsible for inheritance, the day to day operations.
Genome Organization & Evolution. Chromosomes Genes are always in genomic structures (chromosomes) – never ‘free floating’ Bacterial genomes are circular.
The information content of DNA is in the form of specific sequences of nucleotides The DNA inherited by an organism leads to specific traits by dictating.
Used for detection of genetic diseases, forensics, paternity, evolutionary links Based on the characteristics of mammalian DNA Eukaryotic genome 1000x.
Chapter 21 Eukaryotic Genome Sequences
Genome Organization Genome  Complete set of instructions for making an organism master blueprints for all enzymes, cellular structures & activitiesmaster.
Chapter 17 From Gene to Protein. Gene Expression DNA leads to specific traits by synthesizing proteins Gene expression – the process by which DNA directs.
Chapter 5 The Content of the Genome 5.1 Introduction genome – The complete set of sequences in the genetic material of an organism. –It includes the.
Lecture 10 Genes, genomes and chromosomes
Control of Eukaryotic Genome
Replication, Transcription and Translation. Griffith’s Experiment.
11 Gene function: genes in action. Sea in the blood Various kinds of haemoglobin are found in red blood cells. Each kind of haemoglobin consists of four.
RNA and Gene Expression BIO 224 Intro to Molecular and Cell Biology.
How many genes are there?
Genome  Complete set of instructions for making an organism master blueprints for all enzymes, cellular structures & activitiesmaster blueprints for.
Genomes. Definition Complete set of instructions for making an organis Master blueprints for all enzymes, cellular structures & activities An organism‘s.
The Genetic Material Must Exhibit Four Characteristics For a molecule to serve as the genetic material, it must be able to replicate, store information,
Microbial Genetics Structure and Function of Genetic Material The Regulation of Bacterial Gene Expression Mutation: Change in Genetic Material Genetic.
Aim: How is DNA organized in a eukaryotic cell?. Why is the control of gene expression more complex in eukaryotes than prokaryotes ? Eukaryotes have:
DNA and RNA Structure of DNA Chromosomes and Replication Transcription and Translation Mutation and Gene Regulation.
The genome of prokaryotes and eukaryotes- nuclear and extranuclear genetic organization.
Chromosome Organization & Molecular Structure. Chromosomes & Genomes Chromosomes complexes of DNA & proteins – chromatin Viral – linear, circular; DNA.
Genomes.
Organization of prokaryotic, eukaryotic and viral genomes
13/11/
Unit 5: DNA and Protein Synthesis
Organization of the human genome
Gene Action and Expression
SGN23 The Organization of the Human Genome
Organization of the human genome
Chapter 17 From Gene to Protein.
Gene Density and Noncoding DNA
Agrobacterium tumefaciens
So how do we get from DNA to Protein?
Presentation transcript:

Genomes

Definition an organism‘s complete set of DNA Complete set of instructions for making an organism master blueprints for all enzymes, cellular structures & activities an organism‘s complete set of DNA total genetic information carried by a single set of chromosomes in a haploid nucleus

Viral genomes Viral genomes: ssRNA, dsRNA, ssDNA, dsDNA, linear or circular Viruses with RNA genomes: Almost all plant viruses and some bacterial and animal viruses Genomes are rather small (a few thousand nucleotides) Viruses with DNA genomes (e.g. lambda = 48,502 bp): Often a circular genome. Replicative form of viral genomes all ssRNA viruses produce dsRNA molecules many linear DNA molecules become circular Molecular weight and contour length: duplex length per nucleotide = 3.4 Å Mol. Weight per base pair = ~ 660

Agrobacterium tumefaciens Procaryotic genomes Generally 1 circular chromosome (dsDNA) Usually without introns Relatively high gene density (~2500 genes per mm of E. coli DNA) Often indigenous plasmids are present Eschericia coli Agrobacterium tumefaciens

Bacterial genomes: E. coli 4288 protein coding genes: Average ORF 317 amino acids Average gene size 1000 bp Very compact: average distance between genes 118bp Contour length of genome: 1.7 mm

Easy problem Bacterial Gene-finding Dense Genomes Short intergenic regions Uninterrupted ORFs Conserved signals Abundant comparative information Complete Genomes

Plasmids Extra chromosomal circular DNAs -lactamase ori Found in bacteria, yeast and other fungi Size varies form ~ 3,000 to 250,000 bp. Replicate autonomously (origin of replication) May contain resistance genes May be transferred from one bacterium to another May be transferred across kingdoms Multipcopy plasmids (~ up to 400 plasmids/per cell) Low copy plasmids (1 –2 copies per cell) Plasmids may be incompatible with each other used as vectors that could carry a foreign gene of interest foreign gene

Agrobacterium tumefaciens Characteristics Plant parasite that causes Crown Gall Disease Lives in intercellular spaces of the plant Encodes a large (~250kbp) plasmid called Tumor-inducing (Ti) plasmid) Plasmid contains genes responsible for the disease Wound = entry point  10-14 days later, tumor forms Portion of the Ti plasmid is transferred between bacterial cells and plant cells  T-DNA (Tumor DNA)

Agrobacterium tumefaciens Characteristics T-DNA integrates stably into plant genome Single stranded T-DNA fragment is converted to dsDNA fragment by plant cell Then integrated into plant genome 2 x 23bp direct repeats play an important role in the excision and integration process

Agrobacterium tumefaciens What is naturally encoded in T-DNA? Enzymes for auxin and cytokinin synthesis Causing hormone imbalance  tumor formation/undifferentiated callus Mutants in enzymes have been characterized Opine synthesis genes (e.g. octopine or nopaline) Carbon and nitrogen source for A. tumefaciens growth Insertion genes Virulence (vir) genes Allow excision and integration into plant genome

Ti plasmid of A. tumefaciens

Auxin, cytokinin, opine synthetic genes transferred to plant Plant makes all 3 compounds Auxins and cytokines cause gall formation Opines provide unique carbon/nitrogen source only A. tumefaciens can use!

Fungal genomes: S. cerevisiae First completely sequenced eukaryote genome Very compact genome: Short intergenic regions Scarcity of introns Lack of repetitive sequences Strong evidence of duplication: Chromosome segments Single genes Redundancy: non-essential genes provide selective advantage

Eucaryotic genomes Located on several chromosomes Relatively low gene density (50 genes per mm of DNA in humans) Carry organellar genome as well

2300 Mbp=??? Human Genomes 50,000 genes X 2 kbp=100 Mbp Introns=300 Mbp? Regulatory regions=300 Mbp? Only 5-10% of human genome codes for genes - function of other DNA (mostly repetitive sequences) unknown but it might serve structural or regulatory roles 2300 Mbp=???

Plant genomes Plant contains three genomes Genetic information is divided in the chromosome. The size of genomes is species dependent The difference in the size of genome is mainly due to a different number of identical sequence of various size arranged in sequence The gene for ribosomal RNAs occur as repetitive sequence and together with the genes for some transfer RNAs in several thousand of copies Structural genes are present in only a few copies, sometimes just single copy. Structural genes encoding for structurally and functionally related proteins often form a gene family The DNA in the genome is replicated during the interphase of mitosis

Plant genomes: Arabidopsis thaliana A weed growing at the roadside of central Europe It has only 2 x 5 chromosomes It is just 70 Mbp It has a life cycle of only 6 weeks It contains 25,498 structural genes from 11,000 families The structural genes are present in only few copies sometimes just one protein Structural genes encoding for structurally and functionally related proteins often form a gene family

Peculiarities of plant genomes Huge genomes reaching tens of billions of base pairs Numerous polyploid forms Abundant (up to 99%) non coding DNA which seriously hinders sequencing, gene mapping and design of gene contigs Poor morphological, genetics, and physical mapping of chromosomes A large number of “small-chromosome” in which the chromosome length does not exceed 3 μm The difficulty of chromosomal mapping of individual genes using in situ hybridization The number of chromosomes and DNA content in many species is still unknown

Size of the genome in plants and in human Arabidopsis thaliana Zea mays Vicia faba Human Nucleus 70 Millions 3900 Millions 14500 Millions 2800 Millions Plastid 0.156 Millions 0.136 Millions 0.120 Millions Mitochondrion 0.370 Millions .570 Millions .290 Millions .017 Millions

Organisation of the genome into chromosome The nuclear genome is organized in to chromosome Chromosomes consist of essentially one long DNA helix wound around nucleosome At metaphase, when the genome is relatively inactive, the chromosome are most condensed and therefore most easily observed cytologically, counted or separated Chromosomes provide the means by which the plant genome constituents are replicated and segregated regularly in mitosis and meiosis Large genome segments are defined by their conserved order of constituent genes

Chromosome

Chromosome parts Heterochromatin Darkly staining portions of chromosomes, believed due to high degree of coiling a. Centromere ~ “middle” of Chromosomes spindle attachment sites b. Telomeres 1. ends of chromosome 2. important for the stability of chromosomes tips. 2. Euchromatin Lightly staining portion of chromosomes It represents most of the genomes It contains most of genes.

Genome organization

Segment of DNA which can be transcribed and translated to amino acid Protein Coding Genes Segment of DNA which can be transcribed and translated to amino acid 40

Protein Coding Genes Transcribed region ≈ Open Reading Frame (ORF) long (usually >100 aa) “known” proteins  likely Basal signals Transcription, translation Regulatory signals Depend on organism Prokaryotes vs Eukaryotes Yeast, ~1% of genes have ORFs<100 aa

Protein Coding Genes House keeping gene: Plant contains about 10 000 – 30 000 structural genes They are present in only a few copies, sometimes just one (single copy gene) They often form a gene family The transcription of most structural genes is subject to very complex and specific regulation The gene for enzymes of metabolism or protein biosynthesis which proceed in all cells are transcribed more often Most of the genes are switched off and are activated only in certain organ and then often only in certain cells Many genes are only switched on at specific times Yeast, ~1% of genes have ORFs<100 aa House keeping gene: The genes which every cell needs for such basic functions independent of its specialization

Prokaryotic Gene Promoter Cistron1 Cistron2 CistronN Terminator Transcription RNA Polymerase mRNA 5’ 3’ 1 2 N Translation Ribosome, tRNAs, Protein Factors N N C N C C 1 2 3 Polypeptides

Promoter Region on DNA upstream from transcription start site initial binding site of RNA polymerase and initiation factors (IFs) Promoter recognition: a prerequisite for initiation E. coli consensus promoter regions -35 site = TTGACA -10 site: “TATA” box

Eukaryotic genes

Pseudogenes Nonfunctional copies of genes Formed by duplication of ancestral gene, or reverse transcription (and integration) Not expressed due to mutations that produce a stop codon (nonsense or frame-shift) or prevent mRNA processing, or due to lack of regulatory sequences

Tandemly Repeated DNA A large number of identical repeated DNA sequences It spread over the entirely chromosome There is therefore within species variation for the number of copies in allelic arrays Variations in the lengths of tandemly repeat units have been used as a sources of molecular marker It is divided into: 1. Tandemly repeated expressed DNA 2. Tandemly repeated non expressed DNA

Tandemly Repeated Gene Genes which are duplicated and clustered at many location of the genome Ribosomal 18S, 58S, 25S and 5S RNA genes are highly reiterated in clusters and form at sites called nucleolus organizers (NOR) There is therefore within species variation for the number of copies in allelic arrays Variations in the lengths of rDNA repeat units have been used as a sources of molecular marker Tandem repeated expressed DNAs are also observed for tDNA and histones

Tandemly Repeat non expressed DNA Repetitive sequences which are unable to be expressed but found in huge amount in the genome Simple-sequence DNA Moderately repeated DNA (mobile DNA)

Simple Sequence DNA Very sort sequences repeated many times in tandem in large clusters It is also called as satellite DNA It often lies in heterochromatin especially in centromeres and telomeres (and others) It is divided into 2 groups: Mini satellite : Variable number tandem repeat (VNTR) Micro satellite : Simple sequence repeat (SSR) It is used in DNA fingerprinting to identify individuals

Mobile DNA Units of DNA which are predisposed to move to another location, sometimes involving replication of the unit, with the help of products of genes on the elements or on related element Move within genomes Most of moderately repeated DNA sequences found throughout higher eukaryotic genomes L1 LINE is ~5% of human DNA (~50,000 copies) Alu is ~5% of human DNA (>500,000 copies) Some encode enzymes that catalyze movement 2 types: a. Transposon b. Retrotransposon

Transposon Chromosomal loci capable of being transposed from one spot to another within and among the chromosomes of a complement gene Movement of mobile DNA Involves copying of mobile DNA element and insertion into new site in genome Why? Molecular parasite: “selfish DNA” Probably have significant effect on evolution by facilitating gene duplication, which provides the fuel for evolution, and exon shuffling

Retrotransposon (retroelement) Transposon like segment of DNA Retroviruses lacking the sequence encoding the structural envelope protein Major component of plant genome Size ranges from 1 to 13 kb in length Widely distributed over the chromosomes of many plant species gene Retrovirus A virus of higher organism whose genome is rna, but which can insert a dna copy its genome into host chromosome

The Repetitive DNA Content of Genomes

Tandemly repeated DNA Microsatellite Minisatellite Unit size: at most 5 bp ATATATATATATATATATATATAT Minisatellite Unit size: up to 25 bp ATTGCTGTATTGCTGTATTGCTGT

Interspersed Genome-wide Repeats Retrotransposon RNA intermediate Retrovirus, LTR retrotransposon (Long terminal repeat), Non-LTR retrotransposon, LINEs (Long interspersed element) SINEs (Short interspersed element) Transposon DNA intermediate

Classification of transposable elements

Retrotransposon

Structural feature of transposable elements SINEs (80-630 bp) AAAAA LINEs (6-8 kbp) Orf1 pol AAAAA Retrotransposon (4-8 kbp) gag pol LTR LTR Endogenous Retrovirus (4-8 kbp) gag pol env LTR LTR

Life cycle of LTR retrotransposons IN, integrase; PR, protease; RT, reverse transcriptase; VLP, virus-like particle

Eukaryotic cells Peter J. Russell, iGenetics: Copyright © Pearson Education, Inc., publishing as Benjamin Cummings.

Mitochondrial genome (mtDNA) Number of mitochondria in plants can be between 50-2000 One mitochondria consists of 1 – 100 genomes (multiple identical circular chromosomes. They are one large and several smaller Size ~15 Kb in animals Size ~ 200 kb to 2,500 kb in plants Mt DNA is replicated before or during mitosis Transcription of mtDNA yielded an mRNA which did not contain the correct information for the protein to be synthesized. RNA editing is existed in plant mitochondria Over 95% of mitochondrial proteins are encoded in the nuclear genome. Often A+T rich genomes

Chloroplast genome (ctDNA) Multiple circular molecules, similar to procaryotic cyanobacteria, although much smaller (0.001-0.1%of the size of nuclear genomes) Cells contain many copies of plastids and each plastid contains many genome copies Size ranges from 120 kb to 160 kb Plastid genome has changed very little during evolution. Though two plants are very distantly related, their genomes are rather similar in gene composition and arrangement Some of plastid genomes contain introns Many chloroplast proteins are encoded in the nucleus (separate signal sequence)

“Cellular” Genomes Viruses Procaryotes Eucaryotes Nucleus Capsid Plasmids Viral genome Bacterial chromosome Chromosomes (Nuclear genome) Mitochondrial genome Chloroplast genome Genome: all of an organism’s genes plus intergenic DNA Intergenic DNA = DNA between genes

Estimated genome sizes mammals plants fungi bacteria (>100) mitochondria (~ 100) viruses (1024) 1e1 1e2 1e3 1e4 1e5 1e6 1e7 1e8 1e9 1e10 1e11 1e12 Size in nucleotides. Number in ( ) = completely sequenced genomes

What Did These Individuals Contribute to Molecular Genetics? Anton van Leeuwenhoek Discovered cells Bacteria Protists Red blood

What Did These Individuals Contribute to Molecular Genetics? Gregor Johan Mendel Discovered genetics

What Did These Individuals Contribute to Molecular Genetics? Walter Sutton Discovered Chromosomes

What Did These Individuals Contribute to Molecular Genetics? Thomas Hunt Morgan Discovered how genes are transmitted through chromosomes

What Did These Individuals Contribute to Molecular Genetics? Rosalind Elsie Franklin Research led to the discovery of the double helix structure of DNA

What Did These Individuals Contribute to Molecular Genetics? James Watson and Francis Crick Discovered DNA

DNA’s History 1866 Gregore Mendel Law of Heredity 1900 Carl Correns, Hugo de Vries& Eric von Tschermak Mendelian Law re-invention 1944 Avery, Macleod & McCarty Gene consists of DNA 1952 Hersey dan Chase DNA as genetic matarials 1953 Watson & Crick Double helix DNA 1971 Cohen & Boyer Transformation Technology 1972 Berg DNA Recombinant Technology 1973 Arber, smith & Nathans Restriction Enzyme

Gene The hereditary determinant of a specified difference between individual The unit of heredity The unit which passed from generation to generation following simple Mendelian inheritance A segment of DNA which encodes protein synthesis Any of the units occurring at specific points on the chromosomes, by which hereditary characters are transmitted and determined, and each is regarded as a particular state of organization of the chromatin in the chromosome, consisting primarily DNA and protein

Gene classification intergenic region non-coding genes coding genes Chromosome (simplified) Messenger RNA Structural RNA Proteins transfer RNA ribosomal RNA other RNA Structural proteins Enzymes

What are the problems with this definition? Gene Molecular definition: DNA sequence encoding protein What are the problems with this definition?

Gene Some genomes are RNA instead of DNA Some gene products are RNA (t-RNA, r-RNA, and others) instead of protein Some nucleic acid sequences that do not encode gene products (non-coding regions) are necessary for production of the gene product (RNA or protein)

Coding region Nucleotides (open reading frame) encoding the amino acid sequence of a protein The molecular definition of gene includes more than just the coding region

Noncoding regions of eukaryotic gene Regulatory regions RNA polymerase binding site Transcription factor binding sites Introns Polyadenylation [poly(A)] sites

Molecular definition: Gene Molecular definition: Entire nucleic acid sequence necessary for the synthesis of a functional polypeptide (protein chain) or functional RNA

Polycistronic mRNA encodes several proteins Bacterial genes Most do not have introns Many are organized in operons: contiguous genes, transcribed as a single polycistronic mRNA, that encode proteins with related functions Polycistronic mRNA encodes several proteins

Bacterial operon

Eukaryotic genes Most have introns Produce mono-cistronic mRNA (only one encoded protein) Large

Hemoglobin beta subunit gene Eucaryotic genes Hemoglobin beta subunit gene Exon 1 90 bp Intron A 131 bp Exon 2 222 bp Intron B 851 bp Exon 3 126 bp Splicing Introns : intervening sequences within a gene that are not translated into a protein sequence. Exons : sequences within a gene that encode protein sequences Splicing : Removal of introns from the mRNA molecule.

mRNA from some genes can be spliced into two or more different mRNAs Alternative splicing mRNA from some genes can be spliced into two or more different mRNAs

Number of genes in eukaryotes Species # genes S. cerevisiae (yeast) ~6700 D. melanogaster (fruit fly) ~14400 C. elegans (roundworm) ~20600 A. thaliana (mustard weed) ~25000 H. sapiens (human) ~24000 P. troglodytes (chimpanzee) ~22500 M. musculus (mouse) ~27000 R. norvegicus (rat) ~23400 C. familiaris (dog) ~20400

Top Ten Terms in Molecular Genetics Amino acids: the 20 building blocks of proteins, each coded for by a specific 3 base-pair codon. Allele: one of the two copies of a specific gene Polymorphism: a gene that varies between individual members of the population in more than 1% of the population. Most commonly, these are single nucleotide variations (SNPs). Transcription: the synthesis of an RNA copy from a sequence of DNA; the first step in gene expression. Translation: the synthesis of proteins from mRNA and amino acids

Top Ten Terms in Molecular Genetics Gene: specific sequence of nucleotide bases that carries information for constructing proteins; exons are the regions that actually encode for the protein Chromosome: physically separate microscopic units of DNA that comprise the genome Genetics: the study of the patterns of inheritance of specific traits Genomics: the study of an organism’s entire complement of genetic material and its function Proteomics: the study of an organism’s entire protein material, its structure and function