DNA and Genome sequencing. Genome –Hereditary information of an organism is encoded in its DNA and enclosed in a cell (unless it is a virus). All the.

Slides:



Advertisements
Similar presentations
Replication. N N H R O CH3 O T N N R H N H O C R N N N N H H N A G R N N N O H U.
Advertisements

Nucleic Acids.
DNA Sequencing.
Recombinant DNA Introduction to Recombinant DNA technology
DNA Sequencing and Gene Analysis
DNA Sequencing. DNA sequencing … ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA TATATATATACGTCGTCGT ACTGATGACTAGATTACAG ACTGATTTAGATACCTGAC.
DNA Sequencing How do you do it?. DNA Sequencing DNA sequencing – used to determine the actual DNA sequence of an organism. Using a computer, one can.
Introduction to Bioinformatics Lecture 20: Sequencing genomes.
6 The Chemical Structure, Replication, and Manipulation of DNA.
11 DNA and Its Role in Heredity. 11 The Structure of DNA DNA is a polymer of nucleotides. The four nucleotides that make up DNA differ only in their nitrogenous.
DNA Sequencing.
Protein Sequencing Primary Structure of Proteins
27.8 Introduction to Peptide Structure Determination.
Sanger-Coulson Dideoxynucleotide Sequencing Kwamina Bentsi-Barnes Deisy Mendoza Jennifer Aoki Lecture 10/30/00 Best printed in color for clarity.
7.1 cont’d: Sanger Sequencing SBI4UP MRS. FRANKLIN.
DNA Sequencing. * Sequencing means finding the order of nucleotides on a piece of DNA. * Nucleotide order determines amino acid order, and by extension,
DNA Sequencing LECTURE 6: Biotechnology; 3 Credit hours Atta-ur-Rahman School of Applied Biosciences (ASAB) National University of Sciences and Technology.
DNA Sequencing Chemical Method and Termination Method Shaila Ahmed 02/13/04 BICM
DNA Sequencing Today, laboratories routinely sequence the order of nucleotides in DNA. DNA sequencing is done to: Confirm the identity of genes isolated.
1.) DNA Extraction Follow Kit Grind sample Mix with solution and spin Bind, Wash, Elute.
CHAPTER 22: Nucleic Acids & Protein Synthesis
6.3 Advanced Molecular Biological Techniques 1. Polymerase chain reaction (PCR) 2. Restriction fragment length polymorphism (RFLP) 3. DNA sequencing.
-The methods section of the course covers chapters 21 and 22, not chapters 20 and 21 -Paper discussion on Tuesday - assignment due at the start of class.
Cloning and genetic engineering by Ivo Frébort. Cloning Clone: a collection of molecules or cells, all identical to an original molecule or cell To "clone.
DNA sequencing: Importance Basic blueprint for life; Aesthetics. Gene and protein. –Function –Structure –Evolution Genome-based diseases- “inborn errors.
Announcements Lab notebooks due Monday by 5 No Ch. 9 Part 2 homework
The MOLECULAR BASIS OF INHERITANCE
POLYMERASE CHAIN REACTION. DNA Structure DNA consists of two molecules that are arranged into a ladder-like structure called a Double Helix. A molecule.
Primary Structure Determination (Sanger) 1.Determine what amino acids are present and their molar ratios. 2. Cleave the peptide into smaller fragments,
1 Chapter 2: DNA replication and applications DNA replication in the cell Polymerase chain reaction (PCR) Sequence analysis of DNA.
CHAPTER 22: Nucleic Acids & Protein Synthesis General, Organic, & Biological Chemistry Janice Gorzynski Smith.
CHAPTER 7 DNA SEQUENCING - INTRODUCTION - SANGER DIDEOXY METHOD - AUTOMATED SEQUENCING - NEXT GENERATION OF SEQUENCING METHODS MISS NUR SHALENA SOFIAN.
DNA Sequencing Scenario
Chapter 5: Exploring Genes and Genomes Copyright © 2007 by W. H. Freeman and Company Berg Tymoczko Stryer Biochemistry Sixth Edition.
GENE SEQUENCING. INTRODUCTION CELL The cells contain the nucleus. The chromosomes are present within the nucleus.
PPT-1. Experiment Objective: The objective of this experiment is to amplify a DNA fragment by Polymerase Chain Reaction (PCR) and to clone the amplified.
Nucleic Acids and Protein Synthesis 10 – 1 DNA 10 – 2 RNA 10 – 3 Protein Synthesis.
6.3 Advanced Molecular Biological Techniques 1. Polymerase chain reaction (PCR) 2. Restriction fragment length polymorphism (RFLP) 3. DNA sequencing.
Chapter 10: Genetic Engineering- A Revolution in Molecular Biology.
Locating and sequencing genes
1 PCR: identification, amplification, or cloning of DNA through DNA synthesis DNA synthesis, whether PCR or DNA replication in a cell, is carried out by.
Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism.
Proteins: Primary Structure Lecture 6 Chapters 4 & 5 9/10/09.
DNA Sequencing Hunter Jones, Mitchell Gage. What’s the point? In a process similar to PCR, DNA sequencing uses a mixture of temperature changes, enzymes.
B8 Nucleic Acid. Assessment Objective B.8.1 Describe the structure of nucleotides and their condensation polymers (nucleic acids or polynucleotides).
Title: Studying whole genomes Homework: learning package 14 for Thursday 21 June 2016.
핵산 염기서열 분석(DNA SEQUENCING)
DNA Sequencing First generation techniques
DNA sequencing DNA sequencing is the process of determining the precise order of nucleotides within a DNA molecule. It includes any method or technology.
DNA Sequencing BCH 446.
DNA Technologies (Introduction)
DNA Sequencing Techniques
Di-deoxynucleotide Chain Termination
copying & sequencing DNA
Genetic Research and Biotechnology
Sequencing Technologies
DNA Sequencing Chemical Method and Termination Method
DNA sequencing Direct determination of nucleotide sequence
The Human Genome Project
DNA Sequence Determination (Sanger)
Chapter 14 Bioinformatics—the study of a genome
Screening a Library for Clones Carrying a Gene of Interest
Recombinant DNA Technology
DNA Sequencing The DNA from the genome is chopped into bits- whole chromosomes are too large to deal with, so the DNA is broken into manageably-sized overlapping.
Nucleic Acids and Protein Synthesis
DNA and the Genome Key Area 8a Genomic Sequencing.
Molecular Biology lecture -Putnoky
Plant Biotechnology Lecture 2
UNIT D DNA & DNA replication.
SBI4U0 Biotechnology.
Presentation transcript:

DNA and Genome sequencing

Genome –Hereditary information of an organism is encoded in its DNA and enclosed in a cell (unless it is a virus). All the information contained in the DNA of a single organism is its genome. DNA molecule can be thought of as a very long sequence of nucleotides or bases:  = {A, T, C, G}

Nucleic Acid Basics Nucleic Acids Are Polymers Each Monomer Consists of Three Moieties: Nucleotide A Base + A Ribose Sugar + A Phosphate Nucleoside A Base Can be One of the Five Rings:

Pyrimidines Purines Pyrimidines and Purines can Base-Pair (Watson-Crick Pairs)

Three Dimensional Structures of Double Helices A-DNA Major Groove Minor Groove

Genome in Detail Apart from reproductive cells (gametes) and mature red blood cells, every cell in the human body contains 23 pairs of chromosomes, each a packet of compressed DNA

Genome Size The genomes vary widely in size: measuring from Few thousand base pairs for viruses to X bp for certain amphibian and flowering plants. Coliphage MS2 (a virus) has the smallest genome: only 3.5 X 10 3 bp. Mycoplasmas (a unicellular organism) has the smallest cellular genome: 5 X 10 5 bp. C. elegans (nematode worm, a primitive multicellular organism) has a genome of size » 10 8 bp. Species Haploid Genome Size Chromos ome Numer E. Coli 4.64 x S.cerevisae x C. elegans /12 D. melanogaster 1.7 X M. musculus 3 X H. sapiens 3 X A. Cepa (Onion) 1.5 X

What is genome sequencing? The process of determining the exact order of the chemical building blocks of whole genome.

DNA sequencing DNA sequencing: The process of determining the exact order of the chemical building blocks (called bases and abbreviated A, T, C, and G) that make up the DNA. Maxam and Gilbert, used a “chemical cleavage protocol” (1970) (a chain cleavage method of sequencing DNA fragments ) Sanger dideoxynucleotide termination sequencing method (enzymatic chain termination procedure ) Sequencing by hybridization –chips

DNA sequencing Determination of nucleotide sequence  the determination of the precise sequence of nucleotides in a sample of DNA Two similar methods: 1. Maxam and Gilbert method 2. Sanger method They depend on the production of a mixture of oligonucleotides labeled either radioactively or fluorescein, with one common end and differing in length by a single nucleotide at the other end This mixture of oligonucleotides is separated by high resolution electrophoresis on polyacrilamide gels and the position of the bands determined

Maxam-Gilber t Walter Gilbert –Harvard physicist –Knew James Watson –Became intrigued with the biological side –Became a biophysicist Allan Maxam ~ Student of Gilbert

Maxam and Gilbert’s technique A single fragment of DNA was isolated and labelled at one end (the 5’ end). Then this fragment was partially cleaved with 4 specific specific chemical reactions that cleaved at either A+G, just G, C+T, just C. This yields specific labelled fragments whose size corresponds to the sequence.

The Maxam-Gilbert Technique Principle - Chemical Degradation of Purines –Purines (A, G) damaged by dimethylsulfate –Methylation of base –Heat releases base –Alkali cleaves G –Dilute acid cleave A>G

Maxam-Gilbert Technique Principle Chemical Degradation of Pyrimidines –Pyrimidines (C, T) are damaged by hydrazine –Piperidine cleaves the backbone –2 M NaCl inhibits the reaction with T

Maxam and Gilbert Method Chemical degradation of purified fragments (chemical degradation) The single stranded DNA fragment to be sequenced is end-labeled by treatment with alkaline phosphatase to remove the 5’phosphate It is then followed by reaction with P-labeled ATP in the presence of polynucleotide kinase, which attaches P labeled to the 5’terminal The labeled DNA fragment is then divided into four aliquots, each of which is treated with a reagent which modifies a specific base 1. Aliquot A + dimethyl sulphate, which methylates guanine residue 2. Aliquot B + formic acid, which modifies adenine and guanine residues 3. Aliquot C + Hydrazine, which modifies thymine + cytosine residues 4. Aliquot D + Hydrazine + 5 mol/l NaCl, which makes the reaction specific for cytosine The four are incubated with piperidine which cleaves the sugar phosphate backbone of DNA next to the residue that has been modified

Maxam-Gilbert sequencing - modifications

Dimethyl sulphate piperidine formate hydrazine +NaCl Chemical cleavage method

Advantages/disadvantages Maxam-Gilbert sequencing Requires lots of purified DNA, and many intermediate purification steps Relatively short readings Automation not available (sequencers) Remaining use for ‘footprinting’ (partial protection against DNA modification when proteins bind to specific regions, and that produce ‘holes’ in the sequence ladder) In contrast, the Sanger sequencing methodology requires little if any DNA purification, no restriction digests, and no labeling of the DNA sequencing template

Sanger Method Fred Sanger, 1958 –Was originally a protein chemist –Made his first mark in sequencing proteins –Made his second mark in sequencing RNA 1980 dideoxy sequencing

Original Sanger Method Random incorporation of a dideoxynucleoside triphosphate into a growing strand of DNA Requires DNA polymerase I Requires a cloning vector with initial primer (M13, high yield bacteriophage, modified by adding: beta- galactosidase screening, polylinker) Uses 32 P-deoxynucleoside triphosphates

Sanger Method in-vitro DNA synthesis using ‘terminators’, use of dideoxi- nucleotides that do not permit chain elongation after their integration DNA synthesis using deoxy- and dideoxynucleotides that results in termination of synthesis at specific nucleotides Requires a primer, DNA polymerase, a template, a mixture of nucleotides, and detection system Incorporation of di-deoxynucleotides into growing strand terminates synthesis Synthesized strand sizes are determined for each di- deoxynucleotide by using gel or capillary electrophoresis Enzymatic methods

Dideoxynucleotide no hydroxyl group at 3’ end prevents strand extension CH2 O OPPP 5’ 3’ BASE

The principles Partial copies of DNA fragments made with DNA polymerase Collection of DNA fragments that terminate with A,C,G or T using ddNTP Separate by gel electrophoresis Read DNA sequence

"G" tube: all four dNTP's, ddGTP and DNA polymerase "A" tube: all four dNTP's, ddATP and DNA polymerase "T" tube: all four dNTP's, ddTTP and DNA polymerase "C" tube: all four dNTP's, ddCTP and DNA polymerase

CCGTAC 3’ 5’ 3’ primer dNTP ddATP GGCA ddTTP GGCAT ddCTP GGCG ddGTP G GGCATG A T C G

Dideoxy Chain Terminator Template Primer Extension Chemistry –polymerase –termination –labeling Separation Detection

Chain Terminator Basics Target Template-Primer Extend ddA ddG ddC ddT Labeled Terminators ddA AddC ACddG ACGddT TGCA dN : ddN 100 : 1

Electrophoresis

Sanger Method Sequencing Gel

Template ssDNA vectors – M13 –pUC PCR dsDNA (+/- PCR)

Primers Universal primers parallel –cheap, reliable, easy, fast, parallel –BULK sequencing Custom primers –expensive, slow, one-at-a-time – ADAPTABLE

Extension Polymerase –Sequenase –Thermostable (Cycle Sequencing) Terminators –Dye labels (“Big Dye”) spectrally different, high fluorescence –ddA,C,G,T with primer labels

Separation Gel Electrophoresis Capillary Electrophoresis –suited to automation rapid (2 hrs vs 12 hrs) re-usable simple temperature control 96 well format

Sequencing of DNA by the Sanger method

Comparison Sanger Method –Enzymatic –Requires DNA synthesis –Termination of chain elongation Maxam Gilbert Method –Chemical –Requires DNA –Requires long stretches of DNA –Breaks DNA at different nucleotides

Modern Techniques Modern techniques are based on chain termination, but no longer use x-ray film to detect the DNA letters. About years ago a means of detecting the ddNTPs with fluorescent tags was discovered. This required only a single test tube instead of four.

The DNA to be sequenced is prepared as a single strand. This template DNA is supplied with a mixture of all four normal (deoxy) nucleotides in ample quantities dATP dGTP dCTP dTTP A mixture of all four dideoxynucleotides, each present in limiting quantities and each labeled with a "tag" that fluoresces a different color: ddATP ddGTP ddCTP ddTTP DNA polymerase I

In fact the whole experiment is run in a single tube. Here the relative concentrations of dNTPs and ddNTPs are critical

Laser reads gels directly, all four sample in a single lane

Capillary sequencing replaces gels

Sample Output 1 lane

This is what you end up with

Problems No signal Low signal Compression Stop

Shotgun Method - Overview Cut genome into short fragments Sequence DNA fragments Create contigs Contig - continous set of overlapping sequences Gap!

Shotgun Sequencing Isolate Chromosome ShearDNA into Fragments Clone into Seq. Vectors Sequence

Shotgun Method The shotgun approach to sequence assembly. The DNA molecule is broken into small fragments, each of which is sequenced. The master sequence is assembled by searching for overlaps between the sequences of individual fragments. In practice, an overlap of several tens of base pairs would be needed to establish that two sequences should be linked together.

Shotgun technique This approach entails sampling DNA fragments as randomly as possible from the source sequence and then producing a sequencing read of the first 300 to 900 bases of one end of each fragment. If enough fragments are sequenced and their sampling is sufficiently random across the source, the process should let us determine the source by finding sequence overlaps among the reads of fragments that were sampled from overlapping stretches.

Steps in shotgun technique Technicians randomly fracture the sample either using sound (sonication) or passing it through a nozzle under pressure (nebulation), which produces a uniformly random partitioning of each copy of the source strand into a collection of DNA fragments.

Step-2 To remove fragments that are too large or too small, this pool of fragments is size selected, typically using size separation under gel electrophoresis and then simply excising a band of the gel containing the desired size.

Step-3 The technicians then insert the size-selected fragments into the DNA of a genetically engineered bacterial virus (phage), called a vector.Usually, at most one fragment is inserted at a predetermined point, called the cloning site, in the vector. The fragments at this point are often called inserts and the collection of inserts is a library.

Principles of DNA Sequencing Primer PBR322 Amp Tet Ori DNA fragment Denature with heat to produce ssDNA Klenow + ddNTP + dNTP + primers

Multiplexed CE with Fluorescent detection ABI x700 bases

Shotgun Sequencing Sequence Chromatogram Send to Computer Assembled Sequence

Shotgun Method – Haemophilus Influenzae Sequencing 1.5-2kb Extract DNA Sonicate Electrophoresis DNA library SequenceConstruct contigs Sequenced

Shotgun Sequencing Very efficient process for small-scale (~10 kb) sequencing (preferred method) First applied to whole genome sequencing in 1995 (H. influenzae) Now standard for all prokaryotic genome sequencing projects Successfully applied to D. melanogaster Moderately successful for H. sapiens

Shotgun Method Repeats as the main problem

Introduction to Peptide Sequence Determination

Primary Structure The primary structure is the amino acid sequence plus any disulfide links.

Classical Strategy (Sanger) 1.Determine what amino acids are present and their molar ratios. 2. Cleave the peptide into smaller fragments, and determine the amino acid composition of these smaller fragments. 3. Identify the N-terminus and C-terminus in the parent peptide and in each fragment. 4.Organize the information so that the sequences of small fragments can be overlapped to reveal the full sequence.

Amino Acid Analysis

Acid-hydrolysis of the peptide (6 M HCl, 24 hr) gives a mixture of amino acids. The mixture is separated by ion-exchange chromatography, which depends on the differences in among the various amino acids. Amino acids are detected using ninhydrin. Automated method; requires only to g of peptide.

Partial Hydrolysis of Proteins

Partial Hydrolysis of Peptides and Proteins Acid-hydrolysis of the peptide cleaves all of the peptide bonds. Cleaving some, but not all, of the peptide bonds gives smaller fragments. These smaller fragments are then separated and the amino acids present in each fragment determined. Enzyme-catalyzed cleavage is the preferred method for partial hydrolysis.

Partial Hydrolysis of Peptides and Proteins The enzymes that catalyze the hydrolysis of peptide bonds are called peptidases, proteases, or proteolytic enzymes.

Trypsin Trypsin is selective for cleaving the peptide bond to the carboxyl group of lysine or arginine. NHCHC O R' OR" O R lysine or arginine

Chymotrypsin Chymotrypsin is selective for cleaving the peptide bond to the carboxyl group of amino acids with an aromatic side chain. NHCHC O R' OR" O R phenylalanine, tyrosine, tryptophan

Carboxypeptidase protein H 3 NCHC O R + NHCHCO OR – CO Carboxypeptidase is selective for cleaving the peptide bond to the C-terminal amino acid.

End Group Analysis

Amino sequence is ambiguous unless we know whether to read it left-to-right or right-to-left. We need to know what the N-terminal and C- terminal amino acids are. The C-terminal amino acid can be determined by carboxypeptidase-catalyzed hydrolysis. Several chemical methods have been developed for identifying the N-terminus. They depend on the fact that the amino N at the terminus is more nucleophilic than any of the amide nitrogens.

Sanger's Method The key reagent in Sanger's method for identifying the N-terminus is 1-fluoro-2,4- dinitrobenzene. 1-Fluoro-2,4-dinitrobenzene is very reactive toward nucleophilic aromatic substitution. F O2NO2NO2NO2N NO 2

Sanger's Method 1-Fluoro-2,4-dinitrobenzene reacts with the amino nitrogen of the N-terminal amino acid. F O2NO2NO2NO2N NO 2 NHCH 2 C NHCHCO CH 3 NHCHC CH 2 C 6 H 5 H 2 NCHC OOO O CH(CH 3 ) 2 – + O2NO2NO2NO2N NO 2 NHCH 2 C NHCHCO CH 3 NHCHC CH 2 C 6 H 5 NHCHC O O O O CH(CH 3 ) 2 –

Sanger's Method Acid hydrolysis cleaves all of the peptide bonds leaving a mixture of amino acids, only one of which (the N-terminus) bears a 2,4- DNP group. O2NO2NO2NO2N NO 2 NHCH 2 C NHCHCO CH 3 NHCHC CH 2 C 6 H 5 NHCHC O O O O CH(CH 3 ) 2 – H3O+H3O+H3O+H3O+ O O2NO2NO2NO2N NO 2 NHCHCOH CH(CH 3 ) 2 H 3 NCHCO – CH 3 + H 3 NCH 2 CO – OO+ O H 3 NCHCO – CH 2 C 6 H

The specificity of some common methods for fragmenting polypeptide chain Trypsin Lys (K) and/or Arg (R) at C-ter Chymotripsin Phe (F), Trp (W) and Tyr (Y) at C-ter Pepsine Phe (F) Trp (W) and Tyr (Y) at N-ter Cyanogen bromide Met (M) at C-ter

Insulin FVNQHLCGSHLVGALYLVCGERGFFYTPKA

Insulin is a polypeptide with 51 amino acids. It has two chains, called the A chain (21 amino acids) and the B chain (30 amino acids). The following describes how the amino acid sequence of the B chain was determined.

Primary Structure of Bovine Insulin N terminus of A chain N terminus of B chain C terminus of B chain C terminus of A chain SS10 S S S S F F F F V N Q HL C C C C C V V V V G G G S S S H L L L G A A A C L Y Y E E L E R Y Y I Q K P T Q N N

The Edman Degradation and Automated Sequencing of Peptides

Edman Degradation 1. Method for determining N-terminal amino acid. 2.Can be done sequentially one residue at a time on the same sample. Usually one can determine the first 20 or so amino acids from the N-terminus by this method g of sample is sufficient. 4. Has been automated.

Edman Degradation The key reagent in the Edman degradation is phenyl isothiocyanate. NC S

Edman Degradation Phenyl isothiocyanate reacts with the amino nitrogen of the N-terminal amino acid. peptide H 3 NCHC OR + NHNHNHNH C6H5NC6H5NC6H5NC6H5NC S +

Edman Degradation peptide H 3 NCHC OR + NHNHNHNH C6H5NC6H5NC6H5NC6H5NC S + peptide C 6 H 5 NHCNHCHC OR NHNHNHNHS

Edman Degradation peptide C 6 H 5 NHCNHCHC OR NHNHNHNHS The product is a phenylthiocarbamoyl (PTC) derivative. The PTC derivative is then treated with HCl in an anhydrous solvent. The N-terminal amino acid is cleaved from the remainder of the peptide.

Edman Degradation peptide C 6 H 5 NHCNHCHC OR NHNHNHNHSHCl peptide H3NH3NH3NH3N++ C6H5NHC6H5NHC6H5NHC6H5NH CSC N CH R O

Edman Degradation C6H5NHC6H5NHC6H5NHC6H5NH CSC N CH R O The product is a thiazolone. Under the conditions of its formation, the thiazolone rearranges to a phenylthiohydantoin (PTH) derivative. peptide H3NH3NH3NH3N++

Edman Degradation C6H5NHC6H5NHC6H5NHC6H5NH CSC N CH R O C C N HN CH R OS C6H5C6H5C6H5C6H5 The PTH derivative is isolated and identified. The remainder of the peptide is subjected to a second Edman degradation. peptide H3NH3NH3NH3N++

Edman Degradation The Edman degradation is carried out on a machine, called a sequenator, that mixes reagents in the proper proportions, separates the products, identifies them, and records the results