Presentation is loading. Please wait.

Presentation is loading. Please wait.

DNA and Genome sequencing. Genome –Hereditary information of an organism is encoded in its DNA and enclosed in a cell (unless it is a virus). All the.

Similar presentations


Presentation on theme: "DNA and Genome sequencing. Genome –Hereditary information of an organism is encoded in its DNA and enclosed in a cell (unless it is a virus). All the."— Presentation transcript:

1 DNA and Genome sequencing

2 Genome –Hereditary information of an organism is encoded in its DNA and enclosed in a cell (unless it is a virus). All the information contained in the DNA of a single organism is its genome. DNA molecule can be thought of as a very long sequence of nucleotides or bases:  = {A, T, C, G}

3 Nucleic Acid Basics Nucleic Acids Are Polymers Each Monomer Consists of Three Moieties: Nucleotide A Base + A Ribose Sugar + A Phosphate Nucleoside A Base Can be One of the Five Rings:

4 Pyrimidines Purines Pyrimidines and Purines can Base-Pair (Watson-Crick Pairs)

5 Three Dimensional Structures of Double Helices A-DNA Major Groove Minor Groove

6 Genome in Detail Apart from reproductive cells (gametes) and mature red blood cells, every cell in the human body contains 23 pairs of chromosomes, each a packet of compressed DNA

7 Genome Size The genomes vary widely in size: measuring from Few thousand base pairs for viruses to 2 - 3 X 10 11 bp for certain amphibian and flowering plants. Coliphage MS2 (a virus) has the smallest genome: only 3.5 X 10 3 bp. Mycoplasmas (a unicellular organism) has the smallest cellular genome: 5 X 10 5 bp. C. elegans (nematode worm, a primitive multicellular organism) has a genome of size » 10 8 bp. Species Haploid Genome Size Chromos ome Numer E. Coli 4.64 x 10 6 1 S.cerevisae 1.205 x 10 7 16 C. elegans10 8 11/12 D. melanogaster 1.7 X 10 8 4 M. musculus 3 X 10 9 20 H. sapiens 3 X 10 9 23 A. Cepa (Onion) 1.5 X 10 10 8

8 What is genome sequencing? The process of determining the exact order of the chemical building blocks of whole genome.

9 DNA sequencing DNA sequencing: The process of determining the exact order of the chemical building blocks (called bases and abbreviated A, T, C, and G) that make up the DNA. Maxam and Gilbert, used a “chemical cleavage protocol” (1970) (a chain cleavage method of sequencing DNA fragments ) Sanger dideoxynucleotide termination sequencing method (enzymatic chain termination procedure ) Sequencing by hybridization –chips

10 DNA sequencing Determination of nucleotide sequence  the determination of the precise sequence of nucleotides in a sample of DNA Two similar methods: 1. Maxam and Gilbert method 2. Sanger method They depend on the production of a mixture of oligonucleotides labeled either radioactively or fluorescein, with one common end and differing in length by a single nucleotide at the other end This mixture of oligonucleotides is separated by high resolution electrophoresis on polyacrilamide gels and the position of the bands determined

11 Maxam-Gilber t Walter Gilbert –Harvard physicist –Knew James Watson –Became intrigued with the biological side –Became a biophysicist Allan Maxam ~ Student of Gilbert

12 Maxam and Gilbert’s technique A single fragment of DNA was isolated and labelled at one end (the 5’ end). Then this fragment was partially cleaved with 4 specific specific chemical reactions that cleaved at either A+G, just G, C+T, just C. This yields specific labelled fragments whose size corresponds to the sequence.

13 The Maxam-Gilbert Technique Principle - Chemical Degradation of Purines –Purines (A, G) damaged by dimethylsulfate –Methylation of base –Heat releases base –Alkali cleaves G –Dilute acid cleave A>G

14 Maxam-Gilbert Technique Principle Chemical Degradation of Pyrimidines –Pyrimidines (C, T) are damaged by hydrazine –Piperidine cleaves the backbone –2 M NaCl inhibits the reaction with T

15 Maxam and Gilbert Method Chemical degradation of purified fragments (chemical degradation) The single stranded DNA fragment to be sequenced is end-labeled by treatment with alkaline phosphatase to remove the 5’phosphate It is then followed by reaction with P-labeled ATP in the presence of polynucleotide kinase, which attaches P labeled to the 5’terminal The labeled DNA fragment is then divided into four aliquots, each of which is treated with a reagent which modifies a specific base 1. Aliquot A + dimethyl sulphate, which methylates guanine residue 2. Aliquot B + formic acid, which modifies adenine and guanine residues 3. Aliquot C + Hydrazine, which modifies thymine + cytosine residues 4. Aliquot D + Hydrazine + 5 mol/l NaCl, which makes the reaction specific for cytosine The four are incubated with piperidine which cleaves the sugar phosphate backbone of DNA next to the residue that has been modified

16 Maxam-Gilbert sequencing - modifications

17 Dimethyl sulphate piperidine formate hydrazine +NaCl Chemical cleavage method

18

19

20 Advantages/disadvantages Maxam-Gilbert sequencing Requires lots of purified DNA, and many intermediate purification steps Relatively short readings Automation not available (sequencers) Remaining use for ‘footprinting’ (partial protection against DNA modification when proteins bind to specific regions, and that produce ‘holes’ in the sequence ladder) In contrast, the Sanger sequencing methodology requires little if any DNA purification, no restriction digests, and no labeling of the DNA sequencing template

21 Sanger Method Fred Sanger, 1958 –Was originally a protein chemist –Made his first mark in sequencing proteins –Made his second mark in sequencing RNA 1980 dideoxy sequencing

22 Original Sanger Method Random incorporation of a dideoxynucleoside triphosphate into a growing strand of DNA Requires DNA polymerase I Requires a cloning vector with initial primer (M13, high yield bacteriophage, modified by adding: beta- galactosidase screening, polylinker) Uses 32 P-deoxynucleoside triphosphates

23 Sanger Method in-vitro DNA synthesis using ‘terminators’, use of dideoxi- nucleotides that do not permit chain elongation after their integration DNA synthesis using deoxy- and dideoxynucleotides that results in termination of synthesis at specific nucleotides Requires a primer, DNA polymerase, a template, a mixture of nucleotides, and detection system Incorporation of di-deoxynucleotides into growing strand terminates synthesis Synthesized strand sizes are determined for each di- deoxynucleotide by using gel or capillary electrophoresis Enzymatic methods

24 Dideoxynucleotide no hydroxyl group at 3’ end prevents strand extension CH2 O OPPP 5’ 3’ BASE

25

26 The principles Partial copies of DNA fragments made with DNA polymerase Collection of DNA fragments that terminate with A,C,G or T using ddNTP Separate by gel electrophoresis Read DNA sequence

27 "G" tube: all four dNTP's, ddGTP and DNA polymerase "A" tube: all four dNTP's, ddATP and DNA polymerase "T" tube: all four dNTP's, ddTTP and DNA polymerase "C" tube: all four dNTP's, ddCTP and DNA polymerase

28 CCGTAC 3’ 5’ 3’ primer dNTP ddATP GGCA ddTTP GGCAT ddCTP GGCG ddGTP G GGCATG A T C G

29 Dideoxy Chain Terminator Template Primer Extension Chemistry –polymerase –termination –labeling Separation Detection

30 Chain Terminator Basics Target Template-Primer Extend ddA ddG ddC ddT Labeled Terminators ddA AddC ACddG ACGddT TGCA dN : ddN 100 : 1

31

32

33

34 Electrophoresis

35 Sanger Method Sequencing Gel

36 Template ssDNA vectors – M13 –pUC PCR dsDNA (+/- PCR)

37 Primers Universal primers parallel –cheap, reliable, easy, fast, parallel –BULK sequencing Custom primers –expensive, slow, one-at-a-time – ADAPTABLE

38 Extension Polymerase –Sequenase –Thermostable (Cycle Sequencing) Terminators –Dye labels (“Big Dye”) spectrally different, high fluorescence –ddA,C,G,T with primer labels

39 Separation Gel Electrophoresis Capillary Electrophoresis –suited to automation rapid (2 hrs vs 12 hrs) re-usable simple temperature control 96 well format

40

41 Sequencing of DNA by the Sanger method

42 Comparison Sanger Method –Enzymatic –Requires DNA synthesis –Termination of chain elongation Maxam Gilbert Method –Chemical –Requires DNA –Requires long stretches of DNA –Breaks DNA at different nucleotides

43 Modern Techniques Modern techniques are based on chain termination, but no longer use x-ray film to detect the DNA letters. About 15-20 years ago a means of detecting the ddNTPs with fluorescent tags was discovered. This required only a single test tube instead of four.

44

45 The DNA to be sequenced is prepared as a single strand. This template DNA is supplied with a mixture of all four normal (deoxy) nucleotides in ample quantities dATP dGTP dCTP dTTP A mixture of all four dideoxynucleotides, each present in limiting quantities and each labeled with a "tag" that fluoresces a different color: ddATP ddGTP ddCTP ddTTP DNA polymerase I

46 In fact the whole experiment is run in a single tube. Here the relative concentrations of dNTPs and ddNTPs are critical

47 Laser reads gels directly, all four sample in a single lane

48 Capillary sequencing replaces gels

49 Sample Output 1 lane

50 This is what you end up with

51 Problems No signal Low signal Compression Stop

52 Shotgun Method - Overview Cut genome into short fragments Sequence DNA fragments Create contigs Contig - continous set of overlapping sequences Gap!

53 Shotgun Sequencing Isolate Chromosome ShearDNA into Fragments Clone into Seq. Vectors Sequence

54 Shotgun Method The shotgun approach to sequence assembly. The DNA molecule is broken into small fragments, each of which is sequenced. The master sequence is assembled by searching for overlaps between the sequences of individual fragments. In practice, an overlap of several tens of base pairs would be needed to establish that two sequences should be linked together.

55 Shotgun technique This approach entails sampling DNA fragments as randomly as possible from the source sequence and then producing a sequencing read of the first 300 to 900 bases of one end of each fragment. If enough fragments are sequenced and their sampling is sufficiently random across the source, the process should let us determine the source by finding sequence overlaps among the reads of fragments that were sampled from overlapping stretches.

56 Steps in shotgun technique Technicians randomly fracture the sample either using sound (sonication) or passing it through a nozzle under pressure (nebulation), which produces a uniformly random partitioning of each copy of the source strand into a collection of DNA fragments.

57

58 Step-2 To remove fragments that are too large or too small, this pool of fragments is size selected, typically using size separation under gel electrophoresis and then simply excising a band of the gel containing the desired size.

59 Step-3 The technicians then insert the size-selected fragments into the DNA of a genetically engineered bacterial virus (phage), called a vector.Usually, at most one fragment is inserted at a predetermined point, called the cloning site, in the vector. The fragments at this point are often called inserts and the collection of inserts is a library.

60 Principles of DNA Sequencing Primer PBR322 Amp Tet Ori DNA fragment Denature with heat to produce ssDNA Klenow + ddNTP + dNTP + primers

61 Multiplexed CE with Fluorescent detection ABI 370096x700 bases

62 Shotgun Sequencing Sequence Chromatogram Send to Computer Assembled Sequence

63 Shotgun Method – Haemophilus Influenzae Sequencing 1.5-2kb Extract DNA Sonicate Electrophoresis DNA library SequenceConstruct contigs Sequenced

64 Shotgun Sequencing Very efficient process for small-scale (~10 kb) sequencing (preferred method) First applied to whole genome sequencing in 1995 (H. influenzae) Now standard for all prokaryotic genome sequencing projects Successfully applied to D. melanogaster Moderately successful for H. sapiens

65 Shotgun Method Repeats as the main problem

66 Introduction to Peptide Sequence Determination

67 Primary Structure The primary structure is the amino acid sequence plus any disulfide links.

68 Classical Strategy (Sanger) 1.Determine what amino acids are present and their molar ratios. 2. Cleave the peptide into smaller fragments, and determine the amino acid composition of these smaller fragments. 3. Identify the N-terminus and C-terminus in the parent peptide and in each fragment. 4.Organize the information so that the sequences of small fragments can be overlapped to reveal the full sequence.

69 Amino Acid Analysis

70 Acid-hydrolysis of the peptide (6 M HCl, 24 hr) gives a mixture of amino acids. The mixture is separated by ion-exchange chromatography, which depends on the differences in among the various amino acids. Amino acids are detected using ninhydrin. Automated method; requires only 10 -5 to 10 -7 g of peptide.

71 Partial Hydrolysis of Proteins

72 Partial Hydrolysis of Peptides and Proteins Acid-hydrolysis of the peptide cleaves all of the peptide bonds. Cleaving some, but not all, of the peptide bonds gives smaller fragments. These smaller fragments are then separated and the amino acids present in each fragment determined. Enzyme-catalyzed cleavage is the preferred method for partial hydrolysis.

73 Partial Hydrolysis of Peptides and Proteins The enzymes that catalyze the hydrolysis of peptide bonds are called peptidases, proteases, or proteolytic enzymes.

74 Trypsin Trypsin is selective for cleaving the peptide bond to the carboxyl group of lysine or arginine. NHCHC O R' OR" O R lysine or arginine

75 Chymotrypsin Chymotrypsin is selective for cleaving the peptide bond to the carboxyl group of amino acids with an aromatic side chain. NHCHC O R' OR" O R phenylalanine, tyrosine, tryptophan

76 Carboxypeptidase protein H 3 NCHC O R + NHCHCO OR – CO Carboxypeptidase is selective for cleaving the peptide bond to the C-terminal amino acid.

77 End Group Analysis

78 Amino sequence is ambiguous unless we know whether to read it left-to-right or right-to-left. We need to know what the N-terminal and C- terminal amino acids are. The C-terminal amino acid can be determined by carboxypeptidase-catalyzed hydrolysis. Several chemical methods have been developed for identifying the N-terminus. They depend on the fact that the amino N at the terminus is more nucleophilic than any of the amide nitrogens.

79 Sanger's Method The key reagent in Sanger's method for identifying the N-terminus is 1-fluoro-2,4- dinitrobenzene. 1-Fluoro-2,4-dinitrobenzene is very reactive toward nucleophilic aromatic substitution. F O2NO2NO2NO2N NO 2

80 Sanger's Method 1-Fluoro-2,4-dinitrobenzene reacts with the amino nitrogen of the N-terminal amino acid. F O2NO2NO2NO2N NO 2 NHCH 2 C NHCHCO CH 3 NHCHC CH 2 C 6 H 5 H 2 NCHC OOO O CH(CH 3 ) 2 – + O2NO2NO2NO2N NO 2 NHCH 2 C NHCHCO CH 3 NHCHC CH 2 C 6 H 5 NHCHC O O O O CH(CH 3 ) 2 –

81 Sanger's Method Acid hydrolysis cleaves all of the peptide bonds leaving a mixture of amino acids, only one of which (the N-terminus) bears a 2,4- DNP group. O2NO2NO2NO2N NO 2 NHCH 2 C NHCHCO CH 3 NHCHC CH 2 C 6 H 5 NHCHC O O O O CH(CH 3 ) 2 – H3O+H3O+H3O+H3O+ O O2NO2NO2NO2N NO 2 NHCHCOH CH(CH 3 ) 2 H 3 NCHCO – CH 3 + H 3 NCH 2 CO – OO+ O H 3 NCHCO – CH 2 C 6 H 5 + + + +

82 The specificity of some common methods for fragmenting polypeptide chain Trypsin Lys (K) and/or Arg (R) at C-ter Chymotripsin Phe (F), Trp (W) and Tyr (Y) at C-ter Pepsine Phe (F) Trp (W) and Tyr (Y) at N-ter Cyanogen bromide Met (M) at C-ter

83 Insulin FVNQHLCGSHLVGALYLVCGERGFFYTPKA

84 Insulin is a polypeptide with 51 amino acids. It has two chains, called the A chain (21 amino acids) and the B chain (30 amino acids). The following describes how the amino acid sequence of the B chain was determined.

85 Primary Structure of Bovine Insulin N terminus of A chain N terminus of B chain C terminus of B chain C terminus of A chain 5 5 15 10 15 20 20 25 30SS10 S S S S F F F F V N Q HL C C C C C V V V V G G G S S S H L L L G A A A C L Y Y E E L E R Y Y I Q K P T Q N N

86 The Edman Degradation and Automated Sequencing of Peptides

87 Edman Degradation 1. Method for determining N-terminal amino acid. 2.Can be done sequentially one residue at a time on the same sample. Usually one can determine the first 20 or so amino acids from the N-terminus by this method. 3. 10 -10 g of sample is sufficient. 4. Has been automated.

88 Edman Degradation The key reagent in the Edman degradation is phenyl isothiocyanate. NC S

89 Edman Degradation Phenyl isothiocyanate reacts with the amino nitrogen of the N-terminal amino acid. peptide H 3 NCHC OR + NHNHNHNH C6H5NC6H5NC6H5NC6H5NC S +

90 Edman Degradation peptide H 3 NCHC OR + NHNHNHNH C6H5NC6H5NC6H5NC6H5NC S + peptide C 6 H 5 NHCNHCHC OR NHNHNHNHS

91 Edman Degradation peptide C 6 H 5 NHCNHCHC OR NHNHNHNHS The product is a phenylthiocarbamoyl (PTC) derivative. The PTC derivative is then treated with HCl in an anhydrous solvent. The N-terminal amino acid is cleaved from the remainder of the peptide.

92 Edman Degradation peptide C 6 H 5 NHCNHCHC OR NHNHNHNHSHCl peptide H3NH3NH3NH3N++ C6H5NHC6H5NHC6H5NHC6H5NH CSC N CH R O

93 Edman Degradation C6H5NHC6H5NHC6H5NHC6H5NH CSC N CH R O The product is a thiazolone. Under the conditions of its formation, the thiazolone rearranges to a phenylthiohydantoin (PTH) derivative. peptide H3NH3NH3NH3N++

94 Edman Degradation C6H5NHC6H5NHC6H5NHC6H5NH CSC N CH R O C C N HN CH R OS C6H5C6H5C6H5C6H5 The PTH derivative is isolated and identified. The remainder of the peptide is subjected to a second Edman degradation. peptide H3NH3NH3NH3N++

95 Edman Degradation The Edman degradation is carried out on a machine, called a sequenator, that mixes reagents in the proper proportions, separates the products, identifies them, and records the results


Download ppt "DNA and Genome sequencing. Genome –Hereditary information of an organism is encoded in its DNA and enclosed in a cell (unless it is a virus). All the."

Similar presentations


Ads by Google