DNA as Biological Information Rasmus Wenersson. Overview Learning objectives –About Biological Information –A note about DNA sequencing techniques and.

Slides:



Advertisements
Similar presentations
FROM DNA TO PROTEIN Transcription – Translation
Advertisements

DNA as Biological Information Rasmus Wernersson Henrik Nielsen.
Database Search: Mutation Interpretation Huong Le Senior Hospital Scientist Department of Molecular & Clinical Genetics Royal Prince Alfred Hospital Sydney,
DNA sequence analysis Xu Cheng. DNA sequence analysis Retrieving DNA sequences from databases Computing nucleotide compositions Identifying restriction.
Biologisk information Med fokus på DNA. Læringsmål / learning objectives Læringsmål –Hvad er biologisk information –Informations flow –Teknikken bag DNA.
Introduktion til Bioinformatik Hold 01 Oktober 2010.
It og Sundhed Nov Jan. Thomas Nordahl Petersen, Associate Professor Center for Biological Sequence Analysis, DTU Normal
It og Sundhed Thomas Nordahl Petersen, Associate Professor Center for Biological Sequence Analysis, DTU Building 208, room 021
IMGS 2012 Bioinformatics Workshop: File Formats for Next Gen Sequence Analysis.
© Wiley Publishing All Rights Reserved. Using Nucleotide Sequence Databases.
It og Sundhed Nov Jan. Thomas Nordahl Petersen, Associate Professor Center for Biological Sequence Analysis, DTU
Module 12 Human DNA Fingerprinting and Population Genetics p 2 + 2pq + q 2 = 1.
On line (DNA and amino acid) Sequence Information Lecture 7.
RNA and Protein Synthesis
© 2006 W.W. Norton & Company, Inc. DISCOVER BIOLOGY 3/e
What is bioinformatics? Finding patterns in molecular biological data Implies: managing molecular biological data identifying correlations in molecular.
How to use the web for bioinformatics Molecular Technologies February 11, 2005 Ethan Strauss X 1373
It og Sundhed Thomas Nordahl Petersen, Associate Professor Center for Biological Sequence Analysis, DTU
DNA as Biological Information Rasmus Wernersson Henrik Nielsen.
How to use the web for bioinformatics Ethan Strauss X 1171
 ribose  Adenine  Uracil  Adenine  Single.
Wellcome Trust Workshop Working with Pathogen Genomes Module 1 Artemis.
Pattern databasesPattern databasesPattern databasesPattern databases Gopalan Vivek.
On line (DNA and amino acid) Sequence Information
Chapter 20: Biotechnology. Essential Knowledge u 3.a.1 – DNA, and in some cases RNA, is the primary source of heritable information (20.1 & 20.2)
Analyzing your clone 1) FISH 2) “Restriction mapping” 3) Southern analysis : DNA 4) Northern analysis: RNA tells size tells which tissues or conditions.
6.3 Advanced Molecular Biological Techniques 1. Polymerase chain reaction (PCR) 2. Restriction fragment length polymorphism (RFLP) 3. DNA sequencing.
What is bioinformatics?. What are bioinformaticians up to, actually? Manage molecular biological data –Store in databases, organise, formalise, describe...
Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.
Section 2 Genetics and Biotechnology DNA Technology
Organizing information in the post-genomic era The rise of bioinformatics.
AP Biology DNA Study Guide. Chapter 16 Molecular Basis of Heredity The structure of DNA The major steps to replication The difference between replication,
Recombinant DNA Technology and Genomics A.Overview: B.Creating a DNA Library C.Recover the clone of interest D.Analyzing/characterizing the DNA - create.
Bioinformatics for Human Biologists Rasmus Wernersson, Associate Professor Center for Biological Sequence Analysis, DTU [ -
Peptide Bond Formation Walk the Dogma RECALL: The 4 types of organic molecules… CARBOHYDRATES LIPIDS PROTEINS (amino acid chains) NUCLEIC ACIDS (DNA.
Sackler Medical School
Biological databases Exercises. Discovery of distinct sequence databases using ensembl.
GENE SEQUENCING. INTRODUCTION CELL The cells contain the nucleus. The chromosomes are present within the nucleus.
Locating and sequencing genes
Molecular Basis for Relationship between Genotype and Phenotype DNA RNA protein genotype function organism phenotype DNA sequence amino acid sequence transcription.
Class summary and homework for February 2 since we missed 2 consecutive classes, this class consisted of a review and going over the 2 pending homework.
While replication, one strand will form a continuous copy while the other form a series of short “Okazaki” fragments Genetic traits can be transferred.
Recombinant DNA Technology. DNA replication refers to the scientific process in which a specific sequence of DNA is replicated in vitro, to produce multiple.
Transcription and Translation of DNA How does DNA transmit information within the cell? PROTEINS! How do we get from DNA to protein??? The central dogma.
1 DNA and Biotechnology. 2 Outline DNA Structure and Function DNA Replication RNA Structure and Function – Types of RNA Gene Expression – Transcription.
UCSC Genome Browser Zeevik Melamed & Dror Hollander Gil Ast Lab Sackler Medical School.
Finding genes in the genome
Annotation of eukaryotic genomes
CFE Higher Biology DNA and the Genome Transcription.
Announcements Exam I: Returned at recitation take-home 58% + in-class 80% = 72.7% DRAFT2: Returned in lab this week Class Average: 162/205 = 79% [trajectory]
GENBANK FILE FORMAT LOCUS –LOCUS NAME Is usually the first letter of the genus and species name, followed by the accession number –SEQUENCE LENGTH Number.
Automated DNA Sequencing AP Biology Fall Automated DNA Sequencing  Currently, laboratories use automated DNA sequencing to determine the unknown.
The genetic engineers toolkit A brief overview of some of the techniques commonly used.
Topic Cloning and analyzing oxalate degrading enzymes to see if they dissolve kidney stones with Dr. VanWert.
Bioinformatics Computing 1 CMP 807 – Day 4 Kevin Galens.
DNA Sequencing First generation techniques
DNA as Biological Information
”Gene Finding in Eukaryotic Genomes”
DNA as Biological Information
COURSE OF MICROBIOLOGY
Section 2 Genetics and Biotechnology DNA Technology
Chapter 4 “DNA Finger Printing”
Relationship between Genotype and Phenotype
DNA Sequencing The DNA from the genome is chopped into bits- whole chromosomes are too large to deal with, so the DNA is broken into manageably-sized overlapping.
Central Dogma Central Dogma categorized by: DNA Replication Transcription Translation From that, we find the flow of.
3.1 Genes Essential idea: Every living organism inherits a blueprint for life from its parents. Genes and hence genetic information is inherited from.
Plant Biotechnology Lecture 2
It og Sundhed Thomas Nordahl Petersen, Associate Professor
Relationship between Genotype and Phenotype
Gel Electrophoresis Technique for separating DNA molecules based on size.
Presentation transcript:

DNA as Biological Information Rasmus Wenersson

Overview Learning objectives –About Biological Information –A note about DNA sequencing techniques and DNA data –File formats used for biological data –Introduction to the GenBank database

Information flow in biological systems

DNA sequences = summary of information 5’ AGCC 3’ 3’ TCGG 5’ 5’ ATGGCCAGGTAA 3’ DNA backbone: (Deoxy)ribose: Ribose Deoxyribose ’ 3’ 5’ 3’

PCR Melting 96º, 30 sec Annealing ~55º, 30 sec Extension 72º, 30 sec 35 cycles Animation :

PCR Animation: PCR graph:

Gel electrophoresis DNA fragments are seperated using gel electrophoresis –Typically 1% argarose –Colored with EtBr or ZybrGreen (glows in UV light). –A DNA ”ladder” is used for identification of known DNA lengths. Gel picture: PCR setup: + -

The Sanger method of DNA sequencing Images: } Terminator X-ray sequenceing gel OH

Automated sequencing The major break-through of sequencing has happended through automation. Fluorescent dyes. Laser based scanning. Capillary electrophoresis Computer based base- calling and assembly. Images:

Handout exercise: ”base-calling” Handout: Chromotogram Groups of 2-3. Tasks: –Identify “difficult” regions –Identify “difficult” sequence stretches. –Try to estimate the best interval to use.

Biological data on computers The GenBank database File formats –FASTA –GenBank

NCBI GenBank GenBank is one of the main internaltional DNA databases. GenBank is hosted by NCBI: National Center for Biotechnology Information. GenBank has exists since The database is public - no restrictions on the use of the data within.

FASTA format >alpha-D ATGCTGACCGACTCTGACAAGAAGCTGGTCCTGCAGGTGTGGGAGAAGGTGATCCGCCAC CCAGACTGTGGAGCCGAGGCCCTGGAGAGGTGCGGGCTGAGCTTGGGGAAACCATGGGCA AGGGGGGCGACTGGGTGGGAGCCCTACAGGGCTGCTGGGGGTTGTTCGGCTGGGGGTCAG CACTGACCATCCCGCTCCCGCAGCTGTTCACCACCTACCCCCAGACCAAGACCTACTTCC CCCACTTCGACTTGCACCATGGCTCCGACCAGGTCCGCAACCACGGCAAGAAGGTGTTGG CCGCCTTGGGCAACGCTGTCAAGAGCCTGGGCAACCTCAGCCAAGCCCTGTCTGACCTCA GCGACCTGCATGCCTACAACCTGCGTGTCGACCCTGTCAACTTCAAGGCAGGCGGGGGAC GGGGGTCAGGGGCCGGGGAGTTGGGGGCCAGGGACCTGGTTGGGGATCCGGGGCCATGCC GGCGGTACTGAGCCCTGTTTTGCCTTGCAGCTGCTGGCGCAGTGCTTCCACGTGGTGCTG GCCACACACCTGGGCAACGACTACACCCCGGAGGCACATGCTGCCTTCGACAAGTTCCTG TCGGCTGTGTGCACCGTGCTGGCCGAGAAGTACAGATAA >alpha-A ATGGTGCTGTCTGCCAACGACAAGAGCAACGTGAAGGCCGTCTTCGGCAAAATCGGCGGC CAGGCCGGTGACTTGGGTGGTGAAGCCCTGGAGAGGTATGTGGTCATCCGTCATTACCCC ATCTCTTGTCTGTCTGTGACTCCATCCCATCTGCCCCCATACTCTCCCCATCCATAACTG TCCCTGTTCTATGTGGCCCTGGCTCTGTCTCATCTGTCCCCAACTGTCCCTGATTGCCTC TGTCCCCCAGGTTGTTCATCACCTACCCCCAGACCAAGACCTACTTCCCCCACTTCGACC TGTCACATGGCTCCGCTCAGATCAAGGGGCACGGCAAGAAGGTGGCGGAGGCACTGGTTG AGGCTGCCAACCACATCGATGACATCGCTGGTGCCCTCTCCAAGCTGAGCGACCTCCACG CCCAAAAGCTCCGTGTGGACCCCGTCAACTTCAAAGTGAGCATCTGGGAAGGGGTGACCA GTCTGGCTCCCCTCCTGCACACACCTCTGGCTACCCCCTCACCTCACCCCCTTGCTCACC ATCTCCTTTTGCCTTTCAGCTGCTGGGTCACTGCTTCCTGGTGGTCGTGGCCGTCCACTT CCCCTCTCTCCTGACCCCGGAGGTCCATGCTTCCCTGGACAAGTTCGTGTGTGCCGTGGG CACCGTCCTTACTGCCAAGTACCGTTAA (Handout)

GenBank format Originates from the GenBank database. Contains both a DNA sequence and annotation of feature (e.g. Location of genes). (handout)

GenBank format - HEADER LOCUS CMGLOAD 1185 bp DNA linear VRT 18-APR-2005 DEFINITION Cairina moschata (duck) gene for alpha-D globin. ACCESSION X01831 VERSION X GI:62724 KEYWORDS alpha-globin; globin. SOURCE Cairina moschata (Muscovy duck) ORGANISM Cairina moschata Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Archosauria; Aves; Neognathae; Anseriformes; Anatidae; Cairina. REFERENCE 1 (bases 1 to 1185) AUTHORS Erbil,C. and Niessing,J. TITLE The primary structure of the duck alpha D-globin gene: an unusual 5' splice junction sequence JOURNAL EMBO J. 2 (8), (1983) PUBMED COMMENT Data kindly reviewed (13-NOV-1985) by J. Niessing.

GenBank format - ORIGIN section ORIGIN 1 ctgcgtggcc tcagcccctc cacccctcca cgctgataag ataaggccag ggcgggagcg 61 cagggtgcta taagagctcg gccccgcggg tgtctccacc acagaaaccc gtcagttgcc 121 agcctgccac gccgctgccg ccatgctgac cgccgaggac aagaagctca tcgtgcaggt 181 gtgggagaag gtggctggcc accaggagga attcggaagt gaagctctgc agaggtgtgg 241 gctgggccca gggggcactc acagggtggg cagcagggag caggagccct gcagcgggtg 301 tgggctggga cccagagcgc cacggggtgc gggctgagat gggcaaagca gcagggcacc 361 aaaactgact ggcctcgctc cggcaggatg ttcctcgcct acccccagac caagacctac 421 ttcccccact tcgacctgca tcccggctct gaacaggtcc gtggccatgg caagaaagtg 481 gcggctgccc tgggcaatgc cgtgaagagc ctggacaacc tcagccaggc cctgtctgag 541 ctcagcaacc tgcatgccta caacctgcgt gttgaccctg tcaacttcaa ggcaagcggg 601 gactagggtc cttgggtctg ggggtctgag ggtgtggggt gcagggtctg ggggtccagg 661 ggtctgagtt tcctggggtc tggcagtcct gggggctgag ggccagggtc ctgtggtctt 721 gggtaccagg gtcctggggg ccagcagcca gacagcaggg gctgggattg catctgggat 781 gtgggccaga ggctgggatt gtgtttggaa tgggagctgg gcaggggcta gggccagggt 841 gggggactca gggcctcagg gggactcggg gggggactga gggagactca gggccatctg 901 tccggagcag gggtactaag ccctggtttg ccttgcagct gctggcacag tgcttccagg 961 tggtgctggc cgcacacctg ggcaaagact acagccccga gatgcatgct gcctttgaca 1021 agttcttgtc cgccgtggct gccgtgctgg ctgaaaagta cagatgagcc actgcctgca 1081 cccttgcacc ttcaataaag acaccattac cacagctctg tgtctgtgtg tgctgggact 1141 gggcatcggg ggtcccaggg agggctgggt tgcttccaca catcc //

FEATURES Location/Qualifiers source /organism="Cairina moschata" /mol_type="genomic DNA" /db_xref="taxon:8855" CAAT_signal TATA_signal precursor_RNA /note="primary transcript" exon /number=1 CDS join( , , ) /codon_start=1 /product="alpha D-globin" /protein_id="CAA " /db_xref="GI: " /db_xref="GOA:P02003" /db_xref="InterPro:IPR000971" /db_xref="InterPro:IPR002338" /db_xref="InterPro:IPR002340" /db_xref="InterPro:IPR009050" /db_xref="UniProt/Swiss-Prot:P02003" /translation="MLTAEDKKLIVQVWEKVAGHQEEFGSEALQRMFLAYPQTKTYFP HFDLHPGSEQVRGHGKKVAAALGNAVKSLDNLSQALSELSNLHAYNLRVDPVNFKLLA QCFQVVLAAHLGKDYSPEMHAAFDKFLSAVAAVLAEKYR" repeat_region /note="direct repeat 1" intron /number=1 repeat_region /note="direct repeat 1" exon /number=2 intron /number=2 exon /number=3 polyA_signal polyA_signal 1114 GenBank format - FEATURE section

Exercise: GenBank Work in groups of 2-3 people. The exercise guide is linked from the course programme. Read the guide carefully - it contains a lot of information about GenBank.