Classification depicted as a tree SpeciesGenusFamilyOrderClass
Comparison of limbs Image source:
Theory of evolution Charles Darwin
Phylogenetic basis of systematics Linnaeus: Ordering principle is God. Darwin: Ordering principle is shared descent from common ancestors. Today, systematics is explicitly based on phylogeny.
Natural Selection: Darwin’s four postulates More young are produced each generation than can survive to reproduce. Individuals in a population vary in their characteristics. Some differences among individuals are based on genetic differences. Individuals with favorable characteristics have higher rates of survival and reproduction. Evolution by means of natural selection Presence of ”design-like” features in organisms: Quite often features are there “for a reason”
Evolution at the sequence level
About DNA DNA contains the recipes of how to make protein / enzymes. Every time a cells divides it’s DNA is duplicated, and each daughter cell gets a copy.
The DNA alphabet The information in the DNA is written in a four letter code: A, T, G, C. The DNA can be “sequenced” and the result stored in a computer file. ATGGCCCTGTGGAT
DNA is always written 5’ 3’ 5’ AGCC 3’ 3’ TCGG 5’ 5’ ATGGCCAGGTAA 3’ DNA backbone: (Deoxy)ribose: Ribose Deoxyribose ’ 3’ 5’ 3’
ATGGCCCTGTGGATGCG Can DNA be changed?
ATGGCCCTGTGGATGCG ATGGCCCTATGGATGCG Can DNA be changed?
A history of mutations ATGGCCCTGTGTATGCG ATGGCAATGTGGATGCA ATGGCCCTGTGGATGCG ATGGCCCCGTGGATGCG ATGTCCCCGTGGATGCG ATGGCCCCGTGGAACCG Time
Real life example: Alignment Insulin from 7 different species Homo:ATGGCCCTGTGGATGCGCCTCCTGCCCCTGCTGGCGCTGCTGGCCCTCTGGGGACCTGACCCAGCCGCAGCCTTTGTGAA Pan:ATGGCCCTGTGGATGCGCCTCCTGCCCCTGCTGGTGCTGCTGGCCCTCTGGGGACCTGACCCAGCCTCGGCCTTTGTGAA Sus:ATGGCCCTGTGGACGCGCCTCCTGCCCCTGCTGGCCCTGCTGGCCCTCTGGGCGCCCGCCCCGGCCCAGGCCTTCGTGAA Ovis:ATGGCCCTGTGGACACGCCTGGTGCCCCTGCTGGCCCTGCTGGCACTCTGGGCCCCCGCCCCGGCCCACGCCTTCGTCAA Canis:ATGGCCCTCTGGATGCGCCTCCTGCCCCTGCTGGCCCTGCTGGCCCTCTGGGCGCCCGCGCCCACCCGAGCCTTCGTTAA Mus:ATGGCCCTGTTGGTGCACTTCCTACCCCTGCTGGCCCTGCTTGCCCTCTGGGAGCCCAAACCCACCCAGGCTTTTGTCAA Gallus:ATGGCTCTCTGGATCCGATCACTGCCTCTTCTGGCTCTCCTTGTCTTTTCTGGCCCTGGAACCAGCTATGCAGCTGCCAA
Real life example: Tree
Interpretation of Multiple Alignments Conserved features assumed to be important for functionality For instance: conserved pairs of cysteines indicate possible disulphide bridge
Darwin: all organisms are related through descent with modification Prediction: similar molecules have similar functions in different organisms Protein synthesis carried out by very similar RNA-containing molecular complexes (ribosomes) that are present in all known organisms Sequences are related
Sequences are related, II Related oxygen- binding proteins in humans
DNA as Biological Information Rasmus Wenersson
Overview Learning objectives –About Biological Information –A note about DNA sequencing techniques and DNA data –File formats used for biological data –Introduction to the GenBank database
Information flow in biological systems
DNA sequences = summary of information 5’ AGCC 3’ 3’ TCGG 5’ 5’ ATGGCCAGGTAA 3’ DNA backbone: (Deoxy)ribose: Ribose Deoxyribose ’ 3’ 5’ 3’
Gel electrophoresis DNA fragments are seperated using gel electrophoresis –Typically 1% argarose –Colored with EtBr or ZybrGreen (glows in UV light). –A DNA ”ladder” is used for identification of known DNA lengths. Gel picture: PCR setup: + -
The Sanger method of DNA sequencing Images: } Terminator X-ray sequenceing gel OH
Automated sequencing The major break-through of sequencing has happended through automation. Fluorescent dyes. Laser based scanning. Capillary electrophoresis Computer based base- calling and assembly. Images:
Handout exercise: ”base-calling” Handout: Chromotogram Groups of 2-3. Tasks: –Identify “difficult” regions –Identify “difficult” sequence stretches. –Try to estimate the best interval to use.
Biological data on computers The GenBank database File formats –FASTA –GenBank
NCBI GenBank GenBank is one of the main internaltional DNA databases. GenBank is hosted by NCBI: National Center for Biotechnology Information. GenBank has exists since The database is public - no restrictions on the use of the data within.
GenBank format Originates from the GenBank database. Contains both a DNA sequence and annotation of feature (e.g. Location of genes). (handout)
GenBank format - HEADER LOCUS CMGLOAD 1185 bp DNA linear VRT 18-APR-2005 DEFINITION Cairina moschata (duck) gene for alpha-D globin. ACCESSION X01831 VERSION X GI:62724 KEYWORDS alpha-globin; globin. SOURCE Cairina moschata (Muscovy duck) ORGANISM Cairina moschata Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Archosauria; Aves; Neognathae; Anseriformes; Anatidae; Cairina. REFERENCE 1 (bases 1 to 1185) AUTHORS Erbil,C. and Niessing,J. TITLE The primary structure of the duck alpha D-globin gene: an unusual 5' splice junction sequence JOURNAL EMBO J. 2 (8), (1983) PUBMED COMMENT Data kindly reviewed (13-NOV-1985) by J. Niessing.