DNA RNA Protein replication (mutation!) transcription translation (nucleotides) (amino acids) (nucleotides) Nucleic acids ~ “software” ~ “hardware” An.

Slides:



Advertisements
Similar presentations
Evolution and Complex Structures: Simulated Evolution Hints at Features? Eric Duchon March 17, 2008.
Advertisements

Fa07CSE 182 CSE182-L4: Database filtering. Fa07CSE 182 Summary (through lecture 3) A2 is online We considered the basics of sequence alignment –Opt score.
Hidden Markov Models (1)  Brief review of discrete time finite Markov Chain  Hidden Markov Model  Examples of HMM in Bioinformatics  Estimations Basic.
Gapped BLAST and PSI-BLAST Altschul et al Presenter: 張耿豪 莊凱翔.
Bioinformatics Tutorial I BLAST and Sequence Alignment.
BLAST Sequence alignment, E-value & Extreme value distribution.
1. The scientific process involves… A. the acceptance of all hypotheses. B. rejection of hypotheses that are inconsistent with experimental results. C.
Chapter VI …Organs of extreme perfection and complication. -- To suppose that the eye, with all its inimitable contrivances for adjusting the focus to.
Chapter 25 ~ Phylogeny & Systematics. Phylogeny: the evolutionary history of a species Systematics:Systematics: the study of biological diversity in an.
Sequence Alignment Kun-Mao Chao ( 趙坤茂 ) Department of Computer Science and Information Engineering National Taiwan University, Taiwan
Textbox center textbox center Deniers, Darwin and the Dodo Birds 2 Peter 3:3-13.
The Big Picture GEA 101 Critical Thinking. The Importance of Context When someone provides you with evidence for a truth-claim, you have to ask: Does.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 2: “Homology” Searches and Sequence Alignments.
Database Searching for Similar Sequences Search a sequence database for sequences that are similar to a query sequence Search a sequence database for sequences.
Heuristic alignment algorithms and cost matrices
Sequence Comparison Intragenic - self to self. -find internal repeating units. Intergenic -compare two different sequences. Dotplot - visual alignment.
Sequence Alignment III CIS 667 February 10, 2004.
Heuristic Approaches for Sequence Alignments
1-month Practical Course Genome Analysis Lecture 3: Residue exchange matrices Centre for Integrative Bioinformatics VU (IBIVU) Vrije Universiteit Amsterdam.
Practical algorithms in Sequence Alignment Sushmita Roy BMI/CS 576 Sep 16 th, 2014.
Blast heuristics Morten Nielsen Department of Systems Biology, DTU.
Practical algorithms in Sequence Alignment Sushmita Roy BMI/CS 576 Sep 17 th, 2013.
Information theoretic interpretation of PAM matrices Sorin Istrail and Derek Aguiar.
1 BLAST: Basic Local Alignment Search Tool Jonathan M. Urbach Bioinformatics Group Department of Molecular Biology.
Alignment Statistics and Substitution Matrices BMI/CS 576 Colin Dewey Fall 2010.
An Introduction to Bioinformatics
BLAST What it does and what it means Steven Slater Adapted from pt.
BLAST Workshop Maya Schushan June 2009.
BIOINFORMATICS IN BIOCHEMISTRY Bioinformatics– a field at the interface of molecular biology, computer science, and mathematics Bioinformatics focuses.
Computational Biology, Part 9 Efficient database searching methods Robert F. Murphy Copyright  1996, 1999, All rights reserved.
Sepia Fish Turtle Octopus. “ To suppose that the eye with all its inimitable contrivances for adjusting the focus to different distances, for admitting.
Database Searches BLAST. Basic Local Alignment Search Tool –Altschul, Gish, Miller, Myers, Lipman, J. Mol. Biol. 215 (1990) –Altschul, Madden, Schaffer,
Construction of Substitution Matrices
Sequence Alignment Csc 487/687 Computing for bioinformatics.
Re-Versed Lyrics Copyright © 1997 Nancy L. Mari "Evolution" (sung to the tune of "Revolution“ by The Beatles) You say believe in evolution - well, you.
Basic terms:  Similarity - measurable quantity. Similarity- applied to proteins using concept of conservative substitutions Similarity- applied to proteins.
1 What is Life? – Living organisms: – are composed of cells – are complex and ordered – respond to their environment – can grow and reproduce – obtain.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
INFO 272. Qualitative Research Methods 17 November 2009.
BLAST Slides adapted & edited from a set by Cheryl A. Kerfeld (UC Berkeley/JGI) & Kathleen M. Scott (U South Florida) Kerfeld CA, Scott KM (2011) Using.
BLAST, which stands for basic local alignment search tool, is a heuristic algorithm that is used to find similar sequences of amino acids or nucleotides.
Pairwise Local Alignment and Database Search Csc 487/687 Computing for Bioinformatics.
Constructing Probability Matrices Redux Suppose we live in a world with only 3 amino acids: Alanine Leucine Serine Furthermore suppose: Alanine Leucine.
INFO 272. Qualitative Research Methods 5 May 2009.
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
BLAST, which stands for basic local alignment search tool, is a heuristic algorithm that is used to find similar sequences of amino acids or nucleotides.
Heuristic Methods for Sequence Database Searching BMI/CS 576 Colin Dewey Fall 2015.
Sequence Alignment.
Construction of Substitution matrices
Doug Raiford Phage class: introduction to sequence databases.
Step 3: Tools Database Searching
The statistics of pairwise alignment BMI/CS 576 Colin Dewey Fall 2015.
Heuristic Methods for Sequence Database Searching BMI/CS 576 Colin Dewey Fall 2010.
BLAST: Database Search Heuristic Algorithm Some slides courtesy of Dr. Pevsner and Dr. Dirk Husmeier.
Using BLAST To Teach ‘E-value-tionary’ Concepts Cheryl A. Kerfeld 1, 2 and Kathleen M. Scott 3 1.Department of Energy-Joint Genome Institute, Walnut Creek,
Substitution Matrices and Alignment Statistics BMI/CS 776 Mark Craven February 2002.
9/6/07BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST1 BCB 444/544 Lab 3 BLAST Scoring Matrices & Alignment Statistics Sept6.
1. The scientific process involves…
Living covered by the fingerprints of God
Homology Search Tools Kun-Mao Chao (趙坤茂)
BLAST Anders Gorm Pedersen & Rasmus Wernersson.
Homology Search Tools Kun-Mao Chao (趙坤茂)
Basic Local Alignment Search Tool
Basic Local Alignment Search Tool (BLAST)
Constructing Probability Matrices
Basic Local Alignment Search Tool (BLAST)
Homology Search Tools Kun-Mao Chao (趙坤茂)
BLAST Slides adapted & edited from a set by
Sequence alignment, E-value & Extreme value distribution
BLAST Slides adapted & edited from a set by
Presentation transcript:

DNA RNA Protein replication (mutation!) transcription translation (nucleotides) (amino acids) (nucleotides) Nucleic acids ~ “software” ~ “hardware” An Information Flow in Biology Primer genes messages

Examples of DNA and Protein Structure DNA: Protein: amino acids ( antibody ( HIV reverse transcriptase (

Mutations (hence new varieties) do not arise because they are needed -- they arise by chance Mutations (hence new varieties) do not arise because they are needed -- they arise by chance Mutations merely furnish random raw material for evolution, and rarely, if ever determine the course of the process Mutations merely furnish random raw material for evolution, and rarely, if ever determine the course of the process Natural selection is the differential reproduction of genotypes (genes) Natural selection is the differential reproduction of genotypes (genes) Evolution is the change in the genetic composition of a population over time – “Natural Selection is not Evolution” – Ronald Fisher, The Genetical Theory of Natural Selection [see the Weasel applet for a demonstration of the power of selection] Weasel applet Weasel applet An Evolution by Natural Selection Primer

speciesgene frequencytime

Chapter VI …Organs of extreme perfection and complication. -- To suppose that the eye, with all its inimitable contrivances for adjusting the focus to different distances, for admitting different amounts of light, and for the correction of spherical and chromatic aberration, could have been formed by natural selection, seems, I freely confess, absurd in the highest possible degree. Yet reason tells me, that if numerous gradations from a perfect and complex eye to one very imperfect and simple, each grade being useful to its possessor, can be shown to exist; if further, the eye does vary ever so slightly, and the variations be inherited, which is certainly the case; and if any variation or modification in the organ be ever useful to an animal under changing conditions of life, then the difficulty of believing that a perfect and complex eye could be formed by natural selection, though insuperable by our imagination, can hardly be considered real. … Chapter VI …Organs of extreme perfection and complication. -- To suppose that the eye, with all its inimitable contrivances for adjusting the focus to different distances, for admitting different amounts of light, and for the correction of spherical and chromatic aberration, could have been formed by natural selection, seems, I freely confess, abserd in the highest possible degree. Yet reason tells me, that if numerous gradations from a perfect and complex eye to one very imperfect and simple, each grade being useful to its possessor, can be shown to exist; if further, the eye does vary ever so slightly, and the variations be inherited, which is certainly the case; and if any variation or modification in the organ be ever useful to an animal under changing conditions of life, then the difficulty of believing that a perfect and complex eye could be formed by natural selection, though insuperable by our imagination, can hardly be considered real. … Chapter VI …Organs of extreme perfection and complication. -- To suppose that the eye, with all its inimitable contrivances for adjusting the focus to different distances, for admitting different amounts of light, and for the correction of spherical and chromatic aberration, could have been formed by natural selection, seems, I freely confess, abserd in the highest possible degree. Yet reason tells me, that if numerous gradations from a perfect and complex eye to one very imperfect and simple, each grade being useful to its possessor, can be shown to exist; if further, the eye does vary ever so slightly, and the variations be inherited, which is certainly the case; and if any variation or modification in the organ be ever useful to an animal under changing conditions of life, then the difficulty of believing that a perfect and complex eye could be formed by natural selection, though insuperible by our imagination, can hardly be considered real. … Chapter VI …Organs of extreme perfection and complication. -- To suppose that the eye, with all its inimitable contrivances for adjusting the focus to different distances, for admitting different amounts of light, and for the correction of spherical and chromatic aberration, could have been formed by natural selection, seems, I freely confess, abserd in the highest possible degree. Yet reason tells me, that if numerous gradations from a perfect and complex eye to one very imperfect and simple, each grade being useful to its possessor, can be shown to exist; if further, the eye does vary ever so slightly, and the variations be inherited, which is certainly the case; and if any variation or modification in the organ be ever useful to an animal under changing conditions of life, then the difficulty of believing that a perfect and complex eye could be formed by natural selection, though insuperible by our imagination, can hardly be considered real. … Chapter VI …Organs of extreme perfection and complication. -- To suppose that the eye, with all its inimitable contrivances for adjusting the focus to different distances, for admitting different amounts of light, and for the correction of spherical and chromatic aberration, could have been formed by natural selection, seems, I freely confess, abserd in the highest possible degree. Yet reason tells me, that if numerous gradations from a perfect and complex eye to one very imperfect and simple, each grade being useful to its possesser, can be shown to exist; if further, the eye does vary ever so slightly, and the variations be inherited, which is certainly the case; and if any variation or modification in the organ be ever useful to an animal under changing conditions of life, then the difficulty of believing that a perfect and complex eye could be formed by natural selection, though insuperible by our imagination, can hardly be considered real. … Chapter VI Yet reason tells me, that if numerous gradations from a perfect and complex eye to one very imperfect and simple, each grade being useful to its possesser, can be shown to exist; if further, the eye does vary ever so slightly, and the variations be inherited, which is certainly the case; and if any variation or modification in the organ be ever useful to an animal under changing conditions of life, then the difficulty of believing that a perfect and complex eye could be formed by natural selection, though insuperible by our imagination, can hardly be considered real. … …Organs of extreme perfection and complication. -- To suppose that the eye, with all its inimitable contrivances for adjusting the focus to different distances, for admitting different amounts of light, and for the correction of spherical and chromatic aberration, could have been formed by natural selection, seems, I freely confess, abserd in the highest possible degree.

Chapter VI A phylogeny of Chapter VI’s

Determining Similarity* To determine how similar your sequence is to other sequences in the database, you rely on scores determined by the program you are using. The scores are generated in different ways for each of the different programs, but, in general scores are determined by substitution matricies. To understand how these matricies work, let's look at he simplest case: aligning two nucleotide sequences. Generally, when two nucleotide sequences are aligned, the scores are as follows: +2 = identity -1 = mismatch * This slide modified from Andreas Matern’s presentation ( Example Score GATACA =12 GATACA GATACC =9 GATACA GAAACA =9 GATACA Example Score GAAGCC =6 GATACA GATCCCACA =9 gap GAT ACA GATAC =9 gap GATACA Situation for protein is more complex!

BLAST - Basic Local Alignment Search Tool J. Mol. Biol. (1990) 215: Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ BLAST sifts through the huge amounts of data in a database, scanning a nucleotide database at 2 x 106 bases per second, and a protein database at 500,000 residues a second! How does it do it so fast? Well, this is a little hairy -- and probably more statistics than anyone really needs to know. I'll try to add some of the statistical stuff at a later date, but for now remember that, as previously mentioned, BLAST does not go through each an every sequence, it uses a LOCAL alignment heuristic. Without belaboring the statistics, BLAST divides your sequence into words - smaller segments with a given length (w). Nucleotides are normally broken up into words of length w = 12. Then, BLAST goes through the database (which has also been broken up into words) to find pairs with a score above a predetermined threshold. However, using a statistical heuristic which Karlin and Altschul developed, BLAST can eliminate some of these attempted word pairings by estimating a score at which a match is no better than chance. By ignoring all searches at and below this score, BLAST can effectively disregard a great deal of the database. BLAST finds only those pairs that contain a score of at least T - a threshold value. Once it finds a hit, it then tries to extend that hit until a cutoff score (S) is reached. Extending means that it adds letters to the ends of the word pair and then assesses the new score. To reiterate: 1.Cut up the sequences into smaller pieces called words 2.Ignore all pairs below the threshold score 3.Try to extend all remaining hits until you get to a cutoff score In the original BLAST paper, Altschul et al go through a series of tests to generate (via random simulation and through real data) values of w, T, and S which are the most biologically relevant and yet computationally useful. Generally speaking, the lower the threshold (T) value is, the greater the chance of finding a "hit" of at least S. However, small values of T increase the number of hits, and therfore the amount of time it takes for BLAST to sort through the database. This slide taken from Andreas Matern’s presentation (

Let’s do a Blast search

Protein Tyrosine Kinases (PTK’s) Protein Tyrosine Kinases (PTK’s) Protein Tyrosine Phosphatases (PTP’s)

WIP PTP FERM DPez: 1252 aa, ~ 140kD 34% identical, 46% similar to human Pez 37% identical, 53% similar to human Pez 27% identical, 47% similar to human WIP

FERM PTP