Pipelines for Computational Analysis (Bioinformatics)

Slides:



Advertisements
Similar presentations
Phylogenetics workshop: Protein sequence phylogeny week 2 Darren Soanes.
Advertisements

 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Phylogenetic reconstruction
Molecular Evolution Revised 29/12/06
5.4 Cladistics Nature of science:
14 Molecular Evolution and Population Genetics
Review of cladistic technique Shared derived (apomorphic) traits are useful in understanding evolutionary relationships Shared primitive (plesiomorphic)
CHAPTER 25 TRACING PHYLOGENY. I. PHYLOGENY AND SYSTEMATICS A.TAXONOMY EMPLOYS A HIERARCHICAL SYSTEM OF CLASSIFICATION  SYSTEMATICS, THE STUDY OF BIOLOGICAL.
Molecular Evolution, Part 2 Everything you didn’t want to know… and more! Everything you didn’t want to know… and more!
5.4 Cladistics The ancestry of groups of species can be deduced by comparing their base or amino acid sequences.
Chapter 2 Opener How do we classify organisms?. Figure 2.1 Tracing the path of evolution to Homo sapiens from the universal ancestor of all life.
Topic : Phylogenetic Reconstruction I. Systematics = Science of biological diversity. Systematics uses taxonomy to reflect phylogeny (evolutionary history).
Scientific FieldsScientific Fields  Different fields of science have contributed evidence for the theory of evolution  Anatomy  Embryology  Biochemistry.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Pathways of Evolution.
Systematics the study of the diversity of organisms and their evolutionary relationships Taxonomy – the science of naming, describing, and classifying.
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
 Read Chapter 4.  All living organisms are related to each other having descended from common ancestors.  Understanding the evolutionary relationships.
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Introduction to Phylogenetics
ARE THESE ALL BEARS? WHICH ONES ARE MORE CLOSELY RELATED?
Chapter 24: Molecular and Genomic Evolution CHAPTER 24 Molecular and Genomic Evolution.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Using blast to study gene evolution – an example.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Human Genomics. Writing in RED indicates the SQA outcomes. Writing in BLACK explains these outcomes in depth.
Agenda Microevolution Test Reflection
5.4 Cladistics Essential idea: The ancestry of groups of species can be deduced by comparing their base or amino acid sequences. The images above are both.
Phylogeny & Systematics
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Phylogeny.
Protein Evolution Introducing the use of Biology Workbench as a Bioinformatics Tool.
Molecular Clocks and Continued Research
Reconstructing and Using Phylogenies 16. Concept 16.1 All of Life Is Connected through Its Evolutionary History All of life is related through a common.
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
Section 2: Modern Systematics
Phylogeny and the Tree of Life
Introduction to Bioinformatics Resources for DNA Barcoding
Title: Different Types of Evolution
5.4 Cladistics Essential idea: The ancestry of groups of species can be deduced by comparing their base or amino acid sequences. The images above are both.
The Basics of Molecular Biology
Section 2: Modern Systematics
In-Text Art, Ch. 16, p. 316 (1).
5.4 Cladistics.
Genomes and their evolution
Agenda 10/8 Seashell Sort Phylogeny Lecture Phylogenetics Pracice
Biological Classification: The science of taxonomy
Endeavour to reconstruct the characters of each hypothetical ancestor.
Welcome to AP Biology Saturday Study Session
Molecular Evolution.
Summary and Recommendations
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Phylogeny and Systematics
Chapter 20 Phylogenetic Trees.
Chapter 20 Phylogenetic Trees. Chapter 20 Phylogenetic Trees.
First Draft of Chimpanzee Genome
Chapter 19 Molecular Phylogenetics
Unit Genomic sequencing
Cladistics.
Phylogeny and the Tree of Life
Summary and Recommendations
Study phylogeny in the context of species evolution
1 2 Biology Warm Up Day 6 Turn phones in the baskets
Evolution Biology Mrs. Johnson.
Presentation transcript:

Pipelines for Computational Analysis (Bioinformatics) Tutorial on Comparative Genomics part IV NSF DBI-1515704

What are the questions? We can use computational techniques from bioinformatics to address a number of core questions about the evolution of species from the large number of genome sequences that have been generated. Some core questions are: Which genes are found uniquely in a genome or set of closely related genomes? Which genes were duplicated or lost at different points in the evolution of species? Which protein encoding genes have evolved particularly rapidly, potentially changing function under selective pressure, on any particular lineage?

How do we answer these questions? We will need a full bioinformatics pipeline to address these questions systematically. The first few steps in this pipeline involve identifying the sets of related genes (based upon sequence similarity) from the same genome and from the genomes of other species to identify a gene family. A tool called BLAST is used to do this efficiently. A sample BLAST output from the NCBI website is shown below.

Multiple Sequence Alignment Once a gene family has been obtained, the next step is to produce a multiple sequence alignment that shows the historical relationship of each amino acid position in a given protein to that in every other protein and identifies positions that do not share a historical relationship (are not homologous). Taken from http://www.pagepress.org/journ als/index.php/eb/article/view/e b.2010.e7/2536, the alignment at right shows a sample multiple sequence alignment.

Phylogenies (Gene Trees) Phylogenetic trees show the relationship of taxa, which can be species or genes, to each other. The vertical direction in the trees shown approximates time and following the branching pattern shows the relationships. Internal points (nodes) that connect taxa reflect their most recent common ancestor. There are several ways to estimate phylogenetic trees from patterns of sequence similarity, most commonly using an explicit mathematical model for the evolutionary process of sequence divergence. A tree derived from an alignment as an example is shown below.

Reconciling gene trees and species trees From large collections of genome sequence data, we have established relationships among many species. These have been assembled into phylogenetic trees, called species trees. We now know that we can build gene trees for individual genes. Comparing these trees tells us where particular genes originated, were duplicated, or lost. The totality of this information tells us the sets of genes that changed along particular species tree lineages. Taken from the PrIMETV website (http://prime.sbc.su.se/primetv/), a sample reconciliation of primate genes is shown.

Detecting Positive Diversifying Selection Because there are 64 codons that encode 20 amino acids, there are then 2 types of nucleotide changes in protein coding regions of genomes. Synonymous substitutions change the codon but to one that still encodes the same amino acid. Nonsynonymous substitutions encode change the codon to one that encodes a different amino acid. If the protein is evolving without selective pressures, one expects that rates of these two types of changes to be the same. However, because proteins already function, many amino acid mutations make proteins worse and are eliminated from the population by selection. This makes the rate of synonymous change faster. However, in rare cases, the nonsynonymous rates is faster. This can be an indication of a selective pressure to change protein function on a particular gene tree branch. The figures on the next slide show the genetic code and a lineage of a phylogenetic tree with the ratio biased towards more nonsynonymous change.

Figures for Positive Diversifying Selection On the left is a figure of the genetic code taken from https://upload.wikimedia.org/wikipedia/commons/thumb/d/d6/GeneticCode21-version-2.svg/2000px- GeneticCode21-version-2.svg.png. The multiple codons that encode each individual amino acid (besides M and W) are shown. On the right is an image showing the ratio of nonsynonymous and synonymous rates for the myostatin gene, where it is greater than 1 on the grey lineages of the species tree for this single copy gene. This is taken from Mol. Phylo. Evol. 33:782 (2004).