T-COFFEE Multiple Alignments of Orthologous Sequences Horizontal Gene Transfer (Phylogenetic Trees) WebLogo.

Slides:



Advertisements
Similar presentations
1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
Advertisements

DNA BLAST Lab.
Types of homology BLAST
Homework Assignments due next session 1.Find a entry of interest in OMIM ( )
Bas E. Dutilh Phylogenomics Using complete genomes to determine the phylogeny of species.
Gene transfer Organismal tree: species B species A species C species D Gene Transfer seq. from B seq. from A seq. from C seq. from D molecular tree: speciation.
Creating And Maintaining A Database. 2 Learn the guidelines for designing databases When designing a database, first try to think of all the fields of.
Topic : Phylogenetic Reconstruction I. Systematics = Science of biological diversity. Systematics uses taxonomy to reflect phylogeny (evolutionary history).
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
The diversity of genomes and the tree of life
Genome Evolution: Duplication (Paralogs) & Degradation (Pseudogenes)
Enzymatic Function Module (KEGG, MetaCyc, and EC Numbers)
Annotation Presentation Alternative Start Codons &
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Structure-based Evidence for Function (TIGRfam, Pfam and PDB)
Working with the Conifer_dbMagic database: A short tutorial on mining conifer assembly data. This tutorial is designed to be used in a “follow along” fashion.
© Ms. Masihi.  The Dreamweaver Welcome Screen first opens when you start Dreamweaver.  This screen gives you quick access to previously opened files,
Lab Reports. Wrapping up IMG-ACT Genome Annotation Online notebook should be completed for all 3 genes Final reports are comprised of the imgACT online.
Pathway Assignments. The assignment – Annotating Pathways KEGG Pathway Database.
Overview. What is Annotation? Annotation is the process of determining the location and function of all identifiable genes in a genome. Annotation is.
3- NON-RIBOSOMAL GENE RECONSTRUCTION  Core / auxiliary / strain specific genes  Housekeeping genes and accordance with global reconstruction  MLSA 
3- RIBOSOMAL RNA GENE RECONSTRUCITON  Phenetics Vs. Cladistics  Homology/Homoplasy/Orthology/Paralogy  Evolution Vs. Phylogeny  The relevance of the.
Copyright OpenHelix. No use or reproduction without express written consent1.
Sequence-based Similarity Module (BLAST & CDD only ) & Horizontal Gene Transfer Module (Ortholog Neighborhood & GC content only)
Applied Bioinformatics Week 8 Jens Allmer. Practice I.
Analysing Data with Excel Viewing Help To view Help 1.On the Start menu, point to Programs, and then click Microsoft Excel. 2.On the Help menu,
Wednesday, September 11, 2013 TAKE OUT: Bioinformatics pre-lab (p. 1-2); tear off pages 3-8 from lab handout AND RECYCLE ! SAVE analysis questions on page.
Function preserves sequences Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. [many slides borrowed from various sources]
Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST INVESTIGATION 3 BIG IDEA 1.
Chapter 8 Lecture Outline
Basic Local Alignment Search Tool BLAST Why Use BLAST?
Copyright OpenHelix. No use or reproduction without express written consent1.
Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]
Annotation of Drosophila virilis Chris Shaffer GEP workshop, 2006.
MODULE 9 Integrating Word, Excel, Access, and PowerPoint © Paradigm Publishing, Inc.1.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Chapter 25: Phylogeny and Systematics. “Taxonomy is the division of organisms into categories based on… similarities and differences.” p. 495, Campbell.
Copyright OpenHelix. No use or reproduction without express written consent1.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
Copyright OpenHelix. No use or reproduction without express written consent1.
How to Work With SURN Principal Academy Data For data downloaded from onlineobservationtools.com.
Winthrop June 28 – July 2, 2014 Terrell L. Hodge Western Michigan University
Protein Sequence, Structure, and Function Lab Gustavo Caetano - Anolles Protein Sequence, Structure, and Function Lab v1 | Gustavo Caetano - Anolles 1.
Bioinformatics What is a genome? How are databases used? What is a phylogentic tree?
Phylogeny and the Tree of Life
Using BLAST to Identify Species from Proteins
INVESTIGATION 3 BIG IDEA 1
TDA Direct Certification
Annotation Presentation
Single Sample Registration
Pipelines for Computational Analysis (Bioinformatics)
Tutorial for using Case It for bioinformatics analyses
Using BLAST to Identify Species from Proteins
INVESTIGATION 3 BIG IDEA 1
INVESTIGATION 3 BIG IDEA 1
This tutorial is designed to be used in a “follow along” fashion
Why could a gene tree be different from the species tree?
Bacterial genomics: The controlled chaos of shifty pathogens
Phylogeny and Systematics
Annotation Presentation
Basic Local Alignment Search Tool
Explore Evolution: Instrument for Analysis
INVESTIGATION 3 BIG IDEA 1
Multiple sequence alignment & Phylogenetics Analysis
Learning Objectives: Creating a new Table Style
Phylogenetic analysis of PurR from low-GC Gram-positive bacteria.
Using BLAST to Identify Species from Proteins
16S rRNA-based phylogeny of sponge-associated cyanobacteria and chloroplasts. 16S rRNA-based phylogeny of sponge-associated cyanobacteria and chloroplasts.
Presentation transcript:

T-COFFEE Multiple Alignments of Orthologous Sequences Horizontal Gene Transfer (Phylogenetic Trees) WebLogo

Overview T-COFFEE –Tree-based Consistency Objective Function for alignment Evaluation Focuses on orthologous gene sequences Used to generate multiple sequence alignments WebLogo Constructed from multiple sequence alignment Phylogenetic Trees Used to determine if your gene is derived from horizontal gene transfer

“Click” Enter ortholog sequences into query box – Where do I get these?

RECALL: What are orthologs? Homologs –Orthologs Genes duplicated via appearance of new species –Identical function in different organisms –Paralogs Genes duplicated within a species –Perform slightly different tasks in cell »Can develop new capabilities »Can become pseudogene if functionality lost but sequence similarity retained Insert Figure 8-41 from Microbiology – An Evolving Science © 2009 W.W. Norton & Company, Inc.

Under Homolog Selection, choose “Paralogs/Orthologs” from drop-down menu Where do I find orthologs? Scroll down

Scroll down to table containing list of orthologs

Select the genes by clicking these boxes The genes are ranked by ascending E-values Add the top 5 orthologs to Gene Cart Notice orthologous genes are from different organisms

“Click” Scroll down to bottom of page

Only 5 genes were selected, why are 6 genes shown in the Gene Cart? One of the genes shown is your ASSIGNED gene (the one you are annotating)

Generate amino acid sequences for orthologs in FASTA format Select “FASTA Amino Acid format” Scroll down to “Export Genes” “Click”

Amino Acid sequences in FASTA format for all 6 genes will appear Scroll down

Your assigned gene is located at the bottom of this list (inspect Gene OID number) Copy / paste all 5 ortholog sequences into your notebook for this module EXCLUDE your gene, which should already be in your notebook Scroll down

Recording results in your notebook The amino acid sequences in FASTA format for the top 5 orthologs Add heading and box

STEP 1: Copy / paste the amino acid sequence in FASTA format for your assigned gene into the query box for T-COFFEE Return to T-COFFEE database

STEP 2: Copy / paste the amino acid sequences in FASTA format for the top 5 orthologs into the same query box as your gene T-COFFEE database entries Separate individual sequences by a hard return “Click”

Wait a few moments...

Select “Start JalView” to examine the multiple sequence alignment of the ortholog sequences T-COFFEE Results

Reminder: Light Blue = Low Frequency Dark Blue = High Frequency Alignment inspection using JalView Select “Percentage Identity” under “Colour” menu Compare to consensus sequence

Copy / paste this alignment into your lab notebook Return to T-COFFEE Results

Identify organism in alignment by Gene OID Recording results in your notebook

T-COFFEE complete On to WebLogo

“Click” “Right Click” and open in IE tab (not Firefox)

Copy/paste multiple sequence alignment Scroll down

1- Select “amino acid” as sequence type 2- Select box for multiline logo “Click”

WebLogo Results Zoom in In IE, save picture as.png file for upload to notebook

Recording results in your notebook

WHAT ARE OUR GOALS? 1. Build a phylogenetic tree 2. Determine if assigned genes are derived from horizontal gene transfer

Phylogenetic tree of Bacteria showing established & candidate phyla (organismal phylogeny ) Domain Phylum Class Order Family Genus Species Insert Figure 1 from Handelsman (2004) Microbiol. Mol. Biol. Rev. 68:

Three bacterial phyla closely related to Planctomycetes by 23S rRNA analysis (organismal phylogeny) Insert Figure 4A from Pilhofer et al. (2008) Characterization and Evolution of Cell Division and Cell Wall Synthesis Genes in the Bacterial Phyla Verrucomicrobia, Lentisphaerae, Chlamydiae, and Planctomycetes and Phylogenetic Comparison with rRNA Genes. J Bacteriology 190:

16S rRNA gene supports the monophyletic grouping Planctomycetales (organismal phylogeny by rDNA analysis) 26 Closest phylogenetic relatives of P. limnophilus (same family)

How do we build a phylogenetic tree?  include P. limnophilus gene (first module)  include the top 5 orthologs (second module)  include genes from organisms closely related to P. limnophilus (i.e., same family)  include genes from organisms less closely related to P. limnophilus (i.e., from phyla Verrucomicrobia, Chlamydiae, and Lentisphaerae)  include genes from organisms that are distantly related to P. limnophilus

Recall: We want to include genes from organisms more closely related to P. limnophilus AND genes from organisms that are less closely related to P. limnophilus. So…depending on the organisms in your top 5 orthologs, there are 2 paths you can take: Select top 5 orthologs If top 5 are closely related to P. limnophilus... If top 5 are less closely related to P. limnophilus... Choose 5-10 less closely related organisms Choose 5-10 more closely related organisms PATH 1PATH 2

26 Building a phylogenetic tree  EXAMPLE: Organisms closely related to P. limnophilus (i.e., same family)

Building a phylogenetic tree  EXAMPLE: Organisms less closely related to P. limnophilus (i.e., from phyla Verrucomicrobia, Chlamydiae, and Lentisphaerae) Insert Figure 4A from Pilhofer et al. (2008) Characterization and Evolution of Cell Division and Cell Wall Synthesis Genes in the Bacterial Phyla Verrucomicrobia, Lentisphaerae, Chlamydiae, and Planctomycetes and Phylogenetic Comparison with rRNA Genes. J Bacteriology 190:

Inspect your top 5 orthologs: Which path? Example: PATH #1 – Most are in same family as P. limnophilus, so choose 5-10 sequences from less closely related organisms

Under Homolog Selection, choose “Paralogs/Orthologs” from drop-down menu Where do I find sequences? Scroll down

Scroll through the ortholog list and select some genes from less closely related as well as some distantly related organisms Once 5-10 orthologs are selected, add them to your gene cart

Generate amino acid sequences for orthologs in FASTA format Select “FASTA Amino Acid format” Scroll down to “Export Genes” “Click”

Amino Acid sequences in FASTA format Scroll down Remember: Your assigned gene is at the bottom of the list

Recording results in your notebook Create another box in your lab notebook, and copy/paste ONLY the 5-10 NEW ortholog FASTA sequences (i.e., exclude those already in first & second module)

Recording results in your notebook

What if your top 5 orthologs are distantly related to P. limnophilus? Example: PATH #2 – Most are not in the same phylum as P. limnophilus, so choose 5-10 sequences from more closely related organisms

Scroll through the ortholog list and select some genes from closely related as well as other distantly related organisms Once 5-10 orthologs are selected, add them to your gene cart Copy / paste FASTA format protein sequences into notebook

Use Phylogeny.fr site to create a phylogenetic tree “Click”

Select “A la Carte” from menu Creating a phylogenetic tree

1- Select “T-Coffee” for multiple alignment 2- Leave other settings as default Scroll down

“Click” Scroll down

Copy/paste sequences in query box. Scroll down & select “submit” Your P. limnophilus gene Your top 5 orthologs 5-10 new orthologs

Results of phylogenetic analysis Download and save as.png for upload to notebook

Recording results in your notebook How do I interpret the tree results?

Possible scenarios resulting from construction of a phylogenetic tree P. limnophilus Carboxydothermus Bacillus P. limnophilus Blastopirellula P. maris Pirellula No HGT since P. limnophilus and Blastopirellula are in the same family and are clustered together (i.e., gene phylogeny matches organismal phylogeny). Possible HGT since P. limnophilus and Carboxydothermus are very distantly related yet clustered together (i.e., gene phylogeny does NOT match organismal phylogeny). Bacillus Clostridium Carboxydothermus P. limnophilus P. maris Blastopirellula Clustered Not clustered Maybe HGT, but unsure because there is also an unresolved or multifurcating branch

Interpreting your phylogenetic tree If your Planctomyces limnophilus gene is clustered with that from an organism in the P. limnophilus family  probably not horizontal gene transfer Planctomyces limnophilus Blastopirellula marina Planctomyces maris Pedospheara parvula (Ellin 514) Lentisphaera araneosa Verrucomicrobium spinosum Sorangium cellulosum Escherichia coli K12 Rhodopirellula baltica Thiobacillus denitrificans Moorella thermoacetica Brucella canis Hydrogenivirga sp. Clostridium perfringens Gemmata obscuriglobus If your Planctomyces limnophilus gene is clustered with that from an organism that is NOT in the P. limnophilus family  may be horizontal gene transfer If your Planctomyces limnophilus gene is clustered with more than one organism in the tree (multifurcating branch)  unresolved phylogeny In the example below, is the gene derived by HGT? Why or why not?

In the “Interpretation” box to your lab notebook  Is there evidence of horizontal gene transfer?  What organisms does your gene cluster with? The same family? Or the three more closely related phyla (Verrucomicrobia, Chlamydiae, Lentisphaerae)?  Do the gene and genome GC content differ by more than 5%?  Do the neighborhoods for the top 5 orthologs look similar or different to that of your gene in P. limnophilus? Recording results in your notebook