Scratch Protein Predictor Result Q:S and percent identity with Lore

Slides:



Advertisements
Similar presentations
To Split or Not to Split: Division of Mycobacteriophage Subcluster A3 Brittany Grandaw, Daphne Hussey, Warren Taylor Abstract The purpose of this experiment.
Advertisements

Phage BruceB, cluster G Et2Brutus, cluster A2.
NCBI data, sliding window programs and dot plots Sept. 25, 2012 Learning objectives-Become familiar with OMIM and PubMed. Understand the difference between.
Alignment of primary structure is the basis of detection of putative homologous proteins. The software BLAST is the most popular and efficient tool for.
Protein Structure Database Introduction Database of Comparative Protein Structure Models ModBase 生資所 g 詹濠先.
Predicting Genes in Mycobacteriophages December 8, In Silico Workshop Training D. Jacobs-Sera.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 2: “Homology” Searches and Sequence Alignments.
Training a Neural Network to Recognize Phage Major Capsid Proteins Author: Michael Arnoult, San Diego State University Mentors: Victor Seguritan, Anca.
Mike Arnoult 9/30/2010 The role of Artificial Neural Networks in Phage Research.
Intro to Bioinformatics Summary. What did we learn Pairwise alignment – Local and Global Alignments When? How ? Tools : for local blast2seq, for global.
Bioinformatics and Phylogenetic Analysis
The Protein Data Bank (PDB)
Protein Modules An Introduction to Bioinformatics.
Protein structure Friday, 10 February 2006 Introduction to Bioinformatics Brigham Young University DA McClellan
Similar Sequence Similar Function Charles Yan Spring 2006.
Training a Neural Network to Recognize Phage Major Capsid Proteins Author: Michael Arnoult, San Diego State University Mentors: Victor Seguritan, Anca.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Arabidopsis Gene Project GK-12 April Workshop Karolyn Giang and Dr. Mulligan.
Making Sense of DNA and protein sequence analysis tools (course #2) Dave Baumler Genome Center of Wisconsin,
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Semantic Similarity over Gene Ontology for Multi-label Protein Subcellular Localization Shibiao WAN and Man-Wai MAK The Hong Kong Polytechnic University.
Introduction to Bioinformatics CPSC 265. Interface of biology and computer science Analysis of proteins, genes and genomes using computer algorithms and.
Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches1 By Jayakumar Rudhrasenan S Primary Supervisor: Prof. Heiko Schroder.
Bacterial Genetics - Assignment and Genomics Exercise: Aims –To provide an overview of the development and.
Discovering the Correlation Between Evolutionary Genomics and Protein-Protein Interaction Rezaul Kabir and Brett Thompson
Comp. Genomics Recitation 3 The statistics of database searching.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
A bioinformatics simulation of a mutant workup from a model genetic organism Christopher J. Harendza – Montgomery County Community College.
BIOLOGICAL DATABASES. BIOLOGICAL DATA Bioinformatics is the science of Storing, Extracting, Organizing, Analyzing, and Interpreting information in biological.
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
1 Improve Protein Disorder Prediction Using Homology Instructor: Dr. Slobodan Vucetic Student: Kang Peng.
Analysis and comparison of very large metagenomes with fast clustering and functional annotation Weizhong Li, BMC Bioinformatics 2009 Present by Chuan-Yih.
Bacteriophage Gene Functions Welkin Pope SEA-PHAGES Bioinformatics Workshop, 2015.
Large-scale Prediction of Yeast Gene Function Introduction to Bio-Informatics Winter Roi Adadi Naama Kraus
Point Specific Alignment Methods PSI – BLAST & PHI – BLAST.
David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN.
PROTEIN IDENTIFIER IAN ROBERTS JOSEPH INFANTI NICOLE FERRARO.
Tutorial 4 Comparing Protein Sequences Intro to Bioinformatics 1.
Bioinformatics Shared Resource Bioinformatics : How to… Bioinformatics Shared Resource Kutbuddin Doctor, PhD.
Bos taurus Olfactory Receptor Katie Davis 1,2 and Sandra Rodriguez-Zas 1 1 Department of Animal Sciences, University of Illinois Urbana-Champaign, 2 ACES.
Sarah Muche, Joseph Quinlan, Christina Gareis, Alison Kloiber, Madison Honer, Taylor Nguyen, Haley Patrick, Martin Ryan, Scott Newman, Lakshmi Narayanam,
Gaurav Arora 1, Vinayak Mathur 2, and Anne Rosenwald 2 1 Gallaudet University, Department of Science, Technology, and Mathematics, Washington, DC
BLAST: Basic Local Alignment Search Tool Robert (R.J.) Sperazza BLAST is a software used to analyze genetic information It can identify existing genes.
Bacterial infection by lytic virus
Bacteriophage Gene Functions
Using BLAST to Identify Species from Proteins
Comparative Analysis of the Expanding Streptomyces BC Cluster
Sequence similarity, BLAST alignments & multiple sequence alignments
Yiming Kang, Hien-haw Liow, Ezekiel Maier, & Michael Brent
Bacterial infection by lytic virus
Research Paper on BioInformatics
Ciara Buechner, Lucas Zellmer, Emily Falch, Lauren Schlitz
Assessing Students' Understanding of the Scientific Process Amy Marion, Department of Biology, New Mexico State University Abstract The primary goal of.
Pipelines for Computational Analysis (Bioinformatics)
Using BLAST to Identify Species from Proteins
Genome Center of Wisconsin, UW-Madison
Bioinformatics and BLAST
Predicting Genes in Actinobacteriophages
Overview Bioinformatics: Analyzing biological data using statistics, math modeling, and computer science BLAST = Basic Local Alignment Search Tool Input.
Isolation and Annotation of Arthrobacteriophage
Functional Genomics of Bacillus Phages
Protein structure prediction.
Basic Local Alignment Search Tool (BLAST)
Basic Local Alignment Search Tool
Using BLAST to Identify Species from Proteins
GOMASHI ANNOTATION GENES 1-13
Figure 1a. Insertion of sequence into Claudi capsid gene
Welkin Pope SEA-PHAGES Bioinformatics Workshop, 2017
The Processes of Science
Presentation transcript:

Scratch Protein Predictor Result Q:S and percent identity with Lore The Use of Bioinformatics Tools to Predict the Functions of Hypothetical Proteins from Pham 6637 Cluster AN of Bacteriophages Guynup, Taylor; Reyes, Andrea; Chang, Joseph Abstract Results Materials and Methods Used Phagesdb to find genomic sequences of different genomes of Pham 3367 of the AN cluster This study was conducted to attempt to discover the function of proteins whose functions were previously unknown with the use of bioinformatics tools on the internet. The problem here is that since the protein functions are unknown, there is a very slight chance that any of the websites/bioinformatics tools already available will be able to detect the function. To solve the problem and answer the question at hand, Lore_14, Jessica_15, StewieGriff_14, and Toulouse_13 were ran through scratch protein predictor, NCBI, and TMHMM. The hopes were that these tools would be able to determine what the function of such genes would be without having to enter the wet lab and do the work for it. Afterwards, the same methods were applied to singletons from BrockDraft and genes 36, 37, and 39 were analyzed. All in all, this research helped prove that the majority of hypothetical proteins need to be taken to a wet lab to have their function determined since the bioinformatics tools were not as reliable as thought. Bacteriophage NCBI Results TMHMM Results Scratch Protein Predictor Result Q:S and percent identity with Lore Lore 14 Hypothetical Protein SEA_JESSICA [Arthrobacter phage Jessica] E value: 0.0 Negative Capsid Sequence: Yes (.994914) Tail Sequence: Yes (.141622) 1:1 100% Toulouse 13 Minor Tail Protein: [Arthrobacter phage Toulouse] Capsid Sequence: Yes (1.206675) Tail Sequence: No (-0.032368) 1:2 92% Jessica 15 Minor Tail Protein [Arthrobacter phage Jessica] Capsid Sequence: Yes (1.156418) Tail Sequence: No (-0.092472) StewieGriff 14 2:1 Ran genomic sequences through NCBI blast to find protein function Ran genomic sequence through: Scratch Protein Predictor: predicts tail or capsid protein TMHMM: predicts transmembrane proteins Compared NCBI blast results and results from the different data bases Introduction Phage genome mapping is a relatively new subject in biology, but it has profound applications in the field. Inherently, a vital aspect of this subject is discovering phage gene functions. Before its function is verified, a potential gene is labeled a “hypothetical protein.” As more genomes are discovered, similar genomes are grouped into clusters, whereas the genomes without other known related genomes are called singletons. Currently, there are several bioinformatics prediction programs that infer relationships between unknown and known genes, but it is often the human’s responsibility to determine a gene’s function based on results from those tools, as there are often other factors those programs do not include in calculations. Nevertheless, protein function prediction programs are constantly improving, and we are interested in evaluating their reliability in predicting functions of hypothetical proteins. Compared the functions of the different Pham members to see commonalities Discussion From the data collected, seen in Figure 1, it can be predicted that Lore 14 will be a minor tail protein. The negative results through TMHMM predicted that Lore 14 and the rest of the Pham 3367 proteins were not transmembrane proteins. StewieGriff 14 supports Lore 14 as being a Minor Tail Protein both through NCBI 1:1 score and the Scratch Protein Predictor Results. Scratch Protein Predictor has a 90% to 97% accuracy and uses protein amino acid composition, the secondary structure, and alignment contact fragments of both capsid and tail proteins. Both Jessica 15 and Toulouse 13 support Lore 14 as a minor tail protein in NCBI Q:S scores but not through Scratch Protein Predictor. The scores for Jessica and Toulouse are not far enough away from the average sequence therefore it does not meet the criteria to be a tail protein. The use of Bioinformatics tools can be used to help predict the functions of hypothetical proteins, in addition, the integration of Homology-based inference can help limit down options that a protein could be. Homology-based inference has been used before, using both BLAST and e values to compare proteins (Hamp 2013).   Figure 1 Application of Methods to Singletons BRock_Draft Cluster Singleton Streptomyces Bacteriophage Gene 36 : Pham 22979; 1503 base pairs; 17538 to 16036 (reverse gene) NCBI Results: Hypothetical protein [Streptomyces phage BRock] TMHMM results: Negative Scratch Protein predictor results: Capsid sequence: Yes (0.585782) Tail Sequence: No (0.324156) Identity with Lore: No significant similarity found Awknoledgements Special Thanks to Dr. Tamarah Adair and and Lathan Lucas for advisement throughout the research process Works Cited Figure 2l: Phamerator Map of BRock Hamp, T., Kassner, R., Seemayer, S., Vicedo, E., Schaefer, C., Achten, D., … Rost, B. (2013). Homology-based inference sets the bar high for protein function prediction. BMC Bioinformatics, 14(Suppl 3), S7. http://doi.org/10.1186/1471-2105-14-S3-S7