Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, 2006-11-07 Per Kraulis

Slides:



Advertisements
Similar presentations
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
Advertisements

On line (DNA and amino acid) Sequence Information Lecture 7.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Other biological databases. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and networks Biological.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Bioinformatics for biomedicine Seminar: Sequence analysis of a favourite gene Lecture 5, Per Kraulis
GENBANK, SWISSPROT AND OTHERS As Problem Sources for CSE 549 Andriy Tovkach Genetics.
Archives and Information Retrieval
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Bioinformatics and Phylogenetic Analysis
Protein Databases EBI – European Bioinformatics Institute
The Cell, Central Dogma and Human Genome Project.
IST Computational Biology1 Information Retrieval Biological Databases 2 Pedro Fernandes Instituto Gulbenkian de Ciência, Oeiras PT.
Scientific Data Mining: Emerging Developments and Challenges F. Seillier-Moiseiwitsch Bioinformatics Research Center Department of Mathematics and Statistics.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Class European Resources Protein Focused. Protein Databases EBI – European Bioinformatics Institute
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Protein and Function Databases
Signaling Pathways and Summary June 30, 2005 Signaling lecture Course summary Tomorrow Next Week Friday, 7/8/05 Morning presentation of writing assignments.
ExPASy - Expert Protein Analysis System The bioinformatics resource portal and other resources An Overview.
Bioinformatics for biomedicine More annotation, Gene Ontology and pathways Lecture 6, Per Kraulis
Predicting Function (& location & post-tln modifications) from Protein Sequences June 15, 2015.
Presented by Liu Qi An introduction to Bioinformatics Algorithms Qi Liu
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
On line (DNA and amino acid) Sequence Information
A number of slides taken/modified from:
Bioinformatics for biomedicine Protein domains and 3D structure Lecture 4, Per Kraulis
Ch10. Intermolecular Interactions and Biological Pathways
Wellcome Trust Workshop Working with Pathogen Genomes Module 3 Sequence and Protein Analysis (Using web-based tools)
The Ensembl Gene set The “Genebuild” 21 April 2008.
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
Bioinformatics for biomedicine
Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
1 Orthology and paralogy A practical approach Searching the primaries Searching the secondaries Significance of database matches DB Web addresses Software.
1 Review of Biological Database Utilization. 2 Biological Databases We will discuss: Usefulness to the bioinformaticist Database types Search methods.
Module 3 Sequence and Protein Analysis (Using web-based tools) Working with Pathogen Genomes - Uruguay 2008.
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
10/17/05 D Dobbs ISU - BCB 444/544X: Genes & Genomes1 10/17/05 Genes & Genomes (formerly Gene Prediction - 1)
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
BIOINFORMATIK I UEBUNG 2 mRNA processing.
EMBL-EBI EMBL-EBI EMBL-EBI What is the EBI's particular niche? Provides Core Biomolecular Resources in Europe –Nucleotide; genome, protein sequences,
Protein Database David Shiuan Department of Life Science Institute of Biotechnology Interdisciplinary Program of Bioinformatics National Dong Hwa University.
Protein and RNA Families
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
Biological databases Exercises. Discovery of distinct sequence databases using ensembl.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Sequencing the World of Possibilities for Energy & Environment MGM workshop. 19 Oct 2010 Information Sources for Genomics Konstantinos Mavrommatis Genome.
Bioinformatics and Computational Biology
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
March 28, 2002 NIH Proteomics Workshop Bethesda, MD Lai-Su Yeh, Ph.D. Protein Scientist, National Biomedical Research Foundation Demo: Protein Information.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
Central hub for biological data UniProtKB/Swiss-Prot is a central hub for biological data: over 120 databases are cross-referenced (EMBL/DDBJ/GenBank,
Protein sequence databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen This also includes old material from my thesis
InterPro Sandra Orchard.
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
High throughput biology data management and data intensive computing drivers George Michaels.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
BASICS OF BIOINFORMATICS Biotechnology Division North-East Institute of Science & Technology (Council of Scientific & Industrial Research) Jorhat ,
Gene Annotation & Gene Ontology May 24, Gene lists from RNAseq analysis What do you do with a list of 100s of genes that contain only the following.
Bioinformatics Shared Resource Bioinformatics : How to… Bioinformatics Shared Resource Kutbuddin Doctor, PhD.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Introduction to Bioinformatics
Archives and Information Retrieval
생물정보학 Bioinformatics.
Genome Annotation Continued
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis

Themes in bioinformatics 1.Databases 2.Sequences 3.Sequence search 4.Sequence, evolution, function 5.Protein 3D structure 6.Sequence alignment 7.Annotation 8.Gene expression; data analysis 9.Pathways and processes

1. Databases Data models –Domain Included, excluded –Central data object(s) –Relations Database policy –Manually curated vs. automated –Updates –Access, licenses, copyright

2. Sequences Sequence databases –Nucleotide, protein –Annotation –Cross-references, links Sequence analysis –Features –Similarities –Phylogenetics

3. Sequence search Sequence search –BLAST, FASTA –Smith-Waterman Sequence patterns –Regular expressions Prosite –Hidden Markov Models (HMMs) Pfam –PSI-BLAST, PHI-BLAST, HMMER

4. Sequence, evolution, function Sequence and evolution –Sequence similarity –Homology Sequence and function –Domains –Activity, enzyme, binding Function and evolution –Orthologs: Speciation event Similar function, presumably –Paralogs: Duplication event Divergent function, presumably

5. Protein 3D structure Protein sequence and 3D structure –Sequence determines structure Structure and function –Strongly conserved Structure prediction –Folding problem –Modelling: using similarity Structural features –Folds and domains

6. Sequence alignment Sequence alignment –Part of sequence search –Required for 3D model from template –Quality depends on similarity Multiple sequence alignment –Heuristic algorithms required –Hard to obtain optimal solution –Phylogenetics

7. Annotation Annotation: properties, features,… Association by guilt –Sequence similarity –Behavioral similarity Gene expression Proteomics Binding, physical association Gene Ontology –Controlled vocabulary of keywords

8. Gene expression; data analysis Gene expression –EST, SAGE, microarrays –Experimental design Time course Data analysis –Normalization –Clustering –Statistics –Visualization

9. Pathways and processes Gene activity –Protein activity and interactions –Expression as proxy Pathways –Metabolism –Signaling and regulation Biological processes –Temporal and spatial –Hierarchy: different levels and scales

Bioinformatics: The future More complete genomes –Phylogenetics Functional genomics –Annotation, experimental design, integration Pathways –Current DBs incomplete –Data model? Processes –How to model? –System biology; towards prediction

Bioinformatics on the web 1 EBI –Site to be modified 11 Dec 2006! –Databases EMBL: Nucleotide sequences UniProt: Protein sequences, annotation, literature IntAct: Protein interactions ArrayExpress: expression data –Tools –2Can: Bioinformatics educational resource –Research groups

Bioinformatics on the web 2 NCBI –Databases GenBank, RefSeq Proteins OMIM, Taxonomy PubMed –Bookshelf: Biology textbooks on-line –Tools BLAST, Entrez

Bioinformatics on the web 3 Ensembl –Eukaryotic genomes Nucleotide sequence, genes, transcripts, proteins –Databases and tools Vega vega.sanger.ac.ukvega.sanger.ac.uk –Curated eukaryotic genomes ExPASy –UniProt (Swiss-Prot & TrEMBL) –Databases and tools

Bioinformatics on the web 4 GeneCards –Human genes –Integrated database: Other DBs used GeneLynx –Human, rat, mouse –Links for genes to other DBs Google –Now several useful DBs indexed! –Google Scholar

Bioinformatics on the web 5 SGD Saccharomyces genome DB BDG Drosophila genome DB FlyBase Drosophila genome DB flybase.bio.indiana.edu flybase.bio.indiana.edu MGI Jackson lab mouse genome DB

Bioinformatics on the web 6 PDB 3D biomolecular structures 3D structural motifs hierarchy –Manual curation 3D structure classification –Automated curation

Bioinformatics on the web 7 KEGG –Pathways, metabolic and signaling –Started with human and eukaryotes BioCyc –Pathways, metabolic –Started with prokaryotes Reactome –Pathways, signaling, reactions –Started with human

Bioinformatics on the web 8 Biological processes –Several dedicated to specific processes –Educational in nature –No developed data models Systems biology – (Seattle) –

Further reading 1 Bioinformatics. Genes, Proteins & Computers –C.A. Orengo, D.T. Jones & J.M. Thornton –320 pp –BIOS Scientific Publishers Limited, 2003 –ISBN Bioinformatics. Sequence and Genome Analysis –D.W. Mount –692 pp –Cold Spring Harbor Lab Press, 2004 –ISBN

Further reading 2 Sequence – Evolution - Function –E.V. Koonin & M.Y. Galperin –488 pp –Springer, 2002 –ISBN –NCBI Bookshelf =bv.View..ShowTOC&rid=sef.TOC&depth=1 =bv.View..ShowTOC&rid=sef.TOC&depth=1