Compiling Information and Inferring Useful Knowledge for Systems Biology by Text Mining the Literature Anália Lourenço IBB – Institute for Biotechnology.

Slides:



Advertisements
Similar presentations
Social networks, in the form of bibliographies and citations, have long been an integral part of the scientific process. We examine how to leverage the.
Advertisements

Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Gene Ontology John Pinney
Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Systems Biology Study Group Chapter 3 Walker Research Group Spring 2007.
Components of a Cell (Eukaryotes) Picture from on-line biology book,on-line biology book,
Research and objectives Modern software is incredibly complex: for example, a modern OS has more than 10 millions lines of code, organized in 10s of layers!
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Experimental and computational assessment of conditionally essential genes in E. coli Chao WANG, Oct
Reconstructing Transcription Network in S.cerevisiae WANG Chao Oct. 4, 2004.
Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae Speaker: Zhu YANG 6 th step, 2006.
The global transcriptional regulatory network for metabolism in Escherichia coli exhibits few dominant functional states Speaker: Zhu Yang
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Class Projects. Future Work and Possible Project Topic in Gene Regulatory network Learning from multiple data sources; Learning causality in Motifs; Learning.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Modeling Functional Genomics Datasets CVM Lessons 4&5 10 July 2007Bindu Nanduri.
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Ch10. Intermolecular Interactions and Biological Pathways
GTL Facilities Computing Infrastructure for 21 st Century Systems Biology Ed Uberbacher ORNL & Mike Colvin LLNL.
Bioinformatics and medicine: Are we meeting the challenge?
Networks and Interactions Boo Virk v1.0.
Transcriptional Regulation in Constraints-based metabolic Models of E. coli Published by Markus Covert and Bernhard Palsson, 2002.
IProLINK – A Literature Mining Resource at PIR (integrated Protein Literature INformation and Knowledge ) Hu ZZ 1, Liu H 2, Vijay-Shanker K 3, Mani I 4,
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
Kevin Correia, Goutham Vemuri, Radhakrishnan Mahadevan Pathway Tools Conference, Menlo Park, CA March 6 th, 2013 Elucidating the xylose metabolising properties.
A COMPREHENSIVE GENE REGULATORY NETWORK FOR THE DIAUXIC SHIFT IN SACCHAROMYCES CEREVISIAE GEISTLINGER, L., CSABA, G., DIRMEIER, S., KÜFFNER, R., AND ZIMMER,
Reconstruction of Transcriptional Regulatory Networks
Agent-based methods for translational cancer multilevel modelling Sylvia Nagl PhD Cancer Systems Science & Biomedical Informatics UCL Cancer Institute.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
BioSumm A novel summarizer oriented to biological information Elena Baralis, Alessandro Fiori, Lorenzo Montrucchio Politecnico di Torino Introduction text.
1 Departament of Bioengineering, University of California 2 Harvard Medical School Department of Genetics Metabolic Flux Balance Analysis and the in Silico.
Lecture # 1 The Grand Schema of Things. Outline 1.The grand scheme of things 2.Some features of genome-scale science 3.The systems biology paradigm 4.Building.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Introduction: Acknowledgments Thanks to Department of Biotechnology (DBT), the Indo-US Science and Technology Forum (IUSSTF), University of Wisconsin-Madison.
NY Times Molecular Sciences Institute Started in 1996 by Dr. Syndey Brenner (2002 Nobel Prize winner). Opened in Berkeley in Roger Brent,
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
Comparative Genomics.
Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.
HRVFrame: Java-Based Framework for Feature Extraction from Cardiac Rhythm Alan Jovic and Nikola Bogunovic Faculty of Electrical Engineering and Computing,
10 AM Tue 20-Feb Genomics, Computing, Economics Harvard Biophysics 101 (MIT-OCW Health Sciences & Technology 508)MIT-OCW Health Sciences & Technology 508.
Mining the Biomedical Research Literature Ken Baclawski.
A collaborative tool for sequence annotation. Contact:
Bioinformatics and Computational Biology
An approach to carry out research and teaching in Bioinformatics in remote areas Alok Bhattacharya Centre for Computational Biology & Bioinformatics JAWAHARLAL.
Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.
Introduction to biological molecular networks
In silico gene targeting approach integrating signaling, metabolic, and regulatory networks Bin Song Jan 29, 2009.
DNAmRNAProtein Small molecules Environment Regulatory RNA How a cell is wired The dynamics of such interactions emerge as cellular processes and functions.
A database of biological pathways and processes (borrowed from a presentation created by Steve Jupe)
Constraint-based Metabolic Reconstructions & Analysis © 2015 H. Scott Hinton Lesson: Introduction BIE 5500/6500Utah State University Introduction to Systems.
DISCUSSION Using a Literature-based NMF Model for Discovering Gene Functional Relationships Using a Literature-based NMF Model for Discovering Gene Functional.
SRI International Bioinformatics 1 Pathway Tools Features Available Only in the Desktop Version PathoLogic.
High throughput biology data management and data intensive computing drivers George Michaels.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
OncoTrack Bioinformatics Workshop Max Planck Institute for Molecular Genetics, Berlin Wednesday 6 th November 2013 TimeSubject 13:30-15:00 Introduction.
Introduction to Biomedical Engineering Tutor: Susana Vinga, INESC-ID/FCM-UNL Master in Biomedical Engineering, IST/FML 2011/2012 Ana de Aguiar nº 72727;
Cheminformatics and Metabolism Team The EBI Enzyme Portal.
Biological Databases By: Komal Arora.
KnowEnG: A SCALABLE KNOWLEDGE ENGINE FOR LARGE SCALE GENOMIC DATA
The Pathway Tools FBA Module
PIR: Protein Information Resource
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
Tutorial: Bioinformatics Resources
Department of Chemical Engineering
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Batyr Charyyev.
Presentation transcript:

Compiling Information and Inferring Useful Knowledge for Systems Biology by Text Mining the Literature Anália Lourenço IBB – Institute for Biotechnology and Bioengineering Centre of Biological Engineering, BioPSE group Universidade do Minho

Systems Biology Systems Biology does not investigate individual cellular components at a time, but the behaviour and relationships of all of the elements in a particular biological system while it is functioning. nts_blogsphere/2009/07/system-biology-for- personalized-medicine.html

Biomedical Literature Mining The idea is to train computers to retrieve, read and interpret the text under processing: – what to read; – how to read; – what to do with the processed text. Roughly speaking, we want to emulate human reading behaviour as closest as possible. – Learning domain-specific behaviour. – Aiming at delivering intuitive, comprehensible domain-specific knowledge.

Biomedical Literature Mining Automatic Information Retrieval Automatic Information Retrieval: PubMed bulk access to PubMed contents; full-text documents retrieval of full-text documents; … Automatic Information Extraction Automatic Information Extraction: classification clustering document classification and clustering; relevant information extraction of biologically relevant information; knowledge inference... Bio-entity tagging (mainly genes and proteins) Gene –disease association Protein relations (binary relations and interactions) Function annotation and localization relations Protein sequence (mutations, polymorphisms, modifications) Acronym, synonym and term collection... Enzyme-related information Pharmokinetics Metagenomics...

In the Scope of Our Research Group... In sillico Metabolic Engineering Systems Biology Bionformatics Customised end-user applications Heterogeneous data integration Open-source plugins Biomedical LiteratureMining Modelling of Metabolic and Regulatory Networks Modelling of fed- batch fermentation processes Optimization of fed-batch fermentation processes Escherichia coli Helicobacter pylori Saccharomyces cerevisiae Kluyveromyces lactis … Escherichia coli Helicobacter pylori Saccharomyces cerevisiae Kluyveromyces lactis …

Genome-scale Model Reconstruction To have a comprehensible knowledge base – Metabolic machinery – Transcriptional regulatory events To be able to perform in silico simulations – In need of a set of balanced reactions => genome-scale model Rocha et al (2007), Gene Ess Gen Scale

Work in progress: Genome-scale Model Reconstruction Consolidating knowledge on Escherichia coli K-12 MG1655 The latest E. coli genome-scale metabolic model, iAF1260 EcoCyc contents – Manually curated metabolic data – Regulatory information uploaded from RegulonDB BRENDA contents on specific enzymatic activities – e.g. functional parameters such as K i, K m,... + metabolic regulators + cofactors MPIDB contents on experimentally determined interactions among E. coli proteins Literature – To help in conflict/inconsistency resolution – To add novel information (e.g. information on protein/gene relation to particular stress conditions)

Work in progress: Genome-scale Model Reconstruction Reconstructing models for another organisms... Helicobacter pylori Kluyveromyces lactis Streptococcus faecalis

Work in progress: Biomedical Literature Mining Mining the bibliome for a systematic review on the stringent response of the bacterium Escherichia coli

Work in progress: Biomedical Literature Mining Establishing an Evaluation Baseline for Document Classifiers in Biomedical Curation – the enzyme scenario Where does enzyme information come from? Biocuration

Work in progress: Biomedical Literature Mining Lourenço et al. (2010), Expert Systems With Applications, 37(4), 3444–3453. Lourenço et al. (2009), J Biomed Inform. 42(4): Developing component modules for text mining services for biocuration

Work in progress: Network Analysis Developing a framework for the integrated analysis of metabolic and regulatory networks Genetic Regulation Metabolic Transcriptional Regulation Inhibition / Activation A B C D R1 G Promoter TF E Case study: Escherichia coli K-12

Work in progress: Optimization Tools OptFlux is an open-source, user-friendly and modular software aimed at being the reference computational tool for metabolic engineering applications. It allows the use of stoichiometric metabolic models for simulation and optimization purposes. Rocha et al (2010), BMC Syst Biol. 4:45.