Literature Mining Tools for Analysis of Genomic Data Ramin Homayouni, Ph.D. Associate Professor of Biology Director of Bioinformatics UTHSC BINF April.

Slides:



Advertisements
Similar presentations
Social networks, in the form of bibliographies and citations, have long been an integral part of the scientific process. We examine how to leverage the.
Advertisements

PubMed and its search options Jan Emmerich, Sonja Jacobi, Kerstin Müller (5th Semester Library Management)
NCBI/WHO PubMed/Hinari Course NCBI Literature Databases: PubMed Background.
Introduction to PubMed® (pubmed.gov)
NCBI data, sliding window programs and dot plots Sept. 25, 2012 Learning objectives-Become familiar with OMIM and PubMed. Understand the difference between.
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
A Smorgasbord of PubMed Interfaces Margaret Henderson, B.Sc., M.L.I.S., Education Services, Tompkins-McCaw Library for Health Sciences, Virginia Commonwealth.
Mining Brain-Related Transcription Factor- Disease Relationships for Novel Linkages Unified Medical Language System (UMLS) Specialised Databases Atlas.
Searching Pubmed Database استخدام قاعدة المعلومات Pubmed د. سيناء عبد المحسن العقيل قسم الصيدلة الإكلينيكية برنامج مهارات البحث العلمي.
1.
The National Library of Medicine online resources Salima M’seffar INH- Bibliotheque
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
NATIONAL LIBRARY OF MEDICINE The PubMed ID and Entrez, PubMed and PubMed Central Edwin Sequeira National Center for Biotechnology Information June 21,
Class Projects. Future Work and Possible Project Topic in Gene Regulatory network Learning from multiple data sources; Learning causality in Motifs; Learning.
Mouse Genome Informatics November 2008 Paul Szauter MGI User Support.
Pathway Informatics 6 th July, 2015 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services Health Sciences Library System University of.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
Indexing 1/2 BDK12-3 Information Retrieval William Hersh, MD Department of Medical Informatics & Clinical Epidemiology Oregon Health & Science University.
Bioinformatics for biomedicine
X-ray crystallography NMR cryoEM Experimental approaches for structural biology.
Combining Numerical and Semantic Analysis for Biological Data Daniel R. Masys, M.D. Professor and Chair Department of Biomedical Informatics Professor.
Linking Diseases and Genes through Informatics Knowledge Bases and Ontologies Joyce A. Mitchell, Ph.D. National Library of Medicine University of Missouri.
Bioinformatics and medicine: Are we meeting the challenge?
1 Enhancing Organism Based Disease Knowledge Using Biological Taxonomy, and Environmental Ontologies Ken Baclawski Northeastern University Neil Sarkar.
University of Illinois at Urbana-Champaign INSTITUTE FOR GENOMIC BIOLOGY BeeSpace: An Interactive Environment for Functional Analysis of Social Behavior.
Introduction to Bioinformatics CPSC 265. Interface of biology and computer science Analysis of proteins, genes and genomes using computer algorithms and.
Networks and Interactions Boo Virk v1.0.
Searching PubMed® NCBI, NLM Resources, Micromedex -GSBS TTUHSC Preston Smith Library presents Rev. 08/17/14.
Expert PubMed/Medline Searching Skills Konstantina (Dina) Matsoukas, MLIS Head of Reference & Education Coordinator CUMC - Health Sciences Library
Biomedical Databases & Tools Rolando Garcia-Milian Biomedical & Health Information Services Department Health Sciences Center Library.
MEDLINE for Medical Research Juliet Ralph and César Pimenta Hilary Term 2007.
Doug Brutlag 2011 Bibliographic Search Doug Brutlag Professor Emeritus of Biochemistry.
Text Mining Special Interest Group Stuart Murray, Wyeth Research Novartis Institute for Biomedical Research, Cambridge, MA 6-8 th October 2004.
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
UMLS Unified Medical Language System. What is UMLS? A Unified knowledge representation system Project of NLM Large scale Distributed First launched in.
The Gene Ontology and its insertion into UMLS Jane Lomax.
Finding Functional Gene Relationships Using the Semantic Gene Organizer (SGO) Kevin Heinrich Master’s Defense July 16, 2004.
Indexing Mathematical Abstracts by Metadata and Ontology IMA Workshop, April 26-27, 2004 Su-Shing Chen, University of Florida
Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Kevin Heinrich, Lai Wei, and Michael W. Berry University of Tennessee.
NCBI Literature Databases: PubMed
To Boldly GO… Amelia Ireland GO Curator EBI, Hinxton, UK.
Japan Consortium for Glycobiology and Glycotechnology DataBase 日本糖鎖科学統合データベース GDGDB - Glyco-Disease Genes Database The complexity of glycan metabolic pathways.
A literature network of human genes for high-throughput analysis of gene expression Speaker : Shih-Te, YangShih-Te, Yang Advisor : Ueng-Cheng, YangUeng-Cheng,
GENE INDEXING Janice Ward Indexer/Reviser Index Section, NLM.
UM/UT Microarray Short Course May 4, 2006
PubMed …featuring more than 20 million citations for biomedical literature from MEDLINE, life science journals, and online books.
Arrowsmith extensions to bio-informatics Vetle I. Torvik.
DISCUSSION Using a Literature-based NMF Model for Discovering Gene Functional Relationships Using a Literature-based NMF Model for Discovering Gene Functional.
Information retrieval and sliding window programs April 5, 2011 Hand in Homework #1. Homework #2 due Tuesday, April 12. Learning objectives- Understand.
Joined up ontologies: incorporating the Gene Ontology into the UMLS.
GRANITE: A Tool to Generate Gene Relational Networks Jahangheer Shaik, Ph.D. Department of Pathology and Immunology, Washington University School of Medicine.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
MEDLINE®/PubMed® PubMed for Trainers, Fall 2015 U.S. National Library of Medicine (NLM) and NLM Training Center An introduction.
NCBI PubMed NCBI Literature Databases: PubMed Session #1, April 28, 2005 Session #2, April 29, 2005 Ho Chi Minh City, VietNam.
Pathway Informatics 30 th March, 2016 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services Health Sciences Library System University.
GUIDE. P UB M ED
Selection of Resources for the Development of an Information Service Program in Molecular Biology and Genetics Ansuman Chattopadhyay, PhD Information Specialist.
Pathway Informatics 16th August, 2017
The National Library of Medicine and its databases
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
gene-to-gene relationships & networks
The National Library of Medicine and its databases
Livia Vasas PhD Budapest, September 2011.
Mangaldai College, Mangaldai
Lívia Vasas, PhD 2018 The National Library of Medicine and its databases Mozilla Firefox/Google Chrome Lívia Vasas, PhD.
Genomes and Their Evolution
Lívia Vasas, PhD 2018 The Nation Library of Medicine and its databases Mozilla Firefox or Google Chrome Lívia Vasas, PhD.
Lesson 3 Bioinformatics Laboratory
PubMed.
The National Library of Medicine and its databases
Presentation transcript:

Literature Mining Tools for Analysis of Genomic Data Ramin Homayouni, Ph.D. Associate Professor of Biology Director of Bioinformatics UTHSC BINF April 24, 2008

Gene Expression Profiling Alizadeh, et al., (2000) Nature 403:503. Now What?

Useful Links for Functional Analysis Databases: –GO: –MeSH: –MEDLINE: –GEO: Programs: –GOTM (GO): –PubGene (MEDLINE): –Chilibot (MEDLINE): –Arrowsmith (MEDLINE): –PubMatrix (MEDLINE): –TXTGate (MEDLINE): –iHOP: (MEDLINE) –STRING (MEDLINE):

Gene Ontology Consortium  A controlled vocabulary applied to genes in a variety of organisms; updated every 30 minutes!  Established in 1998 as a collaboration between FlyBaseFlyBase (Drosophila) Saccharomyces Genome DatabaseSaccharomyces Genome Database (SGD) Mouse Genome DatabaseMouse Genome Database (MGD)  Three main classifications: Molecular Function (7385 terms) Biological Process (8822 terms) Cellular Component (1430 terms)

Gene Ontology Consortium

GO Tree Machine (GOTM) from WebGestalt Bing Zhang & Jay Snoddy, Vanderbilt University Zhang et al., BMC Bioinformatics Feb 18;5(1):

GO Tree Machine Demo GOTM

GO Tree Machine -- Example

Problems with Gene Ontology, or any other manual indexing approach  The vocabulary is limited  The vocabulary is general  Not Comprehensive, therefore biased for well studied genes  Human error: ~66% consistency between professional indexers! EGFR ERBB2 TRP53 TGFB1 DAB1 RELN LRP8 VLDLR (C)

Products of the National Library of Medicine (NLM) & National Center for Biotechnology Information (NCBI)  Databases GenBank, UniGene, LocusLink (Gene) MEDLINE OMIM  Services HealthSTAR Health Services Research Projects in Progress HSTAT  Vocabulary Medical Subject Headings (MeSH) NLM Classification Unified Medical language Systems (UMLS)

MEDLINE  MEDLINE is the premier bibliographic database for biomedicine supported by the National Library of Medicine  MEDLINE contains approximately 18 million references, most of which have abstracts.  MEDLINE covers over 4800 journals, in over 30 languages  MEDLINE citations date back to 1966  Free abstracts !!

Defining Functional Relationships between Genes  Direct Relationship Gene relationships already known (e.g., A-B or B-C) Term co-occurrence Gene symbol: PubGene ( Jenssen et al., Nature Genetics :21) Gene names (synonyms and aliases) – biochemical  Indirect Relationship Gene relationships unknown (e.g., such as A-C) C B A

Reelin Signaling Pathway Dab1 ApoE Reelin VLDLR ApoER2 APP p35 Cdk5 Amyloid plaques pTau fyn

Miscellaneous Trp53 Fos Nras Rasa1 Rab1 Src Notch1 Dll1 Jag1 Robo1 Ptch Smo Reeler Reln Dab1 VLDLR Lpr8 Gene Document Test Set Alzheimer Disease APP Aplp2 Aplp1 Psen1 Psen2 Lrp1 Mapt Apoe A2m Apbb1 Apba1 Cdk5 Cdk5r Cdk5r2

PubGene Query: Dab1 MouseHuman PubMed Query: Dab1 AND Reln = 16 PubMed Query: Dab1 AND reelin = 152 ! Jenssen et al., Nat Genet May;28(1):21-8.

iHOP Query: Dab1

iHOP Query: Dab1; Sentence Structure

iHOP Query: Dab1; Network building

PubMatrix Demo iHOP (Information Hyperlink over Proteins)

Chilibot Extracts term-term relationship from Medline abstracts. Differentiates interactive (e.g. stimulation or inhibition) and non- interactive (e.g. homology, co- existence, etc.) interactions. Color-codes gene expression values when data are provided. Automatically suggests new hypothesis based on the literature. Chen and Sharp (2004) BMC Bioinformatics 5(1):147.

Chilibot Demo Chilibot

STRING at EMBL

PubMatrix Demo STRING

Vector Space Model: Latent Semantic Indexing w1w1 w2w2 w3w3 Query W1W2W3...WxW1W2W3...Wx G 1 G 2... G x a ij  G1G1 a ij = l ij g i

50-Gene Document Collection

Hierarchical Tree DevelopmentCancerAlzheimerDevelopment

Unrooted Tree (Graph)

Semantic Gene Organizer © User Interface

GeneIndexer Software