TDM in the Life Sciences Application to Drug Repositioning *

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Learning Semantic Information Extraction Rules from News The Dutch-Belgian Database Day 2013 (DBDBD 2013) Frederik Hogenboom Erasmus.
6/23/03 IndoUS DL 2003 Text Metadata Mining: Exploring its potential* Padmini Srinivasan School of Library & Information Science The University of Iowa.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
SEVENPRO – STREP KEG seminar, Prague, 8/November/2007 © SEVENPRO Consortium SEVENPRO – Semantic Virtual Engineering Environment for Product.
UNCERTML - DESCRIBING AND COMMUNICATING UNCERTAINTY Matthew Williams
Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.
GOAT: The Gene Ontology Annotation Tool Dr. Mike Bada Department of Computer Science University of Manchester
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
Wrap up  Matching  Geometry  Semantics  Multiscale modelling / incremental update / generalization  Geometric algorithms  Web Services.
1 Enriching UK PubMed Central SPIDER launch meeting, Wolfson College, Oxford Paul Davey, UK PubMed Central Engagement Manager.
Ontologies and the Semantic Web by Ian Horrocks presented by Thomas Packer 1.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
DI FC UL1 Gene Function Prediction by Mining Biomedical Literature Pooja Jain Master in Bioinformatics Supervisor - Mário Jorge Costa Gaspar.
UCB CS Research Fair Search Text Mining Web Site Usability Marti Hearst SIMS.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Scalable Text Mining with Sparse Generative Models
B IOMEDICAL T EXT M INING AND ITS A PPLICATION IN C ANCER R ESEARCH Henry Ikediego
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications Chapters Presented by Sole.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
Srihari-CSE730-Spring 2003 CSE 730 Information Retrieval of Biomedical Text and Data Inroduction.
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Custom driven scientific information extraction from digital libraries using integrated text mining services Betim Çiço, Adrian Besimi, Visar Shehu 14th.
Information Need Question Understanding Selecting Sources Information Retrieval and Extraction Answer Determina tion Answer Presentation This work is supported.
EXCS Sept Knowledge Engineering Meets Software Engineering Hele-Mai Haav Institute of Cybernetics at TUT Software department.
Mining the Semantic Web: Requirements for Machine Learning Fabio Ciravegna, Sam Chapman Presented by Steve Hookway 10/20/05.
Grant Number: IIS Institution of PI: Arizona State University PIs: Zoé Lacroix Title: Collaborative Research: Semantic Map of Biological Data.
THEME 1: Improving the Experimentation and Discovery Process Unprecedented complexity of scientific enterprise Is science stymied by the human bottleneck?
Knowledge Discovery in the Digital Library Access tools for mining science ICSTI Public Workshop Presented by: Bernard Dumouchel, Director-General February.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
PattArAn – From Annotation Triplets to Sentence Fingerprints Motivation Motivation  Scientific concepts are annotated with controlled vocabulary (CV)
Improve your R&D Effectiveness and Manage Your Intellectual Property Assets with Luxid ® for Life Sciences.
Knowledge Discovery from Biological and Clinical Data: BASIC BACKGROUND.
UNCERTML - DESCRIBING AND COMMUNICATING UNCERTAINTY WITHIN THE (SEMANTIC) WEB Matthew Williams
Algorithmic Detection of Semantic Similarity WWW 2005.
CNI, 3rd April 2006 Slide 1 UK National Centre for Text Mining: Activities and Plans Dr. Robert Sanderson Dept. of Computer Science University of Liverpool.
Theme 2: Data & Models One of the central processes of science is the interplay between models and data Data informs model generation and selection Models.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Iana Atanassova Research: – Information retrieval in scientific publications exploiting semantic annotations and linguistic knowledge bases – Ranking algorithms.
Opportunities for Text Mining in Bioinformatics (CS591-CXZ Text Data Mining Seminar) Dec. 8, 2004 ChengXiang Zhai Department of Computer Science University.
Automatic Discovery and Processing of EEG Cohorts from Clinical Records Mission: Enable comparative research by automatically uncovering clinical knowledge.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
TMO Review Jin Guang Zheng, Tetherless World Constellation.
Automatically Identifying Candidate Treatments from Existing Medical Literature Catherine Blake Information & Computer Science University.
Clinical research data interoperbility Shared names meeting, Boston, Bosse Andersson (AstraZeneca R&D Lund) Kerstin Forsberg (AstraZeneca R&D.
Advanced Gene Selection Algorithms Designed for Microarray Datasets Limitation of current feature selection methods: –Ignores gene/gene interaction: single.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
1 Survey of Biodata Analysis from a Data Mining Perspective Peter Bajcsy Jiawei Han Lei Liu Jiong Yang.
Show & Tell Limsoon Wong Kent Ridge Digital Labs Singapore Role of Bioinformatics in the Genomic Era.
Large Scale Semantic Data Integration and Analytics through Cloud: A Case Study in Bioinformatics Tat Thang Parallel and Distributed Computing Centre,
The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions and to generate massive data sets.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Introduction to PubChem BioAssay
Determining the Role of Science and Technology in Agricultural Production Reminder: student learning activities are at the end of this power point.
MINING DEEP KNOWLEDGE FROM SCIENTIFIC NETWORKS
NCT: Gaining Medical Insights and Enhancing Care for Cancer Patients with SAP HANA® Organization National Center for Tumor Diseases (NCT) Heidelberg, part.
KnowEnG: A SCALABLE KNOWLEDGE ENGINE FOR LARGE SCALE GENOMIC DATA
Contextual Intelligence as a Driver of Services Innovation
DReNIn_O “A high-level ontology for drug repositioning” Joseph Mullen
Making “Open Data” Work: Challenges for Data Integration in Genomics Research
Rey-Long Liu Dept. of Medical Informatics Tzu Chi University Taiwan
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Social Knowledge Mining
TDM=Text Mining “automated processing of large amounts of structured digital textual content for purposes of information retrieval, extraction, interpretation.
How to publish in a format that enhances literature-based discovery?
Citation-based Extraction of Core Contents from Biomedical Articles
Next Generation Science/Common Core Standards Addressed!
Presentation transcript:

TDM in the Life Sciences Application to Drug Repositioning * Dr. George Tsatsaronis Senior NLP Scientist, Operations (Content and Innovation) e-mail: g.tsatsaronis@elsevier.com * This research was conducted during the period 2010-2016, at the BIOTEC center, of TU Dresden, Dresden, Germany, and was funded by DFG, BMBF, and EU research projects/programs

George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC The Problem: Drug Repositioning Dove, Nature, 2003 Costs for one drug: $500 million - $2,000 million [Adams and Brantner, 2006] 2 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

The Potential: TDM in Life Sciences enables us to… ask a question: What is the biological role of expansins in fungi? and get back an answer automatically: Expansins are extracellular proteins that increase plant cell-wall extensibility. These wall-loosening proteins are involved in cell wall extension and polysacharide degradation. In fungi expansins and expansin-like proteins have been found to localize in the conidian cell wall and are probably involved in cell wall remodeling during germination. 3 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

The Potential: TDM in Life Sciences enables us to… focus on some specific disease: Raynaud’s Syndrome and get an automatically generated hypothesis on treatment options: Fact 1: Tiejen 1975: “patients with Raynaud’s syndrome… increased blood viscosity” Fact 2: Woodcock 1984: “Beneficial effect of fish oil on blood viscosity” Hypothesis: Fact 1 + Fact 2: Fish oil as treatment of Raynaud’s syndrome 4 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

The Potential: TDM in Life Sciences enables us to… focus on a scientific field: and get automatically a view on where is research going: 5 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

The Current State Of the Art …these case studies can already be reproduced by existing text mining engines, e.g., using text and data mining, natural language processing and semantic integration of resources. 6 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

The Process: An Overview 7 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

The Challenges: Scalability and Multilinguality Source: World Intellectual Property Report 2011, WIPO http://www.wipo.int/edocs/pubdocs/en/intproperty/944/wipo_pub_944_2011.pdf (pp. 52-53) 8 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

The Challenges: Heterogeneity and Integration of Resources During the 6 years of research in TU Dresden, we never had problems accessing the data; the main challenges have been how to integrate and combine all of this information, find the human expertise to guide the TDM feature engineering process, and create models with a biological basis/interpretation that is reasonable. 9 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC The Algorithm Annotate all of the textual resources (e.g., clinical trials, patents, database entries, ontology definitions, scientific abstracts, gene functions) with ontological concepts Unify/integrate the annotated information Focus on Drugs, Targets (Genes and their products), Diseases Use statistical measures such as PMI and chi-square to build entities’ profiles keeping the most important terms that describe each drug and each gene (alternatively, word vectors, deep learning, recurrent neural nets with skip-grams) Use measures of semantic relatedness to compute the pairwise similarity of drug and gene profiles Rank the most associated genes for each drug Verbalize the connections/relations Allow expert biologists/clinical doctors to review and reject obvious false positives Manually/Automatically collect supporting evidence for the remaining top-ranked pairs; Suggest Drug repositioning of that drug via this target (gene) for the indications participating in the profiles 10 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC The Results M. Kissa, G. Tsatsaronis, M. Schroeder. “Prediction of drug-gene associations via ontological profile similarity with application to drug repositioning”, Elsevier Methods, In Press, 2015 11 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC The Potential M. Kissa, G. Tsatsaronis, M. Schroeder. “Prediction of drug-gene associations via ontological profile similarity with application to drug repositioning”, Elsevier Methods, In Press, 2015 12 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

The Main Bottleneck: Open Challenges Verbalizing datasets (assays, microarray experiments) and ontologies and integrate them Create models that produce results with biological interpretations 13 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

The Main Bottleneck: Complexity 14 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

Conclusions and Take Home Messages Semantic integration and data/text mining already provide helpful novel tools and services to researchers. We are experiencing the transition to the era of automated hypothesis generation and validation! Key challenges are: Integration of the heterogeneous data sources Interpretation of models and predictions Human expertise on how to link resources, and what features to use for the model learning 15 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC Thank you very much for your attention! Questions / Discussion 16 George Tsatsaronis, TDM in the Life Sciences, 16/11/2016, EC

Can we exploit all this information? The life circle of TDM Unstructured Text (implicit knowledge) Structured content (explicit knowledge) Information extraction Semantic metadata Knowledge Discovery Retrieval Semantic Search/ Data Mining 17 George Tsatsaronis, TDM in the Life Sciences, 03/11/2016, EC

Application: Drug Repositioning; evidence M. Kissa, G. Tsatsaronis, M. Schroeder. “Prediction of drug-gene associations via ontological profile similarity with application to drug repositioning”, Elsevier Methods, In Press, 2015 18 George Tsatsaronis, TDM in the Life Sciences, 03/11/2016, EC

Application: Drug Repositioning; performance M. Kissa, G. Tsatsaronis, M. Schroeder. “Prediction of drug-gene associations via ontological profile similarity with application to drug repositioning”, Elsevier Methods, In Press, 2015 19 George Tsatsaronis, TDM in the Life Sciences, 03/11/2016, EC