Dr. Michael Schroeder Department of Computing City University, London, UK Visiting Scientist Medical.

Slides:



Advertisements
Similar presentations
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up SQUAD Libby Bishop Online Qualitative Data Resources: Best Practice in Metadata Creation.
Advertisements

The use of Ontology in Organising and Managing Protein Family Resources Katy Wolstencroft, University Of Manchester.
Terminologies: An e-Science perspective Nicholas Gibbins Intelligence, Agents, Multimedia University of Southampton.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
13 th September 2007 UK e-Science All Hands Meeting Text Mining Services to Support e-Research Brian Rea and Sophia Ananiadou National Centre for Text.
SEVENPRO – STREP KEG seminar, Prague, 8/November/2007 © SEVENPRO Consortium SEVENPRO – Semantic Virtual Engineering Environment for Product.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
1 Enriching UK PubMed Central SPIDER launch meeting, Wolfson College, Oxford Paul Davey, UK PubMed Central Engagement Manager.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
1 CBioC: Collaborative Bio- Curation Chitta Baral Department of Computer Science and Engineering Arizona State University.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
1 CIS607, Fall 2006 Semantic Information Integration Instructor: Dejing Dou Week 10 (Nov. 29)
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Class Projects. Future Work and Possible Project Topic in Gene Regulatory network Learning from multiple data sources; Learning causality in Motifs; Learning.
1 Digital Libraries and Evidence in the Developing World Context Dr. Jon Ferguson Senior Health Database Scientist IMMPACT Project University of Aberdeen.
Semantic Web for E-Science and Education Enrico Motta Knowledge Media Institute The Open University, UK.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
From T. MADHAVAN, & K.Chandrasekaran Lecturers in Zoology.. EXIT.
341: Introduction to Bioinformatics Dr. Natasa Przulj Deaprtment of Computing Imperial College London
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
1 The Discovery Informatics Framework Pat Rougeau President and CEO MDL Information Systems, Inc. Delivering the Integration Promise American Chemical.
Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Tae-Hyung Kim 1 Gil-Mi Ryu 1,2 InSong Koh 2 Jong Park 3 1.
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
Grant Number: IIS Institution of PI: Arizona State University PIs: Zoé Lacroix Title: Collaborative Research: Semantic Map of Biological Data.
IST SEWASIE SEWASIE 3rd Review March 14, 2005 SEWASIE Value Proposition and End User Demo Andreas Becks.
AuthorLink: Instant Author Co-Citation Mapping for Online Searching Xia Lin Howard D. White Jan Buzydlowski Drexel University Philadelphia,
The Yellow Group Design Informatics (Regli, Stone, Kusiak, Leifer, Gupta, Chung, Fenves, Law, Kopena)
Chapter 1 Introduction to Data Mining
Bioinformatics Xin-Yi Chua Peter Ansell, Chris Bowles, Lawrence Buckingham, James M. Hogan, Scott Mann, Paul Roe, Jiro Sumitomo, Jan M. Weinert
IProLINK – A Literature Mining Resource at PIR (integrated Protein Literature INformation and Knowledge ) Hu ZZ 1, Liu H 2, Vijay-Shanker K 3, Mani I 4,
Flexible Text Mining using Interactive Information Extraction David Milward
Helping scientists collaborate BioCAD. ©2003 All Rights Reserved.
Teranode Tools and Platform for Pathway Analysis Michael Kellen, Solution Manager June 16, 2006.
Page 1 SCAI Dr. Marc Zimmermann Department of Bioinformatics Fraunhofer Institute for Algorithms and Scientific Computing (SCAI) Grid-enabled drug discovery.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Quality views: capturing and exploiting the user perspective on data quality Paolo Missier, Suzanne Embury, Mark Greenwood School of Computer Science University.
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Multi-agent Systems in Medicine Štěpán Urban. Content  Introduction to Multi-agent Systems (MAS) What is an Agent? Architecture of Agent MAS Platforms.
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
12/7/2015Page 1 Service-enabling Biomedical Research Enterprise Chapter 5 B. Ramamurthy.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Ontology and Databases 1. We'll go around with a self-introduction of participants (10~15 minutes) - we'll skip this if we have more than 20 participants.
An approach to carry out research and teaching in Bioinformatics in remote areas Alok Bhattacharya Centre for Computational Biology & Bioinformatics JAWAHARLAL.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
NeOn Components for Ontology Sharing and Reuse Mathieu d’Aquin (and the NeOn Consortium) KMi, the Open Univeristy, UK
Example projects using metadata and thesauri: the Biodiversity World Project Richard White Cardiff University, UK
Ontologies for the Semantic Web Prepared By: Tseliso Molukanele Rapelang Rabana Supervisor: Associate Professor Sonia Burman 20 July 2005.
Visual Knowledge ® Software Inc. Visual Knowledge BioCAD Case Study Parallels to Other Domains VK Semantic Web Server.
High throughput biology data management and data intensive computing drivers George Michaels.
Genomic Medicine Grid Juan Pedro Sánchez Merino Instituto de Salud Carlos III
CCNT Lab of Zhejiang University
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
What contribution can automated reasoning make to e-Science?
MANAGING KNOWLEDGE FOR THE DIGITAL FIRM
Data Warehousing and Data Mining
Presentation transcript:

Dr. Michael Schroeder Department of Computing City University, London, UK Visiting Scientist Medical Research Council Cambridge, UK BioGrid

Drowning in information... Biology has changed dramatically from an information-light to an information-intensive area Much publicised Human Genome Project is only tip of the iceberg >500 tools online >8000 new abstracts per month LLNE YLEEVE EYEEDE

Heureka ! ??????? ?????? BioGrid Provide access to multiple, heterogeneous and geographically distributed information sources. perform active searches for relevant information in non-local domain (includes retrieving, analysing, manipulating, and integrating information)

BioGrid Objectives Objectives: Information and knowledge grid allowing knowledge discovery and access to multiple types of structured and unstructured data, including gene expression and protein interaction data Business objectives: Grid for next generation classification research infrastructure for large proteomics and genomics databases; Efficient transactional enterprise collaboration; Faster time to market biotech innovation

Example A scientist is interested in a gene, e.g. NOX4 –Search PubMed for articles Too many hits Gene also known under different name –Analyse gene expression data Which genes behave similar to NOX4 Function of NOX4? –Analyse protein interactions Which interactions and processes does expression of NOX4 trigger?

Challenges Semantic Complexity –Computer does not “understand” data –DBs and systems cannot inter-operate Computational complexity –generating protein interaction map takes ca. 7 days –analysing large sets of gene expression data can take up to an hour –analysis of large text bodies complex

BioGrid Vision BioGrid Interaction data Metabolic pathway data Expression data Sequences Character- isation of target sequence Scientific literature

Approach Semantic Web –global and local ontologies to capture meta-data and facilitate semantic inter-operability Grid technology –transparent access to distributed resources Agent technology –personal information agent collecting and presenting relevant information on behalf of its user BioGrid Client BioGrid Client BioGrid Client BioGrid Server Literature Classification Server T he Grid Space Explorer PSIMAP

Classification server Finding and processing relevant scientific literature BioGrid Interacti on data Metab olic pathw ay data Express ion data Seque nces Charact er- isation of target sequenc e Scient ific literat ure

Results of PubMed Lorenz P,Transcriptional repression mediated by the KRAB domain of the human C2H2 zinc finger protein Kox1/ZNF10 does not require histone deacetylation. Biol Chem Apr;382(4): Fredericks WJ. An engineered PAX3-KRAB transcriptional repressor inhibits the malignant phenotype of alveolar rhabdomyosarcoma cells harboring the endogenous PAX3-FKHR oncogene. Mol Cell Biol Jul;20(14): Author Title Year Journal However, to a machine things look different!

Results of PubMed Lorenz P,Transcriptional repression mediated by the KRAB domain of the human C2H2 zinc finger protein Kox1/ZNF10 does not require histone deacetylation. Biol Chem Apr;382(4): Fredericks WJ. An engineered PAX3-KRAB transcriptional repressor inhibits the malignant phenotype of alveolar rhabdomyosarcoma cells harboring the endogenous PAX3-FKHR oncogene. Mol Cell Biol Jul;20(14): Solution: tag data (XML)

Results of PubMed Lorenz P Transcriptional repression mediated by the KRAB domain of the human C2H2 zinc finger protein Kox1/ZNF10 does not require histone deacetylation. Biol Chem However, to a machine things look different!

Results of PubMed Lorenz P Transcriptional repression mediated by the KRAB domain of the human C2H2 zinc finger protein Kox1/ZNF10 does not require histone deacetylation. Biol Chem Solution: use ontologies (Semantic Web)

Semantic Web DAML+OIL is XML-based language to specify ontologies Annotations of data refer to global ontology (where appropriate), hence joint understanding of data possible Ongoing efforts in bioinformatics: e.g. gene ontology

Classification Server Scientific objectives: Effective concept recognition Pattern matching Intelligent data sourcing agents and tagging technology Automated categorisation in a biotechnology-domain Metadata hierarchy Functional interoperability methodology design Domain knowledge mapping, Implementing a logical domain ontology Integration of agent & classification logic & visualisation technology.

Space Explorer … is a general purpose visualisation tool facilitating interactive exploration of large data sets … deals with multi-variate and proximity data … provides principal component analysis multi-dimensional scaling (principal co-ordinate analysis, spring embedding) clustering … provides dendrograms 2D and 3D (using VRML) scatter plots graphs and colour maps BioGrid Interacti on data Metab olic pathw ay data Express ion data Seque nces Charact er- isation of target sequenc e Scient ific literat ure

Example: gene expression data

Example: Protein topology

Protein Interaction: PSIMAP BioGrid Interacti on data Metab olic pathw ay data Express ion data Seque nces Charact er- isation of target sequenc e Scient ific literat ure Based on 3D structure, PSIMAP determines interactions of proteins Structure of map of great importance for understanding of biological processes Generation and analysis of the map are computationally expensive

Partners No. Organisation (abbreviation) Count ry RTD role in the project 1 University of Groningen (RUG) NL User, Bioinformatics on drug discovery 2 ZooRobotics (ZRO) NL Co-ordinator, Supplier of GRID Classification Server, Exploitation Mng. 3 City University London (CIT) UK Supplier of intelligent agents and Space Explorer 4 University of Cyprus (UCY) EL Supplier of GRID knowledge engineering 5 Medical Research Centre (MRC) UK Supplier of PSIMAP, User, bio informatics on Food and Nutrition

Pert diagram

Work packages Workpackage title WP0Management WP1Source domain analysis WP2Hierarchy creation, Metadata model development WP3Classification logic integration WP4Agent implementation WP5Visualisation implementation WP6Measurement and evaluation WP7Dissemination and exploitation

Expression Space: Space Explorer Pathway Space: BioGrid Interaction Space: PSIMAP Literature Space: Classification Server BioGrid Mission: Distributed computational biology platform for fast pharmaceutical research