1 CIS607, Fall 2004 Semantic Information Integration Attendees: Vikash Agarwal, Julian M Catchen Kevin A Huck, Kushal M Koolwal, Paea J Le Pendu Xiangkui.

Slides:



Advertisements
Similar presentations
The Acquisition and Sharing of Domain Knowledge Contained in Software with a Compliant SIK Architecture by Prof. dr. Vasile AVRAM Academy of Economic Studies.
Advertisements

Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
IN350 Document Management & Information Steering Introduction to Document Management. Class 1 August 25, 2003 Judith A. Molka-Danielsen
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
Contents of this Talk [Used as intro to Genome Databases Seminar, 2002] Overview of bioinformatics Motivations for genome databases Analogy of virus reverse-eng.
Ontology Notes are from:
1 CIS607, Fall 2004 Semantic Information Integration Presentation by Xiangkui Yao Week 6 (Nov. 3)
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
1 CIS607, Fall 2004 Semantic Information Integration Presentation by Julian Catchen Week 3 (Oct. 13)
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
A Flexible Workbench for Document Analysis and Text Mining NLDB’2004, Salford, June Gulla, Brasethvik and Kaada A Flexible Workbench for Document.
1 CIS607, Fall 2005 Semantic Information Integration Instructor/Organizer: Dejing Dou Week 1 (Sept. 28)
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
1 CIS607, Fall 2006 Semantic Information Integration Instructor: Dejing Dou Week 10 (Nov. 29)
1 CIS607, Fall 2005 Semantic Information Integration Presentation by Zebin Chen Week 7 (Nov. 9)
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
11/8/20051 Ontology Translation on the Semantic Web D. Dou, D. McDermott, P. Qi Computer Science, Yale University Presented by Z. Chen CIS 607 SII, Week.
1 CIS607, Fall 2004 Semantic Information Integration Presentation by Xiaofang Zhang Week 7 (Nov. 10)
1 Information Integration and Source Wrapping Jose Luis Ambite, USC/ISI.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
Srihari-CSE730-Spring 2003 CSE 730 Information Retrieval of Biomedical Text and Data Inroduction.
9/30/2004TCSS588A Isabelle Bichindaritz1 Introduction to Bioinformatics.
Formal Empirical Applied Mathematical and technical methods and theories Cognitive, behavioral, and organizational techniques and theories ImagingBioInformaticsClinical.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
Information Need Question Understanding Selecting Sources Information Retrieval and Extraction Answer Determina tion Answer Presentation This work is supported.
Designing the Team-oriented Ontology Management System with Ajax Technology Ze Li, Johannes Keizer, Zhong Wang, Margherita Sini, Yelu Zheng The Institute.
Professor Michael J. Losacco CIS 1110 – Using Computers Database Management Chapter 9.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
Bioinformatics MEDC601 Lecture by Brad Windle Ph# Office: Massey Cancer Center, Goodwin Labs Room 319 Web site for lecture:
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.
Structural Models Lecture 11. Structural Models: Introduction Structural models display relationships among entities and have a variety of uses, such.
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
Working with Ontologies Introduction to DOGMA and related research.
12/7/2015Page 1 Service-enabling Biomedical Research Enterprise Chapter 5 B. Ramamurthy.
Mining the Biomedical Research Literature Ken Baclawski.
A collaborative tool for sequence annotation. Contact:
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
1 MedAT: Medical Resources Annotation Tool Monika Žáková *, Olga Štěpánková *, Taťána Maříková * Department of Cybernetics, CTU Prague Institute of Biology.
Databases, Ontologies and Text mining Session Introduction Part 2 Carole Goble, University of Manchester, UK Dietrich Rebholz-Schuhmann, EBI, UK Philip.
Web Technologies for Bioinformatics Ken Baclawski.
7. Data Import Export Lingma Acheson Department of Computer and Information Science IUPUI CSCI N207 Data Analysis Using Spreadsheets 1.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Data Mining Concepts and Techniques Course Presentation by Ali A. Ali Department of Information Technology Institute of Graduate Studies and Research Alexandria.
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
1 Integrating Databases into the Semantic Web through an Ontology-based Framework Dejing Dou, Paea LePendu, Shiwoong Kim Computer and Information Science,
RDF based on Integration of Pathway Database and Gene Ontology SNU OOPSLA LAB DongHyuk Im.
Presenter: Bradley Green.  What is Bioinformatics?  Brief History of Bioinformatics  Development  Computer Science and Bioinformatics  Current Applications.
Biological Databases By: Komal Arora.
Databases, Ontologies and Text mining Session Introduction Part 2
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Functional Annotation of the Horse Genome
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
ONTOMERGE Ontology translations by merging ontologies Paper: Ontology Translation on the Semantic Web by Dejing Dou, Drew McDermott and Peishen Qi 2003.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

1 CIS607, Fall 2004 Semantic Information Integration Attendees: Vikash Agarwal, Julian M Catchen Kevin A Huck, Kushal M Koolwal, Paea J Le Pendu Xiangkui Yao, Xiaofang Zhang Instructor/Organizer: Dejing Dou Week 10 (Dec. 1)

2 Outline Personal Information Management (PIM) Semantic Integration in PIM Medical Informatics and Bioinformatics Semantic Integration in Biomedical Informatics

3 Personal Information Personal Information Homepages (HTML, XML) Personal s (Text) Spreadsheets (E.g. Microsoft Excel ) Contact Lists (Text) Calendar Publications and Presentations ( Word, Latex, PowerPoint ) Personal Databases (SQL)……

4 Personal Information Management (PIM) How to organize personal information resources – They are currently organized by applications and locations. How to integrate and share the data – Mostly manually (e.g. copy&paste) How to search (query). – E.g. Prof. Wang wants to know the papers his students presented in the conferences and travel expenses from grants. Good news: The development of Internet, Web and Wireless communication makes personal information accessible from desktop, laptop, palm and cellphone. The problems: Different formats and data structures, different contents based on applications.

5 Association(Relationship)-based PIM Organize the personal information resources based on their associations (relationships). – s  Contact Lists – Homepage  Publications – Calendar  Spreadsheets Use a domain ontology to define those concepts and store associations (relationships) as mappings. Develop an integration engine to process the data and query based on the domain ontology and mappings.

6 Association(Relationship)-based PIM (cont ’ d) Domain ontology Person Homepage Contacts SpreadSheet Publications Calendar s Information Resources (Data) Integration Engine User Personal DBs SQL

7 Main Topics in Association-based PIM How to integrate structured data and unstructured data – Databases and SpreadSheets are structured, XML and Latex are semi-structured. – s, HTML, Contacts, Word are unstructured text. How to define the domain ontology. The concepts of different resources use different hierarchy. How to express the mapping (rules) of different information resources. How can integration engine use those mappings to integrate data and answer query. – s  Contact Lists – Homepage  Publications  Personal Databases – Calendar  Spreadsheets

8 Bioinformatics and Medical Informatics What it is The analysis of biological and medical information using computers and statistical techniques; the science of developing and utilizing computer databases and algorithms to accelerate and enhance biological and medical research. What it can do – In genomics, bioinformatics includes the development of methods to search databases quickly, to analyze DNA sequence information, and to predict protein sequence and structure from DNA sequence data. – In neuroscience, medical informatics can analyze the EEG and MRI data to study functions of neurons and human brain. – In pharmacy, medical informatics can help study drug use and drug interactions. – In clinical study, medical informatics (e.g. expert system) can help study diseases and treatment of patients.

9 Good news and problems Good news – Most biomedical data has been stored in databases. They are structured data. – Statistics-based data mining techniques has been used successfully to get the pattern of data. Problems in biomedical data integration. – Most biomedical databases were developed locally and application- oriented, there is few agreement in their schemas. – It is difficult for other people, especially people without biomedical knowledge, to understand the schemas. – Database schemas are not expressive for the meaning (“semantics”) of data and pattern of data.

10 Integrating Neuronal Databases Cooperation with Yale Medical Informatics Center to integrate Senselab (Yale) and CNDB (Cornell)’s web- based neuronal databases. – Senselab: model and structure information of a particular class of neurons. – CNDB: experimental data for individual neurons measured at a particular day. Researchers in Senselab have marked up their data and database schema with EDSP[Marenco etal03], an XML specification. Cornell’s researchers also have marked up their data and database schema with another XML dialect. Structure image Experimental EEG Data Electroencephalography

11 Integrating Neuronal Databases(cont ’ d) Get their database schemas from XML files and transform them to class and property definitions. Find the mapping of these two neuronal database schemas with the help of domain experts, neuroscientists. Merge these two database schemas with bridging axioms. e.g: (forall (n - neuron) (if n n))) We have developed some initial semi-automatic tools and GUIs to help domain experts, such as neuroscientists, to map and merge two neuronal database schemas.

12 Interactive Axioms Composition by Domain Experts Ontology Mapping by similarity matching using dictionaries. e.g. Protein vs. Enzyme Axiom Production: Allow Domain Experts give some concrete examples about how two symbols in different ontologies (database schemas) are related. Generalize examples to usable bridging axioms, an machine learning approach to generate mapping rules. Pattern Reuse: Based on the fact a large number of correspondences can usually be sorted into a small set of patterns, allow domain experts to note and reuse these patterns. Consistency testing: D etect contradiction of generated bridging axioms; Display the bugs to domain experts and allow axioms to be edited.

13 The mappings between EEG and MRI data EEG Data acquisition Magnetic resonance imaging (MRI)

14 Ontology-based Data Analysis (Mining) You can consider it as an expert system. At least useful for training purposes. Data M Data R Inference Engine OROR OMOM EEG, MRI …data Computational tools What are the features (patterns) of processed data What can the patterns tell us (e.g. any function and disease of brain)

15 Ontology-based Genome DB Mediation Integrating databases with the domain ontology. The system can process meaningful query and data based on the mapping rules. …… DB 2 DB 1 DB 3 Onto 1 Domain Ontology (includes GO) Onto 2 Onto 3 Query based on domain ontology e.g. ZFINe.g. another Zebrafish Lab DB e.g. Human DB

16 Genotypes + Environment => Phenotypes Data P Data G OGOG OPOP The features (makeup) of Gene The Observable characteristics produced by genotype interacting with the environment Data E OEOE + Environment Features GO (gene ontology) Cellular Component Molecular Functions Biological Process temperature pressure light …… Too many Features