Development of the Amphibian Anatomical Ontology

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

1 Computational Asset Description for Cyber Experiment Support using OWL Telcordia Contact: Marian Nodine Telcordia Technologies Applied Research
SEVENPRO – STREP KEG seminar, Prague, 8/November/2007 © SEVENPRO Consortium SEVENPRO – Semantic Virtual Engineering Environment for Product.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
Automated tools to help construction of Trait Ontologies Chris Mungall Monarch Initiative Gene.
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
Information and Business Work
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
OntoBlog: Linking Ontology and Blogs Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of Informatics, Japan 2 Asian.
R R R CSE870: Advanced Software Engineering (Cheng): Intro to Software Engineering1 Advanced Software Engineering Dr. Cheng Overview of Software Engineering.
How can Computer Science contribute to Research Publishing?
Extracting Test Cases by Using Data Mining; Reducing the Cost of Testing Andrea Ciocca COMP 587.
Connecting Diverse Web Search Facilities Udi Manber, Peter Bigot Department of Computer Science University of Arizona Aida Gikouria - M471 University of.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
Moving beyond free text. Authors Scientist does research Scientist publishes research results in journal article Old Paradigm:
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Funded by: European Commission – 6th Framework Project Reference: IST WP 2: Learning Web-service Domain Ontologies Miha Grčar Jožef Stefan.
School of Computing FACULTY OF ENGINEERING Developing a methodology for building small scale domain ontologies: HISO case study Ilaria Corda PhD student.
How do we Collect Data for the Ontology? AmphibiaTree 2006 Workshop Saturday 11:30–11:45 J. Leopold.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
WEB SEARCH PERSONALIZATION WITH ONTOLOGICAL USER PROFILES Data Mining Lab XUAN MAN.
Markup and Validation Agents in Vijjana – A Pragmatic model for Self- Organizing, Collaborative, Domain- Centric Knowledge Networks S. Devalapalli, R.
Alan Ruttenberg PONS R&D Task force Alan Ruttenberg Science Commons.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Benchmarking ontology-based annotation tools for the Semantic Web Diana Maynard University of Sheffield, UK.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS Instructor Ms. Arwa Binsaleh.
Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.
March 31, 1998NSF IDM 98, Group F1 Group F Multi-modal Issues, Systems and Applications.
Towards a Glossary of Activities in the Ontology Engineering Field Mari Carmen Suárez-Figueroa and Asunción Gómez-Pérez {mcsuarez, Ontology.
EBI is an Outstation of the European Molecular Biology Laboratory. Gautier Koscielny VectorBase Meeting 08 Feburary 2012, EBI VectorBase Text Search Engine.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
Adaptive Faceted Browsing in Job Offers Danielle H. Lee
Big Data that might benefit from ontology technology, but why this usually fails Barry Smith National Center for Ontological Research 1.
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
© NCSR, Frascati, July 18-19, 2002 CROSSMARC big picture Domain-specific Web sites Domain-specific Spidering Domain Ontology XHTML pages WEB Focused Crawling.
Intelligent Database Systems Lab Presenter: YU-TING LU Authors: Yong-Bin Kang, Pari Delir Haghighi, Frada Burstein ESA CFinder: An intelligent key.
Advanced Software Engineering Dr. Cheng
Chapter 1 Database and Database Users
Towards a unified MOD resource: An Overview
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,
Outline Types of Databases and Database Applications Basic Definitions
Achieving Semantic Interoperability of Cancer Registries
Database Systems: Design, Implementation, and Management Tenth Edition
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Data challenges in the pharmaceutical industry
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Taxonomies, Lexicons and Organizing Knowledge
Tools of Software Development
Chapter 1 Database Systems
Data Warehousing and Data Mining
ece 627 intelligent web: ontology and beyond
MANAGING DATA RESOURCES
2. An overview of SDMX (What is SDMX? Part I)
CVE.
Manuscript Transcription Assistant Initiative
Web Mining Department of Computer Science and Engg.
Operating Systems : Overview
Metadata The metadata contains
Chapter 1 Database Systems
Operating Systems : Overview
Social Abstractions for Information agents
DSS Concepts, Methodologies and Technologies
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 8 Slide 1 Tools of Software Development l 2 types of tools used by software engineers:
Versioning in Adaptive Hypermedia
Presentation transcript:

Development of the Amphibian Anatomical Ontology Anne Maglia Department of Biological Sciences Missouri University of Science and Technology (formerly the University of Missouri-Rolla) Jennifer Leopold and Analía Pugener, Missouri S&T Susan Gauch, U. Arkansas

Community-identified need (ATOL) Anatomical ontology vital to amphibian research Common gene expression and embryology models Three disparate lexicons Lack of terminological standardization limits understanding of phenotype evolution and integrative research on gene expression, embryology, and comparative anatomy Challenge

Challenge Develop an amphibian anatomical ontology that accommodates the diversity of structures present in all amphibians; include definitions, literature references, coding of phylogenetic characters, and images and annotations …and include the domain expert community …and maintain interoperability with other ontologies …and do it in a reasonable amount of time Challenge

AmphibAnat approach Combine existing tools and methodologies with new approaches Semi-automated construction Ontology maintenance Web-based community curation AmphibAnat

1. Semi-automated construction Our challenge required automated techniques to reduce manual efforts and enhance existing, manually created ontologies by: Enriching concepts in the ontology Applying metrics to benchmark the semi-automated ontology’s suitability Semi-automated construction

Semi-automated approach Manually construct robust ontology of small subset(s) of domain (started with parts of ZFIN) Rigorous community evaluation, augmentation, and modification of subset ontology Seed IR software with skeleton of ontology and attempt to recreate it Benchmark, repeat until software is reasonably accurate, then let loose on entire ontology Semi-automated construction

How does IR software work? Identify relevant electronic data sources Use topic-specific spider to generate queries for concepts in the ontology Collect potentially relevant documents IR system extracts info from documents Pattern-based extraction methods Statistical natural language processing algorithms that identify and weight most important elements Semi-automated construction

Benchmarking the IR software Intrinsic benchmarking How well the software-based ontology reflects the concepts in the manually-developed ontology Can software learn preferred terminology automatically by identifying the most frequently used term from a synset, or the term most often used by authorities Semi-automated construction

Benchmarking the IR software Extrinsic benchmarking How well the distribution of concepts in the ontology reflects that in the literature Concepts with many instances may require refinement and subdivision Concepts with no instances may be pruned (or designated as a less preferred synonym) Semi-automated construction

2. Ontology maintenance Our needs required maintenance software that: Allows web-based access and collaboration Ensures consistency and concurrent, authorized access Allows query and update of context and structure Allows import/export of common exchange formats (e.g., OBO, OWL) Allows node-based access and editing privileges Maintenance

Maintenance

RDBOM (“red-bomb”) Relational Database Ontology Maintenance Features: Structural (e.g., ancestors/descendents) and content-based queries (e.g., all concepts with particular phrase) Structural updating (e.g., move/insert/delete nodes) Content-based updating (e.g., change the value) Collaborative commenting Security (node-based per user; multiple users) Version control Web-based access Generic (any ontology) Modularization of data Maintenance

Modularization OBO (or OWL) file Phenote/PATO RDBOM Anatomical Ontology Phenotype Annotation (Character Coding) Taxonomy Literature Citations Images and Image Annotations 350 phenotype annotations, 3 model species: Salamandra, Dermophis, Xenopus User Comments Transaction History Access Permission Maintenance

3. Web-based community curation Our challenge required us to develop a virtual organization of amphibian experts Encourage user commenting by node Allow “super-user” node-based curation Teams and individuals take ownership of particular subsets for which they are the experts Web-based community curation

www.amphibanat.org Web-based community curation

Commenting tool Web-based community curation

Privilege assignment Web-based community curation

“Super-user” functions Web-based community curation

Simple interface Web-based community curation

Web-based community curation Facilitates inclusion of area expertise Provides forum for argumentation E.g., definitions, homology, preferred terminology Allows community ownership (and thus guaranteed use) Value-added: unites diverse groups toward common goal Web-based community curation

Summary Biodiverse group with disparate lexicons required combination of existing tools and novel approaches Semi-automated construction will help populate ontology faster, with less manual effort Community-based curation allows user ownership, thus mining expert knowledge and facilitating use RDBOM ontology maintenance system allows robust biodiversity data and functionality while allowing interoperability with other groups (e.g., OBO Foundry) Summary