Automated tools to help construction of Trait Ontologies Chris Mungall Monarch Initiative Gene Ontology Consortium Lawrence Berkeley National Laboratory PRO-PO-GO Workshop 2013
Trait Ontologies What is a trait? – Working elucidation: attributes, not values Existing Trait Ontologies: – TO – plant traits Should really be called ‘PTO’ – VT – vertebrate traits – GO biological attribute ontology Internal ontology used to provide logical definitions for terms like ‘regulation of blood pressure’, ‘regulation of synaptic plasticity’
Many terms in TOs can be trivially composed Many follow ‘EA’ pattern – Entity (anatomical structure, chemical entity, …) – Attribute (subset of PATO) Pre-vs-post composition? – Strictly speaking, from a logical perspective it doesn’t matter – BUT there are many practical advantages to pre-composition Provided the right tools are used
Using TermGenie for traits Scenario: – Annotator needs new trait term “cotyledon length” Today please if possible! Approach: – Go to termgenie.org, login – Selects “cotyledon” as entity “length” as attribute – TermGenie uses Elk reasoner Checks not equivalent to existing class Checks ‘satisfiability’ Determines placement in hierarchy – TermGenie suggests text def and synonyms based on template
GO biological attribute TG instance GO has been using TG for GO terms for >2 years Recently created a new instance for biological attributes – To create new terms like ‘regulation of cell shape’ TO and VT automatically pulled in – Use existing terms if available
Demo Examples: – ‘leaf size’ – ‘gynoecium size’ – ‘cotyledon size’
How does this work? Requires OWL equivalence axioms For plant TO, these currently live in trait_xp.obo file – Seeded using Obol, manually vetted and improved – Modeling patterns documentation: Best_Practice Best_Practice Integrating phenotype ontologies across multiple species CJ Mungall, GV Gkoutos, CL Smith, MA Haendel, SE Lewis, M Ashburner Genome Biology 11 (1), R2
Proposal: Combined trait ontology Merge of existing TOs – Existing plant TO IDs can be ‘grandfathered’ in Not taxon-restricted – Some cell component traits are shared across all kingdoms – Taxon subsets can be extracted automatically Editors version is in OWL Most term requests via TermGenie – Templated or freeform
Modeling issues Best OWL model – Many pros and cons Complex traits – Ratios Integration with quantitative data
Integration with phenotypes ontologies Phenotype ontologies – Terms can be thought of as ‘leaf nodes’ of TO terms (values) – Many phenotypes are complex (multiple traits) Diverse species – FYPO – fission yeast – CPO – all species, cellular, automatically generated – MP – mouse – HP – human – WBbt – worm – FBcv – fly – Zebrafish – autogenerated from post-compositions
Combined Phenotype Ontologies Examples – ‘UberPheno’ – Phenomenet combined ontology Automatically generated using Uberon as bridging anatomy ontology
Summary Automated tools can make the ontology development cycle more efficient – Problem: OWL environments hard for non- experts (and experts) – TermGenie provides a simple intuitive interface Configurable Merging trait efforts is a win-win