Presentation on theme: "Plant Phenotype Pilot Project AIM: To use ontologies in express and analyze plant phenotypes from multiple species The Issue: Traditional free text phenotype."— Presentation transcript:
Plant Phenotype Pilot Project AIM: To use ontologies in express and analyze plant phenotypes from multiple species The Issue: Traditional free text phenotype descriptions are inadequate for large-scale computerized comparative analyses
4 Working Groups: Vertebrates Arthropods Plants Informatics Many fields of biology represented: Systematics Evolutionary biology Genetics/developmental biology Ecology Paleontology … Phenotype Ontology Research Coordination Network Unifying ideas: Shared Ontologies Shared tools and methods Best practices Community outreach
Challenges of managing phenotype data Extremely diverse data type (can range from expression profile to behavior) Can be associated to individuals, populations or species Different levels (summary, measurement data) Can be comparative (mutant vs. wild type) or absolute (days to flowering of a cultivar) Data integration - needs extensive connections to other types of data (seed stocks, genes, experimental methods, publications) Database schema and interface design Data representation - how to represent the data in a consistent way across experiments, research communities and species Data accessibility – how do we get data out of literature and into the database?
Collection of phenotype data- Who is involved? Species Genes included in project setSource Glycine max233SoyBase Solanum lycopersicum74SGN Medicago truncatula443LIS Zea mays324MaizeGDB Oryza sativa138PO/Gramene/Oryzabase Arabidopsis thaliana2400Lloyd and Meinke 2012
Phenotypic measurement Experimental treatment Genotype measured Data collection method Statistical method Reference genotype Control treatment Mutant yfg1-1 has narrow leaves and flowers early in short days Growth conditions Phenotype Data: Phenotype Summary: Leaves are 1 cm wide Image Data interpretation – preferably done by experimenter Pilot Project - limited scope: Mutant phenotypes (not natural variants) Emphasis on visual and morphological (no gene expression patterns) Summary data (not phenotype measurements)
Why use ontologies? Supplement, not replacement, for free text Provides standardized vocabulary – Dwarf, short stature, small plant, reduced height are different ways of expressing the same idea Provides relationships among terms – Vascular leaf is_a type of leaf – Leaf abscission zone part_of leaf – Leaf develops_from leaf primordium Makes computational approaches possible – Searches – Categorization – Network analysis, semantic similarity
Phenotypes of cloned genes Existing phenotype datasets: Existing reference ontologies Consistent and thorough set of ontology annotations Semantic similarity computational analysis Phenotypes of mutant loci, QTL Plant Ontology Gene Ontology ChEBI Ontology statements PATO Outline of Pilot Project Plant EO
From an ontological perspective, a phenotype is a combination of an entity and a quality that inheres in that entity inheres in Quality fused lobed increased mass increased rate Entity juvenile vascular leaf petal seed transpiration Phenotype name adherent leaf notched petal high yield increased water loss Phenotypes may also consist of two entities and a relationship between them: Relationship* fused with basal to Entity 1 juvenile vascular leaf gynoecium Entity 2 stem perianth Phenotypes and Ontologies: *in PATO, the relationship is called a “relational quality”
Examples of mutant phenotypes shared across species: Dwarf plants Rolled leaves
Examples Description of Mutant Phenotype Atomized Phenotype statements Entity Quality (PATO) Dwarf with profuse slender tillers, small panicles dwarf PO: shoot system decreased height profuse tillers PO: whole plant has extra parts of type (basal axillary shoot system) slender tillers PO: basal axillary shoot system slender small panicles PO: inflorescence decreased size Delayed flowering; Reduction in total chlorophyll GO: flowering delayed ChEBI: chlorophyll decreased concentration
Next steps: Data analysis Clustering of genes into pathways Degree of correlation between sequence and phenotype Computational prediction of gene candidates for uncloned mutant genes and QTL Apply lessons learned Is the data set big enough? Are the ontologies complete enough? Is our annotation consistency good enough? Better analysis methods?
Future Possibilities with cROP Expansion to use Protein Ontology Plant Ontology Gene Ontology PRO Ontology statements PATO Plant EO ChEBI
Acknowledgements Funding: NSF - Phenotype Ontology Research Coordination Network (RCN) Oregon State University: Laurel Cooper Pankaj Jaiswal Laura Moore University of Arizona: Ramona Walls (PO / iPlant) USDA-ARS-CICGRU: Steven Cannon, Scott Kalberer Carolyn Lawrence, Lisa Harper Rex Nelson, David Grant George Gkoutos (University of Aberystwyth) Anika Oellrich (EBI) Oklahoma State University: David Meinke Boyce Thompson Institute: Lukas Mueller (SGN) Naama Menda (SGN) Michigan State University: Johnny Lloyd U. Of Nottingham Sean May