Presentation on theme: "An Ontology-centric Architecture for Extensible"— Presentation transcript:
1 An Ontology-centric Architecture for Extensible Scientic Data Management SystemsGavin Kennedy1,2Dr Yuan-Fang Li32: School of ITEE, University of Queensland, St Lucia, QLD3: Clayton School of IT, Monash University, Clayton, VICNovel High Resolution tools at the HRPPCDr Xavier Sirault1Dr Bob Furbank11: CSIRO Plant Industry, Black MountainCnr Clunies Ross St & Barry DriveCanberra, ACT 2601
2 What is Plant Phenomics? Phenome = Genome X EnvironmentGenomics is accelerating gene discovery but how do we capitalise on these data sets to establish gene function and development of new genotypes for agriculture?High throughput and high resolution analysis capacity now the factor limiting discovery of new traits and varieties“ In the next 50 years we must produce more foodthan we have consumed in the history of mankind”Megan Clarke, CSIRO CEO 2009
3 Phenomics from the Leaf to the Field Imagine a plant breeder walking his trials logging plant performance distributed sensors with his mobile phone or logging on to Phenonet from home to view his wheat in real time
4 HRPPC: Canberra node of the Australian Plant Phenomics Facility RoleDeep phenotypingDevelopment of next generation tools to probe plant function and performance (come and see us)Brachypodium distachyonArabidopsis thalianaInfrastructure:1500 m2 lab space245 m2 greenhouse260 m2 growth cabinetsAnalytical tools packaged in:1- Model Plant Module (HTP)2- Crop-Plant Shoot Module (MTP)3- Crop-Plant Root Module (MTP)4- Crop-Plant Field Module (HTP)Gossypium speciesTriticum and Hordeum species,Vigna unguiculata (cowpea),Cicer arietinum (chickpea),Zea mays (maize),Sorghum bicolor, …4
5 Capitalising on new imaging technologies Plant MorphologyPlant FunctionVisible imagingPlant area, biomass, structureSenescence, relative chlorophyll content, pathogenic lesionsFar Infrared imagingCanopy / leaf temperatureWater use / salt toleranceChlorophyll Fluorescence imagingPhysiological state of photosynthetic machineryNear IR imagingTissue water contentSoil water contentFTIR Imaging Spectroscopy / Hyperspectral imagingCellular localisation of metabolites (sugars, protein, aromatics)Carbohydrates, pigments and proteins5
6 Addressing issues with fluorescence and environmental control PlantScan: next generation phenotyping platform for n-dimensional ModelsLight Detection and Ranging (LiDAR)Micro-bolometer sensors (Far-Infrared)4-CCD line scanner (NIR and visible split)Addressing issues with fluorescence and environmental control
7 Automated features extraction and quantification of n-dimensional models Jurgen Fripp CSIRO ICT E-Health BrisbaneAutomated segmentation – extracted stemBounding box extraction and Delauney triangulation for convex 3D hullVolume over timeHeight and total volume extractionSirault, Fripp and Furbank (in preparation)
8 An integrated phenotyping platform for Model Plants PAM Fluorescence imagingFar Infrared imagingVisible imaging for growthClimate controlled in equilibration chamber and imaging chambers2500 plants per dayApplications:1001 genomes project - 65 re-sequenced Arabidopsis thaliana ecotypes under analysis - with Detlef WeigelUSDA Brachypodium distachyon project
9 www.phenonet.com Distributed Sensor Network for Phenomics Measure and log range of environmental factors on field trials.Zigby wireless transmitters:Thermopile Temp SensorHumidityAmbient TempSoil MoistureImaging: Estimate biomass; greeness index for fertilization; detect flowering; estimate yield.Imaging constrained: Develop smarter portable platforms.
10 OntologiesOntologies are a set of formalised terms that allow us to represent knowledge about concepts and relationships in a domain. Annotating with ontologies means describing a domain object or process. Modelling with ontologies means classifying a domain object or process, and its relationship to other domain concepts.This image shows the wheat plant on the left has increased “salt tolerance (TO: )”OBI: : “platform”“A platform is an object_aggregate that is the set of instruments and software needed to perform a process. “
11 OntologiesEvolutionary Changes in Domain, Model & Data Expressed in OWL (& RDF Schema) Provides syntax & semantics - enables reasoning Expressivity vs decidability Validation via reasoning Designed to be open & interoperable Facilitates sharing, reuse & Integration Maturing technology stacks APIs, reasoners, triple stores, query engines
12 PODD The Phenomics Ontology Driven Data repository PlantScanThe Phenomics Ontology Driven Data repositoryA research data and metadata repository.Managing Phenomics Data from Multiple Heterogeneous High Volume High Resolution Data Generation PlatformsA methodology for managing and publishing research data outputs.A semantic web data resource.PhenonetDataPhenomobileTrayScanMetadataPODDMetadataRepositoryPODDData StoresDataMetadata
13 Putting the OD in PODDBasics: Ontologies as domain models for research dataModel domain objects as ontological objectsBase ontology: domain independentPhenomics ontology: domain specificOrganizes data logicallyRepresented as metadata objectsParent-child relationshipReferential relationshipDrives all operations in the data lifecycleDomain ConceptsOWL ClassesAttributes and relationsOWL PredicatesDomain ObjectsOWL IndividualsComments, descriptionsOWL Annotations
14 Observation/Phenotype The PODD OntologyProjectProject PlanInvestigationPlatformAnalysisEventGenotypeTreatmentMaterialMaterialContainerDataEnvironmentDesignGeneSexObservation/PhenotypeTreatmentArchiveDataSequenceMeasurementMeasurementParameter
15 PODD ArchitectureObjects represented semantically Semantics (metadata) captured in RDF Repository operations on RDF: Ingestion, retrieval, update, query & search, export Backend Object Management: Fedora Commons Fedora objects mapped to Java objects for: Business Logic Layer Interface Layer
16 Future WorkAnnotation Services Ontological tagging of PODD objects Annotation tools, search/discovery tools, browsers, etc. Virtual Laboratory Environment Support Phenome to Genome (and back) discovery processes Analyse linkages across data resources Workflows for statistical inferences & mathematical modelling. Visualisation tools etc...
17 ResourcesPlant Phenomics Test Instance:Plant Phenomics Production Instance:Mouse Phenomics Production Instance:PODD Project Website:Contact:Ph:This work is part of a National eResearch Architecture Taskforce (NeAT) project, supported by the Australian National Data Service (ANDS) through the Education Investment Fund (EIF) Super Science Initiative, and the Australian Research Collaboration Service (ARCS) through the National Collaborative Research Infrastructure Strategy Program.
18 The Team PODD Project Manager Gavin Kennedy University of Queensland eResearch Lab:Faith Davies (Developer)Simon McNaughton (Developer)Jane Hunter (eResearch Lab Leader)APPF/HRPCC/CSIROXavier Sirault (Science Leader, HRPPC)Xueqin Wang (Tester, Documentor)Bob Furbank (APPF HRPPC Leader)APPF/Plant Accelerator/Uni of AdelaideBogdan Masznicz (Bioinformatician)Mark Tester (APPF TPA Leader)APNPhilip Wu (Developer)Martin Hamilton (Developer)Adrienne McKenzie (APN Head of Network Services)Monash UnivesityYuan-Fang Li (Designer)NeATAndrew Treloar (Deputy Director ANDS)Paul Coddington (Projects Manager, ARCS)ALADonald Hobern (Director, ALA)