Presentation is loading. Please wait.

Presentation is loading. Please wait.

Example projects using metadata and thesauri: the Biodiversity World Project Richard White Cardiff University, UK

Similar presentations


Presentation on theme: "Example projects using metadata and thesauri: the Biodiversity World Project Richard White Cardiff University, UK"— Presentation transcript:

1 Example projects using metadata and thesauri: the Biodiversity World Project Richard White Cardiff University, UK R.J.White@cs.cf.ac.uk

2 2 The Biodiversity World project 3 year e-Science project funded by the UK BBSRC research council, 2003-2006 Universities of Cardiff, Reading and Southampton The Natural History Museum (London)

3 3 Some difficult biodiversity questions How should conservation efforts be concentrated? (example of Biodiversity Richness & Conservation Evaluation) Where might a species be expected to occur, under present or predicted climatic conditions? (example of Bioclimatic & Ecological Niche Modelling) How can geographical information assist in inferring possible evolutionary pathways? (example of Phylogenetic Analysis & Palaeoclimate Modelling)

4 4 Point data from various herbaria

5 5 GARP prediction of climatic suitability

6 6 Distribution data from ILDIS database

7 7 Types of resource used in these biodiversity studies Data sources: Catalogue of Life (names of species: Species 2000, GBIF) Biodiversity data Descriptive data Distribution of specimens and observations Geographical data Boundaries of geographical & political units Climate surfaces Genetic sequences Analytic tools: Biodiversity richness assessment – various metrics Bioclimatic modelling – bioclimatic ‘envelope’ generation Phylogenetic analysis (generation of phylogenetic trees)

8 8 Some challenges … Finding the resources Knowing how to use these heterogeneous resources Originally constructed for various reasons Often little thought was given to standards or interoperability

9 9 The Biodiversity World vision (1) Problem Solving Environment for Biodiversity studies – Heterogeneous diverse resources Facilitating integration of both legacy and newly-developed resources Flexible workflows Main challenges centre around interoperability, resource discovery, metadata, etc; High-performance computing secondary (though relevant)

10 Our architecture …

11 11 Biodiversity World as a flexible PSE Species 2000 & ITIS Catalogue of Life Analytic tool Thematic data source BDW Grid Ontology:  Metadata  Resource & analytic tool descriptions  Maintenance tools Wrapper Abiotic data source User Local tools Problem Solving Environment user interface (Triana) Problem Solving Environment:  Resource discovery  Support for workflows Wrapper Analytic tool GSD

12 User interaction with BDWorld …

13 13 Example work-flow (Climate-space Modelling) Projection Prediction Species 2000 Localities Climate Space Model Base Maps Climate Submit scientific name; retrieve accepted name & synonyms for species Retrieve distribution data for species of interest Present or recent climate surfaces Model of climatic conditions where species is currently found Possibly different climate surfaces (e.g. predicted climate) World or regional maps Prediction of suitable regions for species of interest Projection of predicted distribution on to base map

14 14 BDWorld / Triana in operation: Workflow creation (design, editing)

15 15 Triana screen-shots

16 16 Triana screen-shots

17 17 Triana screen-shots

18 18 Triana screen-shots

19 19 Triana screen-shots

20 20 Triana screen-shots

21 21 BDWorld / Triana in operation: Workflow execution (enactment, run-time)

22 22 Triana screen-shots

23 23 Triana screen-shots

24 24 Triana screen-shots

25 25 Triana screen-shots

26 26 Triana screen-shots

27 27 A dream A desktop environment in which scientists can “drag & drop” data sources, analysis and modelling tools and visualisation interfaces into a desired sequence of operations which can be run automatically BDWorld just about at this stage With additional features, the environment could be made richer, more productive, and support research groups. Essentially a component-based visual programming environment Not just for biodiversity!

28 28 Role of metadata Metadata is needed to enable discovery of resources and to indicate how they are to be used Properties to help locate appropriate resources Check interoperability, suggest transformations Provenance of data sets Log of work-flows executed

29 29 Resources have to be matched To the user’s requirements To the capabilities of the user’s workstation environment To each other, so that data sets generated by one task can be used by another

30 30 Finding a resource that matches the user’s requirements Metadata is stored when a resource is registered This metadata is used to find a resource which meets the user’s needs (possibly interpreted with the help of an ontology) can run in the user’s environment (users have to register their metadata too)

31 31 Metadata about resources Description, functionality Input and output data sets User interaction, if any Platform, requirements, restrictions Quality & reputation

32 32 Users’ needs What the resource does (or data source delivers) Algorithm used Whether it uses the right data type Quality, reliability, reputation …

33 33 Matching resources to users Users have varying capabilities and privileges which may affect their ability to use resources which: run on specific platforms only have IPR or cost limitations imposed on their use interact with their user locally in real time have other unexpected requirements

34 34 The user’s environment Platform: OS, supporting software Privileges, licences held, etc. Connection (bandwidth etc.) Workstation hardware (display, memory, speed, etc.)

35 35 Matching resource inputs and outputs The output of an earlier task may be the input of a later one Thus inputs and outputs of resources have to be tested for matching The only real criterion for this is the later resource – it has been programmed to read a data set, and will complain if it isn’t suitable However …

36 36 Matching input and output data sets Can be done with various levels of rigour: Is the same word used to describe their type? Do they have the same schema? Do they have schemas which contain the same elements? Do they have schemas which can be proved to be equivalent? (this is very hard) Are there additional parameters which have to match? (e.g. matrix dimensions)

37 37 Transforming data sets If the data sets don’t match, the metadata may allow the workflow designer to supply parameters to the wrapper to adjust its generation of output data or its interpretation of input data choose a transformation tool which can be inserted into the workflow called as a local tool on the user’s workstation control a more flexible data transformation tool

38 38 Summary Need metadata about Resources Operations Data set types (and schemas) Conversion tools Users (and their workstations)


Download ppt "Example projects using metadata and thesauri: the Biodiversity World Project Richard White Cardiff University, UK"

Similar presentations


Ads by Google