GBIF Program Officer for Data Access and Database Interoperability

Slides:



Advertisements
Similar presentations
How to publish genomic Data papers based on BOL data - Biodiversity Data Journal Lyubomir Penev Bulgarian Academy of Sciences & Pensoft Publishers ViBRANT.
Advertisements

What is a Flora? Peter Hovenkamp. What is not a Flora? Labwork/ecology paper Species selection on non-taxonomic criteria No identification tool Character.
GUID-1 Workshop Welcome and Introduction Donald Hobern GBIF Program Officer for Data Access and Database Interoperability February 2006.
The DNA Bank Network Gabriele Droege Botanic Garden and Botanical Museum Berlin-Dahlem Freie Universität Berlin.
Integrating Biodiversity Data
BIS TDWG Conference, New Orleans, 2011 GBIF: Issues in providing federated access to digital information related to biological specimens David Remsen Senior.
Entomological Collections Network Meeting, Indianapolis, IN 13 December 2009 Darwin Core Ratified in the Year of Darwin Gail E. Kampmeier Illinois Natural.
Integrated Taxonomic Information System Janet Gomon, Deputy Director, ITIS Smithsonian Institution Museum of Natural History The.
The GBIF Information Architecture Technological integration at the global level Donald Hobern GBIF Program Officer for Data Access and Database Interoperability.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Hannu Saarenmaa, Donald Hobern, Larry Speers, Per Bjørn & Giorgos Ksouris.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer September G A Darwin-Core Archive solution to publishing and.
BIS TDWG Conference 28 October 2013, Florence Documenting data quality in a global network: the challenge for GBIF Éamonn Ó Tuama, Andrea Hahn, Markus.
The EDIT Platform for Cybertaxonomy as an information broker in name infrastructures Andreas Kohlbecker 1, Yde de Jong 2, Cherian Mathew 1, Lorna Morris.
SERNEC Image/Metadata Database Goals and Components Steve Baskauf
Species Banks a GBIF mechanism to provide electronic access to quality species information Peter H. Schalk, Marc Brugman ETI, University of Amsterdam Tinde.
II Course on GBIF Node Management Arusha, Tanzania 31 st October and 1 st November 2008 Tim ROBERTSON Systems Architect GBIF Secretariat Data Publishing.
Richard White Biodiversity Data. Outline Biodiversity: what is it? – Definitions: is biodiversity: A resource? Something which can be measured? How to.
The Importance and Future Trends of Sharing Biodiversity Data Chau Chin Lin Taiwan Forestry Research Institute.
GLOBAL BIODIVERSITY INFORMATION FACILITY The Global Biodiversity Information Facility (GBIF ): The distributed architecture Samy Gaiji Head of Informatics.
Resource Identification for a Biological Collection Information Service in Europe An introduction to the BioCISE project Walter G. Berendsohn Botanical.
Indexing the Species Names of the World - for the World Frank Bisby (Species 2000), Michael Ruggiero (ITIS) Per de Place Bjørn (GBIF - ECAT)
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer October DarwinCore Archives – Simplified Format for publishing.
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
GLOBAL BIODIVERSITY INFORMATION FACILITY Cataloging and using Taxonomic Data The Global Names Architecture David Remsen Senior Programme Officer, ECAT.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Progress since the February 2005 London DNA Barcode of Life Conference Scott Miller, Chair Consortium for the Barcode of Life Smithsonian Institution.
GLOBAL BIODIVERSITY INFORMATION FACILITY TDWG 2009, Montpelier, November 12, 2009 Dag Endresen (NordGen)Samy Gaiji (GBIF) Dag Endresen (NordGen) & Samy.
Standards and tools for publishing biodiversity data Yu-Huang Wang June 25, 2012.
ABCD & BioCASe A Quick Introduction. Motivation & Rationale – ABCD I “Access to Biological Collection Data”  v2.06 ratified by TDWG, v1.20 still in use.
Digitization of Natural History Collections (DIGIT) Larry Speers Program Officer Digitization of Natural History Collections Data TDWG Annual Meeting Oct.
GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June Metadata publishing with the IPT.
1 GBIF and Ocean Biodiversity, OBI'07 Conference, Oct 2-4, 2007, Dartmouth, Nova Scotia GBIF and Ocean Biodiversity Building the data web with OBIS Éamonn.
TDWG 2006, Missouri, U.S.A. Exchange of germplasm datasets with PyWrapper/BioCASE October 16, 2006 TDWG annual Meeting 2006 Missouri Botanical Garden St.
GLOBAL BIODIVERSITY INFORMATION FACILITY Vishwas Chavan and Nicholas King February 12, GBIF efforts in digitizing and.
Scratchpads The virtual research environment for biodiversity data Simon Rycroft, Dave Roberts, Vince Smith, Alice Heaton, Katherine Bouton, Laurence Livermore,
BioCASE – A Biological Collection Access Service for Europe BioCASE programme – metadata and computing methods The Irish National Node Workshop: October.
Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.
Definition of an Observation In general, an observation represents the measurement of some attribute, of some thing, at a particular time and place. Observations.
A Provisional Observational Data Standard to Facilitate Data Sharing and Aggregation Lynn Kutner, Bruce Stein, and Donna Reynolds TDWG Annual Meeting,
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY DNA Barcoding in Southern Africa Cape Town 7 April
TAPIR 1.0 Renato De Giovanni, Markus Döring, Javier de la Torre October 2006.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Meredith A. Lane CODATA/ERPANET Workshop: Scientific Data Selection &
Distributed Biodiversity Information Databases A. Townsend Peterson.
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
An introduction to data exchange protocols in TDWG Renato De Giovanni TDWG 2008.
1 The National Biological Information Infrastructure and Biodiversity Collections Annette Olson BCI meeting, Washington DC, January 28-29th, 2008.
Beispielbild BioCASe, ABCD and its extensions Jörg Holetschek Botanic Garden & Botanical Museum Berlin-Dahlem Dept. of Biodiversity Informatics and Laboratories.
Fábio Lang da Silveira – This talk on behalf of OBIS International Committee and OBIS North & South America Nodes USP – Zoology.
The IABIN Pollinators Thematic Network 5 th Council Meeting of IABIN Punta del Este, Uruguay May 9, 2007 Michael Ruggiero, Laurie Adams, and Antonio Saraiva.
Acronym Soup GBIF, TDWG & GUIDs Jerry Cooper. Global Biodiversity Information Facility (GBIF) Established in 2000 through non-binding MOU (25 countries.
IABIN Pollinator Thematic Network: Overview Washington, DC 28 October 2008 Michael Ruggiero Smithsonian Institution, USA
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Hannu Saarenmaa EC CHM & GBIF European Regional Nodes Meeting Copenhagen,
Global Biodiversity Information Facility. GLOBAL BIODIVERSITY INFORMATION FACILITY Hannu Saarenmaa & al. Ecoinformatics Workshop Brussels 22 September.
Taxonomic Workflow in the EDIT Platform for Cybertaxonomy Andreas Kohlbecker, Pepe Ciardelli, Niels Hoffmann, Katja Luther, Andreas Müller Botanic Garden.
Networking Biodiversity Data – Online Access to Distributed Data Sources in GBIF-D Andrea Hahn, A. Kirchhoff & W.G. Berendsohn Botanic Garden and Botanical.
The New GBIF Data Portal Web Services and Tools Donald Hobern GBIF Deputy Director for Informatics October 2006.
IABIN Species and Specimens Thematic Network (SSTN) IABIN Executive Committee/Coordinating Institution Meeting. Tierras Enamoradas, Costa Rica. February.
Laura Russell VertNet Meherzad Romer NatureServe Canada John Wieczorek
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen Senior Programme Officer, ECAT 3 Oct th Nodes Meeting.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Hannu Saarenmaa ECOINFORMATICS 2006 JRC, Ispra,
NVS New Zealand National Vegetation Survey. What is NVS? NVS (National Vegetation Survey) – New Zealand’s largest archive facility for plot-based vegetation.
IPT + Darwin Core OBIS XML Schema OBIS Database Schema Explained Mike Flavell OBIS Data Manager OBIS Nodes Training Course, Oostende, Belgium, 6 May 2014.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Meredith A. Lane & Donald G. Hobern Information Systems on Biodiversity.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
GB22 TRAINING EVENT FOR NODES – 4 OCTOBER 2015 Session 02: 2015 Data Publishing Landscape Laura Russell.
Flanders Marine Institute (VLIZ)
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
GLOBAL BIODIVERSITY INFORMATION FACILITY
Presentation transcript:

GBIF Program Officer for Data Access and Database Interoperability Architecture and Standards for Global Biodiversity Informatics A GBIF and TDWG Perspective Donald Hobern GBIF Program Officer for Data Access and Database Interoperability November 2004

TDWG and GBIF TDWG – Taxonomic Databases Working Group Not-for-profit scientific and educational association Affiliated to the International Union of Biological Sciences Mission To provide an international forum for biological data projects To develop and promote the use of standards To facilitate data exchange Products Standards/guidelines for recording/exchanging data about organisms Promotion of use of these standards Forum for discussion (especially annual meeting) GBIF – Global Biodiversity Information Facility Megascience activity involving 42 countries/economies and 28 international organisations Secretariat based in Copenhagen, Denmark Free and universal access to world’s biodiversity data via Internet Sharing primary biodiversity data for society, science and a sustainable future Registry of biodiversity data resources Index of biodiversity data Software tools Web portals (http://www.gbif.net) and data services

Primary biodiversity data Class: Insecta Taxonomic Names Order: Lepidoptera Sequence Data Synonym: Pyralis nubilalis Hübner, 1796 Locus: AAL35331 Definition: acyl-CoA Z/E11 desaturase 1 mvpyattadg hpekdecfed... Family: Pyralidae Genus: Ostrinia Hübner, 1825 Species: Ostrinia nubilalis (Hübner, 1796) Taxonomic Descriptions Diagnosis: Wingspan 26-30mm; sexually dimorphic;male: forewings ochreous to dark brown; female: forewings pale yellow; … Vernacular (EN): European Corn-borer Vernacular (DE): Maiszünsler Vernacular (ES): Piral del maíz Vernacular (FR): Pyrale du maïs Digital Literature and Web Resources Family: Gramineae Pheromones of Ostrinia http://www.nysaes.cornell.edu/fst/faculty/acree/pheronet/phlist/ostrinia.html Foodplant: Zea mais L. 1753 Ecological Interactions Collection: DGH Lepidoptera Record id: DGHEUR_003217 Country: France Coordinates: 03.047˚E 48.730˚N Date: 28 June 2003 Collector: Donald Hobern Specimens and Observations Abiotic Data Average Rainfall Location: 48.82°N 2.29°E Jan Feb Mar Apr ... 182.3 120.6 158.1 204.9 ...

Standardised structured data <?xml version="1.0" encoding="UTF-8"?> <response> <record> <darwin:DateLastModified>2003-06-08</darwin:DateLastModified> <darwin:InstitutionCode>DGH</darwin:InstitutionCode> <darwin:CollectionCode>DGH Lepidoptera</darwin:CollectionCode> <darwin:CatalogNumber>DGHEUR_0002976</darwin:CatalogNumber> <darwin:ScientificName>Dichomeris marginella (Fabricius, 1781)</darwin:ScientificName> <darwin:BasisOfRecord>O</darwin:BasisOfRecord> <darwin:Kingdom>Animalia</darwin:Kingdom> <darwin:Order>Lepidoptera</darwin:Order> <darwin:Family>Gelechiidae</darwin:Family> <darwin:Genus>Dichomeris</darwin:Genus> <darwin:Species>marginella</darwin:Species> <darwin:ScientificNameAuthor>(Fabricius, 1781)</darwin:ScientificNameAuthor> <darwin:IdentifiedBy>Donald Hobern</darwin:IdentifiedBy> <darwin:Collector>Donald Hobern</darwin:Collector> <darwin:YearCollected>2003</darwin:YearCollected> <darwin:MonthCollected>06</darwin:MonthCollected> <darwin:DayCollected>08</darwin:DayCollected> <darwin:ContinentOcean>Europe</darwin:ContinentOcean> <darwin:Country>Denmark</darwin:Country> <darwin:County>Københavns Amt</darwin:County> <darwin:Locality>Merianvej, Hellerup</darwin:Locality> <darwin:Longitude>12.538</darwin:Longitude> <darwin:Latitude>55.737</darwin:Latitude> <darwin:CoordinatePrecision>100</darwin:CoordinatePrecision> <darwin:IndividualCount>1</darwin:IndividualCount> <darwin:Notes>1 in Skinner trap</darwin:Notes> </record> </response> June 2003 S M T W T F S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Observation record formatted using the Darwin Core

TDWG Data Standards Darwin Core Simple XML data model to represent taxon occurrence records (only core attributes) Extensions to handle e.g. curation details, microbial data, image data ABCD Schema – Access to Biological Collection Data More complex XML data model to represent collection or observation data Detailed document structure including features for different communities DiGIR – Distributed Generic Information Retrieval XML protocol for searching remote data resources Suitable for use with a wide range of different data models BioCASe Protocol XML protocol for searching remote data resources with more complex schema (e.g. ABCD) Derived from DiGIR – new unified DiGIR/BioCASe protocol being developed Taxon Concept Schema XML data model currently under development for exchange of nomenclatural/taxonomic data First version to be used for implementation in 2005 SDD Schema – Structured Descriptive Data XML data model for descriptive data relating to taxa or specimens (highly generalised) Suitable for representation of character tables, diagnostic keys, etc.

BioCASe-ABCD <?xml version='1.0' encoding='UTF-8'?> <response xmlns='http://www.biocase.org/schemas/protocol/1.3' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:schemaLocation='http://www.biocase.org/schemas/protocol/1.3 http://www.bgbm.org/biodivinf/schema/protocol_1_3.xsd'> <header> <version software='Python Interpreter'>2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)]</version> <sendTime>2004-10-10T22:22:40+02:00</sendTime><source>192.168.1.12</source><destination>132.181.101.155</destination><type>search</type> </header> <content recordDropped='0' recordCount=‘1' recordStart='0' totalSearchHits=‘1'> <DataSets xmlns='http://www.tdwg.org/schemas/abcd/1.2'> <DataSet> <OriginalSource><SourceInstitutionCode>BGBM</SourceInstitutionCode><SourceName>Bridel Herbar</SourceName><SourceLastUpdatedDate>2004-04-29</SourceLastUpdatedDate></OriginalSource> <DatasetDerivations> <DatasetDerivation> <DateSupplied>2004-07-29</DateSupplied> <Supplier> <Organisation><OrganisationName>Botanic Garden and Botanical Museum Berlin-Dahlem</OrganisationName></Organisation><Person><PersonName>Andrea Hahn</PersonName></Person> <TelephoneNumbers><TelephoneNumber><Number>+49 (0)30 838 50286</Number></TelephoneNumber></TelephoneNumbers><URLs><URL>http://www.bgbm.org</URL></URLs> </Supplier> <Rights> <TermsOfUse>The use of the data is allowed only for non-profit scientific use and for non-profit nature conservation purpose.</TermsOfUse> <LegalOwner> <Organisation> <OrganisationName>Botanic Garden and Botanical Museum Berlin-Dahlem</OrganisationName> <OrganisationCodes><OrganisationCode>BGBM</OrganisationCode></OrganisationCodes> </Organisation> </LegalOwner> <CopyrightDeclaration>No part of this data base may be copied or reproduced without written permission from the legal owner.</CopyrightDeclaration> <IPRDeclaration>The Intellectual Property Rights are held by the legal owner or, in case of living persons, by the collector or determinator.</IPRDeclaration> </Rights> <Statements><Disclaimer>No responsibility is accepted for the accuracy of the information in this data base.</Disclaimer></Statements> </DatasetDerivation> </DatasetDerivations> <Units> <Unit> <UnitID>Bridel-1-362</UnitID> <Identifications> <Identification PreferredIdentificationFlag='0'> <TaxonIdentified> <HigherTaxa><HigherTaxon TaxonRank='Family'>Pottiaceae</HigherTaxon><HigherTaxon TaxonRank='Kingdom'>Plantae</HigherTaxon></HigherTaxa> <NameAuthorYearString>Leucophanes octoblepharioides Brid. 1827</NameAuthorYearString> <ScientificNameString>Leucophanes octoblepharioides</ScientificNameString> <AuthorString>Brid. 1827</AuthorString> </TaxonIdentified> <Identifier><IdentifierPersonName><PersonName>Allen, Noris Salazar</PersonName></IdentifierPersonName></Identifier> <IdentificationDate><ISODateTimeBegin>1986-07</ISODateTimeBegin></IdentificationDate> </Identification> </Identifications> <Gathering><GatheringSite><ContinentOrOcean>Asia</ContinentOrOcean><Country><CountryName>NP</CountryName></Country><AreaDetail>Nepal</AreaDetail></GatheringSite></Gathering> </Unit> </Units> </DataSet> </DataSets> </content> <diagnostics><diagnostic>OK</diagnostic></diagnostics> </response> PROTOCOL COLLECTION (INCLUDING METADATA) Note that structure of record elements is part of the content schema (ABCD), not part of the protocol SPECIMEN PROTOCOL

DiGIR-Darwin Core <?xml version='1.0' encoding='utf-8' ?> <response xmlns='http://digir.net/schema/protocol/2003/1.0'> <header> <version>$Revision: 1.14 $</version> <sendTime>2004-10-10T13:48:12-0700</sendTime> <source resource="martius_munchen_infocomp">http://digir.bebif.be/main/DiGIR.php</source> <destination>132.181.101.155</destination> <type>search</type> </header> <content xmlns:darwin='http://digir.net/schema/conceptual/darwin/2003/1.0' xmlns:xsd='http://www.w3.org/2001/XMLSchema'> <record> <darwin:DateLastModified>2004-05-12</darwin:DateLastModified> <darwin:InstitutionCode>Botanische Staatssammlung München</darwin:InstitutionCode> <darwin:CollectionCode>Infocomp</darwin:CollectionCode> <darwin:CatalogNumber>010702P1</darwin:CatalogNumber> <darwin:ScientificName>Wedelia longifolia Mart. ex Baker</darwin:ScientificName> <darwin:BasisOfRecord>Label</darwin:BasisOfRecord> <darwin:ScientificNameAuthor>Baker, J.G.</darwin:ScientificNameAuthor> <darwin:TypeStatus>Holotypus</darwin:TypeStatus> <darwin:CollectorNumber>s.n.</darwin:CollectorNumber> <darwin:Collector>Martius, C.F.P. von</darwin:Collector> <darwin:ContinentOcean>South America</darwin:ContinentOcean> <darwin:Country>Brazil</darwin:Country> <darwin:Locality>' ... in prov. S. Paulo, inter herbas locis irriguis ad Lorena ... ' (op. cit.)</darwin:Locality> <darwin:Notes>Tribe: Heliantheae, Reference protologue: Martius, C.F.P. von: Flora Brasiliensis 6(3): 182-183. 1884., </darwin:Notes> </record> </content> <diagnostics> <diagnostic code="STATUS_INTERVAL" severity="info">3600</diagnostic> <diagnostic code="STATUS_DATA" severity="info">79,5,2</diagnostic> <diagnostic code="MATCH_COUNT" severity="info">1</diagnostic> <diagnostic code="RECORD_COUNT" severity="info">1</diagnostic> <diagnostic code="END_OF_RECORDS" severity="info">true</diagnostic> </diagnostics> </response> Note that structure of record elements is part of the protocol – the content schema (Darwin Core) only defines the attributes describing the record PROTOCOL SPECIMEN PROTOCOL

BioCASe-ABCD compared to DiGIR-Darwin Core BioCASe-ABCD model Document-based (response document includes metadata and records as a structured package) Strengths No problem with modelling complex nested structures and repeating elements Fits perfectly with UBIF proposal – ABCD DataSet elements and ABCD Metadata could readily be standardised with the DataSet/Metadata structures from other TDWG standards such as Structured Descriptive Data (SDD) and Taxon Concept Schema (TCS) – with rather little work. DataSets from all three of these could be combined to form a single document with cross-references between sections. Possible weaknesses Not simple for specialist networks to extend the structure with additional elements of their own (requires well-planned open extension points to be designed into the schema), especially if a provider wishes simultaneously to be part of more than one such specialist network . (At present) all elements in the ABCD schema are versioned together. Handling an updated version of the schema requires significant additional effort on the part of providers and users. For example, adding new elements to support plant genetic resource data – without changing the elements for museum/herbarium specimens – requires all users to handle a new version of the schema. DiGIR-Darwin Core model Record-based (response returns a set of records which may contain descriptor elements from any schema) Massively flexible and extensible model allowing different networks to use a common protocol and shared core elements alongside their own network-specific extensions. (In integrated protocol version) could return ABCD elements as part of response records. If ABCD is treated as a library of elements, this fits even better. Model maps well to supporting a flexible object-oriented data model for biodiversity informatics. (In existing version) cannot readily handle complex data structures with nested repeating elements. Records have no intrinsic data type – currently relies on an implicit understanding between user and data provider.

Exchange via web services Standardised Structured Data Heterogenous Databases Web Services Internet Users <request> <response> <record> … <request> <response> <record> … <response> <record> … <request>

GBIF network of biodiversity data nodes Specimens: Flowering Plants of Africa Observations: Birds of Central America DiGIR-DarwinCore Specimens: Proteaceae of the World Museum A Observations: Butterflies of Belize BioCASe-ABCD Observer Network B Taxon Names: Proteaceae of the World DiGIR-DarwinCore BioCASe-ABCD Checklist: Birds of Belize GBIF Network Taxon Concept Schema Taxon Concept Schema DiGIR-DarwinCore BioCASe-ABCD Specimens: Bacteria Cultures Specimens: Mammals of North Europe Taxon Concept Schema Taxon Concept Schema Taxon Names: Mammals of the World Taxon Names: Bacteria Further Links: Mammals Further Links: Bacteria Museum C University D

Central GBIF registry of data nodes Type of data Taxon Region Records Museum A Specimen/Observation Flowering Plants Africa 327000 Proteaceae World 23000 Taxonomic Names 1500 Observer Network B Birds Central America 68500 Butterflies Belize 4200 Name List 587 Museum C Mammals North Europe 1800 8000 General Resources 600 University D Bacteria 1200 5000 400

DiGIR-BioCASe Protocol and Nested Networks Get DiGIR-style records each with a set of Darwin Core descriptors and a complete ABCD Unit Get full set of Soy Bean crop descriptors. User Get standard plant genetic resource Passport data for all crop types. Get complete ABCD documents from each BioCASe provider Get Darwin Core records where darwin:ScientificName equals Puma concolor from any provider. BioCASe Provider Darwin Core ABCD Taxon Occurrence BioCASe Provider Darwin Core ABCD Taxon Occurrence MaNIS Provider Darwin Core Curatorial Taxon Occurrence OBIS Provider Darwin Core Marine Taxon Occurrence IPGRI Banana Provider Darwin Core IPGRI Passport Banana Descriptor Taxon Occurrence IPGRI Soy Bean Provider Darwin Core IPGRI Passport Soy Bean Descriptor Taxon Occurrence

GBIF index to biodiversity data User requests GBIF Data Nodes Biodiversity Data Access Specimen Data DiGIR/BiOCASe Biodiversity Data Index Taxonomic Name Service (ECAT) Catalogue of Life Specimen Data Observation Data DiGIR/BiOCASe Specimen Data Name Lists Taxon Concept Specimen Data Links to other data

GBIF data index

Central portal to biodiversity data 6 records Show specimen records for Erinaceus europaeus 35 records GBIF Portal 17 records 58 records: Museum A Paris Museum A Nice Museum A Avignon Museum A Marseille Observer B Norwich Observer B Southampton . . . 0 records

GBIF Data Portal

GBIF Data Portal

Participant Nodes with tailored information Show specimen records for Erinaceus europaeus from France Geographic Services GBIF Portal Show occurrence of Hérisson d’Europe GBIF France 26 records: Museum A Paris Museum A Nice Museum A Avignon Museum A Marseille 23. Observer B Calais 29. Observer B Paris . . . 58. Museum C Toulouse 58 GBIF records: Museum A Paris þ Museum A Nice þ Museum A Avignon þ Museum A Marseille þ Observer B Norwich ý Observer B Southampton ý . . . 58. Museum C Toulouse þ

Flexible applications A customs official discovers specimens of a possible pest species of weevil (Curculionidae) on a consignment of agricultural produce at a port of entry.  The GBIF Network generates an identification key to support identification of pest weevil species to allow the official to determine appropriate response. This application requires access to data from a wide range of sources, including those GBIF participants that are organisations. WANTED Provide key to identify reportable Curculionidae GBIF List of names of reportable pest species Elytra brown 2 Elytra not brown 5 Thorax black Thorax brown 3 Hind tibia black Non-pest Hind tibia brown 4 Hind femur brown Hind femur black Non-pest . . . Descriptive data

Monitoring of data usage 81 records: Museum A Paris Museum A Nice . . . GBIF Usage: Museum A 16 August 2003 Search: Upupa epops 5 records returned 18 Augúst 2003 Search: Birds from Nice 16 records returned Show specimen records for Upupa epops GBIF Portal Data Usage Logs Data Usage Reports GBIF Usage: Observer B 16 August 2003 Search: Upupa epops 2 records returned Show bird specimen records from Nice 126 records: Museum A Upupa epops Museum A Apus apus Museum A Athene noctua . . .

Future activity Globally unique identifiers Schema repository TDWG-GBIF collaboration to develop models to allow data providers to attach persistent identifiers to their data records Allow software to detect multiple instances of the same record Allow users to save resolvable references to specimens, collections, taxon concepts, etc. Schema repository Central library of information on data models Resource for discovering documentation or mappings between different schemas Better support for intelligent software applications Data validation tools Framework for running sets of validation tests against XML data (content values, controlled vocabularies, relating georeference data to named localities, etc.) Support different uses (data providers to locate possible problems in data; users to assure themselves of suitability of data; GBIF to provide metadata on data completeness/coherence) Access to a wide range of taxonomic name data Taxonomic/nomenclatural authorities (nomenclators, global species databases, revisions, etc.) Lists used by different communities/organisations (red lists, pest species, regional checklists, etc.) Customised portals Organised according to taxon lists used by each user Notifications of new data based on user profiles (taxonomic, geographic, etc.)

Taxonomic Databases Working Group Links Taxonomic Databases Working Group http://www.tdwg.org/ Including access to working groups Global Biodiversity Information Facility http://www.gbif.org/ Communications Portal http://www.gbif.net/ Data Portal http://circa.gbif.net/Public/irc/gbif/dadi/library?l=/architecture Architecture documents