Presentation is loading. Please wait.

Presentation is loading. Please wait.

An introduction to data exchange protocols in TDWG Renato De Giovanni TDWG 2008.

Similar presentations


Presentation on theme: "An introduction to data exchange protocols in TDWG Renato De Giovanni TDWG 2008."— Presentation transcript:

1 An introduction to data exchange protocols in TDWG Renato De Giovanni TDWG 2008

2 Overview of the presentation History and context: When and how protocols started to be discussed in TDWG The basic idea behind distributed queries Main features of TAPIR Other protocols Current status of TAPIR

3 TDWG Standards Historically, TDWG has concentrated efforts in: creating controlled vocabularies, indexes, guidelines and best practices (1985 until ~2000): – Index Herbariorum, Authors of plant names, Floristic regions of the world, etc. creating standards to represent different types of biodiversity data (2000 until today): – SDD (descriptions)‏, ABCD (specimens), TCS (names and concepts)‏. More being created.

4 First networks in our community REMIB ENHSINENHSIN Z39.50 custom HISPID custom protocol: More networks followed...

5 Australia’s Virtual Herbarium (1999)‏ Included a data abstraction layer (HISPID) and a simple protocol to return records. HISPID became a TDWG standard. This approach was only used by the AVH. The Species Analyst (1999)‏ Z39.50 was created and maintained by the Library of Congress. Pre-Web technology (no HTTP). Protocol is bound to data abstraction layer. Limited support to XML and Unicode. Z39.50 DiGIR BioCASe TAPIR MaNIS, speciesLink, OBIS... (2002)‏ DiGIR was funded by a NSF project. Motivation was to replace Z39.50 with a new a protocol without the Z39.50 limitations and then split TSA into multiple thematic networks. BioCASE Network (2003)‏ Created after many unsuccessful attempts to reach an agreement with the DiGIR community. Can be used with more complex data abstraction layers like ABCD. TAPIR protocol (2004)‏ Initial study contracted by GBIF to eliminate interoperability problems and duplication of efforts. TDWG was the venue for discussions (currently an official task group). Protocols, Networks & TDWG HISPID

6 Windows GNU/Linux Mac OS X MS Access PostgreSQL MySQL http://... protocol + data abstraction layer e.g. DarwinCore Client Main scenario: Distributed queries provider 1 provider 2 provider 3 other providers

7 Reasons for the existence of TAPIR TAPIR can potentially be used to exchange data encoded in most (if not all) XML standards defined by the other TDWG groups. TAPIR is one of the main components of the new TDWG Architecture. When integrating DiGIR and BioCASe, the other existing protocol alternatives were not considered suitable. Changing the existing DiGIR and BioCASe networks to use a completely different protocol would cause major impacts in existing tools. TAPIR keeps many similarities with DiGIR and BioCASe to avoid such impacts.

8 Main features of TAPIR Uses the Web (HTTP) to communicate with providers. Responses are always structured in XML. Can be used with different data abstraction layers. Can return different types of search responses. Tries to address the basic needs of federated networks through 5 operations.

9 Metadata operation (default)‏ 1- Need to identify providers and get basic information about the service Who is responsible for the service? How can the owner be contacted? What kind of data is being served? In which language is the data? Are there any IPR restrictions?

10 Capabilities operation 2- Need to get technical information about the service What data abstraction layer is being used? What operations are available? Does the provider only understand specific query templates? (which ones?)‏ Does the provider support custom (on-the-fly) filters? Does the provider support custom (on-the-fly) response types?

11 Inventory operation 3- Need to inspect existing content How many records are available? For what species is there any data? For what countries/regions is there any data?

12 Search operation 4- Need to search content What records satisfy these parameters or filter conditions? Networks are free to define their own response types. Responses can be paged.

13 Ping operation 5- Need to monitor providers Is the service ready to receive requests?

14 Other protocols WFS (Open Geospatial Web Feature Service) Spatial queries Additional operations: update/insert/remove/lock feature TDWG Geospatial Interest Group

15 Other protocols OAI-PMH (Open Archives Initiative – Protocol for Metadata Harvesting) Data harvesting = gathering of data from several sources to store into a single database. TAPIR itself can be used directly for this purpose, including incremental harvesting (depends only on the abstraction layer). OAI-PMH service can be set up on top of a TAPIR service.

16 TAPIR - current status Provider software available. ~75 providers registered in GBIF's UDDI (> 100). Client libraries available. Online service validator. Online tools to help building TAPIR documents. Documentation: – protocol specification – network guide – executive summary TDWG resources: mailing list, Wiki, Subversion, etc. Final version of the specification should be submitted this year to the TDWG standards track (minor changes still being considered before submitting).

17 Thank you renato (at) cria. org. br


Download ppt "An introduction to data exchange protocols in TDWG Renato De Giovanni TDWG 2008."

Similar presentations


Ads by Google