Presentation on theme: "GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey February 5, 2008."— Presentation transcript:
GEOSS ADC Architecture Workshop Clearinghouse, Catalogues, Registries Doug Nebert U.S. Geological Survey February 5, 2008
GEOSS Access Context GEOSS Component, Service registry Standards, Special Arrangements Registries references Web Portals and client applications search Offerors contribute register Community Resources access GEOSS Clearinghouse Catalogues Services User accesses get list of catalogue services accesses search invoke reference operate
GEOSS Clearinghouse Clearinghouse as a broker to Community Catalogues Searches GEOSS Service Registry to identify services that can be searched Community Catalogues may either be harvested in advance or searched at the time of a user query Searches received from GEO Web Portal, Community Portals or any other external application acting as a catalog client Brief or full responses are marshaled and returned to requesting client as XML
Use Case: coordination of Registry and Clearinghouse Providers interface the Registry using a GUI to register components and services. Clearinghouse routinely updated with select contents of Service Registry. Portals (both GEO and Community) and other clients search the Clearinghouse through a catalog service interface, i.e., not a GUI Searches of the Clearinghouse accomplished via –1) metadata held in the clearinghouse - previously harvested from remote catalogues –2) distributed searches to remote catalogues at the time of the users search.
Use Case: coordination of Registry and Clearinghouse In the publishing activity, A, a GEOSS publisher activates an online service and documents its existence or its data sources in a catalog. Activity B details the transactions taking place between a publisher who is registering a Component and a service and the Service and Standards registries. Activity C shows the GEOSS Clearinghouse discovering eligible services including catalog services in the GEOSS Service Registry and then accessing the found services directly. In some cases, the remote catalogs are set up for real-time distributed query – in others for harvesting or processing the results into a local cache. Activity D shows the expected interaction between a Web Portal and the clearinghouse and Component and Service registry.
Interaction Diagram – Clearinghouse
Interaction Diagram, continued
Clearinghouse testing Three implementations tested –Geonetwork Clearinghouse –ESRI Clearinghouse –Compusult Clearinghouse Three sets of tests were performed –Clearinghouse to Service Registry –Search of Clearinghouses by GEO Web Portal candidates –Clearinghouse to Community Catalogues
Clearinghouse Requirements GEOSS Clearinghouse candidates assessment is based on the fulfillment of the requirements contained CFP –Requirements contain slight changes vs CFP Clearinghouse candidate self - assessment against requirements –Compliant except where requirements are ambiguous –Expectation that all registered catalog services should be made searchable through each Clearinghouse instance
Clearinghouse trade study: Distributed search vs. Harvest Set of evaluation criteria defined followed by analysis of the alternatives Harvest alternative advantage: quick searches. Disadvantage: metadata duplication Distributed Search advantage is metadata is maintained closer to source. Disadvantage that searching takes longer to complete and has more chances for the search to not be completed. Recommend Harvest when possible –Harvest only collection metadata –Policy of community catalogue must be respected
Integration Issues Catalogues registered with GEOSS have a wide variety of standardization. Protocols include: –ISO23950 (Z39.50) GEO Profile Version 2.2 FGDC (CSDGM Metadata) ANZLIC Metadata ISO Metadata –OGC Catalogue Service for the Web (Version and 2.0.2) ebRIM Profile (incl ISO and EO Extension Packages) FGDC Profile ISO Profile –SRU/SRW OpenSearch –OAI-Protocol for Metadata Harvesting (OAI-PMH) –Dublin/Darwin Core Metadata –Web-accessible folder/ftp?
Who are the primary user types? Registries Clearinghouse Catalogues
What resource types should be registered? Consider service, data set, data collection (series), items as alternatives and the ability to transition from one to the other. Current results are too heterogeneous
What protocols can be expected? Let responses to CFP suggest choices Support test harness capability to self-test registered catalog service types Clearinghouse instances must expose identical service interfaces
What metadata formats are found? ISO and Profiles (INSPIRE, ANZLIC, NAP) FGDC CSDGM Dublin Core Darwin Core
What metadata? How should it be presented? Need to refine the core metadata results that are handled and presented by the Clearinghouse as an intersection of data elements or Summary style record synthesized from the remote response
Specific recommendations (agreements for Clearinghouse testing and implementation) Performance issues and scalability need to be addressed, usage expectations, type & volume of use Typical use cases of query and presentation and load handling need to be included to gracefully handle numerous users and query loads