DiGIR1 Distributed Databases and Applications John Wieczorek Museum of Vertebrate Zoology, UC Berkeley
DiGIR 2 Distributed Databases Multiple sources of data …under local control, …with concepts in common …and a desire to deliver data as part of a community.
DiGIR 3 Distributed Databases The Species Analyst (TSA)TSA The Integrated Taxonomic Information System (ITIS)ITIS FishNet The Mammal Networked Information System (MaNIS)MaNIS HerpNET The Ornithological Information System (ORNIS) …
DiGIR 4 Distributed Databases European Natural History Science Information Network (ENHSIN)ENHSIN Biological Collection Access for Europe (BioCASE)BioCASE Australia Virtual Herbarium (AVH)AVH Red Mundial de Información Sobre Biodiversidad, Comisión Nacional para el Conocimiento y Uso de la Biodiversidad (REMIB, CONABIO)REMIB …
DiGIR 5 Distributed Databases Mountain and Plains Spatio-Temporal Database- Informatics (MaPSTeDI)MaPSTeDI Ocean Biogeographic Information System (OBIS)OBIS Pacific Basin Information Node, National Biological Information Infrastructure (PBIN, NBII)PBIN Species Link, Centro de Referência em Informação Ambiental (Species Link, CRIA)Species Link A Virtual Herbarium of the Chicago Region (vPlants)vPlants Spatial Analysis of Local Vegetation Inventories Across Scales (SALVIAS)SALVIAS …
DiGIR 6 Distributed Databases Berkeley Natural History Museums (BNHM)BNHM Association of Biological Collections, UC Davis …
DiGIR 7 Distributed Databases LifeMapper Global Biodiversity Information Facility (GBIF)GBIF
DiGIR 8 Distributed vs. centralized Multiple sources of data …under local control, …with concepts in common …and a desire to deliver data as part of a community
DiGIR 9 Distributed vs. centralized In other words, distribute the headache rather than have one central migraine.
DiGIR10 DiGIR Distributed Generic Information Retrieval John Wieczorek, Stan Blum, Dave Vieglais, P.J. Schwartz
DiGIR 11 Project Rationale To avoid multiple incongruous development efforts To pool resources and create a community of experts To solve the problem of scalability
DiGIR 12 Project Goals To define a protocol for retrieving structured data from multiple, heterogeneous databases across the Internet To build a reference implementation of both provider and portal software using said protocol
DiGIR 13 Design Goals To use open protocols and standards, such as HTTP and XML To decouple the protocol, software and semantics To make new data provider installations as easy as possible To have open source development and GNU General Public Licensing
DiGIR 14 DiGIR Architecture User Interface Protocol Portal Engine Provider
DiGIR 15 DiGIR Architecture Provider
DiGIR 16 DiGIR Architecture Provider Registry
DiGIR 17 DiGIR Architecture Portal Engine
DiGIR 18 DiGIR Architecture Portal Engine Registry
DiGIR 19 DiGIR Architecture User Interface
DiGIR 20 DiGIR Architecture User Interface Protocol Portal Engine
DiGIR 21 DiGIR Architecture User Interface Protocol Portal Engine Protocol Provider
DiGIR 22 DiGIR Architecture User Interface Protocol Portal Engine Protocol Provider
DiGIR 23 DiGIR Architecture User Interface Protocol Portal Engine
DiGIR24 DiGIR Component Summary
DiGIR 25 DiGIR Protocol Defines request and response message formats for communication between provider, portal engine, and user interfaces Metadata requests Search requests Inventory requests Remains unfettered by the structure of the data it transfers
DiGIR 26 Portal Engine The entry point for a “user” Can query a registry for potential providers Can determine, based on provider metadata, whether a provider should be queried Can send requests to multiple providers Communicates via protocol compliant messaging only
DiGIR 27 Portal Engine, continued Assembles responses from providers Returns packaged results to the “user” Logs activity
DiGIR 28 Provider Receives requests Retrieves data from database Sends results to requestor Supplies metadata to describe data classification and availability Logs requests
DiGIR 29 Registry Supports provider “advertising” May be global and open May be private Need not be used at all Example: Universal Description, Discovery and Integration (UDDI)UDDI
DiGIR 30 User Interfaces Must be able to assemble and send a request document to a portal Must be able to receive and interpret a response document from the portal This is where the real fun is!
DiGIR31 Example Network Configurations
DiGIR 32 BNHM Network Configuration PHMA Working Database Online Database UCBG Working Database DiGIR Provider BNHM DiGIR Portal UCJEPS Working Database Online Database UCMP Working Databases (4) Online Database Essig Working Database Online Database Online Database BNHM Presentation Layer
DiGIR 33 MaNIS Network Configuration Working Database Online Database Working Database DiGIR Provider MaNIS DiGIR Portal Working Database Online Database Working Database Online Database Working Database Online Database Online Database MaNIS Presentation Layer DiGIR Provider MaNIS DiGIR Portal MaNIS Presentation Layer DiGIR Provider MaNIS DiGIR Portal MaNIS Presentation Layer DiGIR Provider DiGIR Provider
DiGIR 34 MaNIS Network Configuration LACM MS Access Database Online MS Access Database MVZ Sybase Database MaNIS DiGIR Portal TTU FoxPro Database Online MS Access Database UWBM 4D-Mac Database Online MS Access Database CAS SQL Server Database Online SQL Server Database Online MS Access Database MaNIS DiGIR Portal MaNIS DiGIR Portal MVZ-MaNIS Presentation Layer LACM-MaNIS Presentation Layer UWBM-MaNIS Presentation Layer DiGIR Provider DiGIR Provider DiGIR Provider DiGIR Provider DiGIR Provider
DiGIR 35 MaNIS Network Configuration LACM MS Access Database Online MS Access Database MVZ Sybase Database MaNIS DiGIR Portal TTU FoxPro Database Online MS Access Database UWBM 4D-Mac Database Online MS Access Database CAS SQL Server Database Online SQL Server Database Online MS Access Database MaNIS DiGIR Portal MaNIS DiGIR Portal MVZ-MaNIS Presentation Layer LACM-MaNIS Presentation Layer UWBM-MaNIS Presentation Layer DiGIR Provider DiGIR Provider DiGIR Provider DiGIR Provider DiGIR Provider
DiGIR 36 MaNIS Network Configuration LACM MS Access Database Online MS Access Database MVZ Sybase Database MaNIS DiGIR Portal TTU FoxPro Database Online MS Access Database UWBM 4D-Mac Database Online MS Access Database CAS SQL Server Database Online SQL Server Database Online MS Access Database MaNIS DiGIR Portal MaNIS DiGIR Portal MVZ-MaNIS Presentation Layer LACM-MaNIS Presentation Layer UWBM-MaNIS Presentation Layer DiGIR Provider DiGIR Provider DiGIR Provider DiGIR Provider DiGIR Provider
DiGIR 37 MaNIS Network Configuration LACM MS Access Database Online MS Access Database MVZ Sybase Database MaNIS DiGIR Portal TTU FoxPro Database Online MS Access Database UWBM 4D-Mac Database Online MS Access Database CAS SQL Server Database Online SQL Server Database Online MS Access Database MaNIS DiGIR Portal MaNIS DiGIR Portal MVZ-MaNIS Presentation Layer LACM-MaNIS Presentation Layer UWBM-MaNIS Presentation Layer DiGIR Provider DiGIR Provider DiGIR Provider DiGIR Provider DiGIR Provider
DiGIR 38 MaNIS Network Configuration LACM MS Access Database Online MS Access Database MVZ Sybase Database MaNIS DiGIR Portal TTU FoxPro Database Online MS Access Database UWBM 4D-Mac Database Online MS Access Database CAS SQL Server Database Online SQL Server Database Online MS Access Database MaNIS DiGIR Portal MaNIS DiGIR Portal MVZ-MaNIS Presentation Layer LACM-MaNIS Presentation Layer UWBM-MaNIS Presentation Layer DiGIR Provider DiGIR Provider DiGIR Provider DiGIR Provider DiGIR Provider
DiGIR 39 Other Network Configurations Working Database Online Database Working Database DiGIR Provider DiGIR Provider DiGIR Portal Working Database Online Database DiGIR Provider Working Database Online Database DiGIR Portal Working Database Online Database DiGIR Provider DiGIR Portal
DiGIR40 DiGing a little deeper
DiGIR 41 Provider Installation Web server (Apache, IIS, etc.)Apache PHP: Hypertext Preprocessor (PHP)PHP Provider software (DiGIR)DiGIR Configuration tool Testing scripts Provider scripts Provider manual (DiGIR)DiGIR
42 Provider Configuration Tool Provider metadata Resources Database connection Establishing table relationships Concept to column (i.e., field, attribute) mapping
DiGIR 43 Portal Configuration Web server (Apache, IIS, etc.)Apache Sun Java 2 (JDK 1.4)JDK 1.4 Tomcat (Apache)Apache Portal software (DiGIR)DiGIR Portal installation documentation (DiGIR)DiGIR
44 Portal Installation Engine configuration file (finding providers) Presentation configuration file (defining the Information Domain) Presentation customization Engine start and stop scripts Presentation start and stop scripts
DiGIR45 Portal Demonstrations
DiGIR 46 DiGIR Project Information The DiGIR project is a collaborative effort DiGIR is currently established as an open source development project on SourceForge ( Further documentation is available on the DiGIR web site (