Download presentation
Presentation is loading. Please wait.
Published byClifton Wilkins Modified over 9 years ago
1
Advanced Computing and Information Systems laboratory iDigBio Cloud and Appliances: Concept, Processes and Progress Jose Fortes (on behalf of the iDigBio IT team)
2
Advanced Computing and Information Systems laboratory iDigBio (idigbio.org) Goal: making data and images for millions of biological specimens available in electronic format for the biological research community, agencies, students, educators, and public Mission: leadership, coordination, and outreach in digitization of collections by implementing resources for communication, use of technology, access to data, research and education. A resource: permanent cloud computing infrastructure to link biological data from collections across the USA to use search and analytics tools to mine and reference data 2
3
Advanced Computing and Information Systems laboratory Seven Thematic Collections Networks (TCNs) InvertNet: An Integrative Platform for Research on Environmental Change, Species Discovery and Identification (Illinois Natural History Survey, University of Illinois) invertnet.orginvertnet.org Plants, Herbivores, and Parasitoids: A Model System for the Study of Tri-Trophic Associations (American Museum of Natural History) tcn.amnh.orgtcn.amnh.org North American Lichens and Bryophytes: Sensitive Indicators of Environmental Quality and Change (U of Wisconsin) symbiota.org/nalichens/index.php symbiota.org/bryophytes/index.phpsymbiota.org/nalichens/index.phpsymbiota.org/bryophytes/index.php Digitizing Fossils to Enable New Syntheses in Biogeography-Creating a PALEONICHES-TCN (U of Kansas) The Macrofungi Collection Consortium: Unlocking a Biodiversity Resource for Understanding Biotic Interactions, Nutrient Cycling and Human Affairs (New York Botanical Garden) Mobilizing New England Vascular Plant Specimen Data to Track Environmental Change (Yale University) Southwest Collections of Anthropods Network (SCAN): A Model for Collections Digitization to Promote Taxonomic and Ecological Research (Northern Arizona University) http://hasbrouck.asu.edu/symbiota/portal/index.php More than 130 participating institutions
4
Advanced Computing and Information Systems laboratory iDigBio IT Vision Cyberinfrastructure to enable the collaborative creation, integration and management of digitized biocollections, and their use in scientific research, education and outreach. Visible as a collection of persistent Internet-accessible services, data and resources for biocollection “producers”, “consumers” and “service providers” cyberinfrastructure providers national/global data aggregators
5
Advanced Computing and Information Systems laboratory CI Stakeholders Domain Data Producers Infrastructure Providers Domain Service Providers Domain Data Consumers National/Global Data Aggregators 5 iDigBio Museums Amazon WS Google Microsoft Azure DataONE TCNs Collectors GBIF ALA Researchers Amazon Turk Georeferencing Imaging services Data quality Mapping EOL TCNs Government Translation OCR BISON NESCent Data Conservancy iPlant Teachers Citizens TCNs Domain-level data
6
Advanced Computing and Information Systems laboratory Evolution of iDigBio capabilities 6 Time Data ingestion Data access, provision and visualization Provide and enable data feedback Data linking and federation Process and visualize integrated data Increasing storage and server hosting in support of the above Increasing number of appliances in support of the above Web site for interaction with public, community, education and above
7
Advanced Computing and Information Systems laboratory iDigBio.org 7 News Events Forums Documents Links Data portal Working groups
8
Advanced Computing and Information Systems laboratory Building the iDigBio Cloud Useful services/APIs (programmatic and web-based) Scalable object storage and information processing Digitization-oriented virtual appliances Standards, proven solutions and software reuse if possible Input from stakeholders (surveys, summit, workshops, …) Needs: storage, server hosting, data feedback transformations …
9
Advanced Computing and Information Systems laboratory iDigBio data portal v0 at work
10
Advanced Computing and Information Systems laboratory iDigBio Data Portal: Tutorial
11
Advanced Computing and Information Systems laboratory iDigBio data portal v0: search
12
Advanced Computing and Information Systems laboratory iDigBio data portal v0: record info
13
Advanced Computing and Information Systems laboratory Storage hosting “… able to facilitate storage of images on a case-by-case basis.” “iDigBio currently does not provide archival storage, and hosting of images in iDigBio should not be seen as such.” currently approximately 30 TB space committed to storage for the dissemination of images and derivatives produced by TCNs: North American Lichens and Bryophytes The Macrofungi Collection Consortium Plants, Herbivores, and Parasitoids If you would like iDigBio to store and disseminate your TCN data as well, please contact us. iDigBio also provides limited storage space along with its hosting services, this space currently totals approximately 8TB of storage. 13
14
Advanced Computing and Information Systems laboratory Appliances, Virtual Private Servers iDigBio packages and distributes pre-configured software tools and environments as software “appliances” Deployment in end-user or in a hosted server environment iDigBio cloud hosts virtual private servers exposing services to the bio-collections community Proposal requests through iDigBio portal interface Virtual private servers on iDigBio cloud: Symbiota, FilteredPush, VertNet, Biogeomancer Virtual appliances Under development: Media ingestion; augmenting-OCR workshop and hack-a-thon Community interactions: Image-to-record services (OCR, NLP, duplicate discovery, workflow), Kepler Kurator, Specify 14
15
Advanced Computing and Information Systems laboratory Short term Ingestion appliance Web-based UI Images captured (e.g. HD/flash media) /images/1/100.tif /1/101.tif /2/200.tif … iDigBio object Storage cloud (Swift) Batch upload, Cloud APIs Web server Cloud client File interface /1/100.tif GUID1 /1/101.tif GUID2 Facilitate data ingestion, interface with iDigBio
16
Advanced Computing and Information Systems laboratory Initial Setup 16
17
Advanced Computing and Information Systems laboratory Initial Screen – Sign In 17
18
Advanced Computing and Information Systems laboratory Fill out Sign In Form 18
19
Advanced Computing and Information Systems laboratory Settings Pane After Signing In 19
20
Advanced Computing and Information Systems laboratory Fill Out Settings 20
21
Advanced Computing and Information Systems laboratory Move Next to Uploader Pane 21
22
Advanced Computing and Information Systems laboratory Copy and Paste Path, Upload 22
23
Advanced Computing and Information Systems laboratory Upload Started 23
24
Advanced Computing and Information Systems laboratory Case 1: Ingestion Successful on the First Attempt 24
25
Advanced Computing and Information Systems laboratory Upload Finishes Successfully 25
26
Advanced Computing and Information Systems laboratory Case 2: Ingestion Successful After Several Attempts 26
27
Advanced Computing and Information Systems laboratory Network Failed - Upload Aborted 27
28
Advanced Computing and Information Systems laboratory Upload Resumes 28
29
Advanced Computing and Information Systems laboratory Upload Finished with Some Errors 29
30
Advanced Computing and Information Systems laboratory Resume Again 30
31
Advanced Computing and Information Systems laboratory Now Entire Batch is Successful 31
32
Advanced Computing and Information Systems laboratory Summary iDigBio cloud Service-oriented, standards-based, focused on ADBC needs Scalable data management and information processing using standard interfaces, data formats, protocols, tools Toolboxes as appliances Evolving collection of community-selected tools Built-in interfaces for effortless iDigBio integration Embed best practices and standards in biocollections work After the first year we have functional web site, data portal, storage and server hosting services Ingestion appliances and ingestion APIs for images and data soon available For feedback: fortes@ufl.edu and “Contacts” at idigbio.orgfortes@ufl.edu
33
Advanced Computing and Information Systems laboratory Linking Collections to… Ecology Paleontology Genomics Living Collections Other repositories PRAGMA activities
34
Advanced Computing and Information Systems laboratory Acknowledgments National Science Foundation Judith Skog and Anne Maglia iDigBio IT team at U. of Florida Renato Figueiredo & Andrea Matsunaga, Senior Personnel Alex Thompson, Kevin Love & Matt Collins, IT Experts Jiangyan Xu, Graduate student iDigBio IT team at Florida State U. Greg Riccardi, Director for Informatics Austin Mast, Senior Personnel Gil Nelson & Deb Paul, Digitization Specialists Guillaume Pierre, IT expert 34
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.