Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,

Slides:



Advertisements
Similar presentations
Dublin Core for Digital Video: Overview of the ViDe Application Profile.
Advertisements

18 Copyright © 2005, Oracle. All rights reserved. Distributing Modular Applications: Introduction to Web Services.
Maines Sustainability Solutions Initiative (SSI) Focuses on research of the coupled dynamics of social- ecological systems (SES) and the translation of.
Virtualizing Lifemapper for PRAGMA: Step 2 - The Computational Tier By Aimee Stewart, Cindy Zheng, Phil Papadopoulos, C.J. Grady University of Kansas Biodiversity.
NG-CHC Northern Gulf Coastal Hazards Collaboratory Simulation Experiment Integration Sandra Harper 1, Manil Maskey 1, Sara Graves 1, Sabin Basyal 1, Jian.
Lifemapper Provenance Virtualization
Entomological Collections Network Meeting, Indianapolis, IN 13 December 2009 Darwin Core Ratified in the Year of Darwin Gail E. Kampmeier Illinois Natural.
Building a Digital Library with Fedora International Conference on Developing Digital Institutional Repositories Hong Kong December 9, 2004.
Caro-COOPS Data Management: Metadata. Cast-Net addresses the need for improved connectivity among coastal observing systems by creating a regional framework.
Center for Environmental Studies Arizona State University Digital Research Records at Center for Environmental Studies Peter McCartney.
RSS RSS is a method that uses XML to distribute web content on one web site, to many other web sites. RSS allows fast browsing for news and updates.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Web services A Web service is an interface that describes a collection of operations that are network-accessible through standardized XML messaging. A.
Web service testing Group D5. What are Web Services? XML is the basis for Web services Web services are application components Web services communicate.
Publishing Digital Content to a LOR Publishing Digital Content to a LOR 1.
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
Bridging Species Niche Modeling and Multispecies Ecological Modeling and Analysis Jeffery Cavner, J.H. Beach, Aimee Stewart, CJ Grady
Adding Phylogeny to GIS-enabled Species Range and Distribution Analyses Jeffery Cavner, J.H. Beach, Aimee Stewart, CJ Grady
Specify Software Project – Quick Facts
Facility Registry System and the Exchange Network Pat Garvey OEI/OIC May 2000.
An Introduction to Metadata Tammy Walker Beaty Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN Data Management.
Pipelines and Scientific Workflows with Ptolemy II Deana Pennington University of New Mexico LTER Network Office Shawn Bowers UCSD San Diego Supercomputer.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
BEN Architecture Isovera Consulting Feb Internet consulting for non-profits 2 BEN Architecture Diagram.
EcoGrid SEEK All Hands Meeting February 2003 Albuquerque, NM.
Standards and tools for publishing biodiversity data Yu-Huang Wang June 25, 2012.
Data Management: Documentation & Metadata Sherry Lake, Senior Data Consultant Bill Corey, Data Consultant Jeremy Bartczak, Intellectual Access & Metadata.
Science Environment for Ecological Knowledge: EcoGrid Matthew B. Jones National Center for.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
Metadata Lessons Learned Katy Ginger Digital Learning Sciences University Corporation for Atmospheric Research (UCAR)
GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June Metadata publishing with the IPT.
Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.
SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Web Services Standards. Introduction A web service is a type of component that is available on the web and can be incorporated in applications or used.
XML Web Services Architecture Siddharth Ruchandani CS 6362 – SW Architecture & Design Summer /11/05.
Research Design for Collaborative Computational Approaches and Scientific Workflows Deana Pennington January 8, 2007.
Policy Based Data Management Data-Intensive Computing Distributed Collections Grid-Enabled Storage iRODS Reagan W. Moore 1.
Uwe SchindlerGES 2007 – May 2-4, 2007 Data Information Service based on Open Archives Initiative Protocols and Apache Lucene Uwe Schindler 1, Benny Bräuer.
The Digital Library for Earth System Science: Contributing resources and collections GCCS Internship Orientation Holly Devaul 19 June 2003.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
An introduction to data exchange protocols in TDWG Renato De Giovanni TDWG 2008.
Laura Russell Programmer VertNet Buenos Aires (Argentina) 28 September 2011 Training course on biodiversity data publishing and.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
Scratchpads and the new Biodiversity Data Journal Biodiversity Data Publishing made… easier Dimitris Koureas Natural History Museum London.
Biodiversity Data Exchange Using PRAGMA Cloud Umashanthi Pavalanathan, Aimee Stewart, Reed Beaman, Shahir Shamsir C. J. Grady, Beth Plale Mount Kinabalu.
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
Lifemapper II: Finding the Good Life Aimee Stewart James H. Beach, C.J. Grady, David A. Vieglias Biodiversity Institute, KU.
Find Research Data b2find.eudat.eu B2FIND User Training How to find data objects and collections using EUDAT’s B2FIND This work is licensed.
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
XMC Cat: An Adaptive Catalog for Scientific Metadata Scott Jensen and Beth Plale School of Informatics and Computing Indiana University-Bloomington Current.
Open Science Framework Jeffrey Spies University of Virginia.
Repository for Archiving, Managing and Accessing Diverse DAta Thiru.
GBIF Governing Board 20 Module 6B: New GBIF Tools II 2013 Portal and NPT Startup Daniel Amariles IT Leader, National Biodiversity Information System of.
The Virtual Observatory and Ecological Informatics System (VOEIS): Using RESTful architecture and an extensible data model to provide a unique data management.
XML 1. Chapter 8 © 2013 Pearson Education, Inc. Publishing as Prentice Hall SAMPLE XML SCHEMA (XSD) 2 Schema is a record definition, analogous to the.
MSG-085 2RS Common Interest Group SINEX OVERVIEW
Introduction: AstroGrid increases scientific research possibilities by enabling access to distributed astronomical data and information resources. AstroGrid.
TOWARDS AN ARCHITECTURE FOR NATIONAL DATA SERVICES Ian Foster Director, Computation Institute Argonne National Laboratory & The University of
Lifemapper 2.0 Using and Creating Geospatial Data and Open Source Tools for the Biological Community Aimee Stewart, CJ Grady, Dave Vieglais, Jim Beach.
Joslynn Lee – Data Science Educator
The JISC IE Metadata Schema Registry
Session 2: Metadata and Catalogues
Make EML with r and share on github
Presentation transcript:

Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach, A. Stewart, J. Cavner University of Kansas Biodiversity Institute

Geospatial Metadata Describes – What it is – What it looks like – Who assembled it – When it was collected – Etc

EML Ecological Metadata Language – XML Schema – Open Source – Community Driven – Describes ecological data Occurrence Data Climate Layers Species Ranges

Narratives Transformation of metadata into a story that is appropriate for the intended audience Same metadata can be used to create narrative for: – Scientists – Undergrads – K-12 students

Narrative Example

DataONE – Distributed system for: Queries Data replication – Initially supports EML

Study of Experiment Reproducibility Ellison, Aaron Repeatability and transparency in ecological research. Ecology 90.

There is a Solution!

Process Metadata Data about the process used Descriptive and prescriptive Documents process used to generate data / metadata – Quality control – SDM experiments – RAD experiments

Capabilities Reproducibility – Actions are documented Transparency – Experiments can be evaluated and validated Publishing – Metadata can be published along with results

What we have done EML for all of our Species Distribution Modeling services Simple process metadata – Documents how an experiment is ran through our cluster including what versions of software – Also describes what web services would be called to execute the experiment again

What we have done Clients – Python library – VisTrails – QGIS

What we are doing Publishing EML to a repository Client extensions Extending process metadata – HTTP message – XPath

Process Metadata Extensions HTTP Message – Documents any web resource call over HTTP XPath processing

What we will do next Use standard APIs to communicate with DataONE Continue to search for standard process metadata and include it whenever possible Contribute process metadata extensions back to the community Add additional conditional analysis elements to the schema (JSON, etc)

Reproducibility Simple process metadata EML process metadata extensions Lifemapper client EML reader

Transparency EML for all service objects Descriptive process metadata

Publishing Aid Client access to public data / metadata catalogs Publish buttons

Lessons Learned Had success starting with narrow, specific, process steps and generalizing them – Calls to our web services expanding to any HTTP call Easy to get carried away with all of the possibilities

Lifemapper funded by: U.S. National Science Foundation NSF EPSCoR NSF EPSCoR EHR/DRL BIO/DBI OCI/CI-TEAM