Lifemapper 2.0 Using and Creating Geospatial Data and Open Source Tools for the Biological Community Aimee Stewart, CJ Grady, Dave Vieglais, Jim Beach.

Slides:



Advertisements
Similar presentations
TeraGrid Deployment Test of Grid Software JP Navarro TeraGrid Software Integration University of Chicago OGF 21 October 19, 2007.
Advertisements

Peter Berrisford RAL – Data Management Group SRB Services.
Welcome to Middleware Joseph Amrithraj
Virtualizing Lifemapper for PRAGMA: Step 2 - The Computational Tier By Aimee Stewart, Cindy Zheng, Phil Papadopoulos, C.J. Grady University of Kansas Biodiversity.
Virtualizing Entomology Collection Student: Di Wang (Alan) Sponsors: John Marris: Curator, Entomology Research Museum Stuart Charters: Department of Applied.
Using Specimen Data in Scientific Workflow Environments to Connect to Metadata Archive and Discovery Services in Environmental Biology CJ Grady, J.H. Beach,
Lifemapper Provenance Virtualization
June 22-23, 2005 Technology Infusion Team Committee1 High Performance Parallel Lucene search (for an OAI federation) K. Maly, and M. Zubair Department.
ARCS Data Analysis Software An overview of the ARCS software management plan Michael Aivazis California Institute of Technology ARCS Baseline Review March.
UMIACS PAWN, LPE, and GRASP data grids Mike Smorul.
Data Grid: GRASP Mike Smorul. Grid Retrieval and Search Platform Based on concepts developed in the Earth Science Data Interface (ESDI) developed at the.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Accessing Biodiversity Resources in Computational Environments from Workflow Application J. S. Pahwa, R. J. White, A. C. Jones, M. Burgess, W. A. Gray,
Web service testing Group D5. What are Web Services? XML is the basis for Web services Web services are application components Web services communicate.
UNIVERSITY of MARYLAND GLOBAL LAND COVER FACILITY High Performance Computing in Support of Geospatial Information Discovery and Mining Joseph JaJa Institute.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
Metadata Harvesting The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop.
Bridging Species Niche Modeling and Multispecies Ecological Modeling and Analysis Jeffery Cavner, J.H. Beach, Aimee Stewart, CJ Grady
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Specify Software Project – Quick Facts
Using SRB and iRODS with the Cheshire3 Information Framework Building Data Grids with iRODS May, 2008 National e-Science Centre Edinburgh Dr Robert.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Open access to biodiversity data: the speciesLink experience Dora Ann Lange Canhos
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
A performance evaluation approach openModeller: A Framework for species distribution Modelling.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
OpenModeller framework for ecological niche modelling CRIA, INPE, Poli-USP.
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
An introduction to data exchange protocols in TDWG Renato De Giovanni TDWG 2008.
Slide 1 Archive Computing: Scalable Computing Environments on Very Large Archives Andreas J. Wicenec 13-June-2002.
UK Climate Projections User Interface Centre for Environmental Data Archival RAL, UK Stephen Pascoe, Ag Stephens,
Biodiversity Data Exchange Using PRAGMA Cloud Umashanthi Pavalanathan, Aimee Stewart, Reed Beaman, Shahir Shamsir C. J. Grady, Beth Plale Mount Kinabalu.
Data Integration in Bioinformatics Using OGSA-DAI The BioDA Project Shirley Crompton, Brian Matthews (CCLRC) Alex Gray, Andrew Jones, Richard White (Cardiff.
P088; Presented in Canberra, 27 th March, 2008 GR000: Presented in Fremantle on 20 th October, 2008 GAIA RESOURCES Experiences in mobilizing biodiversity.
Lifemapper II: Finding the Good Life Aimee Stewart James H. Beach, C.J. Grady, David A. Vieglias Biodiversity Institute, KU.
A new tool for fundamental niche modelling Renato De Giovanni Centro de Referência em Informação Ambiental, CrIA.
1 openModeller Presentation Plan: Overview of openModeller OMWS: an open standard for distributed ecological niche modelling openModeller in relation to.
Ben Robb MVP, SharePoint Server CTO, cScape Ltd Interoperability Overview: All Roads Lead to SharePoint.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
December 9, 2004 EC511 Java Pet Store Demo Chandra Donipati.
The EUBrazilOpenBio-BioVeL Use Case in EGI Daniele Lezzi, Barcelona Supercomputing Center EGI-TF September 2013.
Aimee Stewart (KU) Nadya Williams (UCSD) 1.
The Virtual Observatory and Ecological Informatics System (VOEIS): Using RESTful architecture and an extensible data model to provide a unique data management.
Staging of the Ecological Niche Modeling Mammal Prototype Project Deana Pennington University of New Mexico December 14, 2004.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
EGI Technical Forum Madrid The EUBrazilOpenBio-BioVeL Use Case in EGI Daniele Lezzi – BSC EGI Technical Forum Madrid.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Distributed Systems Architectures Chapter 12. Objectives  To explain the advantages and disadvantages of different distributed systems architectures.
Distributed Systems Architectures. Topics covered l Client-server architectures l Distributed object architectures l Inter-organisational computing.
INFSO-RI Enabling Grids for E-sciencE ESR Database Access K. Ronneberger,DKRZ, Germany H. Schwichtenberg, SCAI, Germany S. Kindermann,
MOBILE AND DISCONNECTED FIELD DATA COLLECTION
Customizing ArcGIS Online Data Basin Project Overview
Joslynn Lee – Data Science Educator
TeraGrid Information Services Developer Introduction
GWE Core Grid Wizard Enterprise (
Flanders Marine Institute (VLIZ)
CUAHSI HIS Sharing hydrologic data
Expanding and Scaling Lifemapper Computations Using CCTools
Lecture 1: Multi-tier Architecture Overview
Module 01 ETICS Overview ETICS Online Tutorials
Cloud Web Filtering Platform
Laura Bright David Maier Portland State University
Overview of Workflows: Why Use Them?
Gordon Erlebacher Florida State University
Production Manager Tools (New Architecture)
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Lifemapper 2.0 Using and Creating Geospatial Data and Open Source Tools for the Biological Community Aimee Stewart, CJ Grady, Dave Vieglais, Jim Beach Natural History Museum and Biodiversity Institute University of Kansas

Overview Overall Goals History Current version Implementation Future

Niche Modeling Yeah yeah yeah Data –Environmental –Occurrence Computational limitations

Lifemapper 1.0 NSF funded Experimental app. Successful DC project Enthusiastic users Limited by –Data quality –Architectural decisions

Lifemapper 2.0 Demo pipeline processing specimen data from GBIF cache Funded by NSF/EPSCoR Simpler, controlled architecture Goals –On demand computation –Model archive –Data and analysis service

Components Cluster Spatial data library (SDL) Workflow controller Open-source Python

Operation Client –retrieves point data –constructs request –sends job to cluster by REST Cluster front end receives /schedules job Cluster nodes –retrieve environmental data –dispatch job to OM Client –polls for status –retrieves and stores model/projection

Data Environmental data –URL in job, retrieved via WCS by node –Caches on nodes for efficiency Point data –Could be REST or WFS URL Result data –Model (ruleset) stored on file system –Projection (raster map) registered in SDL

Cluster 64 node, 128 processors 2 TB storage NPACI Rocks Sun Grid Engine scheduler HTTP REST service –Run openModeller (GARP or other algorithm) –Get status –Get result data

Spatial Data Library Mapserver with custom python W*S Layer metadata in PostGIS Independent service - so could –be standalone –be one of multiple SDLs servicing pipeline Will have –search/query webservice –browsable web interface

Workflow Controller Could simply generate jobs … so easy to integrate Currently –Harvests from GBIF –Generates jobs per species –Reproduce LM1 but with refined data and scalable system

Overall system Standalone system –Only outside connection is REST service –Easily moved to smaller/larger system –Or multiple systems for failover for high demand Interface - easier than existing SOAP Designed to provide high throughput, not rapid evaluation of single model Open Source Software

Implementation Status Core components operational –harvest data –generate jobs –output projections –store back to SDL No user interface yet W*S so existing viz solutions easy

What does the future hold? Fine-tune Taxonomic resolution Data cleaning Multiple –algorithms –projection scenarios Analysis services

Acknowledgements Funding –NSF Award (EPSCoR ) –Kansas Technology Enterprise Corporation openModeller –CRIA and more Original GARP –David Stockwell, SDSC Environmental data –Climate Research Unit –Int. Panel on Climate change –Normalization BDWorld, Tim Sutton, Pete Brewer GBIF and contributing collections Lifemapper1 Team, especially Ricardo Pereira

Questions?