Presented at the GHRC User Working Group Meeting September 25-26, 2014 INFRASTRUCTURE At the GHRC DAAC Will Ellett IT Manager

Slides:



Advertisements
Similar presentations
LeadManager™- Internet Marketing Lead Management Solution May, 2009.
Advertisements

Introduction to Mendeley. What is Mendeley? Mendeley is a reference manager allowing you to manage, read, share, annotate and cite your research papers...
Presented at AMSR Science Team Meeting September 23-24, 2014 AMSR SIPS STATUS Helen Conover Information Technology & Systems Center The University of Alabama.
AMSR-E SIPS Processing Status Presented by Helen Conover Information Technology and Systems Center at the University of Alabama in Huntsville AMSR-E Joint.
National facility concerned with looking after and distributing marine data Part of NOC National Marine Facilities Serve science, education and industry,
OVERVIEW TEAM5 SOFTWARE The TEAM5 software manages personnel and test data for personal ESD grounding devices. Test and personnel data may be viewed/reported.
Web Plus Overview Division of Cancer Prevention and Control National Center for Chronic Disease Prevention and Health Promotion CDC Registry Plus Training.
AMSR-E SIPS Processing Status Presented by Kathryn Regner Information Technology and Systems Center at the University of Alabama in Huntsville AMSR-E Joint.
How to Guide: Step-by-Step introduction on how to Manage your References Pavlinka Kovatcheva, Sciences Librarian Library training instruction for Sciences.
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System 1 Zaihua Ji Doug Schuster Steven Worley Computational.
National Aeronautics and Space Administration Implementing DSpace at NASA Langley Research Center 1 Greta Lowe Librarian NASA Langley Research Center
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
AMSR-E SIPS Processing Status Presented by Kathryn Regner Information Technology and Systems Center at the University of Alabama in Huntsville JAXA / AMSR-E.
Data Ingest Automation GHRC Status and Plans Helen Conover GHRC DAAC Operations Manager Presented at ESIP Summer Meeting 2015.
Overview of the ODP Data Provider Sergey Sukhonosov National Oceanographic Data Centre, Russia Expert training on the Ocean Data Portal technology, Buenos.
User Working Group 2013 Data Management System – Status 12 March 2013
Providing Access to Your Data: Access Mechanisms Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
©Kwan Sai Kit, All Rights Reserved Windows Small Business Server 2003 Features.
Digital Object Identifiers for EOSDIS data ESIP Winter Meeting Jan 6, 2011 John Moses, ESDIS
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Research Data at NCAR 1 August, 2002 Steven Worley Scientific Computing Division Data Support Section.
Earth Observing System Data and Information System (EOSDIS) provides access to more than 3,000 types of Earth science data products and specialized services.
Updates from EOSDIS -- as they relate to LANCE Kevin Murphy LANCE UWG, 23rd September
MASSACHUSETTS INSTITUTE OF TECHNOLOGY NASA GODDARD SPACE FLIGHT CENTER ORBITAL SCIENCES CORPORATION NASA AMES RESEARCH CENTER SPACE TELESCOPE SCIENCE INSTITUTE.
Introduction to Mendeley. What is Mendeley? Mendeley is a reference manager allowing you to manage, read, share, annotate and cite your research papers...
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
Scientific Investigations; Support from Research Data Archives for Joint Office for Science Support 26 February, 2002 Steven Worley SCD/DSS.
The Global Video Grid: DigitalWell Update & Plan For SRB Integration Myke Smith, Manager Streaming Media Technologies University of Washington / ResearchChannel.
M.Lautenschlager (WDCC, Hamburg) / / 1 Semantic Data Management for Organising Terabyte Data Archives Michael Lautenschlager World Data Center.
August 2003 At A Glance VMOC-CE is an application framework that facilitates real- time, remote cooperative work among geographically dispersed mission.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Series 2013 Data Management at the National Climate Change and Wildlife Science Center.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
2015 GLM Annual Science Team Meeting: Cal/Val Tools Developers Forum 9-11 September, 2015 DATA MANAGEMENT For GLM Cal/Val Activities Helen Conover Information.
Presented at AMSR Science Team Meeting September 23-24, 2014 AMSR2 NRT Land, Atmosphere Near real-time Capability for EOS (LANCE) Helen Conover Information.
User Working Group 2013 Data Access Mechanisms – Status 12 March 2013
AMSR-E SIPS Processing Status Kathryn Regner Information Technology and Systems Center at the University of Alabama in Huntsville
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
06-1L ASTRO-E2 ASTRO-E2 User Group - 14 February, 2005 Astro-E2 Archive Lorella Angelini/HEASARC.
Mercury. One single online platform: Mercury Highlights – USP’s Web-based platform: accessible from any computer in any location without installing any.
Portal Update Plan Ashok Adiga (512)
29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson.
LANCE Processing at the AMSR-E SIPS Presented by Kathryn Regner Information Technology and Systems Center at the University of Alabama in Huntsville Joint.
Web 2.0: Making the Web Work for You, Illustrated Unit A: Research 2.0.
ERDDAP The Next Generation of Data Servers Bob Simons DOC / NOAA / NMFS / SWFSC / ERD Monterey, CA Disclaimer: The opinions expressed.
MODIS SDST, STTG and SDDT MODIS Science Team Meeting (Land Discipline Breakout Session) July 13, 2004 Robert Wolfe Raytheon NASA GSFC Code 922.
Science Review Panel Meeting Biosphere 2, Tucson, AZ - January 4-5, 2011 Vegetation Phenology and Vegetation Index Products from Multiple Long Term Satellite.
Global Change Master Directory (GCMD) Mission “To assist the scientific community in the discovery of Earth science data, related services, and ancillary.
Open Access data at VLIZ Experience in retrieving data from EMODnet “Data ingestion, archiving, citation and DOI” June 26, 2014.
Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.
© 2014 VMware Inc. All rights reserved. Cloud Archive for vCloud ® Air™ High-level Overview August, 2015 Date.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
1 Digital Object Identifiers Update ESIP Data Stewardship Committee Meeting May 16, 2016 Presenters: Nate James, ESDIS Lalit Wanchoo, ADNET Systems Inc.
LP DAAC Overview – Land Processes Distributed Active Archive Center Chris Doescher LP DAAC Project Manager (605) Chris Torbert.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
ODP V2 Data Provider overview. 22 Scope Data Provider provides access to data and metadata of the local data systems. Data Provider is a wrapper, installed.
AIRS Meeting GSFC, February 1, 2002 ECS Data Pool Gregory Leptoukh.
A Solution for Maintaining File Integrity within an Online Data Archive Dan Scholes PDS Geosciences Node Washington University 1.
ESDRs Distribution and User Support – Miscellaneous Topics
OneStop Project Update for WGISS
An Overview of Data-PASS Shared Catalog
NSIDC DAAC Accessioning and “De-commissioning” Plans
EOSDIS Data Preservation Archive (EDPA)
Steering Group Member, Link Digital
Future Data Architectures Big Data Workshop – April 2018
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System Zaihua Ji Doug Schuster Steven Worley Computational.
WGISS Connected Data Assets Oct 24, 2018 Yonsook Enloe
Dataverse for citing and sharing research data
Data Management Components for a Research Data Archive
Presentation transcript:

Presented at the GHRC User Working Group Meeting September 25-26, 2014 INFRASTRUCTURE At the GHRC DAAC Will Ellett IT Manager Support: Michele Garrett, Michael McEniry, Jason Toone

Data Systems Ingest & Processing Public (Web, FTP) Database Storage Systems Tape-based Archive Disk-based Archive Backup NAS Data Storage 9/25/14 – 9/26/14User Working Group Meeting2 GHRC Overview

NASA Public Web FTP NASA Private Ingest Processing Archive NAS UAH Private User Workstations NASA Public NASA Private UAH Private Firewall Internet VPN 9/25/14 – 9/26/14User Working Group Meeting3 GHRC Network

Production Sites Field Campaigns LANCE Project HS3 Project RTMM Project Dell PowerEdge R510 GHRC.nsstc.nasa.gov LIGHTNING.nsstc.nasa.gov SCS3.nsstc.nasa.gov Dell PowerEdge R510 AIRBORNESCIENCE.nsstc.nasa.gov FCPORTAL.nsstc.nasa.gov GPM.nsstc.nasa.gov Dell PowerEdge R720 LANCE.nsstc.nasa.gov Dell PowerEdge R510 HS3.nsstc.nasa.gov Dell PowerEdge 2950 RTMM2.nsstc.nasa.gov (retiring soon) 9/25/14 – 9/26/14User Working Group Meeting4 Public Network Systems web/ftp

Ingest/Processing Dell PowerEdge R510 gale LMA Processing Dell Precision T7500 LMA Processing AMSR Processing Sun Fire X4270 AMSR1-3 AMSR Storage Sun Storage amsrnas1: 16TB NAS amsrnas2: 20TB NAS (scaleable to 60TB) LANCE Processing Dell PowerEdge R720 gwen1 Database Sun Fire X4250 neptune Storage NetGear PowerNAS TB NAS Backup/Logs Dell PowerEdge R710 underdog 9/25/14 – 9/26/14User Working Group Meeting5 Private Network Systems

KELVIN GHRCARC1-2 Sun V880/L700 90TB Tape Archive 75% full Sun ZFS Storage TB Disk Archive 10% full 9/25/14 – 9/26/14User Working Group Meeting6 Private Network Systems

Replacing aging Tape Archive – to be competed by Summer 2015 Installed Sept 2002Installed June 2013 Sun V880/L700 90TB usable Scalable to 500TB Sun ZFS Storage TB usable Scalable to 2PB Archive Migration 9/25/14 – 9/26/147User Working Group Meeting

Tape Backup System files Source code Critical data Tape/Disk Archive Datasets (multiple copies) Researching Off-Site Archive Datasets GHRC Public GHRC Private Firewall Internet Backup Archive Amazon Glacier Future Data Backup 9/25/14 – 9/26/148User Working Group Meeting

User Registration System (URS) Require registration for data access FTP to HTTPS Evaluate Impact on Users LIS Space Station Setup new Operations Center Development Server Help reduce load on gale Additional Storage Off-Site Archive Amazon Glacier URS Amazon GlacierDevelopment Server FTP to HTTPS LIS ISS Future Projects 9/25/14 – 9/26/149User Working Group Meeting

Presented at the GHRC User Working Group Meeting September 25-26, 2014 GHRC DATA PROCESSING Lamar Hawkins Operations Manager Bruce Beaumont Lead Software Engineer

The Situation by the Numbers ~300 cataloged datasets ~30 ongoing datasets Frequent field campaigns o ~25 real time data ingests (each) 1-1/2 Operations staff 9/25/14 – 9/26/1411User Working Group Meeting

Goals Automate everything! Standardize data processing Simplify data flow Reduce duplicated code Increase maintainability Document everything Automated watchdogs 9/25/14 – 9/26/1412User Working Group Meeting

Environments DEV (development) o Writable by all developers o Basic (unit) testing done here TEST (integration & test) o Writable by Operations staff only o Acceptance testing done here OPS (production) o Writable by SysAdmin only o Certain directories are writable by Ops staff o Operational processing done here 9/25/14 – 9/26/1413User Working Group Meeting DEV TEST OPS

Overall Data flow 9/25/14 – 9/26/1414User Working Group Meeting IngestProcessDistribute

Data Ingest PUSH method o Remote site delivers data to us periodically o Standard SW discovers new data PULL method o We poll a remote site for new data o Standard SW handles new data Other method o Data delivered on media o Other PUSH method (socket, LDAP) Ingest metrics are generated for most streams 9/25/14 – 9/26/1415User Working Group Meeting Ingest

Processing Science processing for some data May include reformatting, renaming, etc. Processing is not required Modules are stream-specific 9/25/14 – 9/26/1416User Working Group Meeting Process

Data Distribution Data distribution is handled by a common module Distribution may include o Copying files to public or private FTP areas o Putting files on the archive (in OPS only!) o Staging files for delivery to external users via PUSH File-level metadata are generated for most streams 9/25/14 – 9/26/1417User Working Group Meeting Distribute

Presented at the GHRC User Working Group Meeting September 25-26, 2014 GHRC DATA SEARCH, ACCESS AND ORDER Mary Nair User Services and Data Management Team Member Sherry Harrison DBA and Data Management Team Member

199/25/14 – 9/26/14User Working Group Meeting Overview Search HyDRO Reverb GCMD Data Set List OpenSearch Tropical Storm Tracks Access Field Campaign Portals DOIs Data Set Landing Pages Guides OPeNDAP Ftp Future: https Order Automated Order Processing Data Subscriptions: PUSH & GDX

Application developed at the GHRC by Bruce Beaumont Highlights Quick Search Advanced Search Data Sets by Collection Data Set Information Download Data Order Data 209/25/14 – 9/26/14User Working Group Meeting Hydrologic Data Search, Retrieval, and Order System (HyDRO)

Reverb Global Change Master Directory (GCMD) Data Set List hydro/search.pl OpenSearch Provides a web service API for searching the GHRC catalog hydro/ghost.xml 219/25/14 – 9/26/14User Working Group Meeting Data Search Tools

229/25/14 – 9/26/14User Working Group Meeting Application developed at the GHRC Storm data from the National Hurricane Center ~ 6 hour interval updates during active storms Tropical Storm Tracks

239/25/14 – 9/26/14User Working Group Meeting Field Campaign Portals Access restricted to field campaign participants and collaborators

249/25/14 – 9/26/14User Working Group Meeting Digital Object Identifiers (DOIs) What is a DOI? Unique alphanumeric string used to identify a digital object Provides persistent identification with a permanent online link Enables easier access to research data Assigned and regulated by The International DOI Foundation (IDF) Often used in online publications in citations DOIs at the GHRC DOIs have been defined for most of the approximately 300 datasets in the GHRC catalog, with about 65% of these registered through ESDIS. Dataset Landing Pages are already provided for all GHRC datasets, whether or not a DOI is in place. DOI example: F17/SSMIS/DATA302 F17/SSMIS/DATA302

One-paragraph description Citation Information Basic metadata Coverage information Links to documentation and software DOI We get this information from the PI. 259/25/14 – 9/26/14User Working Group Meeting details.pl?ds=gpmparprbgcpex Data Set Landing Pages

269/25/14 – 9/26/14User Working Group Meeting Guides Data set overview document composed by the GHRC from PI provided information Features Instrument Overview Data Format and File Naming Convention Investigator Information Algorithm Details PI Documentation and Software Information and Links Citations and References

279/25/14 – 9/26/14User Working Group Meeting Additional Access Methods ftp://ghrc.nsstc.nasa.gov/ ftp://gpm.nsstc.nasa.gov/ Future: HTTPS

289/25/14 – 9/26/14User Working Group Meeting Automated Order Processing Order Submitter (HyDRO, Reverb) Order Submitter (HyDRO, Reverb) GHRC Order Database Order Broker Value- added process FTP area Extracts files from tarred/gzipped bundles Performs HEW (HDF-EOS) subsetting Packs results into convenient tar bundles for delivery

9/25/14 – 9/26/1429User Working Group Meeting Data Subscriptions Data subscription Scheduled delivery of data on a near-real-time basis to individual subscribers Delivery via applications developed at the GHRC (PUSH, GDX) Access to subscription applications is limited to GHRC operations staff Product / User Subscription Handler (PUSH) Primary Data Subscription Service Configurable for the dataset and the transfer interval GPM Data Interchange (GDX) Command line mechanism for data transfer which includes handshaking Near-real-time LIS provided to PPS (Erich Stocker) Configurable to transfer various data sets

Discussion 9/25/14 – 9/26/1430User Working Group Meeting THANK YOU for your attention! Please cite your data. o When the DOI is available, please use it in your data citation. o When your publication cites our data, please notify us. What data formats do you prefer? What metadata is most useful to you? Do you find the user guide documents useful? Are there additional data access methods to consider? If you have not already done so, please respond to the ESDIS survey for the GHRC DAAC. Please contact GHRC User Services for any help or questions