System Development & Operations NSF DataNet site visit to MIT February 8, 2010 2/8/20101NSF Site Visit to MIT DataSpace DataSpace.

Slides:



Advertisements
Similar presentations
Panel 2 – Promoting Re-Use of Scientific Collections John Harrison SHAMAN Project University of Liverpool
Advertisements

Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
The Dryad Data Repository Ryan Scherle 1, Hilmar Lapp 1, Amol Bapat 2, Sarah Carrier 2, Jane Greenberg 2, Peggy Schaeffer 1, Todd Vision 1,3, Hollie White.
© Fraunhofer Institute SCAI and other members of the SIMDAT consortium Data Grids for Process and Product Development using Numerical Simulation and Knowledge.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Digital Preservation A Matter of Trust. Context * As of March 5, 2011.
The Alliance for Data Archive Technologies: Looking towards a Common Future Myron Gutmann, ICPSR Ben Evans, ASSDA Deborah Mitchell, ASSDA Kevin Schürer,
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Mairéad Martin, Penn State University Commons Solutions Group Storage Workshop May 2010.
Riding the Wave: a Perspective for Today and the Future APA Conference, November 2011 Monica Marinucci EMEA Director for Research, Oracle.
CNRIS CNRIS 2.0 Challenges for a new generation of Research Information Systems.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
January 2006DSpace User Group Meeting, Sydney, Australia DSpace development from MIT's Digital Library Research Program MacKenzie Smith Associate Director.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Microsoft Office Sharepoint Server 2007 (MOSS) Overview Momentum Microsoft November 15, 2007.
Software Architecture April-10Confidential Proprietary Master Data Management mainly inspired from Enterprise Master Data Management – An SOA approach.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Tyler O. Walters, Associate Director, Technology & Resource Services Library & Information Center, Georgia Institute of Technology For NSF Site Visit to.
Dienst Distributed Networked Publishing Carl Lagoze Digital Library Scientist Cornell University.
Presentation Outline (hidden slide) Technical Level: 100 Intended Audience: TDMs, ITPros, ITDMs, BI specialists Objectives (what do you want the audience.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Cyberinfrastructure Overview Core Cyberinfrastructure Team Matthew B. Jones National Center for Ecological Analysis and Synthesis (NCEAS) University of.
Digital Special Collections Users Council Annual Meeting May 9, 2008.
Themes Architecture Content Metadata Interoperability Standards Knowledge Organisation Systems Use and Users Legal and Economic Issues The Future.
1 Data services and computing. 2 We tend to be dealt the computing environment in which we must operate. Few of us have enough influence to steer the.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
Join the Conversation: Active Listening on Social Media By Lauren Cleland New Media Specialist, Explore Georgia #TeamGaSocial.
HathiTrust’s Past, Present and Future. Short- and Long-term Functional Objectives Short-term Page turner mechanism (and Mobile!) Branding (overall initiative;
Jake F. Weltzin United States Geological Survey Taking the Pulse of our Planet The USA National Phenology Network.
Alexandria Digital Earth ProtoType DIGITAL LIBRARIES AND ENVIRONMENTAL INFORMATION Terence R. Smith Alexandria Digital Library Project.
1 A Very Large Digital Library Technology Demonstration William Y. Arms Cornell University.
Jamie Hall (ILL). SciencePAD Persistent Identifiers Workshop PANData Software Catalogue January 30th 2013 Jamie Hall Developer IT Services, Institut Laue-Langevin.
Overview – NSF Site Visit 8 February (v8) DataSpace.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
I Copyright © 2007, Oracle. All rights reserved. Module i: Siebel 8.0 Essentials Training Siebel 8.0 Essentials.
Jake F. Weltzin United States Geological Survey USA National Phenology Network Integrating phenology data across spatial and temporal scales.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Preservation Program Digital Preservation Program Digital Preservation Services: Extending tools to meet campus needs Patricia Cruse, Director, Digital.
Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University
ARL Workshop on New Collaborative Relationships: The Role of Academic Libraries in the Digital Data Universe September 26-27, 2006 ARL Prue.
April 14, 2005MIT Libraries Visiting Committee Libraries Strategic Plan Theme III Work to shape the future MacKenzie Smith Associate Director for Technology.
1 Gateways. 2 The Role of Gateways  Generally associated with primary sites in ESG-CET  Provides a community-facing web presence  Can be branded as.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
OOI Cyberinfrastructure and Semantics OOI CI Architecture & Design Team UCSD/Calit2 Ocean Observing Systems Semantic Interoperability Workshop, November.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
CIMA and Semantic Interoperability for Networked Instruments and Sensors Donald F. (Rick) McMullen Pervasive Technology Labs at Indiana University
Enteprise Content Management from Microsoft. 20% structured 80% unstructured 90% of unstructured data is unmanaged Volume of data is increasing ~36%/year.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
International Planetary Data Alliance Registry Project Update September 16, 2011.
The Earth System Curator Metadata Infrastructure for Climate Modeling Rocky Dunlap Georgia Tech.
NSF DataNet Initiative
GISELA & CHAIN Workshop Digital Cultural Heritage Network
DataNet Collaboration
Summit 2017 Breakout Group 2: Data Management (DM)
Joseph JaJa, Mike Smorul, and Sangchul Song
VI-SEEM Data Repository
Fedora Filling the “Sweet Spot” in the Information Landscape
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Bird of Feather Session
Wrap-Up – NSF Site Visit 8 February 2010
Remedy Integration Strategy Leverage the power of the industry’s leading service management solution via open APIs February 2018.
Presentation transcript:

System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace

Other USA Nodes International Nodes DataSpace High-Level Architecture Global Network (Web) Local Network Metadata Repository for Scientific Data Multiple Scientific Data Repositories (DataSpace Native Architecture) Interface to Legacy Scientific Data Repositories... Distributed Data Management Services: Security, Replication, Administration Policy Management, Workflow Services Additional Data User Services : Data Analytics Data Visualization Basic Data User Services: Discovery, Quality, Conversion, Integration Data Curation Services: Process, Catalog, Annotate, Preserve DataSpace Services MIT Node... Scientist Curator User Provides data, preliminary metadata Process and ingests data, complete metadata, and policies (e.g. retention) Searches (meta)data, accesses/integrates data, analyzes/visualizes data (via DataSpace data services or 3 rd party data services) Basic Workflow DataSpace 3 rd par 3 rd Party Specialized Data Services 2

PLATFORM ARCHITECTURE 2/8/2010NSF Site Visit to MIT DataSpace3 DataSpace

Platform Architecture Version 0.1 Version 1.0 2/8/20104NSF Site Visit to MIT DataSpace

2/8/20105NSF Site Visit to MIT DataSpace

Federated Architecture 2/8/20106NSF Site Visit to MIT DataSpace

Multiple Implementations 2/8/20107NSF Site Visit to MIT DataSpace

Federated Model Data can be widely distributed; Web-based Services can be centralized or federated – e.g. centralized, domain-specific search service that harvests metadata from relevant archives (“google for biological oceanography”) – e.g. real-time data integration across small sets of archives identified via subject search DataSpace will develop some, but more importantly create an ecosystem that others can contribute to (e.g. technology & scientific companies, universities, researchers, labs) February 8, 2010NSF Site Visit to MIT DataSpace8

Development Methodology Behavior-Driven Development model Continuous Integration Process – iteratative research prototyping and production implementation phases Small centralized development team to start Institutional partners add developers in years 1-2 Transparent, open source process Close collaboration with Data Conservancy 2/8/20109NSF Site Visit to MIT DataSpace

OPERATIONS 2/8/2010NSF Site Visit to MIT DataSpace10 DataSpace

Local Operations – MIT Example Scientists – data production, early-stage curation – lots of domain expertise, little or no curation expertise Libraries – outreach and recruitment (e.g. HMI study) – later-stage data curation, ingest – some domain expertise, lots of curation expertise IS&T – identifying, operating hardware & system – Enterprise systems management expertise – lots of IT expertise, some curation expertise 2/8/201011NSF Site Visit to MIT DataSpace

Project-Wide Operations Platform governance – distributed open source software model – transparent decision-making process Service model(s) for each institutional partner – including all data curation activities – including CI templates (e.g. hardware, cloud) – associated cost model for each service model 2/8/201012NSF Site Visit to MIT DataSpace

Project-Wide Operations Ongoing usability studies with researchers, students, public audiences Develop certification strategy for TDRs using DataSpace (.arc domain) 2/8/201013NSF Site Visit to MIT DataSpace

Data Curation Lifecycle Highlights Deposit workflows for researchers based on locally- produced data (interactive and batch) Data Curators – outreach, marketing, data recruitment – metadata creation and data ontology application – curatorial policies developed, applied – tailored preservation strategies (local, consortial, outsourced) Direct access to data creators and boots on the ground support services 2/8/2010NSF Site Visit to MIT DataSpace14

Data Curation Lifecycle Highlights Novel distributed, standards-based policy management strategy based on emerging Semantic Web standards and TRAC Semantic Web standards (e.g. RDF) to support improved data integration and interoperability Separation of access layer (discovery, use) from curation layer, in support of broad federation, distributed tool development 2/8/2010NSF Site Visit to MIT DataSpace15