George E. Brown, Jr. Network for Earthquake Engineering Simulation Data Curation and Quality Assurance at NEES Stanislav Pejša HUBbub 2012 Indianapolis,

Slides:



Advertisements
Similar presentations
Digital Curation: Round One
Advertisements

Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
HATHI TRUST A Shared Digital Repository Digital Repositories for Preservation and Access Digital Directions 2013 Jeremy York July 22, 2013 Unless otherwise.
Institutional Repositories It’s not Just the Technology New England Archivists Boston College March 11, 2006 Eliot Wilczek University Records Manager Tufts.
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
A centre of expertise in data curation and preservation MIS Seminar :: University of Edinburgh :: 2 October 2006 Funded by: This work is licensed under.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Funded by: © AHDS Sherpa DP – a Technical Architecture for a Disaggregated Preservation Service Mark Hedges Arts and Humanities Data Service King’s College.
Ellen M. Rathje University of Texas at Austin NEEScomm IT Development Team.
1 Extending the Implementation of PREMIS to Geospatial Resources in the Stanford Digital Repository: An Exploration By Nancy J. Hoebelheinrich Metadata.
PREMIS What is PREMIS? – Preservation Metadata Implementation Strategies When is PREMIS use? – PREMIS is used for “repository design, evaluation, and archived.
Active Data Curation in Libraries: Issues and Challenges ASEE ELD Presentation June 27, 2011 William H. Mischo & Mary C. Schlembach.
Data management for NEES Stanislav (Standa) Pejša, NEEScomm Data Curator
NEES Central Goran Josipovic IT Manager
Ellen M. Rathje University of Texas at Austin NEEScomm IT Development Team.
George E. Brown, Jr. Network for Earthquake Engineering Simulation Publication of Research Data in the NEEShub Stanislav Pejša, NEEScomm Data Curator HUBbub.
MITCASESTUDY. Video About MIT OCW (2007)
What are research data? July 2015 This work is licensed under a Creative Commons Attribution 4.0 International LicenseCreative Commons Attribution 4.0.
Integrating Digital Curation in a Digital Library curriculum: the International Master DILL case study Anna Maria Tammaro University of Parma Florence,
LIFE 3 LIFE3: Predicting Long Term Preservation Costs Paul Wheatley Digital Preservation Manager The British Library.
FITS: The File Information Tool Set
Dr. Thomas Hacker Co-Leader for Information Technology George E. Brown Network for Earthquake Engineering Simulation Assistant Professor, Computer & Information.
Metadata: An Overview Katie Dunn Technology & Metadata Librarian
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
Working Group: Practical Policy Rainer Stotzka, Reagan Moore.
Demystifying library standards Stanislav Pejša NEES IT Meeting, DLC This work is licensed under a Creative Commons Attribution 3.0 Unported License.Creative.
Considering Open Access – Digital Preservation of arts research data: AKA Managing your “stuff” Open Repositories Conference 2015 Main Strand Dr Robin.
Site Operations Manager’s Workshop – September 26 th -28 th, 2006 Data Curation at NEES Claude Trottier Data Curator Site Operations Managers Workshop.
A CIDOC CRM – compatible metadata model for digital preservation
ESRI User Conference, August 8, 2006 Long-term archiving of geospatial data: the NGDA project Julie Sweetkind-Singer John Banning Stanford University.
VO Sandpit, November 2009 Environmental Data Archival: Practices and Benefits crib sheet Graham Parton With many thanks to Dr.
Data Management for NEES Stanislav (Standa) Pejša, NEEScomm Data Curator NEEShub Boot Camp, This work is licensed under a.
DigCCurr Winter Institute January 7-8, 2013 Chapel Hill, North Carolina, USA DigCCurr Professional Institute Standa Pejša.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
Life Cycle Models & Principles Jake Carlson Associate Professor of Library Science Data Services Specialist Purdue University Libraries.
BUILDING ON COMMON GROUND: EXPLORING THE INTERSECTION OF ARCHIVES AND DATA CURATION Lizzy Rolando & Wendy Hagenmaier 6/3/2015IASSIST 2015.
Rachelle Howell and Ellen M. Rathje University of Texas at Austin NEEScomm IT Development Team.
Digital preservation activities at the NLW Sally McInnes 18 September 2009.
Data Practices across Disciplines: Informing Collections & Curation Carole L. Palmer Melissa H. Cragin, Tiffany Chao, & Nic Weber Center for Informatics.
The Academic Impact of NEES Ian Buckle University of Nevada Reno.
Small steps and lasting impact: making a start with preservation or It’s not all NASA Patricia Sleeman Digital Archives and Repositories University of.
The University of Michigan, School of Information, August 5, 2015 Data Management, Sharing and Reuse: A User’s Perspective Ixchel M. Faniel, Ph.D. Research.
UKOLN is supported by: Digital Preservation Benefits Tools Project Dissemination Workshop Dr Liz Lyon, Associate Director, UK Digital Curation Centre Director,
Data in the NEES Data Repository Conditions for Current and Future Use and Re-Use Quake Summit 2012, Boston, Massachusetts July 12, 2012 Stanislav Pejša.
PREMIS Implementation Fair, San Francisco, CA October 7, Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Knowledge.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Challenges of Coping with Funding and Data Management in a Changing World Rick Lyons Director Infectious Disease Research Center.
How Not to Lose Track of Your Research Organization and Planning Resources at Brandeis Melanie Radik and Raphael Fennimore Library & Technology Services.
George E. Brown, Jr. Network for Earthquake Engineering Simulation 4 th regular meeting of the NEES preservation advisory committee Stanislav Pejša
EO Dataset Preservation Workflow Data Stewardship Interest Group WGISS-37 Meeting Cocoa Beach (Florida-US) - April 14-18, 2014.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
The OAIS model SEEDS meeting May 5 th, 2015, Lausanne Bojana Tasic.
Standards and the digital life cycle NOF Digitisation Workshops September 2000 Alice Grant Consulting Including additional notes and.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Data Stewardship Lifecycle A framework for data service professionals Protectors of data.
Digital Preservation What, Why, and How? Dan Albertson’s Digital Libraries Class April 13, 2016 Jody DeRidder Head, Metadata & Digital Services University.
Data Management and Archival Storage Bojana Tasić FORS SEEDS Workshop I Belgrade, October.
Building A Repository for Digital Objects
Trove Tufts Digital Image Library
Exercise: understanding authenticity evidence
Exploring the lifecycle
Digital Project Lifecycle Curating Across the Curriculum
Implementing an Institutional Repository: Part II
Metadata for digital long-term preservation
Implementing an Institutional Repository: Part II
Research data lifecycle²
How to Implement an Institutional Repository: Part II
Presentation transcript:

George E. Brown, Jr. Network for Earthquake Engineering Simulation Data Curation and Quality Assurance at NEES Stanislav Pejša HUBbub 2012 Indianapolis, IN This work is licensed under a Creative Commons Attribution 3.0 Unported License.Creative Commons Attribution 3.0 Unported License

DCC Curation Life-Cycle Model Data Digital objects Full Lifecycle Actions Description and Representation information Preservation planning Community watch and participation Curate and Preserve Sequential actions Conceptualise Create or receive Appraise and select Ingest Preservation action Store Access, use, re-use Transform Occasional actions Dispose Reappraise Migrate

Testing Collected sensor measurements Visualisation Access and re-use project 637

NEEScomm Data Goals Aligned with NSF Data Management Plan (DMP) requirements* All research data and documentation will be archived Archived data will be of high quality Archived data will be accessible and shareable Archived data will be re-usable Archived data will be preserved NEEScomm Data Sharing and Archiving Policies, *

Data Archiving at NEES Who – research team, site personnel, curator, NEEScomm What – sensor measurements – sensor calibrations – observations – analyses – numerical simulations – reports (including publications and presentations) When – Dates are stated in the Data Sharing and Archiving Policies (1 month, 6 moths, 12 months) – For as long as the data are useful ~ indefinitely ~ for 20 years Where – Project Warehouse Why – increases researcher’s impact – saves work, time, money – facilitates knowledge transfer – maintained authenticity and integrity of data – good practice – advances research

What kind of data? diverse – shared facilities, not always practices – research domain structural engineering geotechnical engineering geophysical research material engineering tsunami research – type of data experimental observational computational increasingly complex – number of sensors – interdisciplinarity – experimenting with computational modeling

Where are the data coming from? DAQ systems on the site DAQ systems of research teams Researcher's computers Thumb-drives ssh, sftp http, php ssh, sftp Project Warehouse

Quality assurance Content – dependent on human input and monitoring Metadata Completeness – Based on standards and requirements » On the level of research hierarchy » Timeline Technical – machine-actionable Formats – Interoperability – Accessibility – Preservability File integrity

Understandable data Metadata need to be: meaningful purposeful consistent accurate predictable "standardized" Relationship among: Instrumentation plan Sensor metadata Data

CONTENT - Metadata Auto complete – Names – Organization – Facility Check boxes – Equipment System-generated folders/Tabs – Genre/Format – Proxy for metadata – Location on file system – File format constrains Templates – Sensors Forms – Material properties

Metadata - Autocomplete Personal Names

Metadata - Autocomplete Personal Names Organization / Facility

Metadata - Checkboxes Personal Names Organization/Facility Facility/Equipment

System-generated folders/Tabs control of proper position provides metadata file format constrains

Templates sensor metadata difficult to collect variety of measuring instruments different research needs

Completeness Research teams – Professional standards – Local Site guidelines – Team guidelines for data management – NEEScomm documentation and metadata requirements – NEEScomm data archiving milestones T0 - completion of an experiment T + 1month: Unprocessed data uploaded T + 6 months: Metadata and Documentation are uploaded T + 12 months: Rest of the data and experimental report are uploaded NEES Sites – Certifications – Professional standards – Local guidelines (naming conventions, etc.) NEES repository – OAIS – PREMIS – Dublin Core – NEEScomm documentation and archiving policies

Technical quality format identification – FITS tool JHOVE NLNZ Metadata Extractor Exiftool DROID FFident File utility (windows) format validation – JHOVE file integrity – checksums MD5 ? Objects with file extension '*.jpg' in PW

Technical quality format identification – FITS tool JHOVE NLNZ Metadata Extractor Exiftool DROID FFident File utility (windows) format validation JHOVE file integrity – checksums MD5 true false File does not begin with SPIFF, Exif or JFIF segment offset=115

Technical quality format identification – FITS tool JHOVE NLNZ Metadata Extractor Exiftool DROID FFident File utility (windows) format validation JHOVE file integrity – checksums MD5 true true

Technical quality format identification – FITS tool JHOVE NLNZ Metadata Extractor Exiftool DROID FFident File utility (windows) format validation JHOVE file integrity MD5

Curation - Path through SWAMP The harder you try to get out, the deeper you sink Image credit:

Curation - Path to SWAMP Straightforward (relatively) Way to Authorship Merit and Publication

Re-use of Data Use of known, tested, and open formats is key to the success of any future attempt to use data Data Use - Using research data for the current research purpose/activity to infer new knowledge about the research subject. Data Re-use - Using research data for a research purpose/activity other than that for which it was intended. Data Purposing - Making research data available and fit for the current research activity. Data Re-purposing -Making existing research data available and fit for a future known research activity. Supporting Data Re-use - Managing existing research data such that it will be available for a future unknown research activity. Darlington, M. (ed.) (2011a) "ERIM Terminology", version 4. University of Bath, last updated April 12, Ball, A., Darlington, M, Howard, T., McMahon, Chris, Culley, S. (2012). Visualizing Research Data Records for their Better Management. Journal of Digital Information, Vol 13, No 1. Available at

Photo credits Slide 3: Specimen 5 after axial failure at 1.0% drift in Y direction. Slide 7: of Bufallo. Stanislav Pejša (2011) DAQ setup. Ian Prowell (2009) Vibpilot. Thumbdrive. Slide 9: LVDT layout 1 SensorLayout_FrontView.png Slide 21: The Fendt. Stuck on peat marshes at Hardley in Norfolk

Questions? Thank you. Comments? Standa Pejša - NEEScomm Data Curator Network for Earthquake Engineering Simulation