Data Practices across Disciplines: Informing Collections & Curation Carole L. Palmer Melissa H. Cragin, Tiffany Chao, & Nic Weber Center for Informatics.

Slides:



Advertisements
Similar presentations
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Advertisements

Breakout 1 Socio-legal etc. Every discipline will be different & each data centre will have different answers to questions. Use a questionnaire and send.
Data Conservancy and the US NSF DataNet Initiative 2010 JISC/CNI Conference July 1, 2010 Sayeed Choudhury Johns Hopkins University.
Maines Sustainability Solutions Initiative (SSI) Focuses on research of the coupled dynamics of social- ecological systems (SES) and the translation of.
Introduction to Research Data Management Services, January 2013 Library Data Services Functions and activities.
Contouring Curation in Research Libraries: Defining “Working” Data Units and Communities Carole L. Palmer Center for Informatics Research in Science &
Museums and Digital Repositories October, The punch line… In the digital realm, museums: * are very much like libraries * tend to share the same.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer Center for Informatics Research in Science & Scholarship Graduate School.
The Analytic Potential of Long-Tail Data: Sharable Data and Re-use Value Carole L. Palmer Center for Informatics Research in Science & Scholarship Graduate.
Data Sharing, Small Science, and Institutional Repositories Melissa H. Cragin & Carole L. Palmer Center For Informatics Research in Science and Scholarship.
Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer & Tiffany Chao Center for Informatics Research in Science & Scholarship Graduate.
The Data Curation Profile IASSIST 2010 Jake Carlson Data Research Scientist Purdue University Libraries.
Future Access to the Scientific and Cultural Heritage – A shared Responsibility Birte Christensen-Dalsgaard State and University Library.
GeoData 2011 Workshop Data Life Cycle Break Out #3 Wednesday, 2 March 2011 Moderator: Mohan Ramamurthy, Unidata.
Active Data Curation in Libraries: Issues and Challenges ASEE ELD Presentation June 27, 2011 William H. Mischo & Mary C. Schlembach.
EMu and Archives NA EMu Users Conference – Oct Slide 1 EMu and Archives Experiences from the Canada Science and Technology Museum Corporation.
Introduction to Geospatial Metadata – FGDC CSDGM National Coastal Data Development Center A division of the National Oceanographic Data Center Please .
THE DATA CITATION INDEX AN INNOVATIVE SOLUTION TO EASE THE DISCOVERY, USE AND ATTRIBUTION OF RESEARCH DATA MEGAN FORCE 22 FEBRUARY 2014.
Providing Access to Your Data: Tracking Data Usage Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
Final Search Terms: Archiving (digital or data) Authentication (data) Conservation (digital or data) Curation (digital or data) Cyberinfrastructure Data.
The Data Conservancy: Lessons from Astronomy Third Workshop on Data Preservation and Long Term Analysis in HEP December 7, 2009.
Designing the Microbial Research Commons: An International Symposium Overview National Academy of Sciences Washington, DC October 8-9, 2009 Cathy H. Wu.
Data Curation Education and Biological Information Specialists DigCCurr 2007 Chapel Hill, April 20, 2007 P. Bryan Heidorn, Carole L. Palmer, Melissa H.
Research Data Management Services Katherine McNeill Social Sciences Librarians Boot Camp June 1, 2012.
Building Biodiversity Information Education: Next Generation Bioinformaticians P. Bryan Heidorn Carole Palmer Dan Wright Graduate School of Library and.
Ensemble Computing in the National Science Digital Library (NSDL)
Information and Discovery in Neuroscience (IDN) Carole Palmer Graduate School of Library and Information Science University of Illinois at Urbana-Champaign.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
Data Curation Education JCDL Pittsburgh, June 20, 2008 Linda C. Smith Melissa H. Cragin, Carole L. Palmer, W. John MacMullen, P. Bryan Heidorn.
Managing End-User Development of Digital Library Resources to Support User Communities Robert R. Downs Center for International Earth Science Information.
A survey based analysis on training opportunities Dr. Jūratė Kuprienė Framing the digital curation curriculum International Conference Florence, Italy.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
A Geospatial Clearinghouse for the Oregon Coast: Implications for Improved Hazard Assessment Dawn Wright Department of Geosciences Oregon State University.
Michael Witt Interdisciplinary Research Librarian & Assistant Professor Purdue Libraries & Distributed Data Curation Center (D2C2) Eliciting.
The Digital Library for Earth System Science: Contributing resources and collections Meeting with GLOBE 5/29/03 Holly Devaul.
Judith E. Skog Biological Sciences Directorate Emerging Frontiers Division H. Richard Lane Geological Sciences Directorate Earth Systems Science.
David Mogk Dept. of Earth Sciences Montana State University April 8, 2015 Webinar SAGE/GAGE FACILITIES SUPPORTING BROADER EDUCATIONAL IMPACTS: SOME CONTEXTS.
W HAT IS I NTEROPERABILITY ? ( AND HOW DO WE MEASURE IT ?) INSPIRE Conference 2011 Edinburgh, UK.
Data in the NEES Data Repository Conditions for Current and Future Use and Re-Use Quake Summit 2012, Boston, Massachusetts July 12, 2012 Stanislav Pejša.
The Role of Academic Libraries in the Digital Data Universe Break-Out Session: New Partnership Models Bob Hanisch and Brian Schottlaender Co-Leaders ARL.
The Digital Library for Earth System Science: Contributing resources and collections GCCS Internship Orientation Holly Devaul 19 June 2003.
The Long Tail of Sample-based Data in the Next Decade FROM DARKNESS TO LIGHT Kerstin Lehnert
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Site-Based Data Curation at Yellowstone National Park PI: Carole L. Palmer, GSLIS, CIRSS Co-PIs: Bruce Fouke, Geology, Microbiology, Institute for Genomic.
Digitization – Basics and Beyond workshop Interoperability of cultural and academic resources New services for digitized collections Muriel Foulonneau.
Digital Preservation across the technologies, strategies, open standards & interoperability aspects including the legal issues Pratik Shrivastava Scientist.
Marv Adams Chief Information Officer November 29, 2001.
Challenges of Coping with Funding and Data Management in a Changing World Rick Lyons Director Infectious Disease Research Center.
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
Data Infrastructure Building Blocks (DIBBS) NSF Solicitation Webinar -- March 3, 2016 Amy Walton, Program Director Advanced Cyberinfrastructure.
Building PetaScale Applications and Tools on the TeraGrid Workshop December 11-12, 2007 Scott Lathrop and Sergiu Sanielevici.
Michael Witt, Jacob Carlson, D. Scott Brandt Purdue University Melissa H. Cragin University of Illinois at Urbana-Champaign Constructing Data Curation.
Data Stewardship Lifecycle A framework for data service professionals Protectors of data.
BECOMING ‘HYPERDISCIPLINARY’: Implications of increasingly hybrid collaborations Dr Peter Darch Department of Information Studies University of California,
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
Digital Asset Management: E-Science Life-Cycle Anthony D. Smith Ocean Teacher Academy Training Course, 30 September - 4 October 2013, Mombasa, Kenya.
EarthCube Sustaining the Geosciences for 21 st Century Challenges Credits: from top to bottom: NOAA Okeanos Explorer Program (CC BY-SA 2.0), NASA/Kathryn.
Data Curation and Data Analytics for Advancing Science and Scholarship GSLIS Research Showcase 9 April 2011 Carole Palmer & Cathy Blake Center for Informatics.
PV 2009 December 3, 2009 The Data Conservancy: Building Sustainable Infrastructure for Interdisciplinary Scientific Data Curation and Preservation.
Workflows in archaeology & heritage sciences
Research on Data Curation and Repositories
ESciDoc Introduction M. Dreyer.
Bird of Feather Session
BCoN Data Integration Workshop, University of Kansas, Feb 13-14, 2018
Digital Objects: The Science
Presentation transcript:

Data Practices across Disciplines: Informing Collections & Curation Carole L. Palmer Melissa H. Cragin, Tiffany Chao, & Nic Weber Center for Informatics Research in Science & Scholarship Graduate School of Library & Information Science University of Illinois at Urbana-Champaign iConference 9 February 2011 Seattle, WA

Data Conservancy studies of scientists

Small science is big, and poorly curated 20%80% Number of Grants Total Dollars$1,747,957,451$1,117,431,154 Range$300,000 - $38,131,952$579 - $300,000 (Heidorn, 2009) 12,025 NSF grants awarded in 2007 = $2,865,388,605 Top 254 grants received 20% of the total awarded

Research questions & target domains What data, in what forms, are needed to advance research? What factors predict value for reuse of data sets? How do the dependencies among research communities evolve around data resources? Earth & life science intersections, with challenging curation problems: systems geobiology - soil ecology - oceanography... interdisciplinary research; need for data from outside fields, integration of data across fields and scales. production and use of compound / complex data sets. ingest / curation of community databases, policy and reuse issues.

Progressive data collection Talking shop about data - efficient exchange with the right scientists about the right things Scientists leading research - IP, access, discovery, research context Pre-interview worksheets Semi-structured interviews follow up sessions with selected participants Scientists managing data - stages, versions, standards, tools (post docs, others from labs and research groups) Data deposit & sharing worksheet Data samples, related documentation

Units of analysis Data “sets” aligned with research group production and dissemination workflows and services policies on attribution, embargoing, etc. Data communities Aligned with current and future interactions around data representation, functionality, and use policies for selection, appraisal, retention, description

Data communities What are the meaningful social units for organization and use of data over the long term? Sub-discipline focused on particular kinds of data that produce specific measurements or analysis - (systems geobiology) Specialized domain focused on a research problem, often interdisciplinary in nature - (urban vulnerability) Developers of shared community-level data collection (i.e., “Resource Collection”, NSB 2005) - (soil science) Core research challenge: Predict and design for communities of users, which will differ from producers, and change over time

Data curation and sharing dynamics GeobiologyVolcanologySoil ecology Data units Site-specific time series: reduced spreadsheets: rock, water, microbial microscopy images annotated digital photographs Rock profile: physical rock thin section chemical analysis photographs field notes Database: multiple abiotic soil measurements associated metadata User communities Geology Chemistry Microbiology Genomics U.S. Park Service Geology – igneous petrology Geophysics Geochemistry Geology – bio geo chemistry Earthworm ecology Sensor network researchers Sharing conventions by request no repository mostly post-publication some unpublished by request no repository public resource collection

Data Curation Framework

Data Conservancy collection criteria Broad scope, targeted research areas / needs – earth sciences, life sciences, social sciences, and astronomy At-risk and highly unique or valuable data for target research areas – consistent with the traditional role of special collections Data with high potential for future reuse – Yet, producers often fail to recognize the potential for reuse by others. (Cragin, Palmer, Carlson, & Witt Philosophical Transactions of the Royal Society A)

Hjørland’s epistemological potential of documents Representation (subject analysis) should go beyond description of aboutness Expose ability to “transfer knowledge” – requires “understanding of which future problems can give rise to the use of the document in question” (p. 93) Documents can have an infinite number of properties capable of informing a user, therefore description must be informed by: – Analysis of contributions to various user groups—beyond the originally intended audience – Prioritization of the contributions with the most “long-term utility” – Categorizations that will function in the information system

Data as raw materials of research Do not transfer knowledge directly Processing and tools for intelligibility and interpretation Effort and resources to determine integrity and fit for new purpose Curation roles in DC: – Integrity - assessed in part by applying OAIS criteria for preservation description information. – Fit-for-purpose - alignment with the methods and tools of a given research community.

Analytic potential of data user communities contributions categorization contributions description domains of interest integrity fit-for- purpose

Data curation expertise As was true with bibliographic resources, understanding future uses of data involves comprehension of particulars of data functionality and application And, historical and cultural dynamics of research areas broad cross-disciplinary epistemological trends to address needs of current and yet unknown user groups.

Questions & comments, please Center for Informatics Research in Science and Scholarship