Contouring Curation in Research Libraries: Defining “Working” Data Units and Communities Carole L. Palmer Center for Informatics Research in Science &

Slides:



Advertisements
Similar presentations
The Data Conservancy: A Digital Research and Curation Virtual Organization D4Science World User Meeting November 25, 2009.
Advertisements

Data Conservancy and the US NSF DataNet Initiative 2010 JISC/CNI Conference July 1, 2010 Sayeed Choudhury Johns Hopkins University.
A centre of expertise in digital information management UKOLN is supported by: Data Informatics Top Ten : (for Libraries) Dr Liz Lyon,
A centre of expertise in digital information management UKOLN is supported by: UK Perspectives on the Curation and Preservation of Scientific.
A centre of expertise in digital information management UKOLN is supported by: Digital Futures for MLAs? A snapshot in real time. Dr Liz.
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
Can We Talk? MICHAEL Conference London May 23, 2008Joyce Ray.
A centre of expertise in data curation and preservation CETIS MDR SIG::28 June 2006::University of Bath Funded by: This work is licensed under the Creative.
Swimming Upstream: Assessing the Librarys Role in Managing the River of Data on Campus Christie Peters | Science & Engineering Librarian Anita R. Dryden.
Preparing e-Science Information Specialists: New Programs and Professionals Graduate School of Library and Information Science University of Illinois at.
Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
Libraries in the New Research Environment Joyce Ray NAS/BRDI Symposium Associate Deputy for Libraries June 3, 2010.
Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer Center for Informatics Research in Science & Scholarship Graduate School.
The Analytic Potential of Long-Tail Data: Sharable Data and Re-use Value Carole L. Palmer Center for Informatics Research in Science & Scholarship Graduate.
Data Sharing, Small Science, and Institutional Repositories Melissa H. Cragin & Carole L. Palmer Center For Informatics Research in Science and Scholarship.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Riding the Wave: a Perspective for Today and the Future APA Conference, November 2011 Monica Marinucci EMEA Director for Research, Oracle.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer & Tiffany Chao Center for Informatics Research in Science & Scholarship Graduate.
Data Conservancy: A Life Sciences Perspective Sayeed Choudhury Johns Hopkins University
The Data Curation Profile IASSIST 2010 Jake Carlson Data Research Scientist Purdue University Libraries.
Active Data Curation in Libraries: Issues and Challenges ASEE ELD Presentation June 27, 2011 William H. Mischo & Mary C. Schlembach.
African Librarianship and the Academic Enterprise Prepared By: Kay Raseroka Director: Library Services University of Botswana.
V. Chandrasekar (CSU), Mike Daniels (NCAR), Sara Graves (UAH), Branko Kerkez (Michigan), Frank Vernon (USCD) Integrating Real-time Data into the EarthCube.
Integrating Digital Curation in a Digital Library curriculum: the International Master DILL case study Anna Maria Tammaro University of Parma Florence,
Final Search Terms: Archiving (digital or data) Authentication (data) Conservation (digital or data) Curation (digital or data) Cyberinfrastructure Data.
Data Conservancy: A Blueprint for Libraries in the Data Age Sayeed Choudhury Johns Hopkins University
The Data Conservancy: A Digital Research and Curation Virtual Organization Karon Kelly National Center for Atmospheric Research – NCAR Library Special.
Sun PASIG Fall 2008 Meeting 26 October 2008 Carole L. Palmer Center for Informatics Research in Science & Scholarship Graduate School of Library and Information.
1 Common Challenges Across Scientific Disciplines Laurence Field CERN 18 th November 2013.
Data Curation Education and Biological Information Specialists DigCCurr 2007 Chapel Hill, April 20, 2007 P. Bryan Heidorn, Carole L. Palmer, Melissa H.
Research Data Management Services Katherine McNeill Social Sciences Librarians Boot Camp June 1, 2012.
13 September 2012 The Libraries’ Role in Research Data Management: A Case Study from the University of Minnesota Meghan Lafferty, Chemistry, Chemical Engineering,
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Ensemble Computing in the National Science Digital Library (NSDL)
Information and Discovery in Neuroscience (IDN) Carole Palmer Graduate School of Library and Information Science University of Illinois at Urbana-Champaign.
1 Data Integration Community of Practice Meeting September 15, 2009 Science Data Integration.
This IMLS-funded project builds on the success of a program already in place at GSLIS, the Data Curation Education Program (DCEP), a concentration within.
Data Curation Education JCDL Pittsburgh, June 20, 2008 Linda C. Smith Melissa H. Cragin, Carole L. Palmer, W. John MacMullen, P. Bryan Heidorn.
Michael Witt Interdisciplinary Research Librarian & Assistant Professor Purdue Libraries & Distributed Data Curation Center (D2C2) Eliciting.
Data Curation in LIS Education and Libraries Melissa Cragin Center for Informatics Research in Science and Scholarship Graduate School of Library and Information.
Life Cycle Models & Principles Jake Carlson Associate Professor of Library Science Data Services Specialist Purdue University Libraries.
David Mogk Dept. of Earth Sciences Montana State University April 8, 2015 Webinar SAGE/GAGE FACILITIES SUPPORTING BROADER EDUCATIONAL IMPACTS: SOME CONTEXTS.
Data Practices across Disciplines: Informing Collections & Curation Carole L. Palmer Melissa H. Cragin, Tiffany Chao, & Nic Weber Center for Informatics.
Master of Science in Biological Informatics PROGRAM DESCRIPTION The MS in Biological Informatics program program aims.
The Role of Academic Libraries in the Digital Data Universe Break-Out Session: New Partnership Models Bob Hanisch and Brian Schottlaender Co-Leaders ARL.
Site-Based Data Curation at Yellowstone National Park PI: Carole L. Palmer, GSLIS, CIRSS Co-PIs: Bruce Fouke, Geology, Microbiology, Institute for Genomic.
A Proposed Short Course on Data Stewardship Scott Hausman Deputy Director NOAA’s National Climatic Data Center Preparing Scientists to Steward Their Data.
Implementing a National Data Infrastructure: Opportunities for the BIO Community Peter McCartney Program Director Division of Biological Infrastructure.
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
Open Access and Institutional Repositories. Accra, June 2007 Institutional repositories in SA research institutions: the DISA experience Dr D Peters.
Data Conservancy and the US NSF DataNet Initiative Fourth Workshop on Data Preservation and Long-Term Analysis in HEP Sayeed Choudhury Johns Hopkins University.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Michael Witt, Jacob Carlson, D. Scott Brandt Purdue University Melissa H. Cragin University of Illinois at Urbana-Champaign Constructing Data Curation.
Data Stewardship Lifecycle A framework for data service professionals Protectors of data.
Data Curation and Data Analytics for Advancing Science and Scholarship GSLIS Research Showcase 9 April 2011 Carole Palmer & Cathy Blake Center for Informatics.
GISELA & CHAIN Workshop Digital Cultural Heritage Network
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
PV 2009 December 3, 2009 The Data Conservancy: Building Sustainable Infrastructure for Interdisciplinary Scientific Data Curation and Preservation.
DataNet Collaboration
Summit 2017 Breakout Group 2: Data Management (DM)
Research on Data Curation and Repositories
ESciDoc Introduction M. Dreyer.
ESciDoc Introduction M. Dreyer.
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Bird of Feather Session
Wrap-Up – NSF Site Visit 8 February 2010
Presentation transcript:

Contouring Curation in Research Libraries: Defining “Working” Data Units and Communities Carole L. Palmer Center for Informatics Research in Science & Scholarship FOURTH BLOOMSBURY CONFERENCE ON E-PUBLISHING AND E-PUBLICATIONS Valued Resources: Roles and Responsibilities of Digital Curators and Publishers JUNE 2010

Data curation and the future of research libraries Data assets vital for universities and research centers - to produce competitive science and scholarship - to be good stewards of the common good produced through research Natural extension of research library mission - to provide information resources to support current and future scholarship Flickr: stancia, rh creative commons flickr.com/photos/001fj/ / The new stacks? ( W. Tabb) The new special collections? (S. Choudhury)

Same “metascience” & specialist responsibilities ON THE RESEARCH TEAM & IN THE LIBRARY (Bates 1999) But comprehensive and functioning infrastructure and services envisioned for interdisciplinary & multi-scale science and scholarship, requires information and data expertise Provide access and promote sharing of broad landscape of information across institutions and disciplines in tradition union catalogs, bibliographies of bibliographies across generations long-term, just in case, collecting

Research on range of organizational structures Research libraries will provide direct support for some -- align with and connect to others local cross-departmental data – “faculty of the environment” geographic site cross-disciplinary data – unique research intensive location disciplinary “resource collections” – neuroscience case institutional repository services – individuals, across disciplines national research library initiative – Data Conservancy Functionality will need to support “strategic reading” (Renear & Palmer, 2009) not just of literature, but data sets as well.

Information and Discovery in Neuroscience Project (NSF/CISE, ) Tensions managing data repository efforts & scientific research activities Depositor & user perspectives: 341 multi-scale, multi-format data sets -cell biologists, microscopists, modelers Used with permission from NCMIR Discipline based repository Important functions beyond archiving and access Registration, certification, awareness function (see Cragin, 2009 dissertation) Implications for moving “research” collections to “resource” level repositories Methods development - progressive, critical materials approach to data collection from multiple information seeking, use, and management perspectives

Institutional repository Data Curation Profiles Project (IMLS NLG ) Individual scientist’s data production workflows and perspectives on sharing Scott Brandt, PI; Collaborators: M. Witt & J. Carlson, (Purdue) Palmer, Cragin, & Shreeves (Illinois) derive requirements for managing data sets in IRs develop policies for archiving and access articulate librarian roles & skill sets for supporting archiving & sharing Biochemistry Biology Civil Engineering Electrical Engineering Food Sciences Earth and Atmospheric Sciences Soil Science Anthropology Geology Plant Sciences Kinesiology Speech and Hearing Earth and Atmospheric Sciences Soil Science

Data collection and analysis Interviews - with scientists and data managers Case Studies - with selected research groups in geology and civil engineering Focus Groups - with liaison librarians on their work with academic researchers related to data issues Needs Analysis - policy assertions for preservation and access, based on researchers as data producers, suppliers, and users Curation Profiles -detailed disciplinary profiles Instrument for curatorial practice

Integrated and comprehensive data curation strategy to collect, organize, validate, and preserve data to address grand research challenges that face society Infrastructure builds on & connects existing exemplar projects and communities deep engagement with scientists extensive experience with large-scale, distributed system development. Research libraries will be a core part of the emerging, distributed network of data collections and services. Data Conservancy - assertion and approach Nationally scoped research library repository

Data Conservancy.org PI, Sayeed Choudhury, Sheridan Libraries Network of domain and data scientists, information and computer scientists, enterprise experts, librarians, and engineers. Carl Lagoze Cornell University Mary Marlino National Center for Atmospheric Research (NCAR) Carole Palmer CIRSS, GSLIS, University of Illinois at U-C Paddy Patterson Marine Biological Laboratory Chris Borgman University of California Los Angeles Ruth Duerr National Snow and Ice Data Center Mark Evans Tessella, Inc. Eileen Fenton Portico Sandy Payette DuraSpace / Fedora Commons Co-PIs and Partners

Success in data standards, practices, documentation, and associated services Ingest astronomy data into preservation archive, connect data to existing services used by astronomers. Demonstrate utility of hosting data in environment that supports existing scientific capabilities in a sustainable manner. Astronomy as an exemplar community Scope to include: life sciences earth sciences social sciences

Science and library based hubs Marine Biological Laboratory Encyclopedia of Life - taxonomic organization, ontology indexing species identification queries for climate change analyses National Snow & Ice Data Center extensive sensor network, fieldwork, aircraft and satellite data access node on the DC network, test bed for distributed services National Center for Atmospheric Research civic decision making and climate science in megacities Cornell University Library DataStar - promotes archiving to disciplinary data centers arXiv eprints - OAI-ORE to link research data with publications

Data framework Start with a common conceptualization that applies across domains -- scientific observation Examine, adapt, and adopt existing models National Virtual Observatory Scientific Observations Network (Sonet) Define fundamental concepts and identity conditions – collections, data sets, version, etc. (Data Concepts team at Illinois, lead by Allen Renear) Accommodate range of disciplinary data and metadata standards -- dozens in earth, atmospheric, soil science alone, yet the “typical” scientist may know of none

User requirements and research

Applying quasi-profiling approach Data kinds and stages - sharing targets, workflow/ provenance, context Intellectual property - owner(s), stakeholders, terms of use, attribution Ingest org /description – formal / local standards, documentation Access - embargo, access control, mirror site Preservation – targets, duration, migration Tools - analytical, visualization, integration Interoperability - needs, APIs, 3rd party data, etc. Storage, integrity, security - audits, version control Discovery – browse, search, external

Progressive data collection Talking shop about data - efficient exchange with the right scientists about the right things Scientists leading research - IP, access, discovery, research context Pre-interview worksheets Semi-structured interviews follow up sessions with selected participants Scientists managing data - stages, versions, standards, tools (post docs, others from labs and research groups) Data deposit & sharing worksheet Data samples, related documentation

Units of analysis Data “sets” aligned with research group production and dissemination workflows and services policies on attribution, embargoing, etc. Data communities Aligned with current and future interactions around data representation, functionality, and use policies for selection, appraisal, retention, description

Data communities What are the meaningful social units for organization and use of data over the long term? Sub-discipline focused on particular kinds of data that produce specific measurements or analysis Specialized domain focused on a research problem, often interdisciplinary in nature Developers of shared community-level data collection (i.e., “Resource Collection”, NSB 2005) Core research challenge: Predict and design for communities of users, which will differ from producers, and change over time

Systems oriented “small” science GeobiologyVolcanologySoil ecology Analytical data unit Site-specific time series: reduced spreadsheets: rock, water, microbial microscopy images annotated digital photographs Rock profile: physical rock thin section chemical analysis photographs field notes Database: multiple abiotic soil measurements associated metadata User communities Geology Chemistry Microbiology Genomics U.S. Park Service Geology – igneous petrology Geophysics Geochemistry Geology – bio geo chemistry Earthworm ecology Sensor network researchers Sharing conventions by request no repository mostly post-publication some unpublished by request no repository public resource collection At present, literature and conference-based sharing relationships Individual data components required for reuse

Research informing LIS education Preparing information professionals for range of workforce demands: Summer Institutes In service professional development Biological Information Specialist Masters in bioinformatics Curation In the Humanities Curation in the Sciences MSLIS concentration in data curation sciences, humanities,

6 th International Digital Curation Conference Chicago, IL Dec. 6-8, 2010 hosted by CIRSS / GSLIS in partnership with Digital Curation Centre, UK pre-conference DataNet Education Summit post-conference LIS Research Summit

Questions & comments, please Center for Informatics Research in Science and Scholarship

Data curation is... the active and on-going management of (research) data through its lifecycle of interest and usefulness to scholarship, science, and education. Tasks appraisal and selection representation authentication data integrity maintaining links format conversions Functions enable discovery and retrieval maintain data quality add value provide for re-use over time archiving preservation