Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Practices across Disciplines: Informing Collections & Curation Carole L. Palmer Melissa H. Cragin, Tiffany Chao, & Nic Weber Center for Informatics.

Similar presentations


Presentation on theme: "Data Practices across Disciplines: Informing Collections & Curation Carole L. Palmer Melissa H. Cragin, Tiffany Chao, & Nic Weber Center for Informatics."— Presentation transcript:

1 Data Practices across Disciplines: Informing Collections & Curation Carole L. Palmer Melissa H. Cragin, Tiffany Chao, & Nic Weber Center for Informatics Research in Science & Scholarship Graduate School of Library & Information Science University of Illinois at Urbana-Champaign iConference 9 February 2011 Seattle, WA

2 Data Conservancy studies of scientists

3 Small science is big, and poorly curated 20%80% Number of Grants24059621 Total Dollars$1,747,957,451$1,117,431,154 Range$300,000 - $38,131,952$579 - $300,000 (Heidorn, 2009) 12,025 NSF grants awarded in 2007 = $2,865,388,605 Top 254 grants received 20% of the total awarded

4 Research questions & target domains What data, in what forms, are needed to advance research? What factors predict value for reuse of data sets? How do the dependencies among research communities evolve around data resources? Earth & life science intersections, with challenging curation problems: systems geobiology - soil ecology - oceanography... interdisciplinary research; need for data from outside fields, integration of data across fields and scales. production and use of compound / complex data sets. ingest / curation of community databases, policy and reuse issues.

5 Progressive data collection Talking shop about data - efficient exchange with the right scientists about the right things Scientists leading research - IP, access, discovery, research context Pre-interview worksheets Semi-structured interviews follow up sessions with selected participants Scientists managing data - stages, versions, standards, tools (post docs, others from labs and research groups) Data deposit & sharing worksheet Data samples, related documentation

6 Units of analysis Data “sets” aligned with research group production and dissemination workflows and services policies on attribution, embargoing, etc. Data communities Aligned with current and future interactions around data representation, functionality, and use policies for selection, appraisal, retention, description

7 Data communities What are the meaningful social units for organization and use of data over the long term? Sub-discipline focused on particular kinds of data that produce specific measurements or analysis - (systems geobiology) Specialized domain focused on a research problem, often interdisciplinary in nature - (urban vulnerability) Developers of shared community-level data collection (i.e., “Resource Collection”, NSB 2005) - (soil science) Core research challenge: Predict and design for communities of users, which will differ from producers, and change over time

8 Data curation and sharing dynamics GeobiologyVolcanologySoil ecology Data units Site-specific time series: reduced spreadsheets: rock, water, microbial microscopy images annotated digital photographs Rock profile: physical rock thin section chemical analysis photographs field notes Database: multiple abiotic soil measurements associated metadata User communities Geology Chemistry Microbiology Genomics U.S. Park Service Geology – igneous petrology Geophysics Geochemistry Geology – bio geo chemistry Earthworm ecology Sensor network researchers Sharing conventions by request no repository mostly post-publication some unpublished by request no repository public resource collection

9 Data Curation Framework

10 Data Conservancy collection criteria Broad scope, targeted research areas / needs – earth sciences, life sciences, social sciences, and astronomy At-risk and highly unique or valuable data for target research areas – consistent with the traditional role of special collections Data with high potential for future reuse – Yet, producers often fail to recognize the potential for reuse by others. (Cragin, Palmer, Carlson, & Witt. 2010. Philosophical Transactions of the Royal Society A)

11 Hjørland’s epistemological potential of documents Representation (subject analysis) should go beyond description of aboutness Expose ability to “transfer knowledge” – requires “understanding of which future problems can give rise to the use of the document in question” (p. 93) Documents can have an infinite number of properties capable of informing a user, therefore description must be informed by: – Analysis of contributions to various user groups—beyond the originally intended audience – Prioritization of the contributions with the most “long-term utility” – Categorizations that will function in the information system

12 Data as raw materials of research Do not transfer knowledge directly Processing and tools for intelligibility and interpretation Effort and resources to determine integrity and fit for new purpose Curation roles in DC: – Integrity - assessed in part by applying OAIS criteria for preservation description information. – Fit-for-purpose - alignment with the methods and tools of a given research community.

13 Analytic potential of data user communities contributions categorization contributions description domains of interest integrity fit-for- purpose

14 Data curation expertise As was true with bibliographic resources, understanding future uses of data involves comprehension of particulars of data functionality and application And, historical and cultural dynamics of research areas broad cross-disciplinary epistemological trends to address needs of current and yet unknown user groups.

15 Questions & comments, please Center for Informatics Research in Science and Scholarship clpalmer@illinois.edu http://cirss.lis.uiuc.edu/


Download ppt "Data Practices across Disciplines: Informing Collections & Curation Carole L. Palmer Melissa H. Cragin, Tiffany Chao, & Nic Weber Center for Informatics."

Similar presentations


Ads by Google