Presentation is loading. Please wait.

Presentation is loading. Please wait.

Principles for Data Citation Micah Altman, Institute for Quantitative Social Science, Harvard University Prepared for DataCite's Summer Meeting: Data and.

Similar presentations


Presentation on theme: "Principles for Data Citation Micah Altman, Institute for Quantitative Social Science, Harvard University Prepared for DataCite's Summer Meeting: Data and."— Presentation transcript:

1 Principles for Data Citation Micah Altman, Institute for Quantitative Social Science, Harvard University Prepared for DataCite's Summer Meeting: Data and the Scholarly Record, the Changing Landscape August 23-24, 2011

2 Collaborators* Principles for Data Citation Leonid Andreev, Ed Bachman, Adam Buchbinder, Ken Bollen, Bryan Beecher, Steve Burling, Kevin Condon, Jonathan Crabtree, Merce Crosas, Gary King, Patrick King, Tom Lipkis, Freeman Lo, Jared Lyle, Marc Maynard, Nancy McGovern, Lois Timms-Ferrarra, Akio Sone, Bob Treacy Research Support Thanks to the Library of Congress (PA#NDP03-1), the National Science Foundation (DMS-0835500, SES 0112072), IMLS (LG-05-09- 0041-09), the Harvard University Library, the Institute for Quantitative Social Science, the Harvard-MIT Data Center, and the Murray Research Archive. * And co-conspirators

3 Related Work Principles for Data Citation M. Altman,2008, "A Fingerprint Method for Verification of Scientific Data" in, Advances in Systems, Computing Sciences and Software Engineering, (Proceedings of the International Conference on Systems, Computing Sciences and Software Engineering 2007), Springer Verlag. M. Altman and G. King. 2007. A Proposed Standard for the Scholarly Citation of Quantitative Data, D-Lib, 13, 3/4 (March/April). G. King, 2007, " An Introduction to the Dataverse Network as an Infrastructure for Data Sharing", Sociological Methods and Research, Vol. 32, No. 2, pp. 173-199

4 Principles for Data Citation (19 Ways of Looking at Data) ^ Citations AKA

5 Principles for Data Citation Common Principles

6 Thanks to 37 Participants Alberto Accomazzi; CFA, Harvard University Micah Altman; IQSS, Harvard University Peter Buneman; University of Edinburgh & University of Pennsylvania Douglas Burke; CFA, Harvard University Sarah Callaghan; STFC Rutherford Appleton Laboratory & CODATAITCSI Todd Carpenter; NISO Tim Clark, Harvard U. Bonnie C. Carroll; Information International Associates Dan Cohen; National Academies and Library of Congress Merce Crosas; IQSS, Harvard University Monica Duke; UK Office for Library and Information, University of Bath & SageCite Christopher C. Erdmann; CFA, Harvard University Martin Fenner; University of Hannover Medical School & ORCID Consol Garcia; Universitat Politècnica de Catalunya. Paul Groth; Free University of Amsterdam & W3C Mark Hahnel; Imperial College & NHLI Joel Hammond; Thomson Reuters Simon Hodson; JISC Michelle Hudson; Yale University John Kunze: Calidornia Digital Library Emilie Marcus; Cell Press Terri Mitton; OECD August Muench; CFA, Harvard University Pascale Cissokho Mutter; OECD Alberto Pepe, CFA, Harvard University Heather Piwowar, Dryad, NesCent Jonathan Rees; Science Commons/Creative Commons David De Roure; University of Southampton Mackenzie Smith; MIT & Creative Commons Gudmundur A. Thorisson; University of Leicester Caitlin Trasande; Digital Science Paul Uhlir; National Academies Mary Vardigan; ICPSR & IASSIST Todd Vision; UNC Chapel Hill Robin King Wendler; Havard University Library Max Wilkinson; British Library Keith Wollman; Cell Press Principles for Data Citation MotivationsElements Citing DataVirtual Archives

7 Principles for Data Citation What are we talking about?

8 In-text Reference [Fenner 2006] Same-old-same old? In-Article Reference Fenner, M. [ORCID:112358132134]Best Article Ever. Journal of I Can Haz Tenure?, Vol ε Issue δ, pp. - -- +. More better? ORCID goes where…? Workflow

9 References In-text Reference [Fenner 2006] Same-old-same old? In-Article Reference Fenner, M. [ORCID:112358132134] Best Article Ever. Journal of I Can Haz Tenure?, Vol ε Issue δ, pp. - -- +. More better? ORCID goes where…? Identifier Catalog Metadata Figure 1: This is the caption of the first figure... Web resolution image More better on steroids? Who did what part? What did each author do? Workflow

10 References In-text Reference [Fenner 2006] Same-old-same old? In-Article Reference Fenner, M. [ORCID:112358132134] Best Article Ever. Journal of I Can Haz Tenure?, Vol ε Issue δ, pp. - -- +. More better? ORCID goes where…? Citation Catalog Metadata Figure 1: This is the caption of the first figure... Web resolution image More better on steroids? Who did what part? What did each author do? External Services Tagged:psychoceramics Recommendations for Martin: Greatest. Author. Ever! Workflow

11 Principles for Data Citation - Separate scientific principles, use cases, requirements - Distinguish syntax from presentation - Design for ecosystem & lifecycle - Incremental value for incremental effort Design Principles

12 The article is (only) a summary -- data is the research outcome Science is based on replicability– data is required to replicate Data should be as easy to discover and cite as other works Principles Principles for Data Citation Theory

13 Core Requirements Principles Principles for Data Citation Theory + Data citations should be first class objects for publication -- appear with citation; should be as easy to cite as other works At minimum, all data necessary to understand assess extend conclusions in scholarly work should be cited Citations should persist and enable access to fixed version of data at least as long as citing work Data citation should support unambiguous attribution of credit to all contributors, possibly through the citation ecosystem

14 Access Provenance Discovery Persistence Provenance Services/Use Cases Requirements Principles Theory + Practice Principles for Data Citation

15 Attribution Discovery PersistenceAccess Provenance Linking Data to Publications through Citation and Virtual Archives Use Cases

16 Attribution Cite data as first class work Identify contributors to data Discovery Associate a persistent id with a work Locate data via identifier Locate data integral to article Locate works related to data – articles, derivatives, sources Persistence Reference exists as long as referring object Evidence persists as long as assertions based on evidence? Durability of data transparent? Access Citation provides for mediated access Access to surrogate On-line access to object Machine understandability Long-term human understandability Provenance Associate work with version of evidence used Verify fixity of information Principles for Data Citation Use Cases (details) Operational Constraints? -Syntax -Interoperability -Technical contexts of use

17 Principles for Data Citation Operational Requirements? Syntax Metadata Interoperability Core technical contexts of use

18 Shared Infrastructure & Practice - ecosystem of infrastructure, policy, practice necessary ? Actors

19 Shared Infrastructure & Practice - ecosystem of infrastructure, policy, practice necessary ? Journal Publishers - Policies and procedures? - Metadata? Actors

20 Shared Infrastructure & Practice - ecosystem of infrastructure, policy, practice necessary ? Journal Publishers - Policies and procedures? - Metadata? Cited Authors - How to create new citation, identifier …? - How to claim works? - How to disclaim works? - How to express relationship to works, parts, aggregations Actors

21 Shared Infrastructure & Practice - ecosystem of infrastructure, policy, practice necessary ? Journal Publishers - Policies and procedures? - Metadata? Cited Authors - How to claim works? - How to disclaim works? - How to express relationship to works, parts, aggregations Citing Authors - How to create new citation, identifier …? Actors

22 Use Cases+ Requirements + Scientific Principles Data citations should be first class objects for publication -- appear in references; be as easy to reference as other works All data necessary to assess conclusions in scholarly work should be cited Citations should persist and enable access to fixed version of data, for as long as the citing works exist Citation should support unambiguous attribution of credit to all contributors, (possibly through the citation ecosystem of metadata, indices, etc.) Separate presentation and content The article is (only) a summary of the research Science requires reproducibility Scientific disciplines require a common evidence base Scientific Principles Requirements Attribution – legal & scientific Persistence – persistence of reference; identify responsible curators Access – short & long term; machine & human Discovery – locate instances; discover derivative, parent, and related works Provenance – associate scientific claim and evidence; verify fixity of evidence Key Use Cases

23 Principles for Data Citation Simple P

24 Principles for Data Citation - Semantic: Persistent ID, Author, Title, Version (or date) - Presentation: Any style Grouped other references Actionable in context - Policy If its scientific evidence, cite it Offer credit to all contributors Simple Proposal

25 Contact Us Principles for Data Citation Micah Altman maltman.hmdc.harvard.edu The Dataverse Network ® thedata.org


Download ppt "Principles for Data Citation Micah Altman, Institute for Quantitative Social Science, Harvard University Prepared for DataCite's Summer Meeting: Data and."

Similar presentations


Ads by Google