Identifiers and trust: lessons for data publishers Valued Resources: Roles and Responsibilities of Digital Curators and Publishers FOURTH BLOOMSBURY CONFERENCE ON E- PUBLISHING AND E-PUBLICATIONS, 24 & 25 JUNE 2010
Ongoing stewardship of scholarly content Certification of the version of record
Version of record Scholarly Publishing Roundtable (US House Committee on Science and Technology/White House Office of Science and Technology Policy) To the fullest extent possible, access should be to the definitive version of journal articles the version of record (VoR) produced and stewarded by the publisher.
Versions & Citation When is something a new version? When does something get a new identifiers? Focus on citation: if something will change the interpretation of a work it gets a new identifier Must keep older versions - users should get to what was cited
"together we can create a reality that we all agree on the reality we just agreed on…any user can change any entry, and if enough users agree with them, it becomes true."
Sir Tim told BBC News that there needed to be new systems that would give websites a label for trustworthiness once they had been proved reliable sources…So I'd be interested in different organisations labeling websites in different ways.
Industry Problems The scholarly pre-publication process is largely invisible The common belief that the publishers job is done on publication of the final version A proliferation of versions of content online that are not stewarded Trust metrics have not been established on the web
CrossMark A logo identifying a publisher certified version of record Clicking the logo tells you: If the copy is publisher- maintained and if there have been corrections Where the publisher- maintained version is Other metadata the publisher chooses to include
Enables Researchers to Easily determine if they are looking at a publisher-maintained version of record and if not, a link to the publisher version Easily ascertain the current status of the document and if there have been updates Easily access and use any non- bibliographic metadata the publisher has provided
Enables Publishers to Identify the publisher-maintained version of record Emphasize initial certification of the version of record AND ongoing stewardship Highlight and disseminate corrections in an industry standard way Highlight other (often invisible) steps taken to ensure the trustworthiness of the content
Things to think about If its not online it doesnt exist If its not linked it doesnt exist The identifier is only one small piece of the puzzle Any ID must unique, persistent and discoverable Sustainable infrastructure - technical and social
articles vs data CrossRef builds on existing citation practices established over 350 years Reward system firmly established for articles and article citation Not the case for data: social aspects are much harder than the technical Collaboration critical to interlink data and articles Data is different - publishers dont want it!
Conclusion Identifiers are tools to enable services and are useless without metadata Editorial selection and citation practices are critical More work is needed to establish trust metrics online Journals must establish data policies requiring deposit in appropriate repositories
Data publishers must... Establish trust through editorial and production processes Certify and steward versions of record for citation purposes (and keep old versions!) so researchers get credit Create system of persistent, actionable IDs and authoritative metadata Develop a community and services