Download presentation
Presentation is loading. Please wait.
Published byJewel Baker Modified over 9 years ago
1
An Open Access publisher’s perspective on data publishing Matthew Cockerill Managing Director, BioMed Central Dryad-UK meeting HEFCE, London, 28 April 2010
2
About BioMed Central Largest publisher of peer-reviewed open access research journals Launched first open access journals in 2000 Part of Springer since October 2008 Now publishes 207 OA titles ~70,000 peer-reviewed OA articles published All research articles Creative Commons licensed Costs covered by 'article processing charge’ (APC)
3
Data is a first class citizen in BioMed Central publications Electronic version of article is authoritative “Additional files” not “Supplementary material” Additional files can be central to the reported findings of the paper Where possible, file is presented in a convenient embedded form (movies, chemical structures, KML etc) while also making downloadable “Mini-websites” provide a generic (too generic?) approach for presentation of complex data
4
Efficient online publication processes can facilitate dataset publication Only a fraction of experimental data sets make it into the literature Many more datasets have the potential to be useful, but do not warrant a traditional publication For certain standard types of data, appropriate databases exist (e.g. nucleotide sequences) But if such databases do not exist, or if further description of the experimental context is required?
6
Plans to extend reusability of data BioMed Central aims to provide more explicit guidelines to facilitate data reuse both generic, and specific to particular disciplines and formats Making authors original vector-based figure files available expands their reuse capability. Similar possibility with data: Make any table of data from within articles conveniently downloadable in spreadsheet form
7
Scientific cloud computing Bioinformaticists have been rapid adopters of cloud computing (as they were of the web) Cloud computing can reduce the barriers to reproducibility Publications can include or refer to necessary datasets and the computational tools that can be fired up to carry out/reproduce the analysis Large datasets can live in cloud – take analysis to the data, rather than vice versa
8
Preservation Publishers not best placed to run repositories for long term preservation of large datasets Mirrors of publisher content not able to accept arbitrary amounts of additional data Long term preservation presents a challenge with respect to continuity Redundant international mirrors with independent governance and funding could help to reduce risk
10
Huge culture variation between disciplines Value is maximized if everyone shares data But cultural norms vary heavily by discipline Prisoner’s dilemma – if no one else is sharing their data, you have little to gain, and much to lose by sharing your own data Funders are theoretically well placed to enforce norms for sharing data But effectiveness of funder data sharing policies is questionable
11
Data sharing in medicine Clinical trial data is one example of data which presents challenges re: privacy and consent Perfect anonymization often impossible - certainly not without losing key aspects of data Increasing collection of genomic data in trials accentuates this issue Trial consent should include info re: limits of anonymizability Full access to underlying data set could be made available for approved research purposes
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.