Making research findings visible – the future of the scientific paper Matthew Cockerill Publisher, BioMed Central
"There is nothing more amusing than watching business interests work themselves up into a righteous frenzy over a threat to their monopoly profits from a new technology or some upstart with a different business model. Invariably, the monopolists… try to present themselves as champions of the consumer, or defenders of a level playing field, as if they hadn't become ridiculously rich by sticking it to consumers and enjoying years in which the playing field was tilted to their advantage." Steven Pearlstein in the Washington Post, July
Status of open access publishing
Momentum for transition to OA We are seeing action (not just words) from funding agencies and governments –Wellcome and several UK research councils now require OA deposit as a condition of grants –Federal Research Public Access Act may do the same in US OA journals continue to grow rapidly Impressive impact factors demonstrate OA and quality are absolutely compatible Move to OA basically unstoppable
Growth of OA
Impact factors Genome Biology – IF 9.71 BMC Bioinformatics – IF 4.96 BMC Genomics – IF 4.09 Genome Biology is: 10 th of 124 in GENETICS & HEREDITY 4 th of 139 in BIOTECHNOLOGY & APPLIED MICROBIOLOGY
What does this mean for the future of the scientific article?
Why did we start BioMed Central as an open access publisher? Limited access to research articles makes further research needlessly inefficient Barriers to access obstruct interdisciplinary cross-fertilization It is in the interest of researchers for their research being read and cited as widely as possible Traditional scientific publishing is not an effective market, and so high serials prices mean a poor deal for the scientific community
The main reason we started BioMed Central Publications and data are a continuum Publications include data Publications are data To make sense of data and publications delivered by post-genomic science, we need –The best possible tools –The widest possible collection of raw material Open access stimulates the creation of tools by providing access to the raw material
The future of the scientific article Computers will be at least as important as human readers
Text mining Open access facilitates text mining BioMed Central XML corpus of full text articles is freely downloadable The more semantics that are captured in the XML, the richer the possibilities for mining
Existing examples of automated sifting of published research
Postgenomic
CiteULike
This is just bibliographic information – but it's a start
Semantic enrichment Ensure that the rest of the knowledge represented in scientific articles is structured to be computer-readable Ideally capture semantics unambiguously at time of publication Mining of free text is a stopgap/fall-back It is not just articles that need semantic enrichment, but data sets too Appropriate standards are now emerging
RDF Useful common technical standard for expressing semantics Subject-predicate-object triples BioMed Central already exposes bibliographic RDF for all articles Tools like the PiggyBank can capture RDF and then store it in triple-stores (local or networked)
Semantic Laundry List Scientific stuff –Genes –Proteins –Anatomy –Taxonomy –Small molecules/drugs –Macromolecules –Diseases –Experimental methodologies –Experimental data types General stuff –People, Places, Organizations, Relationships
NCBO
e.g. of enriched research
Neurocommons.org A ScienceCommons project Working with open access articles from BioMed Central and PLoS Attempting to define best practices/gold standard for semantic enrichment of articles Text mining and enhanced authoring tools both have role
The role of wikis The challenge: Ontologies, to be useful, must stay up-to-date and receive ongoing maintenance and curation Scope of problem is enormous - every entity and relationship of relevance to science Wikis provide a promising approach - perhaps the only viable approach e.g. AuthorIDs
Projects at BioMed Central to capture structured info Case reports Clinical trials Biological processes Chemical structures Taxonomic descriptions Publishing research articles in a more structured form allows the results to be treated as a database
Structured authoring
Publicon – an experiment in structured authoring
Benefits of structure
Live maths in articles
Problem – adding structure is a hassle
Incentivize authors Ideally, create structured authoring tools that remove work rather than add it (e.g. EndNote) If you do create extra work for authors, find a way to provide the author with an immediate return on investment
Reduce work - smart authoring e.g. auto suggest Standard way to disambiguate contacts Why not chemicals, genes, species too? –Unambiguously capture semantics –Increase accuracy, save time, encourage uptake
Return on investment Automatic update of meta-analysis based on clinical trial data Automatic list of closely-related case reports from database Automatic deposit of taxonomic information in registry (Zoobank)
Q & A