Presentation is loading. Please wait.

Presentation is loading. Please wait.

The XML mark up process from the viewpoint of a biodiversity publisher Lyubomir Penev, Donat Agosti, Teodor Georgiev, Terry Catapano, Vladimir Blagoderov,

Similar presentations


Presentation on theme: "The XML mark up process from the viewpoint of a biodiversity publisher Lyubomir Penev, Donat Agosti, Teodor Georgiev, Terry Catapano, Vladimir Blagoderov,"— Presentation transcript:

1 The XML mark up process from the viewpoint of a biodiversity publisher Lyubomir Penev, Donat Agosti, Teodor Georgiev, Terry Catapano, Vladimir Blagoderov, David Roberts, Vincent S. Smith, Norman F. Johnson, Guido Sautter, Robert A. Morris, Vishwas Chavan, Tim Robertson, Pavel Stoev, Jeremy Miller, Sandra Knapp, Cynthia Parr, W. John Kress, Terry Erwin

2 The four stages of the XML- based editorial workflow S UBMISSION – tagged or non-tagged manuscripts? S UBMISSION – tagged or non-tagged manuscripts? PEER-REVIEW/EDITORIAL – the technical challenges of the mark up process PEER-REVIEW/EDITORIAL – the technical challenges of the mark up process PUBLICATION – different publishing formats and to whom they are addressed? PUBLICATION – different publishing formats and to whom they are addressed? DISSEMINATION and USE – Link yourself or perish! DISSEMINATION and USE – Link yourself or perish!

3 Quick facts about ZooKeys Launched on 4 th of July 2008; published 60 issues and >11,000 pages until now, 4,400 registered users; The first mandatory Open Access journal in taxonomy ZooKeys registers all new taxa in ZooBank (mandatory from the 1st issue); all new taxa descriptions are supplied through XML to EOL; all taxon treatments are supplied to Plazi; Wikispecies registers all new taxa as well Since July 2010, ZooKeys implements XML, TaxPub- based editorial wokflow; the journals partners with GBIF, EOL, BHL, NLM, NHM, Plazi and others in various innovative publishing and dissemination projects CrossRef member, ISI ans Scopus covered, indexed in Zoological Record, DOAJ, CABI Abstracts, Google Scholar; approved for archiving in PubMedCentral Impact Factor 1.133 in the 3rd year of existence

4 Semantic tagging: What we have currently at disposal? Plazi’s TaxonX (mainly for legacy literature) and NLM TaxPub (for prospective publishing) as published working XML schemas; TaXMLit is being developed as a promised TDWG standard Plazi’s TaxonX (mainly for legacy literature) and NLM TaxPub (for prospective publishing) as published working XML schemas; TaXMLit is being developed as a promised TDWG standard Domain-specific Mark up tools: Golden Gate (TaxonX), and PMT (Pensoft Mark Up Tool) (based on NLM TaxPub, any other schema) Domain-specific Mark up tools: Golden Gate (TaxonX), and PMT (Pensoft Mark Up Tool) (based on NLM TaxPub, any other schema) TDWG standards and vocabularies approved (DarwinCore) or being under discussion (TaxonConcept, SPM, TaXMLit, etc.) TDWG standards and vocabularies approved (DarwinCore) or being under discussion (TaxonConcept, SPM, TaXMLit, etc.) RDF/XML and OWL links to external resources (ontologies), LSIDs, GUIDSs, etc., quickly gaining popularity and acceptance! RDF/XML and OWL links to external resources (ontologies), LSIDs, GUIDSs, etc., quickly gaining popularity and acceptance! ECAT, GNA, GNI, GNUB, promise an exciting infrastructure for taxon names at global level ECAT, GNA, GNI, GNUB, promise an exciting infrastructure for taxon names at global level BHL CiteBank promises a similarly exciting infrastructure for biodiversity literature references BHL CiteBank promises a similarly exciting infrastructure for biodiversity literature references Semantic Web enhancements to taxonomic papers promote connections; Data publication seems to soon become an indispensable part of taxonomic papers Semantic Web enhancements to taxonomic papers promote connections; Data publication seems to soon become an indispensable part of taxonomic papers Dissemination of published results becomes at least as important as the publication itself! Dissemination of published results becomes at least as important as the publication itself!

5 bn Four stages of an XML-based publication and dissemination work flow Scratchpads and Lifedesk-generated manuscripts GBIF IPT-generated manuscripts from metadata descriptions PENSOFT MARK UP TOOL (PMT), InDesign layout Manuscripts generated from authors’ databases Marked up final publication in PDF, HTML and XML formats Post-publication mark up of legacy literature Manuscripts marked up with MS Word & Open Office plugins Upfront pre-submission mark up (tagged manuscripts) Non-tagged manuscripts Legacy publications PDF, HTML, OCR Scans PLAZI’s GOLDEN GATE (GG) Marked up publication and treatments in HTML and XML formats Mark up integrated simultaneously with the peer-review, editorial and publication process ISI, Zoological Record Indexing (GBIF, GNA, etc.) Aggregators (EOL, WIKI, etc.) Dissemination, archiving, indexing, harvesting PubMedCentral & other archives

6 Pensoft Mark Up Tool (PMT) work flow

7

8

9 But why to mark up? Who will be using that? What does it give more than the usual PDF?

10 The four publication formats and their targets Print (high-resolution, full-color, identical to the PDF): to provide paper archiving and to satisfy the requirements of the Biological Codes Print (high-resolution, full-color, identical to the PDF): to provide paper archiving and to satisfy the requirements of the Biological Codes PDF: electronic copy of the printed version; for e-archiving (personal and institutional libraries; BHL) easy to read, browse and search PDF: electronic copy of the printed version; for e-archiving (personal and institutional libraries; BHL) easy to read, browse and search HTML: addressed to individual users to provide interactive reading and semantic enhancements; saves time and efforts to the readers through cross-referencing, Web harvesting, linking to external resources, etc.; HTML: addressed to individual users to provide interactive reading and semantic enhancements; saves time and efforts to the readers through cross-referencing, Web harvesting, linking to external resources, etc.; XML: to provide a format for archiving, data mining, data use/re-use to institutions (repositories, e-archives, aggregators, indexers, taxon-oriented web platforms, etc.) XML: to provide a format for archiving, data mining, data use/re-use to institutions (repositories, e-archives, aggregators, indexers, taxon-oriented web platforms, etc.)

11 Semantic enhancements to published texts

12

13 The occurrence dataset in Google Earth

14 Automated dissemination of published contents to GBIF, EOL, Plazi, etc.

15 Archiving in PubMedCentral (as TaxPub XML, PDF and separate images files)

16 The lessons learned The main difficulties are caused by: The specificity of the domain (e.g., taxon names, synonyms, instability of nomenclature, lack of global LSID infrastructure, etc.) Mark up of occurrence data (certainly a great challenge) Cost efficiency Sociological barriers: the majority of authors are not willing to change their writing habits; most are still not aware about the tremendous advantages of the Web 2.0 technologies Most small taxonomy publishers (and some bigger ones) do not understand the semantic tagging or just cannot afford it Semantic tagging of and semantic enhancements to biodiversity papers are publishers’ care; publishers should better present and disseminate the published contents to the benefit of their authors and society It is not easy, but......... it is exciting!

17 In 2010, Pensoft is committed to: Participate in all stages of extension, testing and implementation of the NLM TaxPub schema Extend the list of semantic enhancements and links to external resources with some 30 % Implement all these practices routinely also in botany and ecology through:


Download ppt "The XML mark up process from the viewpoint of a biodiversity publisher Lyubomir Penev, Donat Agosti, Teodor Georgiev, Terry Catapano, Vladimir Blagoderov,"

Similar presentations


Ads by Google