Presentation is loading. Please wait.

Presentation is loading. Please wait.

Florian Gräf Software Developer of the McEntyre group at EMBL-EBI

Similar presentations


Presentation on theme: "Florian Gräf Software Developer of the McEntyre group at EMBL-EBI"— Presentation transcript:

1 Implementing the Joint Data Citation Principles in JATS for core life sciences data resources
Florian Gräf Software Developer of the McEntyre group at EMBL-EBI Working on: Europe PMC ORCID integration THOR OpenAIRE

2 EBI-Services Landscape
Genes, genomes & variation ArrayExpress Metabolights Expression Atlas PRIDE InterPro Pfam UniProt ChEMBL ChEBI Literature & ontologies Europe PubMed Central Gene Ontology Experimental Factor Ontology Molecular structures Protein Data Bank in Europe Electron Microscopy Data Bank European Nucleotide Archive 1000 Genomes Gene, protein & metabolite expression Protein sequences, families & motifs Chemical biology Reactions, interactions & pathways IntAct Reactome MetaboLights Systems BioModels BioSamples Enzyme Portal Ensembl Ensembl Genomes European Genome-phenome Archive Metagenomics portal Core Services at EBI But there too are many research groups EuropePMC one of many services at EBI

3 Europe PMC Partner of PMC International
30M abstracts including PubMed, 3.5M full-text articles Managing the EMBL ORCID integration OpenAIRE and THOR contributer

4 Fig. 2 http://europepmc.org/articles/PMC3710810 Why do we do this?
What does Data citation (often) look like

5 11/27/2018 Why did we start text mining?
- We are interested in the context of data mentions/ data citations - Accessions are often not tagged by publishers 11/27/2018

6 Our Route to Data Citation
Data-Literature Integration OpenAIRE THOR JATS v1.1: Data citation extension Text mining accessions Data tagging in publications? Often poor or non existent So better Data-Literature Integration is the goal Jee Hyub learned about F11 DCP at a Force11 meeting in Amsterdam -> Start text mining accessions We contributed to the datacitation integration in JATS in v1.1 Engagement in OpenAIRE (data-literature integration) and THOR Force11 meeting Amsterdam Photo by James, Wheeler; “Yoho Road” [

7 OpenAIRE 2020 Task 7.3 – Data-Literature Integration
Evaluation of status quo and how to improve it What does data citation look like now To what degree can the DCP be satisfied today? How can data literature links be presented? How can they be machine readable? Implementing DCP in JATS v1.1 Building a proof of concept tool providing JATS xml from accession and database name Contribution to the Data-Literature interlinking service Status quo. what does dc look like? We have seen this

8 F11 DCP Compliance Evaluated Metadata ENA PDB Samplesize Open Access set from EuropePMC text mined accessions 70k 16.5k Credit, Attribution Data Repository, Submitters ~94% 100% Unique Identification, Access, Persistance Accession Specificity, Verifiability Version, (Modifiaction Date) ~49.2% Overall All above ~98% ~83% No versions Evaluation Interoperability and Flexibility fulfilled by JATS XML Format

9 Accession2Jats Prototype Workflow
Input Repository: PDB Accession: 3g76 Retrieve metadata from public API

10 JATS 1.1 Data Citation Example

11 Accession2Jats Prototype Workflow
XSL-Transformation Input Repository: PDB Accession: 3g76 Cossu F., Milani M., Mastrangelo E., Bolognesi M. (27 Oct 2009). Crystal structure of XIAP-BIR3 in complex with a bivalent compound. PDB 3g76 [ Retrieve metadata from public API

12 And now? Complete Prototype Github: FlorianGraef/acc2jats
Work with publishers to incorporate tools into workflows Contact me with feedback, suggestions Next steps: we are interested in working with publishers regarding how to incorporate potential tools based on what I just described into workflows.


Download ppt "Florian Gräf Software Developer of the McEntyre group at EMBL-EBI"

Similar presentations


Ads by Google