Presentation is loading. Please wait.

Presentation is loading. Please wait.

Francesco Osborne KMi, The Open University, United Kingdom April 2016 Two roads to semantic publishing.

Similar presentations


Presentation on theme: "Francesco Osborne KMi, The Open University, United Kingdom April 2016 Two roads to semantic publishing."— Presentation transcript:

1 Francesco Osborne KMi, The Open University, United Kingdom April 2016 Two roads to semantic publishing

2 The destination “The Semantic Web will likely profoundly change the very nature of how scientific knowledge is produced and shared, in ways that we can now barely imagine.” T. Berners-Lee “Researchers will benefit from better, faster, cheaper access to data related to publications, enhancing the capacity for in silico meta-research.” D. Shotton 2

3 Two roads 3 Semantic Publishing

4 Two roads 4 Producing machine-readable research publications Technical challenge: Creating user-friendly formats/tools/vocabularies Political challenge: Convincing publishers, authors and other stakeholders to use them Semantic Publishing

5 Two roads 5 Producing machine-readable research publications Technical challenge: Creating user-friendly formats/tools/vocabularies Political challenge: Convincing publishers, authors and other stakeholders to use them Extracting semantic information from text Technical challenges: Information extraction, entity linking, ontology mapping/learning and so on… Political challenge: Data Access Semantic Publishing

6 Two roads 6 Producing machine-readable research publications Technical challenge: Creating user-friendly formats/tools/vocabularies Political challenge: Convincing publishers, authors and other stakeholders to use them Extracting semantic information from text Technical challenges: Information extraction, entity linking, ontology mapping/learning and so on… Political challenge: Data Access Semantic Publishing

7 Three roads to semantic publishing? 7 Producing machine-readable research publications Technical challenge: Creating user-friendly formats/tools/vocabularies Political challenge: Convincing publishers, authors and other stakeholders to use them Extracting semantic information from text Technical challenges: Information extraction, entity linking, ontology mapping/learning and so on… Political challenge: Data Access Semantic Publishing Showing the potential of semantic technologies Technical challenge: Smart analytics, semantic search, novel visualizations… Political challenge: Convincing major companies to do so.

8 Two roads 8 Producing machine-readable research publications Technical challenge: Creating user-friendly formats/tools/vocabularies Political challenge: Convincing publishers, authors and other stakeholders to use them Extracting semantic information from text Technical challenges: Information extraction, entity linking, ontology mapping/learning and so on… Political challenge: Data Access Semantic Publishing

9 Two roads 9 Semantic Publishing Producing machine-readable research publications Technical challenge: Creating user-friendly formats/tools/vocabularies Political challenge: Convincing publishers, authors and other stakeholders to use them Extracting semantic information from text Technical challenges: Information extraction, entity linking, ontology mapping/learning and so on… Political challenge: Data Access My work

10 RASH 10 The RASH (Research Article in Simple HTML) Framework is a set of specifications and tools for writing, converting, visualising and enhancing academic articles in RASH, which is an HTML+RDF-based markup language for writing scholarly documents Not a ‘yet another format’, but a subset of HTML (31 tags) for which is easier to built tools. The documentation, the tools and additional information are publically available on Github. Main developers: Silvio Peroni Andrea Nuzzolese Francesco Poggi https://github.com/essepuntato/rash

11 RASH Allows to include RDF annotation (JSON-LD, RDF/XML, Turtle, RDFa) Inspired by –https://github.com/w3c/scholarly-html/issues/13 –Accessible Rich Internet Applications 1.1 e Digital Publishing WAI-ARIA Module 1.0 Similar projects: –Linked Research: Capadisli, S., Riedl, R., & Auer, S. (2015). Enabling Accessible Knowledge. In Proc. of CeDEM 2015. –ScholarlyMarkdown: Lin, T. T. Y., & Beales, G. (2015). ScholarlyMarkdown Syntax Guide. Guide, 31 January 2015. http://scholarlymarkdown.com/Scholarly-Markdown-Guide.html Debuted at SAVE-SD 2015 (WWW) 11

12 Venues that adopted RASH PROV: Three Years Later 2016 Workshop, held during the Provenance Week 2016 Semantic Publishing Challenge 2016 (SemPub2016), held during the 13th Extended Semantic Web Conference (ESWC 2016) 4th International Workshop on Linked Media (LIME 2016), held during the 13th Extended Semantic Web Conference (ESWC 2016) 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2016) 2016 Workshop on Web APIs and RESTful Design (WS-REST2016), held during the 16th International Conference on Web Engineering (ICWE2016) 15th International Semantic Web Conference (ISWC 2016) 13th Extended Semantic Web Conference (ESWC 2016) 2016 International Workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD 2016), held during the 25th International World Wide Web Conference (WWW 2016) 2015 International Workshop on Learning in the Cloud (LC2015), held during the 26th ACM Conference on Hypertext and Social Media (Hypertex 2015) Semantic Publishing Challenge 2015 (SemPub2015), held during the 12th Extended Semantic Web Conference (ESWC 2015) 1st International Workshop on LINKed EDucation at the ISWC 2015, held during the 14th International Semantic Web Conference (ISWC 2015) 3rd International Workshop on Linked Data for Information Extraction (LD4IE 2015), held during the 14th International Semantic Web Conference (ISWC 2015) 2015 International Workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD 2015), held during the 24th International World Wide Web Conference (WWW 2015) 12

13 Venues that adopted RASH PROV: Three Years Later 2016 Workshop, held during the Provenance Week 2016 Semantic Publishing Challenge 2016 (SemPub2016), held during the 13th Extended Semantic Web Conference (ESWC 2016) 4th International Workshop on Linked Media (LIME 2016), held during the 13th Extended Semantic Web Conference (ESWC 2016) 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2016) 2016 Workshop on Web APIs and RESTful Design (WS-REST2016), held during the 16th International Conference on Web Engineering (ICWE2016) 15th International Semantic Web Conference (ISWC 2016) 13th Extended Semantic Web Conference (ESWC 2016) 2016 International Workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD 2016), held during the 25th International World Wide Web Conference (WWW 2016) 2015 International Workshop on Learning in the Cloud (LC2015), held during the 26th ACM Conference on Hypertext and Social Media (Hypertex 2015) Semantic Publishing Challenge 2015 (SemPub2015), held during the 12th Extended Semantic Web Conference (ESWC 2015) 1st International Workshop on LINKed EDucation at the ISWC 2015, held during the 14th International Semantic Web Conference (ISWC 2015) 3rd International Workshop on Linked Data for Information Extraction (LD4IE 2015), held during the 14th International Semantic Web Conference (ISWC 2015) 2015 International Workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD 2015), held during the 24th International World Wide Web Conference (WWW 2015) 13

14 RASH 14 RelaxNG + HTML5 validation via HTML + RDF

15 RASH 15 RelaxNG + HTML5 validation via HTML + RDF

16 Rexplore Goal –To provide an environment which effectively supports users in exploring research data and enables them: to detect and make sense of the important trends in one or more research areas to identify researchers in one or multiple areas, according to a variety of requirements to discover and explore a variety of dynamic relations between researchers, between topics, and between researchers and topics to rank specific sets of authors, generated using the relevant filters, according to various performance metrics Approach –Integration of advanced and innovative solutions drawn from Machine Learning, Semantic Technologies, and Human-Computer Interactions 16 More info: http://technologies.kmi.open.ac.uk/rexplore/

17 Building ontologies of research topics For making sense of academic data is very useful to have an comprehensive and up-to-date ontology of research topics. Unfortunately, human crafted classifications evolve too slowly and tend to be too coarse-grained. Ontology learning is the answer. 17 Osborne, F. and Motta, E. (2012) Mining Semantic Relations between Research Areas. International Semantic Web Conference, Boston, MA

18 Klink-2 Klink-2 is a novel approach for learning ontologies of research areas and is able: to scale up to large interdisciplinary ontologies – It is able to generate the topic ontology incrementally to handle ambiguous keywords –e.g., “java (programming)”, “java (Indonesia)”, “java (Coffee)” to take as input any kind of statistical or semantic relationship – e.g., involving authors, organizations, venues…

19 Statistical Inferences skos:relatedEquivalent skos:broaderGeneric contributesTo Filtering Triples generation K K K K K K K K K K K K A A A A A A O O O O O V V V V V K K K K1K1 K2K2 Venues Authors Organizations Keywords Linked Data Cloud ClusterizationDisambiguation Input keywords Klink-2

20 Handling ambiguous keywords Klink-2 address mainly three categories of ambiguous keywords: Terms which actually have two or more different meanings – e.g., “owl”, the ontology web language, and “owl”, the bird. Vague terms, with meaning that can change according to the paper they are associated to – e.g., “mapping”, “indexing”, “performance”. Terms that used to have a unique meaning, but are now used in specialized ways by different research communities –e.g. “ontology”. 12

21 Osborne, F., Motta, E. and Mulholland, P. (2013) Exploring Scholarly Data with Rexplore, International Semantic Web Conference, Sydney, Australia Some examples: Improving analytics and exploration

22 Some example – The Map of Semantic Web Osborne, Francesco; Scavo, Giuseppe and Motta, Enrico (2014). A hybrid semantic approach to building dynamic maps of research communities. In EKAW 2014. Linköping, Sweden.

23 Some examples: STM

24 Technology extractions We created a method for identifying software technologies (applications, framework, formats, algorithms) from research papers and to automatically generate an OWL ontology describing them. Approach Combines Natural Language Processing and Semantic Technologies. Analyses the syntactic structure of the sentence and searches for clue terms and verbs usually adopted to introduce or describe technologies Exploits WordNet, Wiktionary and Klink-2 ontology 24

25 OWL construction 25 HELENE POSTER

26 Preliminary evaluation The approach was tested on a gold standard of 300 manually annotated abstracts (from Microsoft Academic Search), 702 sentences and 144 unique technologies in the field of Semantic Web. We tried six methods using respectively 1) only abstracts (A), 2) abstracts with Klink-2 ontology (AK), 3) abstract with Klink-2 ontology and WordNet (AKW), 4) titles and abstracts (TA), 5) titles and abstracts with Klink-2 ontology (TAK), 6) titles and abstracts with Klink-2 ontology and WordNet (TAKW). It seems we can do even better by using statistics and entity linking techniques. Coming soon… 26 AAKAKWTATAKTAKW Precision0.680.750.920.730.810.96 Recall0.520.500.420.750.680.57 F10.590.600.580.74 0.72

27 Future work 27 Semantic Publishing Ability to Convert: from Microsoft Word from LaTeX To EPUB ISWC Online annotator Online editor Novel methods for extracting a rich semantic network of software technologies, authors, topics and so on. Supporting publishers in (semi-) automatically annotating their publications.

28 Future work 28 Thanks! Francesco Osborne KMi, The Open University francesco.osborne@open.ac.uk RASH: https://github.com/essepuntato/rash Rexplore: http://technologies.kmi.open.ac.uk/rexplore/


Download ppt "Francesco Osborne KMi, The Open University, United Kingdom April 2016 Two roads to semantic publishing."

Similar presentations


Ads by Google