Francesco Osborne KMi, The Open University, United Kingdom April 2016 Two roads to semantic publishing.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
April 24, 2007McGuinness NIST Interoperability Week Ontology Summit Semantic Web Perspective Deborah L. McGuinness Acting Director & Senior Research Scientist.
WP8: User Centred Applications Enrico Motta, Marta Sabou, Vanessa Lopez, Laurian Gridinoc, Lucia Specia Knowledge Media Institute The Open University Milton.
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
Mapping Studies – Why and How Andy Burn. Resources The idea of employing evidence-based practices in software engineering was proposed in (Kitchenham.
Multi-Phase Reasoning of temporal semantic knowledge Sakirulai O. Isiaq and Taha Osman School of Computer and Informatics Nottingham Trent University Nottingham.
The Web of data with meaning... By Michael Griffiths.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
Ontology and Web 3.0 Ism 158 May 13, 2010 Julian Chytrowski.
WebMiningResearch ASurvey Web Mining Research: A Survey By Raymond Kosala & Hendrik Blockeel, Katholieke Universitat Leuven, July 2000 Presented 4/18/2002.
© 2013 Association for Computing Machinery Honeywell Introduction to the ACM Digital Library January 16, 2013 Honeywell Introduction to the ACM Digital.
Overview of Search Engines
Web 3.0 or The Semantic Web By: Konrad Sit CCT355 November 21 st 2011.
Best Practices Using Enterprise Search Technology Aurelien Dubot Consultant – Media and Entertainment, Fast Search & Transfer (FAST) British Computer Society.
Synthetic Information Architecture Semantic Web Technology: Leading the Migration Path from Static / Library To Dynamic / Network Architecture.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
What Can Do for You! Fabian Christ
Claudia Marzi Institute for Computational Linguistics (ILC) National Research Council (CNR) - Italy.
Building and Analyzing Social Networks Case Studies of Semantic Social Network Analysis Dr. Bhavani Thuraisingham February 22, 2013.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Dr. Jūratė Kuprienė Director for innovations and infrastructure development Workshop: Information services for research process , Rīga Research.
Custom driven scientific information extraction from digital libraries using integrated text mining services Betim Çiço, Adrian Besimi, Visar Shehu 14th.
1 The BT Digital Library A case study in intelligent content management Paul Warren
Introduction to Web Mining Spring What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web,
SWETO: Large-Scale Semantic Web Test-bed Ontology In Action Workshop (Banff Alberta, Canada June 21 st 2004) Boanerges Aleman-MezaBoanerges Aleman-Meza,
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
© Copyright 2008 STI INNSBRUCK NLP Interchange Format José M. García.
Microsoft Academic Search Search | Explore | Discover Alex D. Wade Director - Scholarly Communication.
Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.
Towards an ecosystem of data and ontologies Mathieu d’Aquin and Enrico Motta Knowledge Media Institute The Open University.
1 Advanced Semantic Technologies Prof. Deborah McGuinness and Dr. Patrice Seyed CSCI CSCI ITWS ITWS TA: Justin.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Comparability of language data and analysis Using an ontology for linguistics Scott Farrar, U.
Evaluating Semantic Metadata without the Presence of a Gold Standard Yuangui Lei, Andriy Nikolov, Victoria Uren, Enrico Motta Knowledge Media Institute,
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
Ontology-Centered Personalized Presentation of Knowledge Extracted from the Web Ralitsa Angelova.
OWL Representing Information Using the Web Ontology Language.
HTML5, Ontology, and Web 3.0 Ism 158 May 13, 2010 Julian Chytrowski.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.
Semantic Publishing Benchmark Task Force Fourth TUC Meeting, Amsterdam, 03 April 2014.
2015/12/121 Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Proceeding of the 18th International.
Iana Atanassova Research: – Information retrieval in scientific publications exploiting semantic annotations and linguistic knowledge bases – Ranking algorithms.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
1 Advanced Semantic Technologies Deborah McGuinness CSCI , 97543, CSCI , 97014, ITWS , 98113, ITWS , TA: Abigail.
Information Retrieval
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
Text Analytics A Tool for Taxonomy Development Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Knowledge Modeling and Discovery. About Thetus Thetus develops knowledge modeling and discovery infrastructure software for customers who: Have high-value.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
KAnOE: Research Centre for Knowledge Analytics and Ontological Engineering Managing Semantic Data NACLIN-2014, 10 Dec 2014 Dr. Kavi Mahesh Dean of Research,
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Characterizing Knowledge on the Semantic Web with Watson Mathieu d’Aquin, Claudio Baldassarre, Laurian Gridinoc, Sofia Angeletou, Marta Sabou, Enrico Motta.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Conceptualizing the research world
Elsevier Activity Range
Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology
Exploring Scholarly Data with Rexplore
Property consolidation for entity browsing
Searching and browsing through fragments of TED Talks
How to publish in a format that enhances literature-based discovery?
Jonathan Griffin, Managing Director, IFIS Publishing &
Presentation transcript:

Francesco Osborne KMi, The Open University, United Kingdom April 2016 Two roads to semantic publishing

The destination “The Semantic Web will likely profoundly change the very nature of how scientific knowledge is produced and shared, in ways that we can now barely imagine.” T. Berners-Lee “Researchers will benefit from better, faster, cheaper access to data related to publications, enhancing the capacity for in silico meta-research.” D. Shotton 2

Two roads 3 Semantic Publishing

Two roads 4 Producing machine-readable research publications Technical challenge: Creating user-friendly formats/tools/vocabularies Political challenge: Convincing publishers, authors and other stakeholders to use them Semantic Publishing

Two roads 5 Producing machine-readable research publications Technical challenge: Creating user-friendly formats/tools/vocabularies Political challenge: Convincing publishers, authors and other stakeholders to use them Extracting semantic information from text Technical challenges: Information extraction, entity linking, ontology mapping/learning and so on… Political challenge: Data Access Semantic Publishing

Two roads 6 Producing machine-readable research publications Technical challenge: Creating user-friendly formats/tools/vocabularies Political challenge: Convincing publishers, authors and other stakeholders to use them Extracting semantic information from text Technical challenges: Information extraction, entity linking, ontology mapping/learning and so on… Political challenge: Data Access Semantic Publishing

Three roads to semantic publishing? 7 Producing machine-readable research publications Technical challenge: Creating user-friendly formats/tools/vocabularies Political challenge: Convincing publishers, authors and other stakeholders to use them Extracting semantic information from text Technical challenges: Information extraction, entity linking, ontology mapping/learning and so on… Political challenge: Data Access Semantic Publishing Showing the potential of semantic technologies Technical challenge: Smart analytics, semantic search, novel visualizations… Political challenge: Convincing major companies to do so.

Two roads 8 Producing machine-readable research publications Technical challenge: Creating user-friendly formats/tools/vocabularies Political challenge: Convincing publishers, authors and other stakeholders to use them Extracting semantic information from text Technical challenges: Information extraction, entity linking, ontology mapping/learning and so on… Political challenge: Data Access Semantic Publishing

Two roads 9 Semantic Publishing Producing machine-readable research publications Technical challenge: Creating user-friendly formats/tools/vocabularies Political challenge: Convincing publishers, authors and other stakeholders to use them Extracting semantic information from text Technical challenges: Information extraction, entity linking, ontology mapping/learning and so on… Political challenge: Data Access My work

RASH 10 The RASH (Research Article in Simple HTML) Framework is a set of specifications and tools for writing, converting, visualising and enhancing academic articles in RASH, which is an HTML+RDF-based markup language for writing scholarly documents Not a ‘yet another format’, but a subset of HTML (31 tags) for which is easier to built tools. The documentation, the tools and additional information are publically available on Github. Main developers: Silvio Peroni Andrea Nuzzolese Francesco Poggi

RASH Allows to include RDF annotation (JSON-LD, RDF/XML, Turtle, RDFa) Inspired by – –Accessible Rich Internet Applications 1.1 e Digital Publishing WAI-ARIA Module 1.0 Similar projects: –Linked Research: Capadisli, S., Riedl, R., & Auer, S. (2015). Enabling Accessible Knowledge. In Proc. of CeDEM –ScholarlyMarkdown: Lin, T. T. Y., & Beales, G. (2015). ScholarlyMarkdown Syntax Guide. Guide, 31 January Debuted at SAVE-SD 2015 (WWW) 11

Venues that adopted RASH PROV: Three Years Later 2016 Workshop, held during the Provenance Week 2016 Semantic Publishing Challenge 2016 (SemPub2016), held during the 13th Extended Semantic Web Conference (ESWC 2016) 4th International Workshop on Linked Media (LIME 2016), held during the 13th Extended Semantic Web Conference (ESWC 2016) 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2016) 2016 Workshop on Web APIs and RESTful Design (WS-REST2016), held during the 16th International Conference on Web Engineering (ICWE2016) 15th International Semantic Web Conference (ISWC 2016) 13th Extended Semantic Web Conference (ESWC 2016) 2016 International Workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD 2016), held during the 25th International World Wide Web Conference (WWW 2016) 2015 International Workshop on Learning in the Cloud (LC2015), held during the 26th ACM Conference on Hypertext and Social Media (Hypertex 2015) Semantic Publishing Challenge 2015 (SemPub2015), held during the 12th Extended Semantic Web Conference (ESWC 2015) 1st International Workshop on LINKed EDucation at the ISWC 2015, held during the 14th International Semantic Web Conference (ISWC 2015) 3rd International Workshop on Linked Data for Information Extraction (LD4IE 2015), held during the 14th International Semantic Web Conference (ISWC 2015) 2015 International Workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD 2015), held during the 24th International World Wide Web Conference (WWW 2015) 12

Venues that adopted RASH PROV: Three Years Later 2016 Workshop, held during the Provenance Week 2016 Semantic Publishing Challenge 2016 (SemPub2016), held during the 13th Extended Semantic Web Conference (ESWC 2016) 4th International Workshop on Linked Media (LIME 2016), held during the 13th Extended Semantic Web Conference (ESWC 2016) 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2016) 2016 Workshop on Web APIs and RESTful Design (WS-REST2016), held during the 16th International Conference on Web Engineering (ICWE2016) 15th International Semantic Web Conference (ISWC 2016) 13th Extended Semantic Web Conference (ESWC 2016) 2016 International Workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD 2016), held during the 25th International World Wide Web Conference (WWW 2016) 2015 International Workshop on Learning in the Cloud (LC2015), held during the 26th ACM Conference on Hypertext and Social Media (Hypertex 2015) Semantic Publishing Challenge 2015 (SemPub2015), held during the 12th Extended Semantic Web Conference (ESWC 2015) 1st International Workshop on LINKed EDucation at the ISWC 2015, held during the 14th International Semantic Web Conference (ISWC 2015) 3rd International Workshop on Linked Data for Information Extraction (LD4IE 2015), held during the 14th International Semantic Web Conference (ISWC 2015) 2015 International Workshop on Semantics, Analytics, Visualisation: Enhancing Scholarly Data (SAVE-SD 2015), held during the 24th International World Wide Web Conference (WWW 2015) 13

RASH 14 RelaxNG + HTML5 validation via HTML + RDF

RASH 15 RelaxNG + HTML5 validation via HTML + RDF

Rexplore Goal –To provide an environment which effectively supports users in exploring research data and enables them: to detect and make sense of the important trends in one or more research areas to identify researchers in one or multiple areas, according to a variety of requirements to discover and explore a variety of dynamic relations between researchers, between topics, and between researchers and topics to rank specific sets of authors, generated using the relevant filters, according to various performance metrics Approach –Integration of advanced and innovative solutions drawn from Machine Learning, Semantic Technologies, and Human-Computer Interactions 16 More info:

Building ontologies of research topics For making sense of academic data is very useful to have an comprehensive and up-to-date ontology of research topics. Unfortunately, human crafted classifications evolve too slowly and tend to be too coarse-grained. Ontology learning is the answer. 17 Osborne, F. and Motta, E. (2012) Mining Semantic Relations between Research Areas. International Semantic Web Conference, Boston, MA

Klink-2 Klink-2 is a novel approach for learning ontologies of research areas and is able: to scale up to large interdisciplinary ontologies – It is able to generate the topic ontology incrementally to handle ambiguous keywords –e.g., “java (programming)”, “java (Indonesia)”, “java (Coffee)” to take as input any kind of statistical or semantic relationship – e.g., involving authors, organizations, venues…

Statistical Inferences skos:relatedEquivalent skos:broaderGeneric contributesTo Filtering Triples generation K K K K K K K K K K K K A A A A A A O O O O O V V V V V K K K K1K1 K2K2 Venues Authors Organizations Keywords Linked Data Cloud ClusterizationDisambiguation Input keywords Klink-2

Handling ambiguous keywords Klink-2 address mainly three categories of ambiguous keywords: Terms which actually have two or more different meanings – e.g., “owl”, the ontology web language, and “owl”, the bird. Vague terms, with meaning that can change according to the paper they are associated to – e.g., “mapping”, “indexing”, “performance”. Terms that used to have a unique meaning, but are now used in specialized ways by different research communities –e.g. “ontology”. 12

Osborne, F., Motta, E. and Mulholland, P. (2013) Exploring Scholarly Data with Rexplore, International Semantic Web Conference, Sydney, Australia Some examples: Improving analytics and exploration

Some example – The Map of Semantic Web Osborne, Francesco; Scavo, Giuseppe and Motta, Enrico (2014). A hybrid semantic approach to building dynamic maps of research communities. In EKAW Linköping, Sweden.

Some examples: STM

Technology extractions We created a method for identifying software technologies (applications, framework, formats, algorithms) from research papers and to automatically generate an OWL ontology describing them. Approach Combines Natural Language Processing and Semantic Technologies. Analyses the syntactic structure of the sentence and searches for clue terms and verbs usually adopted to introduce or describe technologies Exploits WordNet, Wiktionary and Klink-2 ontology 24

OWL construction 25 HELENE POSTER

Preliminary evaluation The approach was tested on a gold standard of 300 manually annotated abstracts (from Microsoft Academic Search), 702 sentences and 144 unique technologies in the field of Semantic Web. We tried six methods using respectively 1) only abstracts (A), 2) abstracts with Klink-2 ontology (AK), 3) abstract with Klink-2 ontology and WordNet (AKW), 4) titles and abstracts (TA), 5) titles and abstracts with Klink-2 ontology (TAK), 6) titles and abstracts with Klink-2 ontology and WordNet (TAKW). It seems we can do even better by using statistics and entity linking techniques. Coming soon… 26 AAKAKWTATAKTAKW Precision Recall F

Future work 27 Semantic Publishing Ability to Convert: from Microsoft Word from LaTeX To EPUB ISWC Online annotator Online editor Novel methods for extracting a rich semantic network of software technologies, authors, topics and so on. Supporting publishers in (semi-) automatically annotating their publications.

Future work 28 Thanks! Francesco Osborne KMi, The Open University RASH: Rexplore: