Low-bandwidth Semantic Web

Slides:



Advertisements
Similar presentations
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Advertisements

RDF Tutorial.
Semantic Web Introduction
RDF formats for Linked Data by Mabi Harandi. RDF is not a format, it is a model for data So: It will provide supports for different formats like :  Turtle.
Communicating Information: Web Design. It’s a big net HTTP FTP TCP/IP SMTP protocols The Internet The Internet is a network of networks… It connects millions.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Copyright 2004 Monash University IMS5401 Web-based Systems Development Topic 2: Elements of the Web (g) Interactivity.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
Nov Copyright Galdos Systems Inc. November 2001 Geography Markup Language Enabling the Geo-spatial Web.
Chapter 3 Database Management
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Prevalent Database Models (Advantages of a database over flat files)
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
Practical RDF Chapter 1. RDF: An Introduction
Michalis Vafopoulos NTUA, GFOSS & The transformers GREEN CITY HACKATHON.
Fundamentals of Information Systems, Fifth Edition
Postacademic Interuniversity Course in Information Technology – Module C1p1 Contents Data Communications Applications –File & print serving –Mail –Domain.
The Semantic Web Web Science Systems Development Spring 2015.
Introduction To Internet
Master Informatique 1 Semantic Technologies Part 11Direct Mapping Werner Nutt.
LOGO 2 nd Project Design for Library Programs Supervised By Dr: Mohammed Mikii.
Copyrighted material John Tullis 10/17/2015 page 1 04/15/00 XML Part 3 John Tullis DePaul Instructor
Boris Villazón-Terrazas, Ghislain Atemezing FI, UPM, EURECOM, Introduction to Linked Data.
RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data.
Semantic Web Programming in Python an Introduction Biju B Jaganath G.
Linked Data: Emblematic applications on Legacy Data in Libraries.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
ESIP Semantic Web Products and Services ‘triples’ “tutorial” aka sausage making ESIP SW Cluster, Jan ed.
ELIS – Multimedia Lab PREMIS OWL Sam Coppens Multimedia Lab Department of Electronics and Information Systems Faculty of Engineering Ghent University.
Presenting Semantic Data Through “Instance Hubs” Using Authoritative URI Design Schemes Alexei Bulazel 1 ( ), Dominic Difranzo 1 (
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
1 UNIT 13 The World Wide Web. Introduction 2 Agenda The World Wide Web Search Engines Video Streaming 3.
The Web Web Design. 3.2 The Web Focus on Reading Main Ideas A URL is an address that identifies a specific Web page. Web browsers have varying capabilities.
The Mechanics of HTTP Requests and Responses and network connections.
Java Web Services Orca Knowledge Center – Web Service key concepts.
How do Web Applications Work?
Components.
IS1500: Introduction to Web Development
The Semantic Web By: Maulik Parikh.
Linked Data Web that can be processed by machines
Improving searches through community clustering of information
11 October Building a Web Site.
Linked Data and Libraries
Contents Digital-SNOWTAM Trial Introduction REST Introduction
Vocabulary Prototype: A preliminary sketch of an idea or model for something new. It’s the original drawing from which something real might be built or.
Advantages of ICT over Manual Methods of Processing Data
Introducing the World Wide Web
Vocabulary Prototype: A preliminary sketch of an idea or model for something new. It’s the original drawing from which something real might be built or.
Databases and Information Management
MVC Framework, in general.
The Re3gistry software and the INSPIRE Registry
Chapter 5 Data Resource Management.
CNIT 131 HTML5 – Anchor/Link.
Triple Stores.
PREMIS Tools and Services
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
Web archive data and researchers’ needs: how might we meet them?
Lessons Vocabulary Access 2016.
Databases and Information Management
Semantic Annotation service
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
Ben Jones - S Rebecca Hunter - S
Resource Description Framework (RDF)
Chapter 3 Database Management
Databases and Information Management
Linked Open Data in 10 Minutes Sandro Hawke, W3C
Semantic-Web, Triple-Strores, and SPARQL
Linked Data 101 Things, URIs, RDF, Triples, Turtle, Ontologies, Vocabularies and SPARQL Linked Data is our Implementation choice for FAIR.
Web Programming : Building Internet Applications Chris Bates CSE :
A framework for ontology Learning FROM Big Data
Presentation transcript:

Low-bandwidth Semantic Web Onno Valkering Supervisor: Victor de Boer Second reader: Stefan Schlobach 25 mei 2019

Low-bandwidth Semantic Web Contents Why Semantic Web and Low-bandwidth? SPARQL over SMS Challenges Experiments Practical validation Conclusion 25 mei 2019 Low-bandwidth Semantic Web

Low-bandwidth Semantic Web Why? Semantic Web Knowledge sharing The WWW is quite good Open Linked Data - Introduce DigiVet, farmers in rural areas of development countries. - There are more examples, common: valuable knowledge generated by users. - We want to share that knowledge between users through the application. - One way to achieve this is to go custom = exclusive club. - We want to use something that other people also use and can use. - Apparently, a lot of people use the web for this. We want to use that! - More specifically, SW provides to right tools for our knowledge sharing needs. DigiVet 3.0 on a Kasadaka, by Gossa Lô and Romy Blankendaal. 25 mei 2019 Low-bandwidth Semantic Web

(Some) Semantic Web in a Nutshell S: http://example.org/bob P: http://xmlns.com/foaf/0.1/topic_interest O: http://www.wikidata.org/entity/Q12418 Represent data by using triples, that combined make up graphs. And by using URIs we can easily reuse common definitions and entities defined by other data sets. When you want to know more about a certain entity or property, just go the URL used, to find out more. In this case we define that Bob is interested in The Mona Lisa, and instead of defining what The Mona Lisa is, we use an already existing definition. 25 mei 2019 Low-bandwidth Semantic Web

(Some) Semantic Web in a Nutshell Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/ When a lot of people start using this, you end up with a huge graph that contains a lot of knowledge. It would be nice to have such a graph based on applications used in rural areas of development. 25 mei 2019 Low-bandwidth Semantic Web

Low-bandwidth Semantic Web Why? Low-bandwidth Rural areas of development countries Dependent on GSM network Cost-effective messaging But! Surprise, there is no internet! No Web infrastructure. We are dependent on cellular networks for our M2M communication: SMS. And since they aren’t free, we need to be thrifty with sending SMSes. To me, this qualifies as a low-bandwidth network. 25 mei 2019 Low-bandwidth Semantic Web

Low-bandwidth Semantic Web SPARQL over SMS Enable (Semantic) Web data exchange over GSM networks. Practical differences HTTP and SMS: SMS works with phone number, HTTP works with URLs. SMS has a size restriction, HTTP practically has none. SMS is one-way messaging, HTTP follows request-response. Basic M2M communication based on SPARQL. In short, we want to use Web-like data exchange over GSM networks. To give an idea of the gap between the de facto Web protocol HTTP and SMS. By the saying, “think big, start small”, an initial implementation based on SPARQL subset. SPARQL is a language to query the kind of graphs we saw earlier. With the subset, basic read/write M2M communication can be achieved. 25 mei 2019 Low-bandwidth Semantic Web

Low-bandwidth Semantic Web SPARQL over SMS http://{db-hostname}/sparql?query=... Won’t go into much technical details, but I want to give an idea about the concept. 25 mei 2019 Low-bandwidth Semantic Web

Low-bandwidth Semantic Web SPARQL over SMS http://{sos-hostname}/{phonenumber}/sparql?query=... We’re tricking the application, by mimicking the behavior of the database on the other end. Technically, the application and database are talking to each other as they would over the web. 25 mei 2019 Low-bandwidth Semantic Web

Low-bandwidth Semantic Web Challenges Blending synchronous and asynchronous messaging SPARQL compression RDF compression Unpredictable query result sizes This is easier said than done, there are few challenges. Performed experiments to establish our compression methods. 25 mei 2019 Low-bandwidth Semantic Web

RDF Compression: Experiment setup Set of real-world RDF data Quarter of a million files Four different strategies: Serialization format Text compression Dictionary encoding Reasoning Acquired almost quarter of a million real-world files ranging from 1 to 1000 triples. We’ve subjected them to four compression strategies. Measured the compression rate of each strategy to come up with the best method. 25 mei 2019 Low-bandwidth Semantic Web

RDF Compression: Experiment setup Serialization formats Set of real-world RDF data Quarter of a million files Four different strategies: Serialization format Text compression Dictionary-encoding Reasoning Five in total N-Triples Turtle RDF/XML EXI HDT The format the triples are stored. One is more verbose than the other. Serialized each file into these formats, and recorded the compression rate. 25 mei 2019 Low-bandwidth Semantic Web

RDF Compression: Experiment setup Text compression Set of real-world RDF data Quarter of a million files Four different strategies: Serialization formats Text compression Dictionary-encoding Reasoning Only gzip compression Applied after serialization In addition, we applied gzip compression to each serialization format. Again, recorded the compression rate. 25 mei 2019 Low-bandwidth Semantic Web

RDF Compression: Experiment setup Dictionary-encoding Set of real-world RDF data Quarter of a million files Four different strategies: Serialization format Text Compression Dictionary-encoding Reasoning Shared vocabulary files Common definitions Top 10, 20, 30 (popularity) Predefined IDs for URIs This kind of compression is based on vocabulary files. These are sets of reusable common definitions for our types and properties. For each URI in the vocabulary an ID is generated. When we go through the RDF file, known URIs are replaced by the shorter ID. We performed this based on the Top 10, Top 20 and Top 30 popular vocabs. 25 mei 2019 Low-bandwidth Semantic Web

RDF Compression: Experiment setup Reasoning Set of real-world RDF data Quarter of a million files Four different strategies: Serialization format Text Compression Dictionary-encoding Reasoning Shared vocabulary files Common definitions Top 10, 20, 30 (popularity) Semantic redundancies 12 RDFS patterns 2 OWL patterns Also based on the vocabularies. But it searches for semantic redundancies based on 14 patterns. We’ll see later how this works. During the Practical Validation. 25 mei 2019 Low-bandwidth Semantic Web

RDF Compression: Results Number of SMSes Avg. number of triples (black line) Avg. number of triples (orange line) 1 2 3 6 8 4 9 16 5 21 24 66 84 7 98 116 126 175 189 10 301 Short summary of the results. The black line is first two strategies (general) We can see some nice compression rates, increasing as the number of triples increases. More interesting is the orange line is the last two strategies added (SW) By using SW specific features, we can compress data even more! We can see the practical effect of this in this table. For example, when using black: 4=9 and orange 4=16. 25 mei 2019 Low-bandwidth Semantic Web

SPARQL Compression: Experiment setup Set of real-world SPARQL queries 500 in total Two strategies Text compression RDF compression Performed a comparable experiment for SPARQL compression. Mixed results. Most of the time RDF compression scored best, but almost 40% = send plain. 25 mei 2019 Low-bandwidth Semantic Web

SPARQL Compression: Results Not always smaller results. 18% text compression is best. 38% no compression is best. 44% RDF compression scored best. Dynamically determine best strategy. Performed a comparable experiment for SPARQL compression. Mixed results. Most of the time RDF compression scored best, but almost 40% = send plain. 25 mei 2019 Low-bandwidth Semantic Web

Low-bandwidth Semantic Web Practical Validation Original 14 SMSes $ 0.196 Now we understand what is going on, let’s see it in action. Query result that contains 18 triples describing 3 animal diagnoses from DigiVet. We added the case specific vocabulary to the set of vocabularies used in compression. SMSes are calculated on 140 characters, because of the implementation used. 25 mei 2019 Low-bandwidth Semantic Web

Low-bandwidth Semantic Web Practical Validation Serialization 8 SMSes $ 0.112 25 mei 2019 Low-bandwidth Semantic Web

Low-bandwidth Semantic Web Practical Validation Reasoning 7 SMSes $ 0.098 25 mei 2019 Low-bandwidth Semantic Web

Low-bandwidth Semantic Web Practical Validation Dictionary-encoding 6 SMSes $ 0.084 25 mei 2019 Low-bandwidth Semantic Web

Low-bandwidth Semantic Web Practical Validation Text compression 3 SMSes $ 0.042 25 mei 2019 Low-bandwidth Semantic Web

Low-bandwidth Semantic Web Conclusion Semantic Web can be used without a Web-infrastructure. Specific Semantic Web features can be used for compression. SPARQL over SMS is free and open-source. As we have seen, it is possible to use SW without a Web-infrastructure. Example for SMS, the same concept can be applied for other type of networks - This allows for M2M knowledge sharing in a way we are used to: Web. - It became clear that the specific properties of SW can be used for compression. 25 mei 2019 Low-bandwidth Semantic Web