Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to the Semantic Web Ivan Herman, W3C ESA, Noordwijk, the Netherlands 16 th April, 2007 Ivan Herman.

Similar presentations


Presentation on theme: "Introduction to the Semantic Web Ivan Herman, W3C ESA, Noordwijk, the Netherlands 16 th April, 2007 Ivan Herman."— Presentation transcript:

1 Introduction to the Semantic Web Ivan Herman, W3C ESA, Noordwijk, the Netherlands 16 th April, 2007 Ivan Herman

2 , Introduction to the Semantic Web ( 2 /67) ‏ (2)‏(2)‏ > Towards a Semantic Web The current Web represents information using  natural language (English, Hungarian, Chinese,…)  graphics, multimedia, page layout Humans can process this easily  can deduce facts from partial information  can create mental associations  are used to various sensory information (well, sort of… people with disabilities may have serious problems on the Web with rich media!)

3 , Introduction to the Semantic Web ( 3 /67) ‏ (3)‏(3)‏ > Towards a Semantic Web Tasks often require to combine data on the Web:  hotel and travel information may come from different sites  searches in different digital libraries  etc. Again, humans combine these information easily  even if different terminologies are used!

4 , Introduction to the Semantic Web ( 4 /67) ‏ (4)‏(4)‏ > However… However: machines are ignorant!  partial information is unusable  difficult to make sense from, e.g., an image  drawing analogies automatically is difficult  difficult to combine information automatically is same as ? how to combine different XML hierarchies?  …

5 , Introduction to the Semantic Web ( 5 /67) ‏ (5)‏(5)‏ > Example: Automatic Airline Reservation Your automatic airline reservation  knows about your preferences  builds up knowledge base using your past  can combine the local knowledge with remote services: airline preferences dietary requirements calendaring etc It communicates with remote information (i.e., on the Web!)  (M. Dertouzos: The Unfinished Revolution)

6 , Introduction to the Semantic Web ( 6 /67) ‏ (6)‏(6)‏ > Example: Data(base) Integration Databases are very different in structure, in content Lots of applications require managing several databases  after company mergers  combination of administrative data for e-Government  biochemical, genetic, pharmaceutical research  etc. Most of these data are accessible from the Web (though not necessarily public yet)

7 , Introduction to the Semantic Web ( 7 /67) ‏ (7)‏(7)‏ > And the problem is real…

8 , Introduction to the Semantic Web ( 8 /67) ‏ (8)‏(8)‏ > Example: Digital Libraries It means catalogs on the Web  librarians have known how to do that for centuries  goal is to have this on the Web, World-wide  extend it to multimedia data, too But it is more: software agents should also be librarians!  help you in finding the right publications

9 , Introduction to the Semantic Web ( 9 /67) ‏ (9)‏(9)‏ > What Is Needed? (Some) data should be available for machines for further processing Data should be possibly combined, merged on a Web scale Sometimes, data may describe other data (like the library example, using metadata)… … but sometimes the data is to be exchanged by itself, like my calendar or my travel preferences Machines may also need to reason about that data

10 , Introduction to the Semantic Web ( 10 /67) ‏ ( 10 ) ‏ > In what follows… We will use a simplistic example to introduce the main Semantic Web concepts We take, as an example area, data integration

11 , Introduction to the Semantic Web ( 11 /67) ‏ ( 11 ) ‏ > The rough structure of data integration 1. Map the various data onto an abstract data representation  make the data independent of its internal representation… 2. Merge the resulting representations 3. Start making queries on the whole!  queries that could not have been done on the individual data sets

12 , Introduction to the Semantic Web ( 12 /67) ‏ ( 12 ) ‏ > A simplified bookstore data (dataset “A”)

13 , Introduction to the Semantic Web ( 13 /67) ‏ ( 13 ) ‏ > 1 st : export your data as a set of relations

14 , Introduction to the Semantic Web ( 14 /67) ‏ ( 14 ) ‏ > Some notes on the exporting the data Relations form a graph  the nodes refer to the “real” data or contain some literal  how the graph is represented in machine is immaterial for now Data export does not necessarily mean physical conversion of the data  relations can be generated on-the-fly at query time via SQL “bridges” scraping HTML pages extracting data from Excel sheets etc. One can export part of the data

15 , Introduction to the Semantic Web ( 15 /67) ‏ ( 15 ) ‏ > Another bookshop data (dataset “F”)

16 , Introduction to the Semantic Web ( 16 /67) ‏ ( 16 ) ‏ > 2 nd : export your second set of data

17 , Introduction to the Semantic Web ( 17 /67) ‏ ( 17 ) ‏ > 3 rd : start merging your data

18 , Introduction to the Semantic Web ( 18 /67) ‏ ( 18 ) ‏ > 3 rd : start merging your data (cont.)

19 , Introduction to the Semantic Web ( 19 /67) ‏ ( 19 ) ‏ > 3 rd : merge identical resources

20 , Introduction to the Semantic Web ( 20 /67) ‏ ( 20 ) ‏ > Start making queries… User of data “F” can now ask queries like:  « donnes-moi le titre de l’original »  (ie: “give me the title of the original”) This information is not in the dataset “F”… …but can be automatically retrieved by merging with dataset “A”!

21 , Introduction to the Semantic Web ( 21 /67) ‏ ( 21 ) ‏ > However, more can be achieved… We “feel” that a:author and f:auteur should be the same But an automatic merge doest not know that! Let us add some extra information to the merged data:  a:author same as f:auteur  both identify a “Person”  a term that a community may have already defined: a “Person” is uniquely identified by his/her name and, say, homepage it can be used as a “category” for certain type of resources

22 , Introduction to the Semantic Web ( 22 /67) ‏ ( 22 ) ‏ > 3 rd revisited: use the extra knowledge

23 , Introduction to the Semantic Web ( 23 /67) ‏ ( 23 ) ‏ > Start making richer queries! User of dataset “F” can now query:  « donnes-moi la page d’accueil de l’auteur de l’original »  (ie, “give me the home page of the original’s author”) The data is not in dataset “F”… …but was made available by:  merging datasets “A” and datasets “F”  adding three simple extra statements as an extra “glue”

24 , Introduction to the Semantic Web ( 24 /67) ‏ ( 24 ) ‏ > Combine with different datasets Using, e.g., the “Person”, the dataset can be combined with other sources For example, data in Wikipedia can be extracted using simple (e.g., XSLT) tools  there is an active development to add some simple semantic “tag” to wikipedia entries

25 , Introduction to the Semantic Web ( 25 /67) ‏ ( 25 ) ‏ > Merge with Wikipedia data

26 , Introduction to the Semantic Web ( 26 /67) ‏ ( 26 ) ‏ > Merge with Wikipedia data

27 , Introduction to the Semantic Web ( 27 /67) ‏ ( 27 ) ‏ > Merge with Wikipedia data

28 , Introduction to the Semantic Web ( 28 /67) ‏ ( 28 ) ‏ > Is that surprising? Maybe but, in fact, no… What happened via automatic means is done all the time, every day by the users of the Web! The difference: a bit of extra rigor (e.g., naming the relationships) is necessary so that machines could do this, too

29 , Introduction to the Semantic Web ( 29 /67) ‏ ( 29 ) ‏ > What did we do? We combined different datasets  all may be of different origin somewhere on the web  all may have different formats (mysql, excel sheet, XHTML, etc)  all may have different names for relations (e.g., multilingual) We could combine the data because some URI-s were identical (the ISBN-s in this case) We could add some simple additional information (the “glue”), also using common terminologies that a community has produced As a result, new relations could be found and retrieved

30 , Introduction to the Semantic Web ( 30 /67) ‏ ( 30 ) ‏ > It could become even more powerful We could add extra knowledge to the merged datasets  e.g., a full classification of various type of library data  geographical information  etc. This is where ontologies, extra rules, etc, may come in Even more powerful queries can be asked as a result

31 , Introduction to the Semantic Web ( 31 /67) ‏ ( 31 ) ‏ > What did we do? (cont)

32 , Introduction to the Semantic Web ( 32 /67) ‏ ( 32 ) ‏ > The abstraction pays off because… … the graph representation is independent on the exact structures in, say, a relational database … a change in local database schema's, XHTML structures, etc, do not affect the whole, only the “export” step  “schema independence” … new data, new connections can be added seamlessly, regardless of the structure of other data sources

33 , Introduction to the Semantic Web ( 33 /67) ‏ ( 33 ) ‏ > So where is the Semantic Web? The Semantic Web provides technologies to make such integration possible! For example: In what follows, we will touch upon some of these technology pieces

34 , Introduction to the Semantic Web ( 34 /67) ‏ ( 34 ) ‏ > RDF Triples Let us to formalize (a bit) what we did!  we “connected” the data…  but a simple connection is not enough… it should be named somehow  hence the RDF Triples: a labeled connection between two resources

35 , Introduction to the Semantic Web ( 35 /67) ‏ ( 35 ) ‏ > RDF Triples (cont.) An RDF Triple (s,p,o) is such that:  “ s ”, “ p ” are URI-s, ie, resources on the Web; “ o ” is a URI or a literal conceptually: “ p ” connects, or relates the “ s ” and “ o ” note that we use URI-s for naming: i.e., we can use http://www.example.org/original  here is the complete triple: RDF is a general model for such triples (with machine readable formats like RDF/XML, Turtle, n3, RXR, …) … and that’s it! (,, )

36 , Introduction to the Semantic Web ( 36 /67) ‏ ( 36 ) ‏ > RDF Triples (cont.) RDF Triples are also referred to as “triplets”, or “statements” Resources can use any URI; it can denote an element within an XML file on the Web, not only a “full” resource, e.g.:  http://www.example.org/file.xml#xpointer(id('home'))  http://www.example.org/file.html#home RDF Triples form a directed, labeled graph (best way to think about them!)

37 , Introduction to the Semantic Web ( 37 /67) ‏ ( 37 ) ‏ > A Simple RDF Example (in RDF/XML) Le palais des mirroirs

38 , Introduction to the Semantic Web ( 38 /67) ‏ ( 38 ) ‏ > A Simple RDF Example (in Turtle) f:titre "Le palais des mirroirs"@fr; f:original.

39 , Introduction to the Semantic Web ( 39 /67) ‏ ( 39 ) ‏ > URI-s Play a Fundamental Role URI-s made the merge possible Anybody can create (meta)data on any resource on the Web  e.g., the same XHTML file could be annotated through other terms  semantics is added to existing Web resources via URI-s  URI-s make it possible to link (via properties) data with one another URI-s ground RDF into the Web  information can be retrieved using existing tools  this makes the “Semantic Web”, well… “Semantic Web”

40 , Introduction to the Semantic Web ( 40 /67) ‏ ( 40 ) ‏ > Need for RDF Schema’s This is the simple form of our “extra knowledge”:  define the terms we can use  what restrictions apply  what extra relationships are there? This is where RDF Schema’s (RDFS) come in

41 , Introduction to the Semantic Web ( 41 /67) ‏ ( 41 ) ‏ > Classes, Resources, … Think of well known traditional ontologies or taxonomies:  use the term “novel”  “every novel is a fiction”  “«The Glass Palace» is a novel”  etc. RDFS defines resources and classes:  everything in RDF is a “resource”  “classes” are also resources, but…  …they are also a collection of possible resources (i.e., “individuals”) “fiction”, “novel”, …

42 , Introduction to the Semantic Web ( 42 /67) ‏ ( 42 ) ‏ > Classes, Resources, … (cont.) Relationships are defined among classes/resources:  “typing”: an individual belongs to a specific class (“«The Glass Palace» is a novel”) to be more precise: “« isbn:000651409X » is a novel”  “subclassing”: instance of one is also the instance of the other (“every novel is a fiction”) RDFS formalizes these notions in RDF

43 , Introduction to the Semantic Web ( 43 /67) ‏ ( 43 ) ‏ > Classes, Resources in RDF(S) RDFS defines rdfs:Resource, rdfs:Class as nodes; rdf:type, rdfs:subClassOf as properties  (these are all special URI-s, we just use an abbreviation)

44 , Introduction to the Semantic Web ( 44 /67) ‏ ( 44 ) ‏ > Ontologies RDFS is useful, but does not solve all possible requirements Complex applications may want more possibilities:  can a program reason about some terms? E.g.: “if «Person» resources «A» and «B» have the same « foaf:email » property, then «A» and «B» are identical”  if somebody else defines a set of terms: are they the same?  disjointness or equivalence of classes  etc.

45 , Introduction to the Semantic Web ( 45 /67) ‏ ( 45 ) ‏ > Ontologies (cont.) There is a need to support ontologies on the Semantic Web: We need various ontology languages  RDFS can be considered as a simple ontology language  OWL gives a much more complex set of possibilities based on logic  SKOS is more geared towards less formal taxonomies  … Applications choose what fits them most “defines the concepts and relationships used to describe and represent an area of knowledge”

46 , Introduction to the Semantic Web ( 46 /67) ‏ ( 46 ) ‏ > Querying Data: Graph Patterns The fundamental idea: RDF being a graph, use graph patterns:  the pattern contains unbound symbols  by binding the symbols (if possible), subgraphs of the RDF graph are selected  if there is such a selection, the query returns the bound resources SPARQLSPARQL: is a programming language-independent query language

47 , Introduction to the Semantic Web ( 47 /67) ‏ ( 47 ) ‏ > A Very Simple SPARQL Example SELECT ?p ?o WHERE {subject ?p ?o} The triplets in WHERE define the graph pattern, with ?p and ?o “unbound” symbols The query returns a list of matching p, o pairs

48 , Introduction to the Semantic Web ( 48 /67) ‏ ( 48 ) ‏ > A Bit More Complex Example… SELECT ?isbn ?price ?currency # note: not ?x! WHERE { ?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency.}

49 , Introduction to the Semantic Web ( 49 /67) ‏ ( 49 ) ‏ > A Bit More Complex Example… SELECT ?isbn ?price ?currency # note: not ?x! WHERE { ?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency.} Returns: [[,33,£], [,50,€], [,60,€], [,78,$]]

50 , Introduction to the Semantic Web ( 50 /67) ‏ ( 50 ) ‏ > Pattern Constraints SELECT ?isbn ?price ?currency # note: not ?x! WHERE { ?isbn a:price ?x. ?x rdf:value ?price. ?x p:currency ?currency. FILTER(?currency == € } Returns: [[,60,€], [,78,$]]

51 , Introduction to the Semantic Web ( 51 /67) ‏ ( 51 ) ‏ > SPARQL Usage in Practice Locally, i.e., bound to a programming environment Remotely, e.g., over the network or into a database  separate documents define the protocol and the result format SPARQL Protocol for RDF with HTTP and SOAP bindings SPARQL Protocol for RDF SPARQL results in XML or JSON formatsXMLJSON

52 , Introduction to the Semantic Web ( 52 /67) ‏ ( 52 ) ‏ > So where is the Semantic Web? (again…)

53 , Introduction to the Semantic Web ( 53 /67) ‏ ( 53 ) ‏ > A More Specific Example: Antibodies Demo Scenario: find the known antibodies for a protein in a specific species Combine (“scrape”…) three different data sources  “Entrez protein sequence” from National Center for Biotechnology Information; conversion to RDF  “Antibody Directory” from Alzheimer Research Forum; scraping RDF from HTML  “Taxonomy information” from Wikispecies; use XSLT to extract RDF from XHTML Use SPARQL as an integration tool (see also demo online)

54 , Introduction to the Semantic Web ( 54 /67) ‏ ( 54 ) ‏ > Example: Antibodies Demo

55 , Introduction to the Semantic Web ( 55 /67) ‏ ( 55 ) ‏ > Example: Ontology Controlled Annotation Annotation of different data formats all along the full drug discovery process…

56 , Introduction to the Semantic Web ( 56 /67) ‏ ( 56 ) ‏ > Example: Find the Right Experts at NASA Expertise locator for nearly 20,000 NASA civil servants using RDF integration techniques over 6 or 7 geographically distributed databases, data sources, and web services…

57 , Introduction to the Semantic Web ( 57 /67) ‏ ( 57 ) ‏ > SW Data Begins to Accumulate on the Web IgentaConnectIgentaConnect bibliographic metadata storage: over 200 million triplets Tracking the US CongressTracking the US Congress: data stored in RDF (around 25 million triplets) RDFS/OWL Representation of WordNetRDFS/OWL Representation of WordNet: also downloadable as 150MB of RDF/XML “Département/canton/commune” structure of France published by the French Statistical InstituteDépartement/canton/commune Geonames OntologyGeonames Ontology and associated RDF data: 6 million (and growing) geographical features RDF Book MashupRDF Book Mashup, integrating book data from, eg, Amazon “dbpedia”: get categorial data of Wikipedia into RDFdbpedia

58 , Introduction to the Semantic Web ( 58 /67) ‏ ( 58 ) ‏ > Semantic Web Applications The data integration is only one area of SW applications Let us see some more…

59 , Introduction to the Semantic Web ( 59 /67) ‏ ( 59 ) ‏ > Applications are not Always Very Complex… Eg: simple semantic annotations of patients’ data greatly enhances communications among doctors What is needed: some simple ontologies, a simple annotation editing environment Simple but powerful!

60 , Introduction to the Semantic Web ( 60 /67) ‏ ( 60 ) ‏ > Portals Vodafone's Live Mobile Portal  search application (e.g. ringtone, game) using RDF page views per download decreased 50% ringtone up 20% in 2 months Other portal examples: Sun’s White Paper Collections and System Handbook collections; Nokia’s S60 support portal; Harper’s Online magazine; Oracle’s Technology Network; Opera’s community site; Yahoo! Food, FAO's Food, Nutrition and Agriculture Journal portal,…

61 , Introduction to the Semantic Web ( 61 /67) ‏ ( 61 ) ‏ > Example: Oracle’s Technology Network

62 , Introduction to the Semantic Web ( 62 /67) ‏ ( 62 ) ‏ > Improved Search via Ontology: GoPubMed Improved search on top of pubmed.org  search results are ranked using the specialized ontologies  extra search terms are generated and terms are highlighted Importance of domain specific ontologies for search improvement

63 , Introduction to the Semantic Web ( 63 /67) ‏ ( 63 ) ‏ > Content labeling for better search (Segala) Use content labeling to qualify search results:

64 , Introduction to the Semantic Web ( 64 /67) ‏ ( 64 ) ‏ > Adobe’s XMP Adobe’s tool to add RDF-based metadata to most of their file formats  used for more effective organization  supported in Adobe Creative Suite  support from 30+ major asset management vendors, with separate XMP conferences The tool is available for all!

65 , Introduction to the Semantic Web ( 65 /67) ‏ ( 65 ) ‏ > Other Application Areas Come to the Fore Knowledge management Business intelligence Linking virtual communities Management of multimedia data (e.g., video and image depositories) Content adaptation and labeling (e.g., for mobile usage) etc

66 , Introduction to the Semantic Web ( 66 /67) ‏ ( 66 ) ‏ > Conclusions The Semantic Web is there to integrate data on the Web The goal is the creation of a Web of Data

67 , Introduction to the Semantic Web ( 67 /67) ‏ ( 67 ) ‏ > Thank you for your attention! These slides are publicly available (in Open Document Format, PDF, and HTML) on: http://www.w3.org/2007/Talks/0416-ESA-IH/


Download ppt "Introduction to the Semantic Web Ivan Herman, W3C ESA, Noordwijk, the Netherlands 16 th April, 2007 Ivan Herman."

Similar presentations


Ads by Google