Semantic Wikis and the Web of Data Jens Lehmann, Sören Auer AKSW Research Group Institute of Computer Science
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig2 Overview The Web of Data Vision Technology DBpedia Semantic Wikis Basis Concepts Overview Semantic MediaWiki, IkeWiki, OntoWiki
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig3 From the Document Web to the Linked Open Data Web (and beyond) Web (since 1992) HTTP HTML/CSS/JavaScript
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig4 From the Document Web to the Linked Open Data Web (and beyond) Web (since 1992) HTTP HTML/CSS/JavaScript Semantic Web (Vision 1998, starting ??) Reasoning Logic, Rules Trust
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig5 From the Document Web to the Linked Open Data Web (and beyond) Web (since 1992) HTTP HTML/CSS/JavaScript Semantic Web (Vision 1998, starting ??) Reasoning Logic, Rules Trust Social Web (since 2003) Folksonomies/Tagging Reputation, sharing Groups, relationships
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig6 From the Document Web to the Linked Open Data Web (and beyond) Web (since 1992) HTTP HTML/CSS/JavaScript Semantic Web (Vision 1998, starting ??) Reasoning Logic, Rules Trust Data Web (since 2006) URI de-referencability RDF serializations Social Web (since 2003) Folksonomies/Tagging Reputation, sharing Groups, relationships
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig7 Web 3.0 Web 1.0 Many Web sites containing unstructured, textual content
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig8 Web 3.0 Pictures Video Encyclopedic articles + Web 1.0 Web 2.0 Many Web sites containing unstructured, textual content Few large Web sites are specialized on specific content types
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig9 Web 3.0 Pictures Video Encyclopedic articles ++ Web 1.0 Web 2.0 Web 3.0 Many Web sites containing unstructured, textual content Few large Web sites are specialized on specific content types Many Web sites containing & semantically syndicating arbitrarily structured content
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig10 Long Tail of Information Domains Pictures News Video Recipes Calendar SemWeb supported structured content Gene sequences Itinerary of King George Talent management The Long Tail by Chris Anderson (Wired, Oct. ´ 04) adopted to information domains … … Requirements- Engineering … … Special interest communities Currently supported structured content types
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig11 Why Do We Need Another Web? Web server Try to search this on the current Web: Apartments near German-French bilingual childcare in Leipzig. ERP service providers with offices in Vienna and Berlin. Researchers working on DB related topics in south-east Asia. Information to answer such search queries is available on the Web, but opaque to current Web search. Leipzig.de Has everything about childcare in Leipzig Immobilienscout.de Knows all about real estate offers in Germany DB Web server DB Web server Search engine HTML RDF
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig12 RDF Statement / Triple Paradigm RDF/XML: <rdf:RDF xmlns=" syntax-ns#" xmlns:dc=" ore#"> Sören Auer RDF/XML: <rdf:RDF xmlns=" syntax-ns#" xmlns:dc=" ore#"> Sören Auer Sören Auer dc:creator Subject (Resource) Predicate (Property) Object (Resource/Literal) RDF/N3: "Sören Auer“ RDF/N3: "Sören Auer“
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig13 RDF Document / Model / Graph dc:Creator Sören Auer foaf: foaf:Name Simple Knowledge Base Combines multiple RDF Statements
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig14 RDF-S Class & Property Hierarchies hasContent rdf:type rdfs:Property HasAlcoholicContent rdfs:subPropertyOfhasContent hasOriginalWortContent rdfs:subPropertyOfhasContent hasContent rdf:type rdfs:Property HasAlcoholicContent rdfs:subPropertyOfhasContent hasOriginalWortContent rdfs:subPropertyOfhasContent hasContent rdf:type rdfs:Property HasAlcoholicContent rdfs:subPropertyOfhasContent hasOriginalWortContent rdfs:subPropertyOfhasContent hasContent rdf:type rdfs:Property HasAlcoholicContent rdfs:subPropertyOfhasContent hasOriginalWortContent rdfs:subPropertyOfhasContent Beer rdf:type rdfs:Class BottomFermentedBeer rdfs:subClassOf Beer Bock rdfs:subClassOf BottomFermentedBeer Lager rdfs:subClassOf BottomFermentedBeer Pilsner rdfs:subClassOf BottomFermentedBeer Beer rdf:type rdfs:Class BottomFermentedBeer rdfs:subClassOf Beer Bock rdfs:subClassOf BottomFermentedBeer Lager rdfs:subClassOf BottomFermentedBeer Pilsner rdfs:subClassOf BottomFermentedBeer
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig15 Semantic Web Layer Cake RDF as base RDFS: simple structures OWL: “real” ontologies SPARQL: querying RDF Logic: reasoning over ontologies
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig16 Linked Data - Paradigm Use URIs as names for things. Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information. Include links to other URIs, so more things can be discovered.
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig17 Linked Data – Publishing RDF De-referenceable RDF-URIs, e.g.: Different HTTP response depending on HTTP- Accept-Header
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig18 Benefits of using the RDF Data Model in the Linked Data Context clients can look up URIs in an RDF to retrieve additional information Links between different sources can be set (owl:sameAs, rdfs:seeAlso) Information from different sources merges naturally data model allows to combine information expressed using different schemata Model allows you to use as much or little structure as you need Provenance via namespaces
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig19 Linking Open Data (LOD) Cloud
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig20 Transforming Wikipedia into a Knowledge base ☺ Wikipedia is the 8th most popular website (according to Alexa.com) ☺ Maybe the finest example of truly collaboratively created content (>8M articles, >200 languages, > authors) ☺ Covers many possible topics and domains, articles are a result of a “community consensus” Θ Many inconsistencies can be found on different pages/language versions Θ Not very well integrated with other data sources Θ Lacks structured representations of content which facilitate querying and search
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig21 Transforming Wikipedia into a Knowledge base Simple Questions – hard to answer: What have Innsbruck and Leipzig in common? Who are mayors of central European towns elevated more than 1000m? Which films are longer than 4 hours and had a budget of less than $1 Million? The information required to answer these is contained in Wikipedia! How can we reveal structure and semantics of Wikipedia content?
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig22 Structure in Wikipedia Title Abstract Infoboxes Geo-coordinates Categories Images Links other language versions other Wikipedia pages To the Web Redirects Disambiguations
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig23 Infobox Templates {{Infobox Korean settlement | title = Busan Metropolitan City | img = Busan.jpg | imgcaption = A view of the [[Geumjeong]] district in Busan | hangul = 부산 광역시... | area_km2 = | pop = | popyear = 2006 | mayor = Hur Nam-sik | divs = 15 wards (Gu), 1 county (Gun) | region = [[Yeongnam]] | dialect = [[Gyeongsang]] }} dbp:Busan dbp:title ″Busan Metropolitan City″ dbp:Busan dbp:hangul ″ 부산 광역시 dbp:Busan dbp:area_km2 ″763.46“^xsd:float dbp:Busan dbp:pop ″ “^xsd:int dbp:Busan dbp:region dbp:Yeongnam dbp:Busan dbp:dialect dbp:Gyeongsang... Wikitext-Syntax RDF representation
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig24 DBpedia Architecture Extraction Job Extraction Manager PageCollections Destinations N-Triple Dumps Wikipedia Dumps Wikipedia OAI-PMH Database Wikipedia Live Wikipedia N-Triple Serializer SPARQL- Update Destination Extractors Generic Infobox Label Geo RedirectDisambiguation Image Abstract Pagelink Parsers DateTimeUnits Ontology- Mappings Mapping-based Infobox String-ListNumbers Geo SPARQL endpoint Linked Data The Web RDF browser HTML browser SPARQL clients DBpedia apps Triple Store Virtuoso Triple Store Virtuoso Update Stream Article- Queue Wikipedia Category
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig25 Results Extraction in 30 languages DBpedia describes 2.6 million things, including at least 213,000 persons, 328,000 places, 57,000 music albums, 36,000 films, 20,000 companies 274 million RDF triples Manual tests reveal: ~87% accuracy, (9% redundant information, 2% irrelevant, 1% errors) Very large multi-domain ontology Dataset (En only)Triples Titles2.7M Abstracts7.6M External Links3.2M Categories7.3M Infoboxes (generic)26.0M Infoboxes (mapped)7.0M Yago Classes2M Geo-coordinates450k Properties66k Mapping to Flickr, DBLP, Eurostat, CIA- Factbook, Musicbrainz, Project Gutenberg, US Census, … 2.5M Mapping to OpenCyc45k
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig26 Hierarchies DBpedia Ontology Schema: manually created for DBpedia (infoboxes) 170 classes properties; 7mio triples YAGO: large hierarchy linking Wikipedia leaf categories to WordNet 250,000 classes UMBEL (Upper Mapping and Binding Exchange Layer): classes derived from OpenCyc Wikipedia Categories: Not a class hierarchy (e.g. cycles), represented using SKOS 415,000 categories
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig27 DBpedia SPARQL Endpoint hosted on a OpenLink Virtuoso server can answer SPARQL queries like Give me all Sitcoms that are set in NYC? All tennis players from Moscow? All films by Quentin Tarentino? All German musicians that were born in Berlin in the 19th century? All soccer players with tricot number 11, playing for a club having a stadium with over 40,000 seats and is born in a country with over 10 million inhabitants?
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig28 DBpedia SPARQL Endpoint SELECT ?name ?birth ?description ?person WHERE { ?person dbp:birthPlace dbp:Berlin. ?person skos:subject dbp:Cat:German_musicians. ?person dbp:birth ?birth. ?person foaf:name ?name. ?person rdfs:comment ?description. FILTER (LANG(?description) = 'en'). } ORDER BY ?name
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig29 DBpedia Applications DBpedia Mobile: location aware mobile client for DBpedia Uses current location and DBpedia to display map Can navigate into other knowledge bases DBpedia Query Builder: user front end for building queries DBpedia Relationship Finder finds relation between two objects in DBpedia
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig30 DBpedia Applications (3 rd party) Muddy Boots (BBC): Annotate actors in BBC News with DBpedia identifiers Open Calais (Reuters): named entity recognition; entities are connected via owl:sameAs to DBpedia, Freebase, Geonames Faviki: Social Bookmarking Tool uses DBpedia in backend to group tags etc. and multi-language support Topbraid Composer: ontology editor, which links entities to DBpedia based on their labels
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig31 Semantic Wikis Wiki with underlying model of knowledge, often RDF/OWL Wiki pages + knowledge base Formal knowledge can be: Included in Wiki pages (special syntax) Derived from Wiki pages (DBpedia approach, controlled natural language) Base of the Wiki itself (data centered, everything is RDF/OWL) Semantic Wiki content is (partially) machine processable and allows (complex) querying and reasoning Provides better search, browsing (facets), and re-use
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig32 Semantic Wiki Systems AceWiki Artificial Memory BOWiki – MediaWiki extension Hypertext Knowledge Workbench IkeWiki KiWi Knoodl KnowWE OntoWiki OpenRecord Semantic MediaWiki – MediaWiki extension Subleme SweetWiki SWiM - offshoot of IkeWiki SWOOKI- Peer-to-Peer Semantic Wiki
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig33 Semantic MediaWiki Developed mainly at AIFB Karlsruhe by Denny Vrandecic and Markus Krötzsch Extension of MediaWiki (underlying Wikipedia) Focus an editing instance data as opposed to creating an expressive ontology Main goal is to deploy the system on Wikipedia Scalability is crucial – requests per second in Wikipedia Only lightweight reasoning possible Class and property hierarchies
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig34 Semantic MediaWiki Typed links: “Busan is a city in [[located in::Korea]] with a population of [[population::3,635,389]].” Properties have their own pages where there type is specified (page, date, , number,...) Subject is always the current page Inline queries: {{#ask: [[Category:City]] [[located in::Germany]] | ?population | ?area#km² = Size in km² }} Semantic forms
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig35 Semantic MediaWiki
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig36 Semantic MediaWiki OpenResearch.org Based on SMW Support for scientific content types Events (Conferences, Workshops, etc.) People, research groups, science genealogy Journals Funding calls Additional categorization schemes include scientific field (not limited to CS) and location/region Semantic annotation and structuring of these facilitate search (e.g. SE conferences by acceptance rate) Already one of the largest KB’s of science meta-information more than pages/entities
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig37 IkeWiki Main developer: Sebastian Schaffert Founded in KIWI EU FP7 project Knowledge management with focus on people KIWI = Knowledge Managent + Wiki Philosophy + Semantic Web IkeWiki system used as prototype for KIWI ideas, later there will be a KIWI system Collaboration of domain experts + ontology engineers Uses OWL-DL reasoning People TechnologyOrganisation
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig38 IkeWiki rich text editor related resourcesontology creation
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig39 OntoWiki Semantic Wiki? Concepts differences similarities Functionality Use cases OntoWiki Maintainer: Sebastian Dietzold
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig40 Conceptual Differences No Wiki-Code!
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig41 Conceptual Differences No Wiki-Code! But: Forms
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig42 Conceptual Differences No Wiki pages!
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig43 Conceptual Differences No Wiki pages! But: Views on Resources
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig44 Conceptual Similarities or ”Why do you call that thing a Wiki?” Philosophy: Make it easy to correct mistakes...
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig45 Conceptual Similarities or ”Why do you call that thing a Wiki?” Philosophy: Make it easy to correct mistakes... Versioning: everything can be undone
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig46 Other Features Interfaces → Extensibility Access Control GUI SPARQL Endpoint WebDAV REST Command Line Interface LDAP
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig47 Other Features Interfaces Extensibility → Access Control GUI Plugins Themes
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig48 Other Features Interfaces Extensibility Access Control → GUI Ontology-Based Action-Based (Statement-Based)
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig49 Other Features Interfaces Extensibility Access Control GUI → Facet-Based Browsing Inline Editing Resource Auto-Suggestion
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig50 Vision 1. Generic RDF Wiki –No data model mismatch (structured vs. unstructured) 2. Application framework for –Knowledge intensive applications –Agile procceses –Distributed user groups
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig51 Use Case I SoftWiki collaborative Requirements-Engineering OntoWiki Base System Custom Views
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig52
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig53 Use Case II vakantieland.nl Dutch Tourism Portal Displays points of interests OntoWiki as Backend for Data management OntoWiki API, widgets used in front-end
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig54
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig55
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig56 Use Case III University of Leipzig celebrates its 600 th anniversary Professor catalogue with 800 entries and 60 schema elements (classes + object and data properties) OntoWiki as backend for collecting data Cusomt front-end
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig57
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig58
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig59 OntoWiki Summary OntoWiki A – somewhat different - semantic Wiki no Wiki code / no Wiki pages Views and Forms Works directly on RDF (and OWL) A Framework for knowledge intensive applications Several interfaces Extensible through Plugins / Themes Access Control
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig60 Main Semantic Wikis: Overview
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig61 Summary Web of Data Data Web is a step towards realising the Semantic Web vision Linked Data principles to publish knowledges bases DBpedia forms nucleus of the Data Web Semantic Wikis: Combination of Wikis with formal knowledge (often RDF/OWL) Text Centered Wikis (Semantic MediaWiki, IkeWiki) Data Centered Wikis (OntoWiki)
2009/02/03Semantic Wikis and the Web of Data - Jens Lehmann, AKSW Leipzig62 The End Thanks for your attention!