Presentation is loading. Please wait.

Presentation is loading. Please wait.

Georgi Kobilarov, Chris Bizer, Sören Auer, Jens Lehmann Freie Universität Berlin, Universität Leipzig.

Similar presentations


Presentation on theme: "Georgi Kobilarov, Chris Bizer, Sören Auer, Jens Lehmann Freie Universität Berlin, Universität Leipzig."— Presentation transcript:

1 Georgi Kobilarov, Chris Bizer, Sören Auer, Jens Lehmann Freie Universität Berlin, Universität Leipzig

2 Querying Wikipedia like a Database

3 Title Description Languages Web Links Categorization Domain specific Data Images Infoboxes

4 Infobox Extraction dbpedia:Albert_Einstein p:name „Albert Einstein“ dbpedia:Albert_Einstein p:birth_place dbpedia:Ulm dbpedia:Albert_Einstein p:birth_date „ 1956-07-09“

5 Property Synonyms

6 Structuring Wikipedia‘s Knowledge Structuring actual data, not modeling the world Bound to Wikipedia Templates, parsers handle template values based on rules (property splitting, merging, transformation)

7 DBpedia Ontology DBpedia Ontology build from scratch 170 classes, 900 properties

8 No living things

9 Class Hierarchy „Select all TV Episodes …“

10 Template Mapping Class TV Episode (Work) Wikipedia Templates: Television Episode UK Office Episode Simpsons Episode DoctorWhoBox

11 Template Mapping Infobox Cricketer Infobox Historic Cricketer Infobox Recent Cricketer Infobox Old Cricketer Infobox Cricketer Biography => Class Cricketer (Athlete)

12

13 People Actors Athlete Journalist MusicalArtist Politician Scientist Writer

14 Places Airport City Country Island Mountain River

15 Organisations Band Company Educational Institution Radio Station Sports Team

16 Event Convention Military Conflict Music Event Sport Event

17 Work Book Broadcast Film Software Television

18 More structured data Categories in SKOS Intra-wiki links Disambiguation Redirects Links to Images (and Flickr) Links to external webpages

19 Data about 2.6 million “things”

20 274 million pieces of information (RDF triples)

21 Multilingual Abstracts – English: 2,613,000 – German: 391,000 – French: 383,000 – Dutch: 284,000 – Polish: 256,000 – Italian: 286,000 – Spanish: 226,000 – Japanese: 199,000 – Portuguese: 246,000 – Swedish: 144,000 – Chinese: 101,000

22

23

24

25

26 DBpedia as Linked Data Hub

27 Semantic Web “My document can point at your document on the Web, but my database can't point at something in your database without writing special purpose code. The Semantic Web aims at fixing that.” Prof. James Hendler

28 Web of Documents Web Browsers Search Engines AB CD HTML hyper links HTML HTTP

29 Web of Data B C Thing data link A D E Thing Search Engines Linked Data Mashups Linked Data Browsers HTTP

30 Linked Data Use URIs as names for things Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information. Include links to other URIs. so that they can discover more things. Wikipedia Article URI: http://en.wikipedia.org/wiki/Madrid DBpedia Resource URI http://dbpedia.org/resource/Madrid

31 HTTP URIs Information Resources http://dbpedia.org/page/Madrid HTTP GET -> 200 OK Real-World Resources http://dbpedia.org/resource/Madrid HTTP GET -> 303 See other http://dbpedia.org/page/Madrid http://dbpedia.org/data/Madrid -> 200 OK

32

33

34

35 Life Sciences Publications Online Activities Music Geographic Cross-Domain

36 4.5 billion triples 180 million data links

37

38

39 Use Cases

40 1.Data Source for Web-Applications 2.Querying Wikipedia like a database 3.Tag Web content with concepts instead of free-text tags 4.Vocabulary and semantic backbone for enterprise linked data integration

41 DBpedia as data source Embed structured information from Wikipedia into your web applications Build (mobile) maps applications using DBpedia data about places Display multilingual titles & descriptions in 15 languages

42 DBpedia Mobile

43

44

45

46 Sparql Endpoint http://dbpedia.org/sparql

47 Wikipedia Query

48 Annotating Documents Use DBpedia concepts to annotate documents instead of free-text tags Named Entity Extraction Systems already use DBpedia URIs (OpenCalais, Muddy Boots) Social Bookmarking with DBpedia URIs as tags www.faviki.com

49 „Apple“ http://dbpedia.org/resource/Apple_Inc. http://dbpedia.org/resource/Apple_(fruit) http://dbpedia.org/resource/Apple_Records

50 Annotating Documents BBC editors tag news articles with DBpedia concepts DBpedia Lookup Service http://lookup.dbpedia.org

51

52 Linking Enterprise Data Take the Linking Open Data approach to the enterprises

53

54 Connect data sets with DBpedia as shared vocabulary Enable meaningful navigation paths across BBC websites Browsing Madonna-related information across BBC News, BBC Music, BBC Programmes, … Make use of the rich background information: relate the release of a music album to a news article about the artist Linking Enterprise Data

55

56

57 The Future of DBpedia

58 Improve Information Extraction

59 Croud-source Information Extraction

60 Crowd Sourced Extraction Where‘s the user benefit?

61 Data Fusion

62 Cross-Language Data Fusion 264 Wikipedia Editions in different languages – Italian Wikipedians know more about Italian villages – German Wikipedia contains more person infoboxes Augment the infobox dataset with facts from other Wikipedia editions.

63 Augment DBpedia with External Data Linking Open Data cloud provides more data than Wikipedia – EuroStat provides additional statistical information about countries. – Musicbrainz contains additional information about other bands. – Geonames provides additional information about locations. Idea – Augment DBpedia with additional data from external sources.

64 Contribute back to Wikipedia Opportunity – Feed data back to Wikipedia Extend the Wikipedia authoring environment with – Suggestions for infobox values – Cross-language consistency checking for infoboxes Currently going on – New maps in Wikipedia based on Dbpedia Mobil Code (OpenStreetMap)

65 Contribute back to Wikipedia Initialize Wikipedia Clean-Up Cycles – Data-driven search interfaces expose the weaknesses of Wikipedia template system. – Preferred items not showing up in end-user interfaces may motivate Wikipedia editors to use templates more stringently.

66 Live Update Current Situation – DBpedia update cycle: 3 month – Wikipedia provides us with access to the live update stream Opportunity – Increase the currency of the DBpedia dataset using this update stream Result – DBpedia in synchronization with Wikipedia.

67 Open Source

68 Open Data

69 What is the Wikipedia for Data?

70 Wikipedia is the Wikipedia for Data

71 Summary

72 http://dbpedia.org georgi.kobilarov@fu-berlin.de


Download ppt "Georgi Kobilarov, Chris Bizer, Sören Auer, Jens Lehmann Freie Universität Berlin, Universität Leipzig."

Similar presentations


Ads by Google