Presentation is loading. Please wait.

Presentation is loading. Please wait.

Center for E-Business Technology Seoul National University Seoul, Korea Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge.

Similar presentations


Presentation on theme: "Center for E-Business Technology Seoul National University Seoul, Korea Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge."— Presentation transcript:

1 Center for E-Business Technology Seoul National University Seoul, Korea Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, Jamie Taylor Metaweb Technologies, Inc. San Francisco International Conference on Management of Data (2008) 2008. 11. 12. Summarized & presented by Babar Tareen, IDS Lab., Seoul National University

2 Copyright  2008 by CEBT Motivation – Wikipedia  Free multilingual encyclopedia  Supports 264 languages  854 Volumes of English articles 2

3 Copyright  2008 by CEBT Motivation – English Wikipedia Growth 3

4 Copyright  2008 by CEBT Introduction  A public repository of world’s knowledge  Inspired by The Semantic Web and Wikipedia  Supports highly diverse and heterogeneous data  Tries to merge the scalability of structured databases with the diversity of collaborative wikis into a practical, scalable, database of structured general human knowledge  The information contained in Freebase is open to anyone  However, Freebase backend database is not open 4

5 Copyright  2008 by CEBT Data Sources  User Contribution  Metaweb Bots  Incorporates facts from many large, publicly available information sources 5

6 Copyright  2008 by CEBT Data Model  Freebase is a graph database  Set of nodes and a set of links that establish relationships between the nodes  Key Concepts Domains – Bases: collections of topics created by users – Commons: similar to bases but more general – Film, Religion, Computers Types – Analogues to classes – Film Actor, Film Festival, Film Distribution, Film Rating, Film Format Properties – Specific information elements within a type – Film Performances, Film Dubbing Performances, IMDb Entry Topics – Analogues to objects – Instances of a type – Topics can be linked to other domains or other topics 6

7 Copyright  2008 by CEBT Data Model (2) 7

8 Copyright  2008 by CEBT Key Components  A scalable Tuple Store  An HTTP/JSON-Based API MQL for read / write operations  A Lightweight, Collaborative Typing System Loose collection of structuring mechanisms and conventions  A Large, Diverse Data Set 100 million asserts 4000 types  A Philosophy of “Complete Normalization” Only one GUID for a real world object 8

9 Copyright  2008 by CEBT Data Entry 9

10 Copyright  2008 by CEBT Schema Creation 10

11 Copyright  2008 by CEBT Data Evaluation 11

12 Copyright  2008 by CEBT Metaweb Query Language  Metaweb Query Language  Who created the comic character Spider-Man ? 12 QUERY [ { "character_created_by" : null, "name" : "Spider-Man", "type" : "/fictional_universe/fictional_character" } ] { "code" : "/api/status/ok", "q1" : { "code" : "/api/status/error", "messages" : [ { "code" : "/api/status/error/mql/result", "info" : { "count" : 2, "result" : [ "Steve Ditko", "Stan Lee" ] }, "message" : "Unique query may have at most one result. Got 2", "path" : "character_created_by", "query" : [ { "character_created_by" : null, "error_inside" : "character_created_by", "name" : "Spider-Man", "type" : "/fictional_universe/fictional_character" } ] } ] }, "status" : "200 OK", "transaction_id" : "cache;cache01.p01.sjc1:8101;2008-11-11T05:54:45Z;0021" }

13 Copyright  2008 by CEBT MQL Queries  Characters created by Stan Lee  Foreign donations to 2008 US Political Candidates  Nikon Cameras in order of Resolution  Tropical Storms in the 90's  Mountains of the Himalayas  African American authors and their books  Web Browsers that run on the Mac  US cities named Canton 13

14 Copyright  2008 by CEBT Applications  Parallax: Freebase Browser http://mqlx.com/~david/parallax/index.html http://mqlx.com/~david/parallax/index.html  Powerset: Semantic Search Engine http://www.powerset.com/ http://www.powerset.com/  ArchiPortal http://dev.mqlx.com/~zak/arch/ http://dev.mqlx.com/~zak/arch/  Dipity Timelines http://www.dipity.com/ http://www.dipity.com/ 14

15 Copyright  2008 by CEBT Discussion  Simple architecture  Topics can be associated to multiple types  Analogues to having a database of knowledge  BUT, Now we have two Knowledge bases to maintain Wikipedia Freebase 15

16 Copyright  2008 by CEBT References  Freebase http://www.freebase.com http://www.freebase.com  The Semantic Edge (Web 2.0 Summit 2007) http://www.web2summit.com/cs/web2007/view/e_sess/15043 http://www.web2summit.com/cs/web2007/view/e_sess/15043  MQL Query Editor http://www.freebase.com/tools/queryeditor/ http://www.freebase.com/tools/queryeditor/  Freebase Blog http://blog.freebase.com/ http://blog.freebase.com/  Freebase Sample Queries http://www.freebase.com/view/freebase/freebase_queryhttp://www.freebase.com/view/freebase/freebase_query 16


Download ppt "Center for E-Business Technology Seoul National University Seoul, Korea Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge."

Similar presentations


Ads by Google