Center for E-Business Technology Seoul National University Seoul, Korea Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge.

Slides:



Advertisements
Similar presentations
Three-Step Database Design
Advertisements

X-SIGMA (An XML based Simple data Integration system for Gathering, Managing and Accessing scientific experimental data in grid environments) Karpjoo
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
Center for E-Business Technology Seoul National University Seoul, Korea Socially Filtered Web Search: An approach using social bookmarking tags to personalize.
An Unsupervised Framework for Extracting and Normalizing Product Attributes from Multiple Web Sites Center for E-Business Technology Seoul National University.
Page 1 Integrating Multiple Data Sources using a Standardized XML Dictionary Ramon Lawrence Integrating Multiple Data Sources using a Standardized XML.
Provenance in Open Distributed Information Systems Syed Imran Jami PhD Candidate FAST-NU.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
ETEC 100 Information Technology
1 Introduction The Database Environment. 2 Web Links Google General Database Search Database News Access Forums Google Database Books O’Reilly Books Oracle.
Integrating data sources on the World-Wide Web Ramon Lawrence and Ken Barker U. of Manitoba, U. of Calgary
BTW Information Annotation By Rudd Stevens, Jason Endo.
“DOK 322 DBMS” Y.T. Database Design Hacettepe University Department of Information Management DOK 322: Database Management Systems.
Encyclopedias Sajjad ur Rehman. Purpose Ready reference source Secondary source Provide general overview of a topic and the background information Pointers.
Center for E-Business Technology Seoul National University Seoul, Korea Social Network Collaborative Filtering Research Meeting Babar Tareen
Web Usage Mining with Semantic Analysis Date: 2013/12/18 Author: Laura Hollink, Peter Mika, Roi Blanco Source: WWW’13 Advisor: Jia-Ling Koh Speaker: Pei-Hao.
Hexastore: Sextuple Indexing for Semantic Web Data Management
SOUPA: Standard Ontology for Ubiquitous and Pervasive Applications Harry Chen, Filip Perich, Tim Finin, Anupam Joshi Department of Computer Science & Electrical.
Unifying Data and Domain Knowledge Using Virtual Views IBM T.J. Watson Research Center Lipyeow Lim, Haixun Wang, Min Wang, VLDB Summarized.
Logics for Data and Knowledge Representation
Bibster AIFB Bibster A Semantics-Based Bibliographic Peer-to-Peer System Peter Haase, Steffen Staab, Rudi Studer, Frank van Harmelen, Michal Plechawski.
Tech Terminology for non-technical people Tim Bornholtz 2006 Annual Conference.
A service-oriented middleware for building context-aware services Center for E-Business Technology Seoul National University Seoul, Korea Tao Gu, Hung.
PHP and MySQL CS How Web Site Architectures Work  User’s browser sends HTTP request.  The request may be a form where the action is to call PHP.
Context-Awareness on Mobile Devices - the Hydrogen Approach Thomas Hofer, Wieland Schwinger, Mario Pichler, Gerhard Leonhartsberger, Josef Altmann (Software.
XML Registries Source: Java TM API for XML Registries Specification.
A Relational Approach to Incrementally Extracting and Querying Structure in Unstructured Data Eric Chu, Akanksha Baid, Ting Chen, AnHai Doan, Jeffrey Naughton.
Center for E-Business Technology Seoul National University Seoul, Korea BrowseRank: letting the web users vote for page importance Yuting Liu, Bin Gao,
Dimitrios Skoutas Alkis Simitsis
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
P2Pedia A Distributed Wiki Network Management and Artificial Intelligence Laboratory Carleton University Presented by: Alexander Craig May 9 th, 2011.
A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment International Conference of Embedded and Ubiquitous Computing.
CASS – Middleware for Mobile Context-Aware Applications Patrick Fahy Siobhan Clarke Trinity College Dublin, Ireland Summarized by Babar Tareen,
1 © 1999 Microsoft Corp.. Microsoft Repository Phil Bernstein Microsoft Corp.
PREMIS Controlled vocabularies Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.
ITGS Databases.
Center for E-Business Technology Seoul National University Seoul, Korea Social Ranking: Uncovering Relevant Content Using Tag-based Recommender Systems.
A Systemic Approach for Effective Semantic Access to Cultural Content Ilianna Kollia, Vassilis Tzouvaras, Nasos Drosopoulos and George Stamou Presenter:
A Method for Analyzing User Action Logs Center for E-Business Technology Seoul National University Seoul, Korea Jaeseok Myung Intelligent Database Systems.
APAN AG-WG Bangkok Food and Agriculture Organization of the UN Library and Documentation Systems Division Margherita Sini Slide Sustainable.
Knowledge Base Building Project 5 th meeting Intelligent Database Systems Lab School of Computer Science & Engineering Seoul National University,
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Object storage and object interoperability
An Ontology-based Approach to Context Modeling and Reasoning in Pervasive Computing Dejene Ejigu, Marian Scuturici, Lionel Brunie Laboratoire INSA de Lyon,
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Wikis: tools for collaboration Ace School Librarianship ICT Applications.
Exploring Traversal Strategy for Web Forum Crawling Yida Wang, Jiang-Ming Yang, Wei Lai, Rui Cai Microsoft Research Asia, Beijing SIGIR
PUBLISHING & COLLABORATION. SOCIAL NETWORKING ▪ Web sites such as Facebook, Twitter and LinkedIn are generally the first names people associate with social.
Semantic Web in Context Broker Architecture Presented by Harry Chen, Tim Finin, Anupan Joshi At PerCom ‘04 Summarized by Sungchan Park
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Relational-Style XML Query Taro L. Saito, Shinichi Morishita University of Tokyo June 10 th, SIGMOD 2008 Vancouver, Canada Presented by Sangkeun-Lee Reference.
ELISQ Systems Demonstration Sagnik Ray Choudhury Doha -- May 2015.
Sesame A generic architecture for storing and querying RDF and RDFs Written by Jeen Broekstra, Arjohn Kampman Summarized by Gihyun Gong.
Implementation of Ontology Based Context-awareness Framework Ki-Chul Lee, Jung-Hoon Kim International Conference on Multimedia and Ubiquitous Engineering.
KIT – University of the State of Baden-Württemberg and National Large-scale Research Center of the Helmholtz Association Institut AIFB – Angewandte Informatik.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
 GEETHA P.  Originally coined by Tim O’Reilly Publishing Media  Second generation of services available on www.  Lets people collaborate and share.
Improving searches through community clustering of information
StYLiD: Structured Information Sharing with User-defined Concepts
Google China Faculty Summit
Phil Bernstein Microsoft Corp.
Web Information retrieval
Associative Query Answering via Query Feature Similarity
Wikitology Wikipedia as an Ontology
Knowledge Based Workflow Building Architecture
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Database Design Hacettepe University
Presentation transcript:

Center for E-Business Technology Seoul National University Seoul, Korea Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, Jamie Taylor Metaweb Technologies, Inc. San Francisco International Conference on Management of Data (2008) Summarized & presented by Babar Tareen, IDS Lab., Seoul National University

Copyright  2008 by CEBT Motivation – Wikipedia  Free multilingual encyclopedia  Supports 264 languages  854 Volumes of English articles 2

Copyright  2008 by CEBT Motivation – English Wikipedia Growth 3

Copyright  2008 by CEBT Introduction  A public repository of world’s knowledge  Inspired by The Semantic Web and Wikipedia  Supports highly diverse and heterogeneous data  Tries to merge the scalability of structured databases with the diversity of collaborative wikis into a practical, scalable, database of structured general human knowledge  The information contained in Freebase is open to anyone  However, Freebase backend database is not open 4

Copyright  2008 by CEBT Data Sources  User Contribution  Metaweb Bots  Incorporates facts from many large, publicly available information sources 5

Copyright  2008 by CEBT Data Model  Freebase is a graph database  Set of nodes and a set of links that establish relationships between the nodes  Key Concepts Domains – Bases: collections of topics created by users – Commons: similar to bases but more general – Film, Religion, Computers Types – Analogues to classes – Film Actor, Film Festival, Film Distribution, Film Rating, Film Format Properties – Specific information elements within a type – Film Performances, Film Dubbing Performances, IMDb Entry Topics – Analogues to objects – Instances of a type – Topics can be linked to other domains or other topics 6

Copyright  2008 by CEBT Data Model (2) 7

Copyright  2008 by CEBT Key Components  A scalable Tuple Store  An HTTP/JSON-Based API MQL for read / write operations  A Lightweight, Collaborative Typing System Loose collection of structuring mechanisms and conventions  A Large, Diverse Data Set 100 million asserts 4000 types  A Philosophy of “Complete Normalization” Only one GUID for a real world object 8

Copyright  2008 by CEBT Data Entry 9

Copyright  2008 by CEBT Schema Creation 10

Copyright  2008 by CEBT Data Evaluation 11

Copyright  2008 by CEBT Metaweb Query Language  Metaweb Query Language  Who created the comic character Spider-Man ? 12 QUERY [ { "character_created_by" : null, "name" : "Spider-Man", "type" : "/fictional_universe/fictional_character" } ] { "code" : "/api/status/ok", "q1" : { "code" : "/api/status/error", "messages" : [ { "code" : "/api/status/error/mql/result", "info" : { "count" : 2, "result" : [ "Steve Ditko", "Stan Lee" ] }, "message" : "Unique query may have at most one result. Got 2", "path" : "character_created_by", "query" : [ { "character_created_by" : null, "error_inside" : "character_created_by", "name" : "Spider-Man", "type" : "/fictional_universe/fictional_character" } ] } ] }, "status" : "200 OK", "transaction_id" : "cache;cache01.p01.sjc1:8101; T05:54:45Z;0021" }

Copyright  2008 by CEBT MQL Queries  Characters created by Stan Lee  Foreign donations to 2008 US Political Candidates  Nikon Cameras in order of Resolution  Tropical Storms in the 90's  Mountains of the Himalayas  African American authors and their books  Web Browsers that run on the Mac  US cities named Canton 13

Copyright  2008 by CEBT Applications  Parallax: Freebase Browser  Powerset: Semantic Search Engine  ArchiPortal  Dipity Timelines

Copyright  2008 by CEBT Discussion  Simple architecture  Topics can be associated to multiple types  Analogues to having a database of knowledge  BUT, Now we have two Knowledge bases to maintain Wikipedia Freebase 15

Copyright  2008 by CEBT References  Freebase  The Semantic Edge (Web 2.0 Summit 2007)  MQL Query Editor  Freebase Blog  Freebase Sample Queries 16