Presentation is loading. Please wait.

Presentation is loading. Please wait.

DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.

Similar presentations


Presentation on theme: "DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015."— Presentation transcript:

1 DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015

2 Agenda Overview Factor Graph Learning &Inference Reference DeepDive: A Data Management System for Automatic Knowledge Base Construction. Ce Zhang.Ph.D. Dissertation, University of Wisconsin-Madison, 2015.

3 Overview What is Deep Dive? DeepDive is a new type of data management system that enables one to tackle extraction, integration, and prediction problems in a single system. DeepDive makes good use of uncertainty to improve predictions during the probabilistic inference step. For example, DeepDive may find a certain mention of "Barack" is only 60% likely to actually refer to "Barack Obama", and use this fact to discount the impact of that mention on the final result for the entity "Barack Obama")

4 Overview

5 What users/developers need do? 1. Generation and Extraction ---Users schema and Correlation schema (correlation schema captures correlations among tuples in the user schema)Users schema ---Extraction Features. (100K features) With weighted value. 2. Distant Supervision ---One way is for user to create training data. ---Take the real fact, and then label each training data true or false. 3. Inference and Learning

6 Overview  User Schema

7 Overview What system will do? 1.Generation and Extraction ---Extracted mentions (entity), candidate relation( based on features and supervise rule) ---Entity linking (Sometimes has glossaries or a database of all known entities; Sometimes need sophisticated machine learning approaches, with weighted value and boolean value) --- Label some of these pairs as true or false according to the supervision rules. (with weighted value and boolean value)

8 Overview What system will do? 1.Generation and Extraction 2. Distant Supervision --- Make use of an already existing database to collect examples for the relation. --- Use these examples to automatically generate our training data, including positive and negative training data (harder process).

9 Agenda Overview Factor Graph Learning &Inference Reference DeepDive: A Data Management System for Automatic Knowledge Base Construction. Ce Zhang.Ph.D. Dissertation, University of Wisconsin-Madison, 2015.

10 Learning and Inference step 1.Factor graph grounding DeepDive heavily relies on factor graphs, one type of probabilistic graphical models, for its statistical inference and learning phase. A Factor graph has two types of nodes: Variable notes and factor Notes. Factor Graph

11 Learning and Inference step Both the features extracted and domain knowledge (inference rule In factor from a factor graph) integrated need a weight to indicate how strong an indicator they are to the target task. ---One way to do that is for the user to manually specify the weight. ---another more easy, consistent, and effective way is for DeepDive to automatically learn the weight with machine learning techniques. (Through an iterative way)

12 Learning and Inference step 1.Factor graph grounding ---Variables, which can be used to quantitatively describe an event. Specifically, describe the tuple in users schema. The variables can be evidence variables when their value is known (from training data or user defined), or query variables when their value should be predicted. ---Factor (correlation relation, from correlation schema), is a function of variables, and is used to evaluate the relations among variable(s). The main task that DeepDive conducts on factor graphs is statistical inference, i.e., for a given node, what is the marginal probability that this node takes the value 1? Factor Graph

13 Learning and Inference step 1.Factor graph grounding The variable nodes of the factor graph are connected to factors according to inference rules specified by the user, who also defines the factor functions which describe how the variables are related. The user can specify whether the factor weights should be constant or learned by the system. Inference rules are edges in graph. Each rule consists of three components: The input query specifies the variables to create (variable notes); The factor function (factor notes); The factor weight describes the confidence in the relationship expressed by the factor. Factor Graph

14 Agenda Overview Factor Graph Learning &Inference

15 Learning and Inference step 2. How it works? A Each variable can take value 0 or 1, and let’s say there are two variables. So we have four possible worlds (a combination of varaible(s)). B Define the probability of a possible world through factor functions. We give different weight to factor functions, to express the relative influence of each factor on the probability Learning &Inference Pr(I) ∝ measure{w1f1(v1, v2) + w2f2(v2)}.

16 Learning and Inference step 2. How it works? B + The probability of a possible world graph is then defined to be proportional to some measure of weighted combination of factor functions. C Now, we can perform marginal inference on factor graphs of one variable taking a particular value. A marginal inference is to infer the probability of one variable taking a particular value. This is similar to marginal probability and joint probability. Learning &Inference

17 Learning and Inference step 2. How it works? In DeepDive, you can assign factor weights manually, or you can let DeepDive learn weights automatically. In order to learn weights automatically, you must have enough training data available. DeepDive chooses weights that agree most with the training data. Formally, the training data is just set of possible worlds, and we choose weights by maximizing the probabilities of these possible worlds. Learning &Inference

18

19 DeepDive Resource http://deepdive.stanford.eduhttp://deepdive.stanford.edu. https://www.youtube.com/watch?v=SfkLvExfl-s http://pages.cs.wisc.edu/~czhang/zhang.thesis.pdf

20 Thank you! Q&A


Download ppt "DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015."

Similar presentations


Ads by Google