Attribute Extraction and Scoring: A Probabilistic Approach Taesung Lee, Zhongyuan Wang, Haixun Wang, Seung-won Hwang Microsoft Research Asia Speaker: Bo.

Attribute Extraction and Scoring: A Probabilistic Approach Taesung Lee, Zhongyuan Wang, Haixun Wang, Seung-won Hwang Microsoft Research Asia Speaker: Bo Xu

Background Goal of Creating Knowledge Base Knowledge base is used to provide this background knowledge to machines Enable machines to perform certain types of inferences as humans do Three backbone of the knowledge base IsA: a relationship between a sub-concept and a concept e.g., an IT company is a company isInstanceOf: a relationship between an entity and a concept e.g., Microsoft is a company isPropertyOf: a relationship between an attribute and a concept e.g., color is a property/attribute of wine

Attributes(isPropertyOf) play a critical role in knowledge inference Knowledge Inference Given some attributes, guess which concept they described E.g. “capital city, population”, people may infer that it relates to country Given “color, body, smell,” people may think of wine However, in many cases, the association may not be straightforward, and may require some guesswork even for humans

However, in order to perform inference, knowing that a concept can be described by a certain set of attributes is not enough. Instead, we must know how important, or how typical each attribute is for the concept Typically P(c|a) Denotes how typical concept c is, given attribute a P(a|c) Denotes how typical attribute a is, given concept c

Problem Definition Goal

Framework

Attribute extraction from the web corpus Concept-based extraction(CB) E.g., “... the acidity of a wine is an essential component of the wine...” Instance-based extraction(IB) E.g., “the taste of Bordeaux is...” (c, a, n(c, a)) tuples (i, a, n(i, a)) tuples

Attribute extraction from an external knowledgebase Extract instance-based attributes from DBpedia entities(KB) use n(i, a) = 1 to produce (i, a, n(i, a)) tuples

Attribute extraction from the query logs Extract instance-based attributes from query logs (QB) we cannot expect concept-based attributes from query logs as people usually have interest in a more specific instance rather than a general concept Two-pass approach First pass Extract candidate instance-attribute pairs (i, a) from the pattern “the a of (the/a/an) i” extend the candidate list IU to include the attributes obtained from IB and KB Second pass count the co-occurrence n(i, a) of i and a in the query log for each (I, a) ∈ IU to produce the (i, a, n(i, a)) list

Attribute Distribution e.g., the attribute name is frequently observed from the CB pattern, such as “the name of a state is...”, but people will not mention it with the specific state, as in “the name of Washington is...”.

Attribute Scoring Computing Typicality from a CB List Computing Typicality from an IB List

Computing Naive P(a|i, c) and P(i|c) using Probase A simplifying assumption one instance belongs to one concept P(c|i) = 1 if this concept-instance is observed in Probase and 0 otherwise

However, in reality, the same instance name appears with many concepts [C1] ambiguous instance related to dissimilar concepts ‘Washington’ can refer to a president and a state [C2] unambiguous instance related to similar concepts ‘George Washington’ is a president, patriot, and historical figure

Unbiasing P(a|i, c) JC(a, c) (join count) defined as the number of instances in c that has attribute join ratio (JR) representing how likely a is associated with c

Unbiasing P(c|i) Probase frequency counts n p (c, i) However, using this equation would not distinguish C2 from C1 sim(c, c′) comparing the instance sets of the two concepts using Jaccard Similarity

Typicality Score Aggregation P(a|c) One CB list from web documents Three IB lists from web documents, query log, and knowledge base no single source can be most reliable for all concepts Formulate P(a|c) The goal is to learn weights for the given concept

Learn the weights Ranking SVM approach pairwise learning VS. regression Since training data does not have to quantify absolute typicality score

Pairwise Learning Given x1>x2>x3 Binary Classification Positive example: x1-x2, x1-x3, x2-x3 Negative example: x2-x1, x3-x1, x3-x2

Features linear combination of the five features

Attribute Synonym Set Leverage the evidence from Wikipedia Wikipedia Redirects Wikipedia Internal Links we take each connected component as a synonymous attribute cluster Within each cluster, we set the most frequent attribute as a representative attribute of the synonymous attributes

Experiment——Experimental Settings

Experiment——Evaluation Measures and Baseline The number of labeled attributes for evaluation is 4846 attributes from 12 concepts evaluation measures relevance score to the four groups as 1, 2/3, 1/3, and 0, respectively absolute value is hard to acquire, approximate this value consider a set of attributes labeled as the universe

Precision

Recall

Thank You!

Attribute Extraction and Scoring: A Probabilistic Approach Taesung Lee, Zhongyuan Wang, Haixun Wang, Seung-won Hwang Microsoft Research Asia Speaker: Bo.

Similar presentations

Presentation on theme: "Attribute Extraction and Scoring: A Probabilistic Approach Taesung Lee, Zhongyuan Wang, Haixun Wang, Seung-won Hwang Microsoft Research Asia Speaker: Bo."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Attribute Extraction and Scoring: A Probabilistic Approach Taesung Lee, Zhongyuan Wang, Haixun Wang, Seung-won Hwang Microsoft Research Asia Speaker: Bo.

Similar presentations

Presentation on theme: "Attribute Extraction and Scoring: A Probabilistic Approach Taesung Lee, Zhongyuan Wang, Haixun Wang, Seung-won Hwang Microsoft Research Asia Speaker: Bo."— Presentation transcript:

Similar presentations

About project

Feedback