A Brief Introduction to Distant Supervision

A Brief Introduction to Distant Supervision
Jin Mao Postdoc, School of Information, University of Arizona March 8th, 2016

Background and Framework
OUTLINE Background and Framework An Example—Relation Extraction Major Challenges

Supervised Learning requires a large amount of labeled samples. Human annotating is tedious and costly in terms of time and economy.

Solution A set of Seeds –-- semi-supervision Existing Resources, such as knowledge bases(e.g., Wikipedia), databases, ---ontology Distant supervision: Use a distant existing knowledge base to generate training samples(Craven & Kumlien, 1999). E.g., Snow et al. (2005) exploited WordNet to extract hypernym (is-a) relations between entities.

A high level Framework (Roth et al., 2013)

A simple distant supervision algorithm(Poon et al., ) Require: A set of sentences, with entity mentions identified Require: A database of relation triples (entity, relation, entity) 1: For each relation triple, find all sentences containing the entity pair 2: Annotate those sentences with the corresponding relation 3: Sample unannotated sentences with co-occuring proteins as negative examples 4: Train a classifier using the annotated dataset 5: return the resulting classifier

An Example—Relation Extraction(Mintz et al., 2009)
Knowledge base: Freebase, a training set of relations and entity pairs that participate in those relations. Entity: all entities are identified in sentences using a named entity tagger that labels persons, organizations and locations. Annotate: a sentence contains two entities and those entities are an instance of one of the Freebase relations Negative Samples: randomly selecting 1% entity pairs that do not appear in any Freebase relation. (sentences holding the entity pairs.)

Classifier: A multiclass logistic regression classifier Input: an entity pair and a feature vector Output: a relation name and a confidence score based on the probability of the entity pair belonging to that relation Multiclass: to learn noisy feature

Features Named Entity Tag Feature Lexical features Each lexical feature consists of the conjunction of all these components Make the number of features fewer, Large Corpus

Features Syntactic features conjunction

Major Challenges Missing: incomplete knowledge base
Noisy: Incorrect Pairs

Major Challenges Noise Reduction Often only a small fraction of co-occurrence matches indeed express the relation of the fact tuple. For example, the arguments of the fact tuple (“Barack Obama", born- in, “Honululu") could match in true positive contexts like”Barack Obama was born in Honululu", as well as false positive contexts like “Barack Obama visited Honululu". improve the quality of the training data by reducing the amount of noise.

Major Challenges Methods (Intxaurrondo, 2013):
Noise Reduction Methods (Intxaurrondo, 2013): mention frequency pointwise mutual information similarity between the centroids of all relation mentions and each individual mention (MC) More complex methods (Roth et al., 2013) at-least-one constraints topic-based models pattern correlations

Major Challenges Missing Data Data is not missing at random (NMAR).
Cannot be predicted by other variables. Dependent upon the data

Major Challenges Missing Data
(Ritter et al., 2013) propose a new latent-variable approach that models missing data.

References Craven, M., & Kumlien, J. (1999, August). Constructing biological knowledge bases by extracting information from text sources. In ISMB (Vol. 1999, pp ). Intxaurrondo, A., Surdeanu, M., de Lacalle, O. L., & Agirre, E. (2013). Removing noisy mentions for distant supervision. Procesamiento del lenguaje natural, 51, Mintz, M., Bills, S., Snow, R., & Jurafsky, D. (2009, August). Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2 (pp ). Association for Computational Linguistics. Poon, H., Toutanova, K., & Quirk, C. (2015). Distant supervision for cancer pathway extraction from text. In Pac. Symp. Biocomput. (pp ). Reschke, K., Jankowiak, M., Surdeanu, M., Manning, C. D., & Jurafsky, D. (2014). Event Extraction Using Distant Supervision. In LREC (pp ). Ritter, A., Zettlemoyer, L., & Etzioni, O. (2013). Modeling missing data in distant supervision for information extraction. Transactions of the Association for Computational Linguistics, 1, Roth, B., Barth, T., Wiegand, M., & Klakow, D. (2013, October). A survey of noise reduction methods for distant supervision. In Proceedings of the 2013 workshop on Automated knowledge base construction (pp ). ACM.

Thank you! More communication or error correction, contact

A Brief Introduction to Distant Supervision

Similar presentations

Presentation on theme: "A Brief Introduction to Distant Supervision"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Brief Introduction to Distant Supervision

Similar presentations

Presentation on theme: "A Brief Introduction to Distant Supervision"— Presentation transcript:

Similar presentations

About project

Feedback