Automatically Labeled Data Generation for Large Scale Event Extraction

Slides:

Advertisements

Similar presentations

Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,

Advertisements

Overview of the TAC2013 Knowledge Base Population Evaluation: English Slot Filling Mihai Surdeanu with a lot help from: Hoa Dang, Joe Ellis, Heng Ji, and.

Large-Scale Entity-Based Online Social Network Profile Linkage.

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.

The Impact of Task and Corpus on Event Extraction Systems Ralph Grishman New York University Malta, May 2010 NYU.

NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

July 9, 2003ACL An Improved Pattern Model for Automatic IE Pattern Acquisition Kiyoshi Sudo Satoshi Sekine Ralph Grishman New York University.

Deep Belief Networks for Spam Filtering

Automatic Acquisition of Lexical Classes and Extraction Patterns for Information Extraction Kiyoshi Sudo Ph.D. Research Proposal New York University Committee:

Extracting Interest Tags from Twitter User Biographies Ying Ding, Jing Jiang School of Information Systems Singapore Management University AIRS 2014, Kuching,

Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.

Introduction to Machine Learning Approach Lecture 5.

Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.

A Light-weight Approach to Coreference Resolution for Named Entities in Text Marin Dimitrov Ontotext Lab, Sirma AI Kalina Bontcheva, Hamish Cunningham,

Thien Anh Dinh1, Tomi Silander1, Bolan Su1, Tianxia Gong

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

A Survey for Interspeech Xavier Anguera Information Retrieval-based Dynamic TimeWarping.

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

Overview of the KBP 2012 Slot-Filling Tasks Hoa Trang Dang (National Institute of Standards and Technology Javier Artiles (Rakuten Institute of Technology)

1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.

Employing Active Learning to Cross-Lingual Sentiment Classification with Data Quality Controlling Shoushan Li †‡ Rong Wang † Huanhuan Liu † Chu-Ren Huang.

A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,

DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.

TWC Illuminate Knowledge Elements in Geoscience Literature Xiaogang (Marshall) Ma, Jin Guang Zheng, Han Wang, Peter Fox Tetherless World Constellation.

FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.

Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.

Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,

Relation Extraction (RE) via Supervised Classification See: Jurafsky & Martin SLP book, Chapter 22 Exploring Various Knowledge in Relation Extraction.

Facial Smile Detection Based on Deep Learning Features Authors: Kaihao Zhang, Yongzhen Huang, Hong Wu and Liang Wang Center for Research on Intelligent.

The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.

Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.

Experience Report: System Log Analysis for Anomaly Detection

Queensland University of Technology

Concept Grounding to Multiple Knowledge Bases via Indirect Supervision

Sentiment analysis algorithms and applications: A survey

Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.

A Brief Introduction to Distant Supervision

Liberal Event Extraction and Event Schema Induction

Chapter 6 Classification and Prediction

Wei Wei, PhD, Zhanglong Ji, PhD, Lucila Ohno-Machado, MD, PhD

(Entity and) Event Extraction CSCI-GA.2591

Distant supervision for relation extraction without labeled data

Aspect-based sentiment analysis

Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning Shizhu He, Cao liu, Kang Liu and Jun Zhao.

Social Knowledge Mining

Applying Key Phrase Extraction to aid Invalidity Search

Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang

Lei Sha, Jing Liu, Chin-Yew Lin, Sujian Li, Baobao Chang, Zhifang Sui

Learning to Sportscast: A Test of Grounded Language Acquisition

Introduction Task: extracting relational facts from text

iSRD Spam Review Detection with Imbalanced Data Distributions

Family History Technology Workshop

Outline Background Motivation Proposed Model Experimental Results

Automatic Extraction of Hierarchical Relations from Text

Deep Cross-media Knowledge Transfer

View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions 1,2 1.

Leveraging Textual Specifications for Grammar-based Fuzzing of Network Protocols Samuel Jero, Maria Leonor Pacheco, Dan Goldwasser, Cristina Nita-Rotaru.

Using Uneven Margins SVM and Perceptron for IE

University of Illinois System in HOO Text Correction Shared Task

Hierarchical, Perceptron-like Learning for OBIE

Rachit Saluja 03/20/2019 Relation Extraction with Matrix Factorization and Universal Schemas Sebastian Riedel, Limin Yao, Andrew.

Unsupervised Learning of Narrative Schemas and their Participants

Entity Linking Survey

Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.

Extracting Why Text Segment from Web Based on Grammar-gram

Presentation transcript:

Automatically Labeled Data Generation for Large Scale Event Extraction Yubo Chen, Shulin Liu, Xiang Zhang, Kang Liu, and Jun Zhao National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences 000

Outline Task & Related Work Our Proposed Approach Experiments 000

Outline Task & Related Work Our Proposed Approach Experiments 000

Task Event Event Extraction In Automatic Content Extraction (ACE), an event is defined as a specific occurrence involving participants. an event is something that happens an event can be described as a change of state Event Extraction a challenging task in Information Extraction aims at detecting and typing events and extracting arguments with different roles from natural-language texts. 000

Threw(a “Conflict/Attack”event) Task Terminology Event mention, Event trigger, Event argument, Argument role Yesterday , some demonstrators threw stones at soldiers in Israeli. Yesterday threw Israeli Trigger Threw(a “Conflict/Attack”event) Arguments Role = Attacker demonstrators Role = Instrument stones Role = Target soldiers Role =Time Yesterday Role=Place Israeli 000

Related Work Supervised Methods Feature-based method: Convert classification clues into discrete features eg. Local features were used to train the classifier(Ahn 2006) Global features were used(Ji and Grishman 2008) Cross-event and Cross-entity inference (Gupta and Ji 2009, Liao and Grishman 2010, Hong et al.2011) 000

Related Work Supervised Methods Structure-based method Predicting the structure of the event in a sentence eg. Dependency parsing problem (McClosky 2011) Structured perceptron with beam search(Li et al. 2013) Constructing information networks(Li et al. 2014) 000

Related Work Supervised Methods Neural-based method Extracting features from texts automatically. eg. CNN based (Nguyen 2015, Chen 2015, Nguyen 2016) RNN based (Nguyen 2016, Feng 2016) 000

Problems Supervised methods relies on elaborate human-annotated data. Thus suffer from high costs for manually labeling training data and low coverage of predefined event types . Unsupervised methods can extract large numbers of events without using labeled data. But extracted events may not be easy to be mapped to events for a particular knowledge base. Statistics of ACE 2005 English Data. 000

Motivation For extracting large scale events, especially in open domain scenarios, how to automatically and efficiently generate sufficient training data is an important problem Recent improvements of Distant Supervision (DS) have been proven to be effective to label training data for Relation Extraction 000

Challenges Triggers are not given out in existing knowledge bases RE: ( entity1, relation, entity2) We can use Michelle Obama and Barack Obama to label back EE:(event instance, event type; role1, argument1 ;...; rolen, argumentn) We can not use m.02nqglv and Barack Obama to label back In ACE, an event instance is represented as a trigger word. 000

Challenges Arguments for a specific event instance are usually mentioned in multiple sentences. Statistics of events in Freebase. Only 0.02% of instances can find all argument mentions in one sentence 000

Outline Task & Related Work Our Proposed Approach Experiments 000

Method of Generating Training Data The architecture of automatically labeling training data

Method of Generating Training Data Key Argument Detection Role Saliency (RS): reflects the saliency of an argument to represent a specific event instance of a given event type. Event Relevance (ER): reflects the ability in which an argument can be used to discriminate different event types. Key Rate (KR): estimate the importance of an argument to a type of event. 000

Method of Generating Training Data Trigger Word Detection We use key arguments to label sentences that may express events in Wikipedia Then we use these labeled sentences to detect trigger Trigger Rate (TR): Trigger Candidate Frequency (TCF)): Trigger Event Type Frequency (TETF): 000

Method of Generating Training Data Trigger Word Filtering and Expansion we propose to use linguistic resource FrameNet to filter noisy verbal triggers and expand nominal triggers The initial trigger lexicon is noisy and merely contains verbal triggers 000

Method of Generating Training Data Automatically labeled data generation we propose a Soft Distant Supervision and use it to automatically generate training data 000

Method of Event Extraction DMCNN we use DMCNN as the basic model for event extraction 000

Method of Event Extraction DMCNN To alleviate the wrong label problem, we use Multi-instance Learning for DMCNNs In stage of argument classification, we take sentences containing the same argument candidate and triggers with a same event type as a bag and all instances in a bag are considered independently. 000

Outline Task & Related Work Our Proposed Approach Experiments 000

Experiments Our Automatically Labeled Data When we merely use two key arguments to label data, we will obtain 421,602 labeled sentences. However, these sentences miss labeling triggers. When we use two key arguments and trigger words to label data, we will obtain 72,611 labeled sentences. Compared with nearly 6,000 human annotated labeled sentence in ACE, our method can automatically generate large scale labeled training data. 000

Experiments Manual Evaluations of Labeled Data We randomly select 500 samples from our automatically labeled data Each sample is independently annotated by three annotators 000

Experiments Automatic Evaluations of Labeled Data Data ACE, Expanded Data (ED), ACE+ED Evaluation 40 newswire articles from ACE 2005 as test set 30 other documents from different genres as development set The rest (529) documents for training A trigger is correct if its event subtype and offsets match those of a reference trigger An argument is correctly identified if its event subtype and offsets match those of any of the reference argument mentions An argument is correctly classified if its event subtype, offsets and argument role match those of any of the reference argument mentions Precision, Recall, F-score 000

Experiments Automatic Evaluations of Labeled Data Compared with all models, DMCNN trained with ACE+ED achieves the highest performance. This demonstrates that our automatically generated labeled data could expand human annotated training data effectively. 000

Experiments Automatic Evaluations of Labeled Data Compared with Chen’s DMCNN trained with ACE, DMCNN trained with ED Only achieves a competitive performance. This demonstrates that our large scale automatically labeled data is competitive with elaborately human annotated data. 000

Experiments Impact of Key Rate ACE+KR achieve the best performance in both stages. This demonstrates the effectiveness of our KR methods if k = 2, we will get 25,797 sentences labeled as people.marriage events and we will get 534 labeled sentences, if k = 3. 000

Experiments Impact of Trigger Rate and FrameNet Compared with ACE+TCF and ACE+TETF, ACE+TR gains a higher performance in both stages. It demonstrates the effectiveness of our TR methods. 000

Experiments Impact of Trigger Rate and FrameNet When we use FrameNet to generate triggers, compared with ACE+TR, we get a 1.0 improvement on trigger classification and a 1.7 improvement on argument classification 000

Experiments Performance of DMCNN-MIL (Held-out Evaluation) we hold out part of the Freebase event data during training, and compare newly discovered event instances against this held out data Event classification Argument classification 000

Experiments Performance of DMCNN-MIL (Human Evaluation) Because the incomplete nature of Freebase, held out evaluation suffers from false negatives problem. we manually check the newly discovered event instances that are not in Freebase 000

Conclusion we present an approach to automatically label data for large scale event extraction via world knowledge and linguistic knowledge. The experimental results show the quality of our large scale automatically labeled data is competitive with elaborately human-annotated data. 000

Email: yubo.chen@nlpr.ia.ac.cn Thanks Email: yubo.chen@nlpr.ia.ac.cn 000