Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reporter: Longhua Qian School of Computer Science and Technology

Similar presentations


Presentation on theme: "Reporter: Longhua Qian School of Computer Science and Technology"— Presentation transcript:

1 Tree Kernel-based Semantic Relation Extraction using Unified Dynamic Relation Tree
Reporter: Longhua Qian School of Computer Science and Technology Soochow University, Suzhou, China ALPIT2008, DaLian, China Good afternoon, everyone! It’s my great pleasure to share my research experience with everyone here. My name is .. And I am from Soochow university in china. My topic is …

2 Outline 1. Introduction 2. Dynamic Relation Tree
3. Unified Dynamic Relation Tree 4. Experimental results 5. Conclusion and Future Work This is the outline of my presentation. The first section is … The second one is … The third one is … The forth one is … And the last one is …

3 1. Introduction Information extraction is an important research topic in NLP. It attempts to find relevant information from a large amount of text documents available in digital archives and the WWW. Information extraction by NIST ACE Entity Detection and Tracking (EDT) Relation Detection and Characterization (RDC) Event Detection and Characterization (EDC) First Let’s have a look at the introduction. According to NIST ACE definition, Information Extraction subsumes three following subtasks EDT means .. RDC means … And EDC means … Our focus is on RDC, that is, relation extraction in general.

4 RDC Function RDC detects and classifies semantic relationships (usually of predefined types) between pairs of entities. Relation extraction is very useful for a wide range of advanced NLP applications, such as question answering and text summarization. E.g. The sentence “Microsoft Corp. is based in Redmond, WA” conveys the relation “GPE-AFF.Based” between “Microsoft Corp” (ORG) and “Redmond” (GPE).

5 Two approaches Feature-based methods Kernel-based methods
have dominated the research in relation extraction over the past years. However, relevant research shows that it’s difficult to extract new effective features and further improve the performance. Kernel-based methods compute the similarity of two objects (e.g. parse trees) directly. The key problem is how to represent and capture structured information in complex structures, such as the syntactic information in the parse tree for relation extraction? Typically, there exist two approaches to relation extraction

6 Kernel-based related work
Zelenko et al. (2003), Culotta and Sorensen (2004), Bunescu and Mooney (2005) described several kernels between shallow parse trees or dependency trees to extract semantic relations. Zhang et al. (2006), Zhou et al. (2007) proposed composite kernels consisting of an linear kernel and a convolution parse tree kernel, and the latter can effectively capture structured syntactic information inherent in parse trees. kernel-based methods for relation extraction include the following work.

7 Structured syntactic information
A tree span for relation instance a part of a parse tree used to represent the structured syntactic information for relation extraction. Two currently used tree spans PT(Path-enclosed Tree): the sub-tree enclosed by the shortest path linking the two entities in the parse tree CSPT(Context-Sensitive Path-enclosed Tree): Dynamically determined by further extending the necessary predicate-linked path information outside PT. There are two… One is PT, … The other one is …

8 Current problems Noisy information Useful information
Both PT and CSPT may still contain noisy information. In other words, more noise should be pruned away from a tree span. Useful information CSPT only captures part of context-sensitive information only relating to predicate-linked path. That is to say, more information outside PT/CSPT may be recovered so as to discern their relationships. However, there still exist several problems relating to the tree span for relation extraction. One is … The other is …

9 Our solution Dynamic Relation Tree (DRT)
Based on PT, we apply a variety of linguistics-driven rules to dynamically prune out noisy information from a syntactic parse tree and include necessary contextual information. Unified Dynamic Relation Tree (UDRT) Instead of constructing composite kernels, various kinds of entity-related semantic information, including entity types/sub-types/mention levels etc., are unified into a Dynamic Relation Tree. Our solution to these problems is to propose DRT and UDRT. DRT is Dynamic Relation Tree, … UDRT is Unified DRT, …

10 2. Dynamic Relation Tree Generation of DRT Remove operation
Starting from PT, we further apply three kinds of operations (i.e. Remove, Compress, and Expansion) sequentially to reshaping PT, giving rise to a Dynamic Relation Tree at last. Remove operation DEL_ENT2_PRE: Removing all the constituents (except the headword) of the 2nd entity DEL_PATH_ADVP/PP: Removing adverb or preposition phrases along the path Now, let’s turn to the second section--DRT. Remove operation includes … and …,

11 DRT(cont’) Compress operation Expansion operation
CMP_NP_CC_NP: Compressing noun phrase coordination conjunction CMP_VP_CC_VP: Compressing verb phrase coordination conjunction CMP_SINGLE_INOUT: Compressing single in-and-out nodes Expansion operation EXP_ENT2_POS: Expanding the possessive structure after the 2nd entity EXP_ENT2_COREF: Expanding entity coreferential mention before the 2nd entity Compress operation includes …., …. And …. And Expansion operation contains … and …

12 Some examples of DRT These are some examples of DRT, e.g.
(a) shows how the constituents before the 2nd entity can be removed. (b) shows all the conjuncts other than the entity may be reduced into a single conjunct. (c) shows how the possessive structure should be kept along with the entity’s headword to differentiate it from positive instances.

13 3.Unified Dynamic Relation Tree
T1: DRT T2: UDRT-Bottom T3: UDRT-Entity T4: UDRT-Top This illustration shows three different kinds of DRT setups incorporated with entity major types information. T1 means the original DRT. T2 means UDRT-Bottom with … T3 means UDRT-Entity with … And T4 means UDRT-Top with …

14 Four UDRT setups T1: DRT there is no entity-related information except the entity order (i.e. “E1” and “E2”). T2: UDRT-Bottom the DRT with entity-related information attached at the bottom of two entity nodes T3: UDRT-Entity the DRT with entity-related information attached in entity nodes T4: UDRT-Top the DRT with entity-related feature attached at the top node of the tree.

15 4. Experimental results Corpus Statistics Corpus processing
The ACE RDC 2004 data contains 451 documents and 5702 relation instances. It defines 7 entity major types, 7 major relation type and 23 relation subtypes. Evaluation is done on 347 (nwire/bnews) documents and 4307 relation instances using 5-fold cross-validation. Corpus processing parsed using Charniak’s parser (Charniak, 2001) Relation instances are generated by iterating over all pairs of entity mentions occurring in the same sentence. The third section is about experimental results. The corpus we used is the ACE RDC 2004 dataset, this dataset contains … For comparison purposes, evaluation is done…. First the corpus is parsed … Then relation instances … Entity major types: PER, ORG, GPE, LOC, FAC, VEH, WEA Relation major types: PHY, PER-SOC, EMP-ORG, ART, OTHER-AFF, GPE-AFF, DISC

16 Classifier Tools One vs. others strategy SVMLight (Joachims 1998)
Tree Kernel Tooklits (Moschitti 2004) The training parameters C (SVM) and λ (tree kernel) are also set to 2.4 and 0.4 respectively. One vs. others strategy which builds K basic binary classifiers so as to separate one class from all the others. The tools we used include .. by …and... by … For comparison purposes, the training … And for efficiency consideration, we also apply one vs. other strategy…

17 Contribution of various operation rules
Each operation rule is incrementally applied on the previously derived tree span. The plus sign preceding a specific rule indicates that this rule is useful and will be added automatically in the next round. Otherwise, the performance is unavailable. Operation rules P R F PT (baseline) 76.3 59.8 67.1 +DEL_ENT2_PRE 62.1 68.5 DEL_PATH_PP - DEL_PATH_ADVP +CMP_SINGLE_INOUT 76.4 63.1 69.1 +CMP_NP_CC_NP 76.1 63.3 CMP_VP_CC_VP +EXP_ENT2_POS 76.6 63.8 69.6 +EXP_ENT2_COREF 77.1 64.3 70.1 This table indicates the contribution of performance on the major relation types in the ACE RDC 2004 corpus. It shows that the eventual Dynamic Relation Tree (DRT) achieves the best performance of 77.1%/64.3%/70.1 in P/R/F respectively after applying all the operation rules, with the increase of F-measure by 3.0 units compared to the baseline PT with entity-type info. Attached in entity nodes .

18 Comparison of different UDRT setups
Tree Setups P R F DRT 68.7 53.5 60.1 UDRT-Bottom 76.2 64.4 69.8 UDRT-Entity 77.1 64.3 70.1 UDRT-Top 76.4 65.2 70.4 This tables compares the performance of different UDRT setups, i.e. …, … and … It shows that … And … Compared with DRT, the Unified Dynamic Relation Trees (UDRTs) with only entity type information significantly improve the F-measure by average 10 units due to the increase both in precision and recall. Among the three UDRTs, UDRT-Top achieves slightly better performance than the other two.

19 Improvements of different tree setups over PT
CSPT over PT 1.5 1.1 1.3 DRT over PT 0.1 5.4 3.3 UDRT-Top over PT 3.9 9.4 7.2 Dynamic Relation Tree (DRT) performs better that CSPT/PT setups. the Unified Dynamic Relation Tree with entity-related semantic features attached at the top node of the parse tree performs best. This tables compares the performance improvements of different tree setups over the original PT. It shows that … And …

20 Comparison with best-reported systems
F Zhou et al.: Composite kernel 82.2 70.2 75.8 Ours: CTK with UDRT-Top 80.2 69.2 74.3 Zhang et al.: 76.1 68.4 72.1 Zhou et al.: CS-CTK with CSPT 81.1 66.7 73.2 Zhao and Grishman 70.5 70.4 CTK with PT 74.1 62.4 67.7 Finally, we compare our system with the best-reported relation extraction systems on the ACE RDC 2004 corpus. It shows that our UDRT-Top performs best among tree setups using one single kernel, and even better than the two previous composite kernels.

21 5. Conclusion Dynamic Relation Tree (DRT), which is generated by applying various linguistics-driven rules, can significantly improve the performance over currently used tree spans for relation extraction. Integrating entity-related semantic information into DRT can further improve the performance, esp. when they are attached at the top node of the tree. The last section is conclusion. From the previous experimental results, we can draw the following conclusions.

22 Future Work we will focus on semantic matching in computing the similarity between two parse trees, where semantic similarity between content words (such as “hire” and “employ”) would be considered to achieve better generalization. As to further work, we will…

23 References Bunescu R. C. and Mooney R. J A Shortest Path Dependency Kernel for Relation Extraction. EMNLP-2005 Chianiak E Intermediate-head Parsing for Language Models. ACL-2001 Collins M. and Duffy N Convolution Kernels for Natural Language. NIPS-2001 Collins M. and Duffy, N New Ranking Algorithm for Parsing and Tagging: Kernel over Discrete Structure, and the Voted Perceptron. ACL-02 Culotta A. and Sorensen J Dependency tree kernels for relation extraction. ACL’2004. Joachims T Text Categorization with Support Vector Machine: learning with many relevant features. ECML-1998 Moschitti A A Study on Convolution Kernels for Shallow Semantic Parsing. ACL-2004 Zelenko D., Aone C. and Richardella A Kernel Methods for Relation Extraction. Journal of MachineLearning Research. 2003(2): Zhang M., , Zhang J. Su J. and Zhou G.D A Composite Kernel to Extract Relations between Entities with both Flat and Structured Features. COLING-ACL’2006. Zhao S.B. and Grisman R Extracting relations with integrated information using kernel methods. ACL’2005. Zhou G.D., Su J., Zhang J. and Zhang M Exploring various knowledge in relation extraction. ACL’2005.

24 End Thank You!


Download ppt "Reporter: Longhua Qian School of Computer Science and Technology"

Similar presentations


Ads by Google