R98922004 Yun-Nung Chen 資工碩一 陳縕儂 1 /39.  Non-projective Dependency Parsing using Spanning Tree Algorithms (HLT/EMNLP 2005)  Ryan McDonald, Fernando.

Slides:



Advertisements
Similar presentations
The Primal-Dual Method: Steiner Forest TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A AA A A A AA A A.
Advertisements

Dependency tree projection across parallel texts David Mareček Charles University in Prague Institute of Formal and Applied Linguistics.
Authors Sebastian Riedel and James Clarke Paper review by Anusha Buchireddygari Incremental Integer Linear Programming for Non-projective Dependency Parsing.
Every edge is in a red ellipse (the bags). The bags are connected in a tree. The bags an original vertex is part of are connected.
Drawings as Models of Syntactic Structure: Theory and Algorithms by Mathias Möhl supervised by Marco Kuhlmann final talk of diploma thesis at Programming.
GRAPH BALANCING. Scheduling on Unrelated Machines J1 J2 J3 J4 J5 M1 M2 M3.
Structured SVM Chen-Tse Tsai and Siddharth Gupta.
Unsupervised Dependency Parsing David Mareček Institute of Formal and Applied Linguistics Charles University in Prague Doctoral thesis defense September.
Dependency Parsing Joakim Nivre. Dependency Grammar Old tradition in descriptive grammar Modern theroretical developments: –Structural syntax (Tesnière)
Dependency Parsing Some slides are based on:
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
28 June 2007EMNLP-CoNLL1 Probabilistic Models of Nonprojective Dependency Trees David A. Smith Center for Language and Speech Processing Computer Science.
Lectures on Network Flows
CSE 589 Applied Algorithms Spring 1999 Course Introduction Depth First Search.
Advanced Topics in Algorithms and Data Structures 1 Rooting a tree For doing any tree computation, we need to know the parent p ( v ) for each node v.
MIN-COST ABORESCENCES YQ Lu. Aborescence Definition: Given a directed graph G=(V,E) and a root r, an aborescence rooted at r is a subgraph T that each.
Tirgul 12 Algorithm for Single-Source-Shortest-Paths (s-s-s-p) Problem Application of s-s-s-p for Solving a System of Difference Constraints.
Parsing with CFG Ling 571 Fei Xia Week 2: 10/4-10/6/05.
Topics: 1. Trees - properties 2. The master theorem 3. Decoders מבנה המחשב - אביב 2004 תרגול 4#
Implemented two graph partition algorithms 1. Kernighan/Lin Algorithm Input: $network, a Clair::Network object Produce bi-partition to undirected weighted.
A Tree-to-Tree Alignment- based Model for Statistical Machine Translation Authors: Min ZHANG, Hongfei JIANG, Ai Ti AW, Jun SUN, Sheng LI, Chew Lim TAN.
Three kinds of learning
April 26, 2007Workshop on Treebanking, NAACL-HTL 2007 Rochester1 Treebanks and Parsing Jan Hajič Institute of Formal and Applied Linguistics School of.
Tirgul 13. Unweighted Graphs Wishful Thinking – you decide to go to work on your sun-tan in ‘ Hatzuk ’ beach in Tel-Aviv. Therefore, you take your swimming.
An SVMs Based Multi-lingual Dependency Parsing Yuchang CHENG, Masayuki ASAHARA and Yuji MATSUMOTO Nara Institute of Science and Technology.
Important Problem Types and Fundamental Data Structures
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
1 Data-Driven Dependency Parsing. 2 Background: Natural Language Parsing Syntactic analysis String to (tree) structure He likes fish S NP VP NP VNPrn.
Prof. Swarat Chaudhuri COMP 482: Design and Analysis of Algorithms Spring 2012 Lecture 10.
Dependency Parsing Prashanth Mannem
Natural Language Processing Course Project: Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
1 Boosting-based parse re-ranking with subtree features Taku Kudo Jun Suzuki Hideki Isozaki NTT Communication Science Labs.
Part E: conclusion and discussions. Topics in this talk Dependency parsing and supervised approaches Single model Graph-based; Transition-based; Easy-first;
Sentence Compression Based on ILP Decoding Method Hongling Wang, Yonglei Zhang, Guodong Zhou NLP Lab, Soochow University.
Data Structures & Algorithms Recursion and Trees Richard Newman.
GREEDY ALGORITHMS UNIT IV. TOPICS TO BE COVERED Fractional Knapsack problem Huffman Coding Single source shortest paths Minimum Spanning Trees Task Scheduling.
Dependency Parser for Swedish Project for EDA171 by Jonas Pålsson Marcus Stamborg.
30 March – 8 April 2005 Dipartimento di Informatica, Universita di Pisa ML for NLP With Special Focus on Tagging and Parsing Kiril Ribarov.
Billy Timlen Mentor: Imran Saleemi.  Goal: Have an optimal matching  Given: List of key-points in each image/frame, Matrix of weights between nodes.
National Taiwan University, Taiwan
Trees : Part 1 Section 4.1 (1) Theory and Terminology (2) Preorder, Postorder and Levelorder Traversals.
NLP. Introduction to NLP The probabilities don’t depend on the specific words –E.g., give someone something (2 arguments) vs. see something (1 argument)
Exploiting Reducibility in Unsupervised Dependency Parsing David Mareček and Zdeněk Žabokrtský Institute of Formal and Applied Linguistics Charles University.
Dependency Parsing Parsing Algorithms Peng.Huang
Trees Dr. Yasir Ali. A graph is called a tree if, and only if, it is circuit-free and connected. A graph is called a forest if, and only if, it is circuit-free.
Natural Language Processing Group Computer Sc. & Engg. Department JADAVPUR UNIVERSITY KOLKATA – , INDIA. Professor Sivaji Bandyopadhyay
Structured learning: overview Sunita Sarawagi IIT Bombay TexPoint fonts used in EMF. Read the TexPoint manual before.
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
1 Trees : Part 1 Reading: Section 4.1 Theory and Terminology Preorder, Postorder and Levelorder Traversals.
Constraint Programming for the Diameter Constrained Minimum Spanning Tree Problem Thiago F. Noronha Celso C. Ribeiro Andréa C. Santos.
Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
Natural Language Processing : Probabilistic Context Free Grammars Updated 8/07.
Linguistic Graph Similarity for News Sentence Searching
Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.
Programming Languages Translator
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Web News Sentence Searching Using Linguistic Graph Similarity
Minimum Spanning Tree Chapter 13.6.
CIS 700 Advanced Machine Learning Structured Machine Learning:   Theory and Applications in Natural Language Processing Shyam Upadhyay Department of.
Lectures on Network Flows
Statistical NLP Spring 2011
Chapter 1.
Topological Sort (topological order)
Automatic Detection of Causal Relations for Question Answering
Important Problem Types and Fundamental Data Structures
Richard Anderson Winter 2019 Lecture 6
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Presentation transcript:

R Yun-Nung Chen 資工碩一 陳縕儂 1 /39

 Non-projective Dependency Parsing using Spanning Tree Algorithms (HLT/EMNLP 2005)  Ryan McDonald, Fernando Pereira, Kiril Ribarov, Jan Hajic 2 /39

3 /39

 Each word depends on exactly one parent  Projective  Words in linear order, satisfying ▪ Edges without crossing ▪ A word and its descendants form a contiguous substring of the sentence 4 /39

 English  Most projective, some non-projective  Languages with more flexible word order  Most non-projective ▪ German, Dutch, Czech 5 /39

 Related work  relation extraction  machine translation 6 /39

 Dependency parsing can be formalized as  the search for a maximum spanning tree in a directed graph 7 /39

8 /39

 sentence: x = x 1 … x n  the directed graph G x = ( V x, E x ) given by  dependency tree for x: y  the tree G y = ( V y, E y ) V y = V x E y = {(i, j), there’s a dependency from x i to x j } 9 /39

 scores of edges  score of a dependency tree y for sentence x 10 /39

11 /39  x = John hit the ball with the bat root hit Johnball the with bat the y1y1 root ball Johnhit the with batthe y2y2 root John ball hit the with batthe y3y3

1) How to decide weight vector w 2) How to find the tree with the maximum score 12 /39

 dependency trees for x = spanning trees for G x  the dependency tree with maximum score for x = maximum spanning trees for G x 13 /39

14 /39

 Input: graph G = (V, E)  Output: a maximum spanning tree in G  greedily select the incoming edge with highest weight ▪ Tree ▪ Cycle in G ‚ contract cycle into a single vertex and recalculate edge weights going into and out the cycle 15 /39

 x = John saw Mary 16 /39 saw root John Mary GxGx

 For each word, finding highest scoring incoming edge 17 /39 saw root John Mary GxGx

 If the result includes  Tree – terminate and output  Cycle – contract and recalculate 18 /39 saw root John Mary GxGx

 Contract and recalculate ▪ Contract the cycle into a single node ▪ Recalculate edge weights going into and out the cycle 19 /39 saw root John Mary GxGx

 Outcoming edges for cycle 20 /39 saw root John Mary GxGx 20 30

 Incoming edges for cycle, 21 /39 saw root John Mary GxGx 20 30

 x = root ▪ s(root, John) – s(a(John), John) + s(C) = =29 ▪ s(root, saw) – s(a(saw), saw) + s(C) = =40 22 /39 saw root John Mary GxGx

 x = Mary ▪ s(Mary, John) – s(a(John), John) + s(C) = =31 ▪ s(Mary, saw) – s(a(saw), saw) + s(C) = =30 23 /39 saw root John Mary GxGx

24 /39 saw root John Mary 9 30 GxGx  Reserving highest tree in cycle  Recursive run the algorithm

25 /39 saw root John Mary 9 30 GxGx  Finding incoming edge with highest score  Tree: terminate and output

26 /39 saw root John Mary 30 GxGx  Maximum Spanning Tree of G x

 Each recursive call takes O(n 2 ) to find highest incoming edge for each word  At most O(n) recursive calls (contracting n times)  Total: O(n 3 )  Tarjan gives an efficient implementation of the algorithm with O(n 2 ) for dense graphs 27 /39

 Eisner Algorithm: O(n 3 )  Using bottom-up dynamic programming  Maintain the nested structural constraint (non-crossing constraint) 28 /39

29 /39

 Supervised learning  Target: training weight vectors w between two features (PoS tag)  Training data:  Testing data: x 30 /39

 Margin Infused Relaxed Algorithm (MIRA)  dt(x) : the set of possible dependency trees for x 31 /39 keep new vector as close as possible to the old final weight vector is the average of the weight vectors after each iteration

 Using only the single margin constraint 32 /39

 Local constraints  correct incoming edge for j other incoming edge for j  correct spanning tree incorrect spanning trees  More restrictive than original constraints 33 /39  a margin of 1  the number of incorrect edges

34 /39

 Language: Czech  More flexible word order than English ▪ Non-projective dependency  Feature: Czech PoS tag  standard PoS, case, gender, tense  Ratio of non-projective and projective  Less than 2% of total edges are non-projective ▪ Czech-A: entire PDT ▪ Czech-B: including only the 23% of sentences with non- projective dependency 35 /39

 COLL1999  The projective lexicalized phrase-structure parser  N&N2005  The pseudo-projective parser  McD2005  The projective parser using Eisner and 5-best MIRA  Single-best MIRA  Factored MIRA  The non-projective parser using Chu-Liu-Edmonds 36 /39

Czech-A (23% non-projective) AccuracyComplete /39 Czech-B (non-projective) AccuracyComplete COLL1999 O(n 5 ) N&N2005 McD2005 O(n 3 ) Single-best MIRA O(n 2 ) Factored MIRA O(n 2 )

English AccuracyComplete /39 McD2005 O(n 3 ) Single-best MIRA O(n 2 ) Factored MIRA O(n 2 )  English projective dependency trees  Eisner algorithm uses the a priori knowledge that all trees are projective

39/39