Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dependency Parsing Some slides are based on:

Similar presentations


Presentation on theme: "Dependency Parsing Some slides are based on:"— Presentation transcript:

1 Dependency Parsing Some slides are based on:
PPT presentation on dependency parsing by Prashanth Mannem Seven Lectures on Statistical Parsing by Christopher Manning

2 Constituency parsing Breaks sentence into constituents (phrases), which are then broken into smaller constituents Describes phrase structure and clause structure ( NP, PP, VP, etc.) Structures often recursive

3 S NP VP VP NP mom is an amazing show

4 Dependency parsing Syntactic structure consists of lexical items, linked by binary asymmetric relations called dependencies Interested in grammatical relations between individual words (governing & dependent words) Does not propose a recursive structure, rather a network of relations These relations can also have labels

5

6 Dependency vs. Constituency
Dependency structures explicitly represent Head-dependent relations (directed arcs) Functional categories (arc labels) Possibly some structural categories (parts-of-speech) Constituency structure explicitly represent Phrases (non-terminal nodes) Structural categories (non-terminal labels) Possibly some functional categories (grammatical functions)

7 Dependency vs. Constituency
A dependency grammar has a notion of a head Officially, CFGs don’t But modern linguistic theory and all modern statistical parsers (Charniak, Collins, …) do, via hand-written phrasal “head rules”: The head of a Noun Phrase is a noun/number/… The head of a Verb Phrase is a verb/modal/…. Based on a slide by Chris Manning

8 Dependency vs. Constituency
The head rules can be used to extract a dependency parse from a CFG parse (follow the heads) A phrase structure tree can be got from a dependency tree, but dependents are flat Based on a slide by Chris Manning

9 Definition: dependency graph
An input word sequence w1…wn Dependency graph G = (V,E) where V is the set of nodes i.e. word tokens in the input seq. E is the set of unlabeled tree edges (i, j) i, j є V (ii, j) indicates an edge from i (parent, head, governor) to j (child, dependent)

10 Definition: dependency graph
A dependency graph is well-formed iff Single head: Each word has only one head Acyclic: The graph should be acyclic Connected: The graph should be a single tree with all the words in the sentence Projective: If word A depends on word B, then all words between A and B are also subordinate to B (i.e. dominated by B)

11 Non-projective dependencies
Ram saw a dog yesterday which was a Yorkshire Terrier

12 Parsing algorithms Dependency based parsers can be broadly categorized into Grammar driven approaches Parsing done using grammars Data driven approaches Parsing by training on annotated/un-annotated data

13 Unlabeled graphs Dan Klein recently showed that labeling is relatively easy and that the difficulty of parsing lies in creating bracketing (Klein, 2004) Therefore some parsers run in two steps: 1) bracketing; 2) labeling

14 Traditions Dynamic programming Deterministic search
e.g., Eisner (1996), McDonald (2006) Deterministic search e.g., Covington (2001), Yamada and Matsumoto, Nivre (2006) Constraints satisfaction e.g., Maruyama, Foth et al.

15 Data driven Two main approaches
Global, Exhaustive, Graph-based parsing Local, greedy, transition-based parsing

16 Graph-based parsing Assume there is a scoring function:
The score of a graph is Parsing for input string x is All dependency graphs

17 MST algorithm (McDonald, 2006)
Scores are based on features, independent of other dependencies Features can be Head and dependent word and POS separately Head and dependent word and POS bigram features Words between head and dependent Length and direction of dependency

18 MST algorithm (McDonald, 2006)
Parsing can be formulated as maximum spanning tree problem Use Chu-Liu-Edmonds (CLE) algorithm for MST (runs in , considers non-projective arcs) Uses online learning for determining weight vector w

19 Transition-based parsing
A transition system for dependency parsing defines: a set C of parser configurations, each of which defines a (partially built) dependency graph G a set T of transitions, each a function t :CC for every sentence x = w0,w1, ,wn a unique initial configuration cx a set Qx of terminal configurations

20 Transition sequence A transition sequence Cx,m = (cx, c1, , cm) for a sentence x is a sequence of configurations such that and, for every there is a transition such that The graph defined by is the dependency graph of x

21 Transition scoring function
The score of a transition t in a configuration c s(c, t) represents the likelihood of taking transition t out of configuration c Parsing is finding the optimal transition sequence ( )

22 Yamada and Matsumoto (2003)
A transition-based (shift-reduce) parser Considers two adjacent words Runs in iterations, continues as long as new dependencies are created In every iteration, consider 3 different actions and choose one using SVM (or other discriminative learning technique) Time complexity Accuracy was shown to be close to the state-of-the-art algorithms (e.g., Eisner’s)

23 Y&M (2003) Actions Shift Left Right

24 Y&M (2003) Learning Features (lemma, POS tag) are collected from the context

25 Stack-based parsing Introducing a stack and a buffer
The buffer is a queue of all input words (left to right) The stack begins empty; words are pushed to the stack by the defined actions Reduces Y&M complexity to linear time

26 2 stack-based parsers Nivre’s (2003, 2006) arc-standard Stack Buffer
i doesn’t have a head already j doesn’t have a head already Stack Buffer

27 2 stack-based parsers Nivre’s (2003, 2006) arc-eager

28 Borrowed from Dependency Parsing (P. Mannem)
Example (arc eager) _ROOT_ Red figures on the screen indicated falling stocks S Q Borrowed from Dependency Parsing (P. Mannem)

29 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Shift Borrowed from Dependency Parsing (P. Mannem)

30 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Left-arc Borrowed from Dependency Parsing (P. Mannem)

31 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Shift Borrowed from Dependency Parsing (P. Mannem)

32 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Right-arc Borrowed from Dependency Parsing (P. Mannem)

33 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Shift Borrowed from Dependency Parsing (P. Mannem)

34 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Left-arc Borrowed from Dependency Parsing (P. Mannem)

35 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Right-arc Borrowed from Dependency Parsing (P. Mannem)

36 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Reduce Borrowed from Dependency Parsing (P. Mannem)

37 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Reduce Borrowed from Dependency Parsing (P. Mannem)

38 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Left-arc Borrowed from Dependency Parsing (P. Mannem)

39 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Right-arc Borrowed from Dependency Parsing (P. Mannem)

40 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Shift Borrowed from Dependency Parsing (P. Mannem)

41 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Left-arc Borrowed from Dependency Parsing (P. Mannem)

42 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Right-arc Borrowed from Dependency Parsing (P. Mannem)

43 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Reduce Borrowed from Dependency Parsing (P. Mannem)

44 Borrowed from Dependency Parsing (P. Mannem)
Example _ROOT_ Red figures on the screen indicated falling stocks S Q Reduce Borrowed from Dependency Parsing (P. Mannem)

45 Graph (MSTParser) vs. Transitions (MaltParser)
Accuracy on different languages Characterizing the Errors of Data-Driven Dependency Parsing Models, McDonald and Nivre 2007

46 Graph (MSTParser) vs. Transitions (MaltParser)
Sentence length vs. accuracy Characterizing the Errors of Data-Driven Dependency Parsing Models, McDonald and Nivre 2007

47 Graph (MSTParser) vs. Transitions (MaltParser)
Dependency length vs. precision Characterizing the Errors of Data-Driven Dependency Parsing Models, McDonald and Nivre 2007

48 Known Parsers Stanford (constituency + dependency)
MaltParser (dependency) MSTParser (dependency) Hebrew Yoav Goldberg’s parser (


Download ppt "Dependency Parsing Some slides are based on:"

Similar presentations


Ads by Google