Presentation is loading. Please wait.

Presentation is loading. Please wait.

Syntax 3: Back to State Networks... Recursive Transition Networks John Barnden School of Computer Science University of Birmingham Natural Language Processing.

Similar presentations


Presentation on theme: "Syntax 3: Back to State Networks... Recursive Transition Networks John Barnden School of Computer Science University of Birmingham Natural Language Processing."— Presentation transcript:

1 Syntax 3: Back to State Networks... Recursive Transition Networks John Barnden School of Computer Science University of Birmingham Natural Language Processing 1 2010/11 Semester 2

2 Aim of this Short Section Give brief sketch of an alternative mechanism for syntactic analysis: Recursive Transition Networks. Most naturally viewed as top-down. Most naturally operated depth-first. Emphasize thereby a major limitation of FSAs (finite-state automata/networks). RTNs are in a static sense finite-state, but dynamically (i.e., while running) can lead to unlimited number of network states in effect, because they recursively generate copies of states. This is just special case of: Recursion in an ordinary program can lead to indefinitely many ongoing subprogram calls existing at any given moment, even though there are only finitely many subprograms.

3 RTN Example: part S1 S2S3 S6 Verb NP S4S5 is NP PRED S network Exit state PR1 PR2 PRED network Adj PP Start state Verb

4 RTN Example: rest N1 N2N3 N4 ProperNoun Det PP NP network PP1 PP3 PP network NP PP2 Noun Prep

5 Comments RTNs can be annotated to return syntactic structures, manipulate grammatical features, etc. RTNs can parse the same inputs as CFGs. RTNs and CFGs can deal with unrestricted centre embedding: The book that the professor recommended is good. 1 1 The book that the professor that the students like recommended is good. 1 2 2 1 The book that the professor that the students that dogs hate like recommended is good. 1 2 3 3 2 1 English grammar supposedly allows this to unrestricted depth. But this is disputed.......

6 First, some variant examples Without the “that”s: The book is good. The book the professor recommended is good. 1 1 The book the professor the students like recommended is good. 1 2 2 1 The book the professor the students dogs hate like recommended is good. 1 2 3 3 2 1 Formally similar but much more difficult yet (even if you put back the thats): Dogs fight. Dogs dogs fight fight. Dogs dogs dogs fight fight fight. Dogs dogs dogs dogs fight fight fight fight.

7 The Dispute about such examples We feel increasing difficulty with increasing depth of centre embedding, especially but not exclusively when semantic cues don’t serves to keep the levels distinct. It’s traditional to say that this difficulty is “just” a matter of performance: i.e. just a matter of practical restrictions on what we can handle by way of grammar, and doesn’t affect our competence: i.e. the knowledge of grammar that we actually have. And certainly if we think hard and long enough we can make sense of any amount of embedding, it might be argued. But another view is that such competence is illusory in the sense that it’s not part of our natural, unreflective use of language: we can only go beyond the limitations by special, conscious effort. So the grammar involved in natural communication does not have things like unrestricted centre embedding. Many connectionists (neural net theorists), and others, have views like this.

8 Finite State Automata/Networks FSAs (whose arcs are labelled with lexical forms) cannot achieve unrestricted centre embedding. They are therefore traditionally said to be inadequate for capturing grammars of natural languages. It’s crucial that RTNs can have arc labels that are network names, and moreover that recursion can happen. Caution: you can split up an FSA into fragments and have arc labels that name FSA fragments. This is just an organizational tool if it cannot lead to recursion.

9 Finite State Automata/Networks Notice the basic structure that the centre-embedded examples I gave have is A n B C n where A, B, C are syntactic categories of some sort, and both occurrences of n are indeed the same value. (See the numerals annotating the examples: they show the n layers.) It can be proved that FSA cannot recognize such structures while excluding structures of form A n B C m where m is different from n. The basic problem with FSAs is that they can’t remember a count that could get indefinitely large, in this case n. Since the automaton can only be in finitely many different states, it’s impossible for it to reliably know, once something of form A n B has been consumed, what n was in all cases. So there’s no way it could possibly consume just the right number of Cs in all cases.

10 FSAs, contd. Exercise: satisfy yourself that if there is some limit on n, then an FSA can work. Simple exercise: be sure you know how an FSA could recognize things of form A n B C m where m may be different from or equal to n. In both those tasks, you can take A, B and C to be single lexical forms a, b and c, for simplicity. The point of those exercises is that they show that the problem is not being able to remember an indefinitely large count.


Download ppt "Syntax 3: Back to State Networks... Recursive Transition Networks John Barnden School of Computer Science University of Birmingham Natural Language Processing."

Similar presentations


Ads by Google