Presentation is loading. Please wait.

Presentation is loading. Please wait.

Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6801 עיבוד שפות טבעיות - שיעור שבע Partial Parsing אורן גליקמן.

Similar presentations


Presentation on theme: "Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6801 עיבוד שפות טבעיות - שיעור שבע Partial Parsing אורן גליקמן."— Presentation transcript:

1 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6801 עיבוד שפות טבעיות - שיעור שבע Partial Parsing אורן גליקמן המחלקה למדעי המחשב אוניברסיטת בר אילן

2 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6802 Syntax The study of grammatical relations between words and other units within the sentence. The Concise Oxford Dictionary of Linguistics the way in which linguistic elements (as words) are put together to form constituents (as phrases or clauses) Merriam-Webster Dictionary

3 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6803 Brackets “I prefer a morning flight” [ S [ NP [ pro I]][ VP [ V prefer][ NP [ Det a] [ Nom [ N morning] [ N flight]]]]]]

4 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6804 Parse Tree Noun Nom NounDet VerbPronoun Ipreferamorning flight NP S VP

5 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6805 Parsing The problem of mapping from a string of words to to its parse tree is called parsing.

6 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6806 Generative Grammar A set of rules which indicate precisely what can be and cannot be a sentence in a language. A grammar which precisely specifies the membership of the set of all the grammatical sentences in the language in question and therefore excludes all the ungrammatical sentences.

7 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6807 Formal Languages The set of all grammatical sentences in a given natural language. Are natural languages regular?

8 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6808 English is not a regular language! a n b n is not regular Look at the following English sentences: –John and Mary like to eat and sleep, respectively. –John, Mary, and Sue like to eat, sleep, and dance, respectively. –John, Mary, Sue, and Bob like to eat, sleep, dance, and cook, respectively.

9 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6809 Constituents Certain groupings of words behave as constituents. Constituents are able to occur in various sentence positions: –ראיתי את הילד הרזה –ראיתי אותו מדבר עם הילד הרזה –הילד הרזה גר ממול

10 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68010 The Noun Phrase (NP) Examples: –He –Ariel Sharon –The prime minister –The minister of defense during the war in Lebanon. They can all appear in a similar context: ___ was born in Kfar-Malal

11 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68011 Prepositional Phrases Examples: –the man in the white suit –Come and look at my paintings –Are you fond of animals? –Put that thing on the floor

12 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68012 Verb Phrases Examples: –Getting to school on time was a struggle. –He was trying to keep his temper. –That woman quickly showed me the way to hide.

13 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68013 Chunking Text chunking is dividing sentences into non- overlapping phrases. Noun phrase chunking deals with extracting the noun phrases from a sentence. While NP chunking is much simpler than parsing, it is still a challenging task to build a accurate and very efficient NP chunker.

14 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68014 What is it good for The importance of chunking derives from the fact that it is used in many applications: –Information Retrieval & Question Answering –Machine Translation –Preprocessing before full syntactic analysis –Text to speech –Many other Applications

15 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68015 What kind of structures should a partial parser identify? Different structures useful for different tasks: –Partial constituent structure [ NP I] [ VP saw [ NP a tall man in the park]]. –Prosodic segments [I saw] [a tall man] [in the park]. –Content word groups [I] [saw] [a tall man] [in the park].

16 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68016 Chunk Parsing Goal: divide a sentence into a sequence of chunks. Chunks are non-overlapping regions of a text: –[I] saw [a tall man] in [the park]. Chunks are non-recursive –a chunk can not contain other chunks Chunks are non-exhaustive –not all words are included in chunks

17 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68017 Chunk Parsing Examples Noun-phrase chunking: –[I] saw [a tall man] in [the park]. Verb-phrase chunking: –The man who [was in the park] [saw me]. Prosodic chunking: –[I saw] [a tall man] [in the park].

18 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68018 Chunks and Constituency Constituents: [a tall man in [the park]]. Chunks: [a tall man] in [the park]. Chunks are not constituents –Constituents are recursive Chunks are typically subsequences of Constituents –Chunks do not cross constituent boundaries

19 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68019 Chunk Parsing: Accuracy Chunk parsing achieves higher accuracy –Smaller solution space –Less word-order flexibility within chunks than between chunks –Better locality: Fewer long-range dependencies Less context dependence –No need to resolve ambiguity –Less error propagation

20 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68020 Chunk Parsing: Domain Specificity Chunk parsing is less domain specific: Dependencies on lexical/semantic information tend to occur at levels "higher" than chunks: –Attachment –Argument selection –Movement Fewer stylistic differences within chunks

21 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68021 Chunk Parsing: Efficiency Chunk parsing is more efficient –Smaller solution space –Relevant context is small and local –Chunks are non-recursive –Chunk parsing can be implemented with a finite state machine

22 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68022 Psycholinguistic Motivations Chunk parsing is psycholinguistically motivated: Chunks as processing units –Humans tend to read texts one chunk at a time – Eye-movement tracking studies Chunks are phonologically marked –Pauses, Stress patterns Chunking might be a first step in full parsing

23 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68023 Chunk Parsing Techniques Chunk parsers usually ignore lexical content Only need to look at part-of-speech tags Techniques for implementing chunk parsing: –Regular expression matching / Finite State Machines –Transformation Based Learning –Memory Based Learning –Others

24 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68024 Regular Expression Matching Define a regular expression that matches the sequences of tags in a chunk –A simple noun phrase chunk regexp: ? * –Chunk all matching subsequences: the/DT little/JJ cat/NN sat/VBD on/IN the/DT mat/NN [the/DT little/JJ cat/NN] sat/VBD on/IN [the/DT mat/NN] –If matching subsequences overlap, the first one gets priority

25 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68025 Chunking as Tagging Map Part of Speech tag sequences to {I,O,B}*  I – tag is part of an NP chunk  O – tag is not part of  B – the first tag of an NP chunk which immediately follows another NP chunk Example: –Input: The little cat sat on the mat –Output: B I I O O B I

26 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68026 Chunking State of the Art Depending on task specification and test set: 90-95%

27 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68027 Homework

28 Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-68028 Context Free Grammars Putting the constituents together Next Week…


Download ppt "Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books 88-6801 עיבוד שפות טבעיות - שיעור שבע Partial Parsing אורן גליקמן."

Similar presentations


Ads by Google