Download presentation
Presentation is loading. Please wait.
Published byMarjory Harriet Hood Modified over 8 years ago
1
10/31/00 1 Introduction to Cognitive Science Linguistics Component Topic: Formal Grammars: Generating and Parsing Lecturer: Dr Bodomo
2
10/31/00 2 Introduction In my previous lectures, we discussed how tacit linguistic knowledge can be represented at various levels of phonology, morphology, syntax, semantics, pragmatics, and their interfaces, including morphophonology, morphosyntax, and the syntax-semantics interrelationships. In this lecture, we shall look closely at how these linguistic knowledge representations can be formalised into an algorithm, a computational procedure for processing this linguistic knowledge.
3
10/31/00 3 Keywords Constituent structure rules initial symbol terminal symbol non-terminal symbol generative grammar formal grammar
4
10/31/00 4 Formal devices and notation The symbol ‘ ’ indicates that a node is ‘rewritten as…’ or ‘consists of’, or ‘has the constituents…’ This is used in rewrite rules of the type: S NP + VP –a sentence, S, has the constituents: noun phrase (NP) and verb phrase (VP) Optionality in the grammar is expressed as {X, Y}. This means apply either X or Y but not both
5
10/31/00 5 Formal devices and notation (cont.) The symbol # is used to indicate constituent boundary –e.g. # _ is word initial while _# is word final The notation X (Y) implies that X is obligatory and may be followed by Y Initial symbol Initial symbol: the symbol from which a rewrite rule begins (e.g. S) Terminal symbol Terminal symbol: the end symbols from which no constituent structure can be further developed (N, V, Art). All others are non-terminal symbols (e.g. NP, VP).
6
10/31/00 6 Generating and Parsing sentences Two main aspects of grammatical information processing: Generating and Parsing sentences Before we begin let us illustrate with a simple grammar and lexicon, using the following sentence: –The students greeted the teacher –The students greeted the teacher.
7
The students greeted the teacher. Grammar : –S NP +VP –VP V + NP –NP Art + N Lexicon 1: –Greeted: V, - NP –Students: N –The: Art –Teacher: N But you have to augment i.e. increase the lexicon as follows: Lexicon2: An: Art Greeted, scared, ate: V, - NP Apple: N Students: N Child: NTeacher: N The: Art This grammar can also generate (i.e. produce) the following sentences: The teacher greeted the students The teacher scared the students The child ate an apple
8
10/31/00 8 Sentence Generation:the algorithm To produce a sentence we need three things: A set of phrase structure rules (as illustrated above) A lexicon (as illustrated above), and A lexical insertion rule (as explained below) an instruction to select the right word from a lexicon A lexical insertion rule is an instruction to select the right word from a lexicon The following is an example of a lexical rule:
9
10/31/00 9 Lexical insertion rule For each terminal symbol of a phrase structure rule, select a word from the lexicon that satisfies the following conditions: –It is a member of the class of terminal symbol (e.g. N, V) –its subcategorization frame matches that of the terminal symbol (e.g. V, _NP). Attach this word as the daughter of this terminal symbol. sentence generator The set of rules above constitutes what is known as a sentence generator.
10
10/31/00 10 The whole procedure of beginning with an initial symbol and then working through phrase structure rules to adding the lexical items via lexical insertions rules is driven by an algorithm or a set of instructions. Let us set out an algorithm for the generation (production) of the sentence: The students greeted the teacher, a grammar and a lexicon as follows:
11
10/31/00 11 The students greeted the teacher Grammar: PS Rule1: S NP +VP PS Rule2: VP V + NP PS Rule3: NP Art + N Lexicon1: Greeted: V, - NP Students: N The: Art Teacher: N i.i. Start with the initial symbol, S. ii.ii. For every non-terminal symbol, X, find a phrase structure rule with X as left-hand symbol and others as the right hand symbol(s), and develop a rewrite rule with X as the mother and the right hand symbols as ordered daughters. iii.iii. Apply rule ii. until all branches end in terminal symbols. iv.iv. Apply lexical rule iteratively until every terminal symbol is replaced by a lexical item.
12
10/31/00 12 Illustrating the algorithm Applying rule i: we get: S Applying rule ii and iii. We get: S NP VP (PS Rule i) NP VP (PS Rule i) Art N V NP (PS Rule iii and ii) Art N V NP (PS Rule iii and ii) Art N (PS Rule iii) Art N (PS Rule iii) The teacher greeted the students (applying Rule iv) The teacher greeted the students (applying Rule iv)
13
10/31/00 13 From the above we can see that we have started from an initial string and have ended with terminal strings with lexical items as their daughters. A sentence has thus been generated (produced), telling us how this sentence is built up. Now, let us see how we can begin with an existing sentence and then break it down into its component parts by applying rules.
14
10/31/00 14 Sentence parsing: the algorithm To parse a sentence means to analyse it into its constituent parts by the systematic application of lexical insertion rules and some phrase structure rules. It is like the reverse process of generation.
15
PARSER Some sentence parsing rules which constitute a PARSER For a sentence, S –i. –i. Determine from the lexicon the word class of every item and develop a partial tree for each word where the word class label dominates the word. –ii. –ii. Find a PS rule of the type X Y, Z and where the right hand symbols match some sequence of categories in the structure so far, and develop a partial tree with X as the mother and the right hand symbols as ordered daughters. –iii. –iii. Continue rule ii. until the root, S, is reached and there are no unattached strings.
16
10/31/00 16 The man drank the tea The man drank the tea. Grammar: PS Rule1: S NP +VP PS Rule2: VP V + NP PS Rule3: NP Art + N Lexicon1: drank: V, - NP man: N The: Art Tea: N Art N V Art N Art N V Art N The man drank the tea The man drank the tea NP NP NP NP Art N V Art N Art N V Art N The man drank the tea The man drank the tea Applying Rule iii, we get:
17
10/31/00 17 VP NP NP NP NP Art N V Art N Art N V Art N The man drank the tea The man drank the tea Applying Rule ii, we get: S NP VP NP VP NP NP Art N V Art N Art N V Art N The man drank the tea The man drank the tea Applying Rule i, we get:
18
10/31/00 18 Conclusion Parsing and generation of natural language data is a very important area of linguistics, especially in computer applications of natural languages which has become an important aspect of the computer or information processing industry. In the next lecture, we shall be looking at the last topic of the linguistics segment i.e. how linguistic knowledge is acquired/learnt by speakers of a language, from the point of view of spoken language and from the point of literacy (reading and writing).
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.