Presentation is loading. Please wait.

Presentation is loading. Please wait.

Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences

Similar presentations


Presentation on theme: "Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences"— Presentation transcript:

1 Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences iomdin@iitp.ru, iomdin@gmail.com

2 Program Overview: p. 1 1. Basic Principles of The Meaning-Text theory by Igor Mel’čuk. Language as a Universal Translator of Senses to Texts and Texts to Senses. Text analysis and text generation. The theory of integral linguistic description by Juri Apresjan. The grammar and the dictionary of language. 2. Two syntactic levels of sentence representation: surface syntax and deep syntax. December 18, 2009. Lectures 11-122

3 Program Overview: p. 2 3. The dependency tree structure as a syntactic representation of the sentence. Dependency tree vs. Constituent tree: advantages and drawbacks of both types of representation. Limits of the dependency tree. The hypothesis of two syntactic starts. 4. The notions of syntactic relation. Major classes of syntactic relations: actant, attributive, coordinative and auxiliary relation classes. 5. The notion of syntactic feature. Syntactic features vs. Semantic features. December 18, 2009. Lectures 11-123

4 Program Overview: p. 3 6. Actants and valencies. Active, passive and distant valencies. The government pattern of a dictionary entry. An overview of actant syntactic relations. The predicative relation. The agentive relation. Completive relations. 7. An overview of attributive syntactic relations. Grammatical Agreement. Numerals and Quantitative Constructions. The system of Quantification Syntax of Russian. 8. Grammatical coordination as a type of grammatical subordination. An overview of coordinative syntactic relations. December 18, 2009. Lectures 11-124

5 Program Overview: p. 4 9. Auxiliary syntactic relations. Analytical grammatical forms as an object of syntax. 10 Microsyntax of Language. Minor Type Sentences. Syntactic Idioms. 11. Lexical Functions in the Dictionary and the Grammar. 12. Syntactic description and syntactic rules. Dependency Syntax in NLP. Dependency Syntax in Machine Translation. Syntactically Tagged Corpus of Texts. December 18, 2009. Lectures 11-125

6 Syntactic description and syntactic rules: the parser of the ETAP-3 system In order to understand how dependency syntax works in an NLP environment, we will discuss one concrete system, ETAP-3 December 18, 2009. Lectures 11-126

7 7 Theoretical Foundations Igor A. Mel’čuk Meaning  Text Linguistic Theory Mel’čuk I.A. (1974). Opyt teorii lingvisticheskix modelej klassa “Smysl - Tekst”. [The theory of linguistic models of the Meaning – Text Type]. Moscow, Nauka. Jury D. Apresjan Systematic Lexicography and the Theory of Integral Description of Language Apresjan, Ju.D. (1995). Integral’noe opisanie jazyka i sistemnaja leksikografija. [An Integrated Description of Language and Systematic lexicography.] Moscow, Jazyki russkoj kul’tury. Apresjan, Ju. D. (2000). Systematic Lexicography. Oxford University Press, XVIII p., 304 p. December 18, 2009. Lectures 11-12

8 8 ETAP-3 Options 1.Machine Translation 2.Deeply Annotated Text Corpus of Russian (SynTagRus) 3.Translation System Based on UNL (Universal Networking Language) Interlingua 4.Synonymous and Quasi-Synonymous Paraphrasing of Utterances 5.Computer-Aided Language Learning Tool 6.New Developments: Semantics and Ontologies December 18, 2009. Lectures 11-12

9 9 Machine Translation Russian  English 130,000-strong morphological dictionaries 100,000-strong combinatorial dictionaries Russian  German prototype Russian  French prototype Russian  Korean prototype Russian  Spanish prototype Arabic  English prototype December 18, 2009. Lectures 11-12

10 Major Features of ETAP Environment Rule-based Approach Strict Stratificational Approach Syntactic Dependencies Lexicalistic Focus Self-Tuning Maximum Reusability of Linguistic Resources 10

11 December 18, 2009. Lectures 11-12 Self-Tuning: Grammar vs. Dictionary General regularities: general rules that apply to very large classes of words and occur very often. Example: agreement N+ V Restricted-scope regularities: specific rules that apply to restricted classes of words and have limited occurrence. Example: compound numerals 11

12 12 General Architecture of Machine Translation Module in ETAP-3 December 18, 2009. Lectures 11-12 WinEtap

13 13 A Sample Syntactic Analysis Rule December 18, 2009. Lectures 11-12 REG:1-COMPL.04 A GERUND AS FIRST COMPLEMENT: N:01 STOPPED READING, WORTH MENTIONING CHECK //1.1 ^#(X,V,A)&VAL(2,X,ING) //КЛ THEY HAD DIFFICULTY [X] FINDING AN APPROPRIATE BATTERY 1.1 ^#(X,V,S,A)&VAL(2,X,ING)&R-EQU(X,Y,3,ING) 1.2 ^=(X,PHAS,ING) 1.3 #(X,PP)/L-LEXA(X,*,3,HAVE)/=(X,ADRES) 4.1 DOM-NEQUN(Y,*,COORDIN,ING,COORD)/ * DOM-EQU(Y,Z,COORDIN,COORD)& * ^DOM-EQU(Z,*,COORD-CONJ,ING) DO 1 SVUZOT:(X,Y,1-COMPL)

14 December 18, 2009. Lectures 11-12 Lexical Functions Substitute LF synonyms, antonyms, converse terms, derivatives Collocate LF MAGN = 'a high degree of what is denoted by X’ OPER/FUNC... 14

15 15 Lexical Functions: Magn MAGN (disease) = grave MAGN (fog) = heavy MAGN (control) = strict MAGN (болезнь) = тяжелый MAGN (туман) = густой MAGN (контроль) = строгий December 18, 2009. Lectures 11-12

16 16 Lexical Functions: Oper / Func Family December 18, 2009. Lectures 11-12

17 17 Examples of LF Oper Oper 1 (invitation) = issue Oper 2 (invitation) = receive Oper 1 (defeat) = suffer Oper 2 (resistence) = encounter Oper 2 (respect) = enjoy December 18, 2009. Lectures 11-12

18 18 Examples of LF Func Func 1 (fear) = possess Func 2 (decision) = concern Func 1 (responsibility) = rest (with) Func 2 (vengeance) = fall (upon) December 18, 2009. Lectures 11-12

19 19 General Properties of Lexical Functions Universality Intralinguistic idiomaticity grave disease, heavy fog *heavy disease, *grave fog. Cross-linguistic idiomaticity Rus. tjazhelaja bolezn’ ‘heavy disease’ Rus. gustoj tuman ‘dense fog’ December 18, 2009. Lectures 11-12

20 20 General Properties of Lexical Functions (cont.) Paraphrasing Potential: He respects [X] his teachers He has [OPER 1 (S 0 (X))] respect [S 0 (X)] for his teachers He treats [LABOR 12 (S 0 (X))] his teachers with respect His teachers enjoy [OPER 2 (S 0 (X))] his respect December 18, 2009. Lectures 11-12

21 21 LF in Practical Applications Syntactic and Lexical Ambiguity Resolution in Parsers Idiomatic Translation of a Large Class of Set Expressions in Machine Translation Sentence Paraphrasing December 18, 2009. Lectures 11-12

22 22 Lexical Ambiguity Resolution to draw a distinction - provodit' razlichie Both verbs are extremely ambiguous: draw - more than 50 meanings provodit’ - more than 10 meanings December 18, 2009. Lectures 11-12

23 23 Syntactic Ambiguity Resolution support of the army 'support by the army' 'support (given) to the army' The president had [Y=OPER 2 (X)] the support [X] of the army December 18, 2009. Lectures 11-12

24 24 Syntactic Ambiguity Resolution The fear [X] of his wife possessed [Y = FUNC 1 (X)] Peter The fears of his wife infected Peter. December 18, 2009. Lectures 11-12

25 25 Idiomatic translation: LF Temp March: in–mart: v2 Tuesday: on– vtornik:v1 dawn: at– rassvet:na2 moment: at– moment:v1 Easter: at – pasxa:na1 December 18, 2009. Lectures 11-12

26 26 Sentence Paraphrasing X = CONV 12 (X) This group consists of 20 persons – Twenty persons comprise this group; X + Y = ANTI 1 (X) + ANTI 2 (Y) He began to observe the rules – He stopped violating the rules X = LABOR 12 + S 0 (X) He respects his parents – He treats his parents with respect December 18, 2009. Lectures 11-12

27 Next lecture SynTagRus – a deeply annotated corpus of Russian. The hypothesis of two syntactic starts. Microsyntax of Language. Minor Type Sentences. Syntactic Idioms. December 18, 2009. Lectures 11-1227


Download ppt "Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences"

Similar presentations


Ads by Google