Functional Generative Description (FGD) Markéta Lopatková Institute of Formal and Applied Linguistics, MFF UK

Slides:



Advertisements
Similar presentations
English 306A; Harris Final exam 7: :00 PM! Thursday 16 December RCH 305.
Advertisements

MAIN NOTIONS OF MORPHOLOGY
Almen sproglig viden og metode (General Linguistics)
Morphology.
The Study Of Language Unit 7 Presentation By: Elham Niakan Zahra Ghana’at Pisheh.
Chapter 4 Syntax.
© 2001 Laura Snodgrass, Ph.D.1 Language Psycholinguistics –study of mental processes and structures that underlie our ability to produce and comprehend.
Language & Mind Summer Words Perhaps the most conspicuous, most easily extractable aspect of language. Cf. phone, phoneme, syllable NB word vis.
June 6, 20073rd PIRE Meeting1 Tectogrammatical Representation of English in Prague Czech-English Dependency Treebank Lucie Mladová Silvie Cinková, Kristýna.
1 Praguian Functionalism and Its Challenges for Linguistic Theory Eva Hajičová Charles University, Prague.
PSY 369: Psycholinguistics Some basic linguistic theory part2.
Lecture 2 The main notions of Grammar The word and the morpheme
Терских Елена и Кокорева Ксения, 3 курс, 2я англ. группа.
Morphology Chapter 7 Prepared by Alaa Al Mohammadi.
Introduction to Linguistics n About how many words does the average 17 year old know?
Autosegmental Phonology
C SC 620 Advanced Topics in Natural Language Processing Lecture 20 4/8.
Language is very difficult to put into words. -- Voltaire What do we mean by “language”? A system used to convey meaning made up of arbitrary elements.
Syntax Lecture 4.
Linguisitics Levels of description. Speech and language Language as communication Speech vs. text –Speech primary –Text is derived –Text is not “written.
Building the Valency Lexicon of Arabic Verbs Viktor Bielický Otakar Smrž LREC 2008, Marrakech, Morocco.
Chapter three lexicon 3.1 What is Word? three senses of “ WORD”
PDT 2.0 Prague Dependency Treebank 2.0 Zdeněk Žabokrtský Dept. of Formal and Applied Linguistics Charles University, Prague.
Phonological Rules Rules about how sounds may or may not go together in a language English: Words may not start with two stop consonants German: Devoicing.
PDT Grammatemes and Coreference in the PDT 2.0 Zdeněk Žabokrtský Institute of Formal and Applied Linguistics Charles University in Prague.
Chapter Four Morphology
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Machine Translation using Tectogrammatics Zdeněk Žabokrtský IFAL, Charles University in Prague.
Phonemes A phoneme is the smallest phonetic unit in a language that is capable of conveying a distinction in meaning. These units are identified within.
A Summary of Terminology in Linguistics. First Session Orientation to the Course Introduction to Language & Linguistics 1. Definition of Language 2. The.
Lecture 3, 7/27/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 3 27 July 2005.
THE BIG PICTURE Basic Assumptions Linguistics is the empirical science that studies language (or linguistic behavior) Linguistics proposes theories (models)
Morphological Meanings in the Prague Dependency Treebank Magda Razímová Zdeněk Žabokrtský Institute of Formal and Applied Linguistics Charles University,
Tree-based Machine Translation using syntax and semantics
The Prague (Czech-)English Dependency Treebank Jan Hajič Charles University in Prague Computer Science School Institute of Formal and Applied Linguistics.
Reasons to Study Lexicography  You love words  It can help you evaluate dictionaries  It might make you more sensitive to what dictionaries have in.
Formal Properties of Language: Talk is achieved through the interdependent components of sounds, words, sentences, and meanings.
Jan Hajič Otakar Smrž Petr Zemánek Jan Šnaidauf Emanuel Beška Faculty of Mathematics and Physics Faculty of Philosophy and Arts Charles University in Prague.
Metalanguage Revision English language year
I. INTRODUCTION.
A very, very brief introduction to linguistics Computational Linguistics, NLL Riga 2008, by Pawel Sirotkin 1.
Resemblances between Meaning-Text Theory and Functional Generative Description Zdeněk Žabokrtský Institute of Formal and Applied Linguistics Charles University,
The Minimalist Program
nd PIRE project workshop1 Tectogrammatical Representation of English Silvie Cinková Lucie Mladová, Anja Nedoluzhko, Jiří Semecký, Jana Šindlerová,
Natural Language Processing Chapter 2 : Morphology.
Annotation Procedure in Building the Prague Czech-English Dependency Treebank Marie Mikulová and Jan Štěpánek Institute of Formal and Applied Linguistics.
Syntactic Annotation of Slovene Corpora (SDT, JOS) Nina Ledinek ISJ ZRC SAZU
MORPHOLOGY definition; variability among languages.
Machine Translation using Tectogrammatics Zdeněk Žabokrtský IFAL, Charles University in Prague.
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
Levels of Linguistic Analysis
Language Language - a system for combining symbols (such as words) so that an unlimited number of meaningful statements can be made for the purpose of.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
Distinctively Visual. Your task Define/describe what each symbol represents. Write down the first few things that pop into your mind.
Slang. Informal verbal communication that is generally unacceptable for formal writing.
March 5, 2008Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics 1 PDT: Tectogrammatical Representation Jan Hajič Institute.
Semantic annotation of a dialog corpus Silvie Cinková Institute of Formal and Applied Linguistics Charles University in Prague, Czech Republic COMPANIONS.
Learning to Generate Complex Morphology for Machine Translation Einat Minkov †, Kristina Toutanova* and Hisami Suzuki* *Microsoft Research † Carnegie Mellon.
Characteristic Features of Language. I. Language is a system at many levels. All languages have two levels, called duality of patterning. This consists.
SYNTAX.
Netgraph – a Tool for Searching in the Prague Dependency Treebank 2.0 Defence of the Doctoral Thesis, Prague, September 3 rd, 2008 Author: Mgr. Jiří Mírovský.
عمادة التعلم الإلكتروني والتعليم عن بعد
Morphology Morphology Morphology Dr. Amal AlSaikhan Morphology.
INTRODUCTION TO PHONETICS AND PHONOLOGY
Revision Outcome 1, Unit 1 The Nature and Functions of Language
Prague Dependency Treebank 2. 0 Zdeněk Žabokrtský Dept
Levels of Linguistic Analysis
Tagmeme A tagmeme is the smallest functional element in the grammatical structure of a language. The term was introduced in the 1930s by the linguist Leonard.
Introduction to English morphology
Introduction to Linguistics
Presentation transcript:

Functional Generative Description (FGD) Markéta Lopatková Institute of Formal and Applied Linguistics, MFF UK

motivation: machine translation PDT – FGD Lopatková Basic characteristics of FGD Petr Sgall (1967) Generativní popis jazyka a česká deklinace. Academia, Praha since 1970s … together with Eva Hajičová, Jarmila Panevová Petr Sgall, Eva Hajičová, Jarmila Panevová (1986) The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. Reidel, Dordrecht abstraction ‘interlingua’ language independent representation language meaning … transfer sentence ~ string of graphemes/phonemes source language target language

motivation: machine translation PDT – FGD Lopatková Basic characteristics of FGD Petr Sgall (1967) Generativní popis jazyka a česká deklinace. Academia, Praha since 1970s … together with Eva Hajičová, Jarmila Panevová Petr Sgall, Eva Hajičová, Jarmila Panevová (1986) The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. Reidel, Dordrecht ‘interlingua’ language independent representation language meaning … transfer sentence ~ string of graphemes/phonemes source language target language abstraction

PDT – FGD Lopatková Basic characteristics of FGD Basic characteristics of FGD (cont.) 'classical' version of FGD: dependency framework formal description suitable mathematical formalism sleeps.Pred brother.Sb study.Adv my.Atr often.Adv his.Atr in.AuxP

PDT – FGD Lopatková Basic characteristics of FGD Basic characteristics of FGD (cont.) 'classical' version of FGD: dependency framework stratificational approach language meaning ~ function string of graphemes/phonemes ~ form synonymy ambiguity

PDT – FGD Lopatková Basic characteristics of FGD Basic characteristics of FGD (cont.) 'classical' version of FGD: dependency framework stratificational approach relation between a form and its function / a function and its form functional

PDT – FGD Lopatková Basic characteristics of FGD Basic characteristics of FGD (cont.) 'classical' version of FGD: dependency framework stratificational approach relation between a form and its function / a function and its form language meaning (not cognitive content) generative generative vs. analytical functional

PDT – FGD Lopatková Basic characteristics of FGD Basic characteristics of FGD (cont.) tradition of Prague Linguistic Circle structural school, since 1926 Mathesius, Trnka, Havránek, Mukařovský, Jakobson, Trubeckoj, Karcevskij, … language as a system ~ langue vs. individual utterances ~ parol stress on testable criteria for distinguishing lang. phenomena higher layers of language description (syntax) topic focus articulation as a part of language meaning

PDT – FGD Lopatková Two components of FGD generative component ~ to define all formally correct meaning representations (of possible sentences of a given language) formalism: 1) phrase rules, phrase structure trees + functors 2) dependency trees push-down automaton translation component ~ translating meaning representations to lower layers sequence of push-down transducers plus finite-state automaton

Main pillars of FGD system of layers valency theory topic focus articulation anaphora / coreference PDT – FGD Lopatková

System of layers in FGD PDT – FGD Lopatková surface syntax morphematics morphonology phonology/phonetics meaning expression deep / underlying syntax tectogrammar

System of layers in FGD System of layers in FGD (cont.) PDT – FGD Lopatková sentence … full representation on each layer of description each layer ~ set of descriptions for all possible sentences finite set of elementary units finite set of operations and relations  set of complex units finite set of relations between sentence representations on a particular layer and its representations on adjacent layers type C relations (composition): elementary units constitute complex units i.e., relations between units of the same layer type R relations (representation): form-function relation i.e., relation between adjacent layers n+1 function R n form C C

System of layers in FGD System of layers in FGD (cont.) PDT – FGD Lopatková layer of phonetics distinctive features … elementary units phones (~ a speech sound) … complex units suprasegmental units … prosody, intonation layer of phonology distinctive features … elementary units phonemes (~ ‘smallest’ units that distinguish meaning) … complex units asymmetry … allophones ~ variants of a single phoneme language dependent sing vs. sin distinctive f.phone C C phoneme R phonology phonetics

System of layers in FGD System of layers in FGD (cont.) PDT – FGD Lopatková layer of morphonology morphoneme ~ set of phoneme variants e.g. k|c|č|.k in "matka" morph ~ string of morphonemes lexical variants (matk, matc, matč, mat.k)... 4 allomorphs mat(k|c|č|.k ) 1 morph distinctive f.phoneme C C morphonememorph R morphonology phonology

System of layers in FGD System of layers in FGD (cont.) PDT – FGD Lopatková layer of morphematics morpheme ~ the smallest component that has semantic meaning lexical morpheme stems e.g. lex. morpheme for matka consists of 4 allomorphs (matk, matc, matč, mat.k) derivational morphemes (affixes: prefixes, infixes, suffixes, … ) grammatical morpheme inflectional suffixes e.g. Cz: case suffixes, Eng: plural -s, past tense -ed sema formeme: sequence of morphs realizing a single tagmeme / sentence member lexical f., case f. (i.e., prep+case), conjunction formemes (i.e., conj+verb mood) C morphematics morphonology morphonememorph formeme sema R morpheme C C

Two layers of syntax in FGD PDT – FGD Lopatková tree-based dependency structure nodes for tagmemes or sememes (complex symbols) edges labeled with a type of a respective syntactic relation

Two layers of syntax in FGD PDT – FGD Lopatková tree-based dependency structure nodes for tagmemes or sememes (complex symbols) edges labeled with a type of a respective syntactic relation layer of surface syntax tagmemes ~ structure of sentence members / tagmemes incl. lexical and morphological infornation 3 types of elementary units: lexical: units from a dictionary morphological: set of morphological features (a pair of) trousers… sema - plural syntactic: sentence members / tagmemes subject, object, attribute, adverbial, complement,… surface word order … linear ordering of tree nodes

My brother often sleeps in his study. sleeps.pres.Pred brother.sg.Sb my.Atr often.Adv his.Atr in study.sg.Adv The layer of surface syntax PDT – FGD Lopatková půjdou. fut.Pred rodiče.pl.Sb babiččině.sg.Atr do divadla.sg.Advpo příjezdu.sg.Adv Po babiččině příjezdu půjdou rodiče do divadla. [After grandma's arrival the parents will go to the theatre.]

The layer of deep syntax PDT – FGD Lopatková ~ meaning of a sentence: semantemes semantemes: lexical (autosemantic) words, their lexical and morphological features and mutual relations terminology: deep / underlying / tectogrammatical representation (TR) 3 basic types of elementary units: lexical: units from a (tectogrammatical) dictionary grammatemes morphological: grammatemes meaning of individual morphological categories (a pair of) trousers… singular denominating (pojmenovávací) vs. correlating (usouvztažňující) categories functorssubfunctors syntactic: types of relation, functors and subfunctors Actor, Patient, Addressee, … local, temporal modifications …

The layer of deep syntax PDT – FGD Lopatková deep word order increasing communicative dynamism: word order reflects "relative degree of importance in comparison with other expressions in the sentence […]" topic focus articulation condition of projectivity !!!

The layer of deep syntax PDT – FGD Lopatková Po babiččině příjezdu půjdou rodiče do divadla. [After grandma's arrival the parents will go to the theatre.] jít. fut.ind.Predicate rodiče. sg.Actor babička. sg.Actor divadlo.sg.DIR- where při j et.fut.kompl.TWHEN-after [sem]. DIR-where jít. fut.ind.Predicate rodiče. sg.Actor babička. sg.Actor divadlo.sg.DIR- where [sem]. DIR-where přijet.pret.kompl.TWHEN-after

The layers of deep and surface syntax PDT – FGD Lopatková different sets of elementary units 'morphological' vs. tectogrammatical lemma morphological categories vs. grammatemes surface sentence members vs. functors sentence members / tagmemes vs. semantemes only autosemantic / lexical words at TR – modal verbs Peter wants to attend the concert. [to attend + volitive] Charles has to pass the exam. [to pass + debitive] – nominalization After grandma's arrival …  [to arrive] – active / passive verbs  [active form] Tato krásná kniha byla vydána nakladatelstvím Albatros. [This beautiful book was published by the Albatros publishing house.]

The layers of deep and surface syntax PDT – FGD Lopatková different sets of elementary units 'morphological' vs. tectogrammatical lemma morphological categories vs. grammatemes surface sentence members vs. functors sentence members / tagmemes vs. semantemes only autosemantic / lexical words at TR completeness of the representation – (surface) ellipses are restored omitted surface subject (Czech: pro-drop language): Czech: Vidíš bratra? Vidím. Přichází.  [Ty] vidíš bratra? [Já] vidím [ho]. [On] přichází [sem]. Russian:Tы видeл брата? Вижу [его]. Идёт. Spanish:¿Ves este tronco? [(Do) you see this log? ]

The layers of deep and surface syntax PDT – FGD Lopatková different sets of elementary units 'morphological' vs. tectogrammatical lemma morphological categories vs. grammatemes surface sentence members vs. functors sentence members / tagmemes vs. semantemes only autosemantic / lexical words at TR completeness of the representation – (surface) ellipses are restored surface vs. deep word order: TR: projective trees increasing communicative dynamism

System of layers in FGD PDT – FGD Lopatková meaning expression C morphoneme morf distinctive f.phoneme C formemesema R morpheme C R C RR C tagmeme sentence semanteme proposition surface syntax morphematics morphonology phonology deep / underlying syntax tectogrammar R

References PDT – Intro Lopatková Sgall, P. (1967) Generativní popis jazyka a česká deklinace. Academia, Praha Sgall, P., Hajičová, E., Panevová, J. (1986) The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. Reidel, Dordrecht. Sgall, P. a kol. (1986) Úvod do syntaxe a sémantiky. Academia, Praha Hajičová, E., Panevová, J., Sgall, P. (2002) Úvod do teoretické a počítačové lingvistiky, sv. I. Karolinum, Praha. Petkevič, V. (1995) A New Formal Specification of Underlying Structure. Theoretical Linguistics Vol.21, No.1. PDT guide PDT documentation