Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Syntax-Morphology Interface and Natural Language Processing

Similar presentations


Presentation on theme: "The Syntax-Morphology Interface and Natural Language Processing"— Presentation transcript:

1 The Syntax-Morphology Interface and Natural Language Processing
Veronika Vincze University of Szeged Hungary Thematic Training Course on Processing Morphologically Rich Languages 11-15 April 2011

2 Thematic Training Course on Processing Morphologically Rich Languages
Outline Introduction Syntax vs. morphology from a linguistic viewpoint Morphological coding systems in Hungarian Morphosyntactic information in Hungarian corpora Language-specific morphosyntactic problems Effects on IE, NER and MT Thematic Training Course on Processing Morphologically Rich Languages

3 Thematic Training Course on Processing Morphologically Rich Languages
Syntax vs. morphology Typological differences among languages Agglutinative lg: role of morphology is stronger (lot of information in morphemes) Isolating lg: role of syntax is stronger (less morphemes, more constructions) Focus on Hungarian (agglutinative) and English (fusional/isolating) Thematic Training Course on Processing Morphologically Rich Languages

4 Basic Hungarian syntax
Lot of information encoded in morphemes No fixed word order Information structure is reflected in word order (theme-rheme, old-new) Péter szereti Marit. Peter love-3SgObj Mary-ACC ‘Peter loves Mary.’ Péter Marit szereti. ‘It is Mary who Peter loves.’ Marit szereti Péter. ‘It is Mary who Peter loves.’ Marit Péter szereti. ‘It is Peter who loves Mary.’ Szereti Péter Marit. ‘Peter LOVES Mary (and not hates).’ Szereti Marit Péter. ‘Peter LOVES Mary (and not hates).’ Thematic Training Course on Processing Morphologically Rich Languages

5 Morphosyntactic features of Hungarian
Nominal declination (nouns, adjectives, numerals) Verbal conjugation Several hundreds of word forms for each lemma Grammatical relations encoded primarily by morphemes -> morpho + syntactic Thematic Training Course on Processing Morphologically Rich Languages

6 Thematic Training Course on Processing Morphologically Rich Languages
Nominal suffixes A stem can be extended by: Derivational suffixes Plural Possessive Case suffixes hat-ás-a-i-nak ‘to its effects’ stem-DERIV.SUFF-POSS-POSS.PL-DAT egész-ség-ed-re ‘cheers’ stem-DERIV.SUFF-POSS.Sg2-SUB Thematic Training Course on Processing Morphologically Rich Languages

7 Case suffixes in Hungarian
~20 cases („rare” cases are not always counted: distributive-temporal (-nte), associative (-stul/-stül…)) always at the right end of the word form grammatical relations are encoded: Arguments of the verb Adjuncts (temporal and locative adverbials) Thematic Training Course on Processing Morphologically Rich Languages

8 Thematic Training Course on Processing Morphologically Rich Languages
…and in English Pisti szerdánként edzésre jár. Steve Wednesday-DIST-TEMP training-SUB go-3Sg Each Wednesday Steve goes to training. Szerdánként – each Wednesday Edzésre – to training Thematic Training Course on Processing Morphologically Rich Languages

9 Thematic Training Course on Processing Morphologically Rich Languages
Pisti bort iszik. Steve wine-ACC drink-3Sg Steve is drinking wine. Pisti-NOM – Steve – subject Bort – wine - object Thematic Training Course on Processing Morphologically Rich Languages

10 Possessive in Hungarian
A fiú kutyája The boy dog-POSS The boy’s dog A(z ő) kutyája The (he) dog-POSS His dog Possessor in nominative Possessed with a possessive marker A fiúnak a kutyája The boy-DAT the dog-POSS Possessor in dative Possessed with a possessive marker Thematic Training Course on Processing Morphologically Rich Languages

11 Thematic Training Course on Processing Morphologically Rich Languages
…and in English The boy’s dog His dog Possessor with a possessive marker (pronoun) Possessed with no marker The dog of the boy Possessive relation is marked by a preposition Thematic Training Course on Processing Morphologically Rich Languages

12 Hungarian vs. English - nouns
Number of word forms: several hundreds (HU) vs. 2-3 (EN) Means to express grammatical relations: Suffixes (HU) Preposition, fixed position (word order), suffix, determiner (EN) Methods for morphological parsing are very different for Hungarian and English Thematic Training Course on Processing Morphologically Rich Languages

13 Thematic Training Course on Processing Morphologically Rich Languages
Verbal suffixes A stem can be extended by: Derivational suffixes Mood markers Tense markers Person/number suffixes Objective markers Vág-at-ná-k Cut-CAUS-COND-3PlObj ‘they would have it cut’ Thematic Training Course on Processing Morphologically Rich Languages

14 Mood and tense in Hungarian
Indicative: default (not marked) Conditional: suffixes (present) – analytic form (past) Imperative: suffixes Tense: Present: default (not marked) Past: suffixes Future: analytic (auxiliary fog) Thematic Training Course on Processing Morphologically Rich Languages

15 Thematic Training Course on Processing Morphologically Rich Languages
…and in English Mood: Indicative: default (not marked) Conditional: past tense forms + analytic forms (auxiliary would) Imperative: auxiliaries + grammatical structure Tense: Present: default (not marked) Past: suffix / irregular forms (suppletives or ablaut (vowel change)) Future: analytic (auxiliary will) Thematic Training Course on Processing Morphologically Rich Languages

16 Thematic Training Course on Processing Morphologically Rich Languages
Person & Number Hungarian: suffixes Fut-ok Fut-sz Fut Fut-unk Fut-tok Fut-nak 3Sg is the default (not marked!) English: 3Sg + pronouns / obligatory subject I run You run He runs We run They run 3Sg marked! Thematic Training Course on Processing Morphologically Rich Languages

17 Derivational suffixes in Hungarian
Possibility/permission: fut-hat-ok run-MOD-1Sg ‘I may run’ Reflexive: mos-akod-unk wash-REFL-1Pl ‘we wash ourselves’ Frequentative: üt-öget-sz hit-FREQ-2Sg ‘you hit sg repeatedly’ Causative: csinál-tat-nak do-CAUS-3Pl ‘they have sg done’ Thematic Training Course on Processing Morphologically Rich Languages

18 Thematic Training Course on Processing Morphologically Rich Languages
… and in English Possibility/permission: auxiliaries Reflexive: pronominal objects Frequentative: adverb Causative: construction Thematic Training Course on Processing Morphologically Rich Languages

19 Hungarian vs. English - verbs
Number of word forms: several hundreds (HU) vs. 4-5 (EN) Means to express grammatical relations: Suffixes + auxiliaries (HU) Auxiliaries + reflexive pronouns + constructions (EN) A lot of syntactic information is encoded in Hungarian morphemes Thematic Training Course on Processing Morphologically Rich Languages

20 Thematic Training Course on Processing Morphologically Rich Languages
Morphology Syntax English Nominal suffix verb-argument relation word order, preposition possessive suffix, preposition Verbal suffix tense suffix agreement pronoun, suffix modality auxiliary causation construction aspect reflexivity pronoun Thematic Training Course on Processing Morphologically Rich Languages

21 Morphosyntactic coding systems
Language independent (?) Language dependent (dis)advantages: comparability considering language-specific features complexity Different information is necessary for each language Thematic Training Course on Processing Morphologically Rich Languages

22 Hungarian coding systems
HUMOR recall Thursday Session 1  in the Hungarian National Corpus MSD In Szeged Treebank Parser and POS-tagger available at: KR No database Parser and POS-tagger available at: Thematic Training Course on Processing Morphologically Rich Languages

23 Thematic Training Course on Processing Morphologically Rich Languages
MSD Morphosyntactic Description International coding system: English Romanian Slovenian Czech Bulgarian Estonian Hungarian Thematic Training Course on Processing Morphologically Rich Languages

24 Thematic Training Course on Processing Morphologically Rich Languages
MSD - 2 Positional codes A given position encodes a given type of information Position 0: part-of-speech Position 1: (sub)type within POS Further positions: other grammatical information (person, number, case, etc.) Irrelevant positions are marked with a hyphen (-) Thematic Training Course on Processing Morphologically Rich Languages

25 Thematic Training Course on Processing Morphologically Rich Languages
KR Created for Hungarian Hierarchical attribute-value matrices Default values (3Sg, singular…) Derivational information is encoded Compounds are also segmented Thematic Training Course on Processing Morphologically Rich Languages

26 Thematic Training Course on Processing Morphologically Rich Languages
MSD vs. KR Differences between the two systems: derivation compounds Harmonization efforts in order to build a morphological parser the output of which is in total harmony with the Szeged Treebank (magyarlanc) (Farkas et al. 2010) Thematic Training Course on Processing Morphologically Rich Languages

27 Thematic Training Course on Processing Morphologically Rich Languages
Nouns in MSD kutya Nc-sn ‘dog’ kutyámat Nc-sa---s1 ‘my dog-ACC’ kutyaházaikról kutyaház Nc-ph---p3 ‘about their doghouse’ Obamához Obama Np-st ‘to Obama’ Thematic Training Course on Processing Morphologically Rich Languages

28 Thematic Training Course on Processing Morphologically Rich Languages
Verbs in MSD futok fut Vmip1s---n ‘I run’ futhatsz Voip2s---n ‘you can run’ ütögették üt Vfis3p---y ‘they were hitting it’ csináltattunk csinál Vsis1p---n ‘we had sg made’ Thematic Training Course on Processing Morphologically Rich Languages

29 Morphosyntactically annotated Hungarian corpora
Hungarian National Corpus 100-million-word balanced reference corpus of present-day Hungarian Word forms automatically annotated for stem, part of speech and inflectional information Szeged Treebank 1-million words, 82K sentences Manually annotated for lemma, POS-tags Constituency and dependency trees Thematic Training Course on Processing Morphologically Rich Languages

30 Thematic Training Course on Processing Morphologically Rich Languages
Szeged Treebank Manually annotated treebank for Hungarian Covers various linguistics styles literature, newspapers, laws, student essays, computer books, etc. multilingual connection: Orwell’s 1984; Win2000 manual in Hungarian Available free of charge for research Developed by University of Szeged, HLT group MorphoLogic Ltd. Academy of Sciences, Research Institute for Linguistics Thematic Training Course on Processing Morphologically Rich Languages

31 Thematic Training Course on Processing Morphologically Rich Languages
Szeged Treebank 2. TEI XML format Manually annotated sentence split & word segmentation morphological analysis PTB-style syntactic structure Verb argument structure converted / extended to Dependency Grammar format manually Thematic Training Course on Processing Morphologically Rich Languages

32 Thematic Training Course on Processing Morphologically Rich Languages
Szeged Treebank 3. Several versions Constituency and dependency versions Old MSD codes New (harmonized) MSD codes (dependency) parser under development Being extended with folklore texts Thematic Training Course on Processing Morphologically Rich Languages

33 Dependency vs. constituency
Each node corresponds to a word -> no virtual nodes (CP, I’…) in dependency trees Constituency grammars said to be good for languages with fixed word order Syntactic relations are determined by the position in the tree (constituency grammar) by dependency relations (labeled edges) (dependency) Thematic Training Course on Processing Morphologically Rich Languages

34 Constituency trees in SzT2.0
Based on generative syntax (É. Kiss et al. 1999) Syntactic features of Hungarian also considered (i.e. not hardcore Chomskyan trees) Verb-argument relations are encoded by labels Very detailed information: different grammatical role for each case suffix Semantic information also can be found (temporal and locative adverbials) Thematic Training Course on Processing Morphologically Rich Languages

35 Thematic Training Course on Processing Morphologically Rich Languages
Aggie all relative-POSS-ACC the day before yesterday see-PAST-3Sg-Obj guest-ESS ‘Aggie received all of her relatives the day before yesterday.’ Thematic Training Course on Processing Morphologically Rich Languages

36 Thematic Training Course on Processing Morphologically Rich Languages

37 Dependency trees in Szeged Dependency Treebank
Based on SzT2.0 Automatic conversion and manual correction Word forms are the nodes of the tree Simplified relations for nominal arguments: SUBJ, OBJ, DAT,OBL, ATT Semantic information kept Sentences without 3Sg copula are distinctively marked Thematic Training Course on Processing Morphologically Rich Languages

38 Thematic Training Course on Processing Morphologically Rich Languages
Winston Smith, his chin nuzzled into his breast in an effort to escape the vile wind, slipped quickly through the glass doors of Victory Mansions. Thematic Training Course on Processing Morphologically Rich Languages

39 Thematic Training Course on Processing Morphologically Rich Languages
Virtual nodes No overt copula in present tense 3Sg Only subject and predicative noun/adjective manifest No syntactic structure in SzT (grammatical roles are not marked) Virtual nodes in SzDT Thematic Training Course on Processing Morphologically Rich Languages

40 Thematic Training Course on Processing Morphologically Rich Languages
I like to go to school because it is good to be at school though not always. Thematic Training Course on Processing Morphologically Rich Languages

41 Szeged Treebank vs. Szeged Dependency Treebank
Labeled relations in both cases -> not so sharp contrast Virtual nodes in SzDT -> grammatical structure marked for every sentence (IE, MT) No word order constraints in SzDT Word forms are marked Other possibilities: morpheme-based syntax (Prószéky et al. (1989), Koutny, Wacha (1991)) Thematic Training Course on Processing Morphologically Rich Languages

42 Language-specific morphosyntactic problems
Morphology vs. syntax: Pseudo-subjects Pseudo-objects Pseudo-datives Morphological analysis of unknown words Lemmatization of named entities Thematic Training Course on Processing Morphologically Rich Languages

43 Thematic Training Course on Processing Morphologically Rich Languages
Pseudo-subjects a noun in nominative is not the subject of the sentence -> special attention required when parsing Possessor: a kisfiú labdája the boy ball-3SgPOSS the boy’s ball Predicative noun: István juhász maradt. Stephen shepherd remain-PAST Stephen remained a shepherd. Object: A kutyám kergeti a macska. The dog-POSS chase-3SgObj the cat ‘The cat is chasing my dog.’ (garden path sentence) A fiam szereti a lányod. The son-1SgPOSS love-3SgObj the daughter-2SgPOSS ‘My son loves your daughter’ or ‘Your daughter loves my son’ Thematic Training Course on Processing Morphologically Rich Languages

44 Thematic Training Course on Processing Morphologically Rich Languages
Solutions Possessor: SzT: one NP includes the possessor and the possessed ((a kisfiú) labdája) SzDT: ATT relation Predicative noun: PRED relation Virtual node in SzDT Object: OBJ relation Sometimes contextual information is needed even for humans… Thematic Training Course on Processing Morphologically Rich Languages

45 Thematic Training Course on Processing Morphologically Rich Languages
Pseudo-objects Adverbials with an apparently accusative ending: Futottam egy jót. Run-PAST-1Sg a good-ACC I have had a good run. Nagyot aludtam. Big-ACC sleep-PAST-1Sg I have slept a lot. Intransitive verbs -> cannot be an object -> MODE relation Thematic Training Course on Processing Morphologically Rich Languages

46 Thematic Training Course on Processing Morphologically Rich Languages
Pseudo-datives Not all (semantic) subjects are in nominative: Dative subject: Sándornak kell elrendeznie az ügyeket. Alexander-DAT must arrange-INF-3Sg the issue-PL Alexander has to arrange the issues. DAT in both corpora Certain auxiliaries with dative subjects (exceptions) Dative-nominative parallelism in possessive as well Thematic Training Course on Processing Morphologically Rich Languages

47 Thematic Training Course on Processing Morphologically Rich Languages
Unknown words Unknown words can be: Compounds Named entities Derivations fémkapunk félmillió csokinyúl NATO-hoz Methods for analysis (Zsibrita et al. 2010): Segmentation into two or more analyzable parts Expert rules to filter impossible combinations (*V+N) Analysis of the last part goes to the whole word Substitution for hyphenated words (pre-defined patterns for each morphological class) Thematic Training Course on Processing Morphologically Rich Languages

48 Thematic Training Course on Processing Morphologically Rich Languages
félmillió fél N half ADJ NUM V be afraid millió million fél+millió Mc-snl Expert rules: NUM + NUM * non-NUM + NUM Thematic Training Course on Processing Morphologically Rich Languages

49 Thematic Training Course on Processing Morphologically Rich Languages
fémkapunk fém N metal kap V get kapu gate unk S 1Pl (verb) nk 1PlPoss (noun) fém+kap+unk Vmip1p---n fém+kapu+nk Nc-sn---p1 Expert rules: N + N N-nonNOM + V * N-NOM + V Thematic Training Course on Processing Morphologically Rich Languages

50 Thematic Training Course on Processing Morphologically Rich Languages
csokinyúl csoki N chocolate nyúl rabbit V stretch kinyúl stretch out csoki+nyúl Vmip3s---n Nc-sn cso+kinyúl (?) Expert rules: N + N N-nonNOM + V * N-NOM + V Thematic Training Course on Processing Morphologically Rich Languages

51 Thematic Training Course on Processing Morphologically Rich Languages
NATO-hoz NATO ? hoz V bring S to NATO-hoz NATO: V Vmip3s---n NATO-hoz (kalaphoz) NATO: N Np-st Ordering of rules: substitution segmentation Expert rules: N S N-nonNOM V * N-NOM V V V Substitution: NATO- -> kalap ‘hat’ Thematic Training Course on Processing Morphologically Rich Languages

52 Thematic Training Course on Processing Morphologically Rich Languages
Lemmatization Lemmatization (i.e. dividing the word form into its root and affixes) is not a trivial task in morphologically rich languages such as Hungarian common nouns: relying on a good dictionary NEs: cannot be listed Problem: the NE ends in an apparent suffix Thematic Training Course on Processing Morphologically Rich Languages

53 Thematic Training Course on Processing Morphologically Rich Languages
Lemmatization of NEs each ending that seems to be a possible suffix is cut off the NE in step-by-step fashion Citroenben Citroenben (lemma) Citroen + ben ‘in (a) Citroen’ Citroenb + en ‘on (a) Citroenb’ Citroenbe + n ‘on (a) Citroenbe’ Each possible lemma undergoes a Google and a Yahoo search – the most frequent one is chosen (Farkas et al. 2008) Thematic Training Course on Processing Morphologically Rich Languages

54 Thematic Training Course on Processing Morphologically Rich Languages
NLP applications NER NEs with suffixes Information extraction Modality, uncertainty Causation Machine translation Morphemes vs. structures Thematic Training Course on Processing Morphologically Rich Languages

55 Thematic Training Course on Processing Morphologically Rich Languages
Named Entities NEs should be recognized They should be morphosyntactically tagged -> proper syntactic/semantic analysis A Citroenben a Peugeot meghatározó tulajdonhányadot szerez. Mini dictionary + suffix list + semantic frame Thematic Training Course on Processing Morphologically Rich Languages

56 Thematic Training Course on Processing Morphologically Rich Languages
DET the ben S in Citroenben ? en on meghatározó ADJ dominant n ot ACC Peugeot szerez V acquire t tulajdonrész N interest Thematic Training Course on Processing Morphologically Rich Languages

57 Thematic Training Course on Processing Morphologically Rich Languages
Possible analyses Citroenben Citroen + ben ‘Citroen-INE’ Citroenb + en ‘Citroenb-SUP’ Citroenbe + n ‘Citroenbe-SUP’ Peugeot Peugeo + t ‘Peugeo-ACC’ Peuge + ot ‘Peuge-ACC’ Thematic Training Course on Processing Morphologically Rich Languages

58 Thematic Training Course on Processing Morphologically Rich Languages
A semantic frame <event frame=transaction.ownerchange>[1=V("szerez"|"vásárol"|"vesz"|"megvesz"|"megvásárol"|"felvásárol")+subject=2+direct_object=3] <rv role=buyer>[2=N]</rv> [3=N("részesedés"|"tulajdon"|"tulajdonrész"|"rész„| ”tulajdonhányad”)+compl1=4+modified_by_adj=5] <rv role=product>[4=N+case=ine+ceg]</rv> <rv role=newshare>[5=A+measure+modified_by_number=6] [6=NB]</rv> </event> Thematic Training Course on Processing Morphologically Rich Languages

59 Thematic Training Course on Processing Morphologically Rich Languages
Analysis A Citroenben a Peugeot meghatározó tulajdonhányadot szerez. Tulajdonhányadot -> ACC/OBJ (3) Citroenben -> INE (4) Peugeot -> NOM/SUBJ (2) ‘Peugeot acquires a dominant interest in Citroen.’ Thematic Training Course on Processing Morphologically Rich Languages

60 Thematic Training Course on Processing Morphologically Rich Languages
Uncertainty Text Mining: derive facts from free text uncertainty and negation have an impact on the quality/nature of the information extracted applications have to treat sentences / clauses containing uncertain or negated information differently from factual information Uncertainty: possible existence of a thing (neither its existence nor its non-existence is claimed) Thematic Training Course on Processing Morphologically Rich Languages

61 Uncertainty detection
Uncertainty detection in English: cues (words with uncertain content) One typical means to express uncertainty in Hungarian: -hat/het High school grades may influence health. A középiskolai jegyek kihathatnak az egészségre. Morphological analysis should reflect modality (Voip3s---n) Thematic Training Course on Processing Morphologically Rich Languages

62 Thematic Training Course on Processing Morphologically Rich Languages
Causation Semantic/thematic relations to be determined properly AGENT != SUBJECT Varrattam egy ruhát. sew-CAUS-PAST-1Sg a dress-ACC ‘I had a dress sewn.’ Varrattam Marival egy ruhát. sew-CAUS-PAST-1Sg Mari-INS a dress-ACC ‘I had Mary sew a dress.’ Varrtam Marival egy ruhát. sew-PAST-1Sg Mari-INS a dress-ACC ‘I sewed a dress with Mary.’ Causative information should be encoded (Vsip3s---n) Thematic Training Course on Processing Morphologically Rich Languages

63 Argument structure of causative verbs
Agent Beneficiary Patient Varrattam egy ruhát. ? I (NOM) ruha (ACC) Varrattam Marival egy ruhát. Mari (INS) Varrtam Marival egy ruhát. I (NOM) + Mari (INS) Thematic Training Course on Processing Morphologically Rich Languages

64 Thematic Training Course on Processing Morphologically Rich Languages
Machine translation Morpheme-based translation would be ideal Easier alignment of translational units Good morphological parser needed Easier to execute in dependency grammar Morpheme-based dependency structures Thematic Training Course on Processing Morphologically Rich Languages

65 Thematic Training Course on Processing Morphologically Rich Languages
Alignments at | varr t ruha ban | ház am in | house my have | sewn dress Thematic Training Course on Processing Morphologically Rich Languages

66 Thematic Training Course on Processing Morphologically Rich Languages
Problems Not practical: no corpus available at the moment Portmanteau morphs – alignment problems Zero morphs – how many of them? 3 zero morphs in Hungarian nouns: könyv-Ø-Ø-Ø vs. könyveit book-Ø-Ø-Ø book-POSS-POSS.PL-ACC (Mel’cuk 2006) Thematic Training Course on Processing Morphologically Rich Languages

67 Thematic Training Course on Processing Morphologically Rich Languages
Morphosyntactic codes might help Csinálhattátok Vois2p---y Reordering rules V csinál do o hat can i - s t PAST 2p tok you y á it csinálhattátok you could do it Thematic Training Course on Processing Morphologically Rich Languages

68 Thematic Training Course on Processing Morphologically Rich Languages
An example hat | csinál / | \ t á tok can | do / | \ d Ø you could / \ you do Thematic Training Course on Processing Morphologically Rich Languages

69 Thematic Training Course on Processing Morphologically Rich Languages
Syntax vs. case suffix Pseudo-subject Extra rules; PRED, OBJ difficult for humans Pseudo-object List of adverbs with accusative ending Pseudo-dative List of verbs with dative subject Unknown words (lemmas+suffixes) Guessing (rules) Information extraction Thematic/semantic relations Proper morphosyntactic codes + rules Uncertainty detection Proper morphosyntactic codes Machine translation (morpheme-based) Thematic Training Course on Processing Morphologically Rich Languages

70 Thematic Training Course on Processing Morphologically Rich Languages
Summary Syntax-morphology interface in Hungarian Morphological coding systems Syntactic annotation in Hungarian corpora Morphosyntactic problems: NER IE MT Thematic Training Course on Processing Morphologically Rich Languages

71 Thematic Training Course on Processing Morphologically Rich Languages
References É. Kiss K., Kiefer F., Siptár P.: Új magyar nyelvtan, Osiris Kiadó, Bp., 1999. Farkas Richárd, Szeredi Dániel, Varga Dániel, Vincze Veronika 2010: MSD-KR harmonizáció a Szeged Treebank 2.5-ben. In: Tanács Attila, Vincze Veronika (szerk.): VII. Magyar Számítógépes Nyelvészeti Konferencia. Szeged, Szegedi Tudományegyetem, pp Farkas, Richárd; Vincze, Veronika; Nagy, István; Ormándi, Róbert; Szarvas, György; Almási, Attila 2008: Web-based lemmatisation of Named Entities. In: Horák, Ales; Kopeček, Ivan; Pala, Karel; Sojka, Petr (eds.): Proceedings of the 11th International Conference on Text, Speech and Dialogue (TSD2008), Berlin, Heidelberg, Springer Verlag, LNCS 5246, pp Koutny I., Wacha B.: Magyar nyelvtan függőségi alapon. Magyar Nyelv Vol. 87 No. 4. (1991) 393–404. Mel’cuk, Igor 2006: Aspects of the Theory of Morphology. Mouton de Gruyter. Prószéky, G., Koutny, I., Wacha, B.: Dependency Syntax of Hungarian. In: Maxwell, Dan; Klaus Schubert (eds.) Metataxis in Practice (Dependency Syntax for Multilingual Machine Translation), Foris, Dordrecht, The Netherlands (1989) 151–181 Zsibrita János, Vincze Veronika, Farkas Richárd 2010: Ismeretlen kifejezések és a szófaji egyértelműsítés. In: Tanács Attila, Vincze Veronika (szerk.): VII. Magyar Számítógépes Nyelvészeti Konferencia. Szeged, Szegedi Tudományegyetem, pp Thematic Training Course on Processing Morphologically Rich Languages


Download ppt "The Syntax-Morphology Interface and Natural Language Processing"

Similar presentations


Ads by Google