Presentation is loading. Please wait.

Presentation is loading. Please wait.

March 5, 2008Companions Semantic Representation and Dialog Interfacing Workshop - Morphology and Surface Syntax 1 The PDT Morphology and Surface Syntax.

Similar presentations


Presentation on theme: "March 5, 2008Companions Semantic Representation and Dialog Interfacing Workshop - Morphology and Surface Syntax 1 The PDT Morphology and Surface Syntax."— Presentation transcript:

1 March 5, 2008Companions Semantic Representation and Dialog Interfacing Workshop - Morphology and Surface Syntax 1 The PDT Morphology and Surface Syntax Jan Hajič Institute of Formal and Applied Linguistics School of Computer Science Faculty of Mathematics and Physics Charles University, Prague Czech Republic

2 March 5, 2008Companions Semantic Representation and Dialog Interfacing Workshop - Morphology and Surface Syntax 2 Morphology (m-layer) Prerequisites for the manual annotation process: Tokenized data Annotation guidelines Annotation tool Manual decision making support Offline (or online) morphological analyzer Quality checking tool Process description Results (manually annotated data) to be used for... tagger training, linguistic research, basis for further annotation,...

3 March 5, 2008Companions Semantic Representation and Dialog Interfacing Workshop - Morphology and Surface Syntax 3 Morphological Attributes Tag: 13 categories Example: AAFP3----3N---- Adjective no poss. Gender negated Regular no poss. Number no voice Feminine no person reserve1 Plural no tense reserve2 Dative superlative base var. Lemma: POS-unique identifier Books/verb -> book-1, went -> go, to/prep. -> to-1 Ex.: nejnezajímavějším “(to) the most uninteresting”

4 March 5, 2008Companions Semantic Representation and Dialog Interfacing Workshop - Morphology and Surface Syntax 4 Morphological Tagset 13 categories, 4452 plausible tags (combinations):

5 March 5, 2008Companions Semantic Representation and Dialog Interfacing Workshop - Morphology and Surface Syntax 5 Morphological Analysis Formally: MA: A + → Pow(L x T) MA(f) = { [ l,t ] }; f  A + (the token), l  L (lemma), t  T (tag) tokens taken in isolation no attempt to solve e.g. auxiliaries vs. full verbs Ex.: MA(“má“) = { [mít,VB-S---3P-AA---], lit. “to have” lit. “has”,”my” [můj,PSFS1-S1------1], lit. “my” [můj,PSFS5-S1------1], [můj,PSNP1-S1------1], [můj,PSNP4-S1------1], [můj,PSNP5-S1------1] }

6 March 5, 2008Companions Semantic Representation and Dialog Interfacing Workshop - Morphology and Surface Syntax 6 Morphological Analysis: Implementation Dictionary-based covers 800kW (lemmas), ~ 20 mil. forms (w/tag) C code implementation standard (regular) derivations on-the-fly; ex.: spojit spojený spojený spojenost spojitelnýspojitelný spojitelnost irregular forms listed in dictionary (w/tags) no phonological processing (concatenation only) grammatical prefixes only: negation, superlative joinedly joinjoined joinedliness joinably joinablejoinability

7 March 5, 2008Companions Semantic Representation and Dialog Interfacing Workshop - Morphology and Surface Syntax 7 The Morphological Annotation Tool (LAW)

8 March 5, 2008Companions Semantic Representation and Dialog Interfacing Workshop - Morphology and Surface Syntax 8 The Process of Morphological Annotation From tokenized to annotated text: (Auto) morphological analysis morphological dictionary tokenized text (auto, w-layer) text w/morph. interpretations Manual morphological disambiguation (DA) text w/select. interpretation Manual adjudication annotated text (m-layer) annotation guidelines

9 9 PDT – Syntactic Annotation Surface syntax annotation Dependency surface syntax Comparable to Penn Treebank annotation Convertible: dependency ↔ parse trees Deep syntactic/semantic annotation Dependency trees Different topology High level of generalization and formalization Many node attributes

10 10 Analytical Syntax (a-layer) Dependency + Analytical Function dependent governor The influence of the Mexican crisis on Central and Eastern Europe has apparently been underestimated.

11 11 Analytical Syntax: Functions Main (for [main] semantic lexemes): Pred, Sb, Obj, Adv, Atr, Atv(V), AuxV, Pnom “Double” dependency: AtrAdv, AtrObj, AtrAtr Special (function words, punctuation,...): Reflefives, particles: AuxT, AuxR, AuxO, AuxZ, AuxY Prepositions/Conjunctions: AuxP, AuxC Punctuation, Graphics: AuxX, AuxS, AuxG, AuxK Structural Elipsis: ExD, Coordination etc.: Coord, Apos

12 12 Example All came from Cray Research.

13 13 Surface Syntax Example Complete sentence: Sb, Pred, Obj  Resistance needs courage.

14 14 Surface Syntax Example Analytical verb form:  he would be allowed to be enrolled

15 15 Surface Syntax Example Predicate with copula (state)‏  you were fired

16 16 Surface Syntax Example Passive construction (action)‏  (The) book has been translated [by Mr. X]

17 17 Surface Syntax Example Complement  she left crying

18 18 Surface Syntax Example Object  he gave Mary a book

19 19 Surface Syntax Example Object used for infinitive of analytical verb forms  he wants to learn

20 20 Surface Syntax Example Relative clause (embedded)‏  the woman, who had a French accent, was very pretty

21 21 Surface Syntax Example Coordination ... (to) magic, mysticism(,) etc.

22 22 Surface Syntax Example Apposition  cheap, i.e. under five dollars

23 23 Incomplete phrases  Peter works well, but Paul badly Surface Syntax Example

24 24 Surface Syntax Example Variants (equality)‏  he bought shoes for his son

25 25 XML Annotation Layers (English) Strictly top-down links w+m+a can be easily “knitted” API for cross-layer access (programming)‏ PML Schema / Relax NG [With slight modification, can be used for spoken data (audio as layer “-1”)]


Download ppt "March 5, 2008Companions Semantic Representation and Dialog Interfacing Workshop - Morphology and Surface Syntax 1 The PDT Morphology and Surface Syntax."

Similar presentations


Ads by Google