Download presentation
Presentation is loading. Please wait.
1
April 26, 2007Workshop on Treebanking, NAACL-HTL 2007 Rochester1 Treebanks: Language-specific Issues Czech Jan Hajič Institute of Formal and Applied Linguistics School of Computer Science Faculty of Mathematics and Physics Charles University, Prague Czech Republic
2
April 26, 2007Workshop on Treebanking, NAACL-HTL 2007 Rochester2 Czech I Morphology (& tagging) 300.000 dictionary entries ~ 20 mil. forms Rich, 13 categories to deal with Tagset: ~4500 plausible tags, 3000 observed Lemmatization essential Syntax (surface) “Free” word order – non-projective constructions Frequent occurrence of… Agreement (several types) “Governance” (subcategoriaztion) and Control
3
April 26, 2007Workshop on Treebanking, NAACL-HTL 2007 Rochester3 Czech II Deep syntax and semantics Information structure Topic, focus – each word (node) vs. global “Deep” word order ~ communicative function Valency, word sense Function and form strongly corelated Checking, constraints Style Colloquial / standard Hard to locate to layers…
4
April 26, 2007Workshop on Treebanking, NAACL-HTL 2007 Rochester4 Examples English (motivation) John expects Mary to leave.
5
April 26, 2007Workshop on Treebanking, NAACL-HTL 2007 Rochester5 Example (~ he wanted her to leave) Surface syntax: Nutil ji odejít / Začal ji číst
6
April 26, 2007Workshop on Treebanking, NAACL-HTL 2007 Rochester6 Example (~ he wanted her to leave) Deep syntax: Nutil ji odejít / Začal ji číst link to surface tree node
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.