Download presentation

Presentation is loading. Please wait.

Published byBraxton Merithew Modified over 4 years ago

2
An Intuitive Representation of Human Languages for Translation Gábor Prószéky MorphoLogic& Faculty of Information Technology, Pázmány University Kalmár Workshop Szeged, October 1-2, 2003

3
Contents t Some words on Prof. Kalmár’s activity in computational linguistics t Problems of human language description with formal tools t A new representation with patterns t Introduction to machine translation methods t Application of patterns to translation Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

4
Kalmár & languages t Kalmár’s paper in formal language theory: „An Intuitive Representation of Context-Free Languages” t Kalmár’s activity in machine translation (conference in 1962): „Representation of Languages with the Help of Mathematical Structures” Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

5
Linguistic representation problems of the 60’s t Dependency structure t Constituent structure t X-bar theory: X’ (P) X (Q) t Related structures t Using transformations Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

6
Structured symbols t Linguistic categories: atomic symbols t Not enough: subcategorization t Semantic features: ± alive,... t Syntactic features: ± countable,... t Rule sets instead of rules t ID/LP Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

7
Feature structures t DAGs t Unification problems t Feature geometry, typed features t LFG, GPSG, HPSG t Parsing: CF-skeleton + features or feature structures only? Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

8
Complexity of NL grammars t RG/FSA: not enough t CF/RTN: not enough t CS ? t 0/ATN: Turing Machine t Transformations and metarules t Arguments for and against Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

9
NL grammar formalisms t Competence and performance? t Kornai number (left-recursion, center- embedding, “respectively” construction) t Gradually from unrestricted to regular t (i) a n b n ->a*b* (n is lost!) t (ii) a n b n ->{ε,ab,aabb,aaabbb} t “Finitization” by length t No structure in FSA; finite systems, however, can produce structural output Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

10
Syntax and semantics t Logical representations (e.g. λx.dog(x), λx.run(x)) t World-knowledge representations (e.g. IS-A, PART-OF, INSTANCE-OF ) t Categorial grammar: early logical representations of syntax (Kalmár) t DCG: interpretation & representation t Rule-to-rule hypothesis Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

11
Conflict handling t Lexicon meets syntax: who is right? t Lexicon: off-line info coming from past experiences t Which is more important in a specific situation? Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

12
Open classes t Open vs. closed classes: that is, features can or cannot be overridden t Proper names, jabbers, folk etymology, loanwords,... t Grammar of closed classes: minimal grammar Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

13
Finite morphology t Finite patterns t Finite number of entries t Descriptions assigned to entries t Finite & open vs. infinite & closed t Underspecified entries for guessing Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

14
Finite syntax t “Item and arrangement” (as in morphology) t “Arrangement” describes a rather free constituent-order t Metawords in a meta-dictionary, e.g. ‘(Det (Adj (N)))’ ‘DAN’ t Cascades without loop Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

15
The „plastic box” t John is a boy. t ”John” is a noun. t Go is a verb. t ”Go” is a verb. t is a sign. t ” ” is a sign. t is a . (where is a ”plastic box”) Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

16
Real examples (a) Unusual use: Go is a verb. POS [np] POS [v] (b) Metaphor: My car drinks a lot. ANIMATE [+] ANIMATE [-] (c) Unknown entry: Kalmár is a family name. POS [np] Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

17
Linguistic frames t Psychology: ”Gestalt” t Morphological complex structures treated as frames by humans t Frames in AI: ‘shopping’, ‘walking’,... t As ‘high-level parsing’ relates to ‘detailed on-line analysis’ Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

18
Translation of human languages t old problems (50’s) t direct (60’s) t interlingual (70’s) t transfer (80’s) t examples (90’s) Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

19
Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation Patterns: general linguistic information in lexicalized form t Short, fully specified patterns are: lexical entries t Longer, fully specified entries are: multi-word expressions t Partially underspecified patterns are: collocations, phrasal verbs, idioms t Totally underspecified patterns are: linguistic rules t Pattern/interpretation pairs: Translation Description Language

20
Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation The MetaMorpho principles t No single words but contextual expressions (in form of patterns) only t Pattern pairs: input/interpretation structure pairs t Single pass: no separate transfer steps t Target structure generation: by-product of parsing t

21
Jabberwocky ‘Twas brillig, and the slighty toves Did gyre and gimble in the wabe: All mimsy were the borogroves, And the mone raths outgrabe. Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

22
‘Twas , and the s Did and in the : All were the s, And the s . Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

23
Translation rules for Jabberwocky t ‘twas volt t , and , és t the s did a ok tak t and és t in the a ban t all teljesen t were the s k voltak az ok t the s a ok tek Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

24
t ‘Twas , and the s Did and in the : All were the s, And the s . t volt, és a ok tak és tek a ben: teljesen voltak a ok és a ok tek. Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

25
Translation of Jabberwocky Dzsebervoki Brillig volt, és a szlájti tóvok gájertak és gimbeltek a vébben: teljesen mimszik voltak a borogróvok és a món rátok autgrébtek. Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

26
An intuitive representation... 1. X-bar based structures 2. Feature-based descriptions 3. Metarules (used off-line) 4. Rule-to-rule principle 5. Lexicon should be finite but open 6. Closed classes belong to the minimal grammar 7. Minimal grammar describes ”basically” linguistic elements Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

27
An intuitive representation... (cont’d) 8. Linguistic constructions can be described by finite patterns 9. A huge & finite description set is used rather than a limited & infinite grammar 10. In case of conflict, lexical information is either redundant or contradicting to the actual description 11. Known constructions need no real- time analysis (Gestalt, frame) Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

28
An intuitive representation... (cont’d) 12. ”Broken” frames are analyzed real-time 13. Structural (source/target) pattern pair is assigned to every frame to be translated 14. Target structure is computed while parsing source structure Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

29
Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

Similar presentations

OK

Language and Cognition Colombo, June 2011 Day 2 Introduction to Linguistic Theory, Part 3.

Language and Cognition Colombo, June 2011 Day 2 Introduction to Linguistic Theory, Part 3.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google