Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Intuitive Representation of Human Languages for Translation Gábor Prószéky MorphoLogic& Faculty of Information Technology, Pázmány University Kalmár.

Similar presentations


Presentation on theme: "An Intuitive Representation of Human Languages for Translation Gábor Prószéky MorphoLogic& Faculty of Information Technology, Pázmány University Kalmár."— Presentation transcript:

1

2 An Intuitive Representation of Human Languages for Translation Gábor Prószéky MorphoLogic& Faculty of Information Technology, Pázmány University Kalmár Workshop Szeged, October 1-2, 2003

3 Contents t Some words on Prof. Kalmár’s activity in computational linguistics t Problems of human language description with formal tools t A new representation with patterns t Introduction to machine translation methods t Application of patterns to translation Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

4 Kalmár & languages t Kalmár’s paper in formal language theory: „An Intuitive Representation of Context-Free Languages” t Kalmár’s activity in machine translation (conference in 1962): „Representation of Languages with the Help of Mathematical Structures” Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

5 Linguistic representation problems of the 60’s t Dependency structure t Constituent structure t X-bar theory: X’  (P) X (Q) t Related structures t Using transformations Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

6 Structured symbols t Linguistic categories: atomic symbols t Not enough: subcategorization t Semantic features: ± alive,... t Syntactic features: ± countable,... t Rule sets instead of rules t ID/LP Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

7 Feature structures t DAGs t Unification problems t Feature geometry, typed features t LFG, GPSG, HPSG t Parsing: CF-skeleton + features or feature structures only? Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

8 Complexity of NL grammars t RG/FSA: not enough t CF/RTN: not enough t CS ? t 0/ATN: Turing Machine t Transformations and metarules t Arguments for and against Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

9 NL grammar formalisms t Competence and performance? t Kornai number (left-recursion, center- embedding, “respectively” construction) t Gradually from unrestricted to regular t (i) a n b n ->a*b* (n is lost!) t (ii) a n b n ->{ε,ab,aabb,aaabbb} t “Finitization” by length t No structure in FSA; finite systems, however, can produce structural output Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

10 Syntax and semantics t Logical representations (e.g. λx.dog(x), λx.run(x)) t World-knowledge representations (e.g. IS-A, PART-OF, INSTANCE-OF ) t Categorial grammar: early logical representations of syntax (Kalmár) t DCG: interpretation & representation t Rule-to-rule hypothesis Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

11 Conflict handling t Lexicon meets syntax: who is right? t Lexicon: off-line info coming from past experiences t Which is more important in a specific situation? Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

12 Open classes t Open vs. closed classes: that is, features can or cannot be overridden t Proper names, jabbers, folk etymology, loanwords,... t Grammar of closed classes: minimal grammar Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

13 Finite morphology t Finite patterns t Finite number of entries t Descriptions assigned to entries t Finite & open vs. infinite & closed t Underspecified entries for guessing Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

14 Finite syntax t “Item and arrangement” (as in morphology) t “Arrangement” describes a rather free constituent-order t Metawords in a meta-dictionary, e.g. ‘(Det (Adj (N)))’  ‘DAN’ t Cascades without loop Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

15 The „plastic box” t John is a boy. t ”John” is a noun. t Go is a verb. t ”Go” is a verb. t is a sign. t ” ” is a sign. t  is a . (where  is a ”plastic box”) Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

16 Real examples (a) Unusual use: Go is a verb. POS [np]  POS [v] (b) Metaphor: My car drinks a lot. ANIMATE [+]  ANIMATE [-] (c) Unknown entry: Kalmár is a family name. POS [np] Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

17 Linguistic frames t Psychology: ”Gestalt” t Morphological complex structures treated as frames by humans t Frames in AI: ‘shopping’, ‘walking’,... t As ‘high-level parsing’ relates to ‘detailed on-line analysis’ Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

18 Translation of human languages t old problems (50’s) t direct (60’s) t interlingual (70’s) t transfer (80’s) t examples (90’s) Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

19 Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation Patterns: general linguistic information in lexicalized form t Short, fully specified patterns are: lexical entries t Longer, fully specified entries are: multi-word expressions t Partially underspecified patterns are: collocations, phrasal verbs, idioms t Totally underspecified patterns are: linguistic rules t Pattern/interpretation pairs: Translation Description Language

20 Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation The MetaMorpho principles t No single words but contextual expressions (in form of patterns) only t Pattern pairs: input/interpretation structure pairs t Single pass: no separate transfer steps t Target structure generation: by-product of parsing t

21 Jabberwocky ‘Twas brillig, and the slighty toves Did gyre and gimble in the wabe: All mimsy were the borogroves, And the mone raths outgrabe. Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

22  ‘Twas , and the  s Did  and  in the : All  were the s, And the  s . Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

23 Translation rules for Jabberwocky t ‘twas    volt t , and   , és  t the s did   a ok tak t  and    és  t in the   a ban t all   teljesen  t  were the s  k voltak az ok t the s   a ok tek Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

24  t ‘Twas , and the  s Did  and  in the : All  were the s, And the  s . t  volt, és a  ok tak és tek a ben: teljesen  voltak a ok és a  ok tek. Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

25 Translation of Jabberwocky Dzsebervoki Brillig volt, és a szlájti tóvok gájertak és gimbeltek a vébben: teljesen mimszik voltak a borogróvok és a món rátok autgrébtek. Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

26 An intuitive representation... 1. X-bar based structures 2. Feature-based descriptions 3. Metarules (used off-line) 4. Rule-to-rule principle 5. Lexicon should be finite but open 6. Closed classes belong to the minimal grammar 7. Minimal grammar describes ”basically” linguistic elements Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

27 An intuitive representation... (cont’d) 8. Linguistic constructions can be described by finite patterns 9. A huge & finite description set is used rather than a limited & infinite grammar 10. In case of conflict, lexical information is either redundant or contradicting to the actual description 11. Known constructions need no real- time analysis (Gestalt, frame) Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

28 An intuitive representation... (cont’d) 12. ”Broken” frames are analyzed real-time 13. Structural (source/target) pattern pair is assigned to every frame to be translated 14. Target structure is computed while parsing source structure Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation

29 Kalmár Workshop 2003 Gábor Prószéky: An Intuitive Representation of Human Languages for Translation


Download ppt "An Intuitive Representation of Human Languages for Translation Gábor Prószéky MorphoLogic& Faculty of Information Technology, Pázmány University Kalmár."

Similar presentations


Ads by Google