Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tel More Telugu Morphological Generator Madhavi Ganapathiraju and Lori Levin Language Technologies Institute Carnegie Mellon University Pittsburgh USA.

Similar presentations


Presentation on theme: "Tel More Telugu Morphological Generator Madhavi Ganapathiraju and Lori Levin Language Technologies Institute Carnegie Mellon University Pittsburgh USA."— Presentation transcript:

1 Tel More Telugu Morphological Generator Madhavi Ganapathiraju and Lori Levin Language Technologies Institute Carnegie Mellon University Pittsburgh USA ICUDL 2006: Second International Conference on Universal Digital Library Alexandria, Egypt November 17-19, 2006

2 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator2 machine translation summarization Information retrieval Interface design digital storage OCR U D L 

3 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator3 machine translation Rani gave the book to my mother 1. Output from English Lexical analysis gave  Verb past, root give the book  Noun phrase, singular, neutral mother  noun, singular, feminine my  possessive, root I … 2. English – Telugu Dictionary for root forms of nouns and verbs give  ichchut’a book  pustakamu mother  talli, amma I  neinu 3. TelMore: Morphological generator for Telugu ichchut’a  ichchaad’u (past masc), ichchinadi (past fem),... Istun’di (future fem), istaad’u (future masc) pustakamu  pustakamu, pustakamutoo (with pustakamu), pustakamu loo (in pustakamu)… amma  ammaki (to amma), amma cheita (by amma) I  naa (possessive) 3. TelMore: Morphological generator for Telugu 1. Phrase match in EBMT Gave to  ki ichchaad’u OR

4 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator4 TelMore Generates morphological forms for nouns and verbs when the root word is given

5 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator5 About Telugu 2 nd largest spoken language in India (?) 70 M native speakers World ranking –with Korean, Vietnamese, Marathi and Tamil 7 th century AD recorded origin literary language in 11 th century AD

6 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator6 Parts of Speech: Noun Number: singular, plural Gender: male, female, neutral Morphological forms: (vibhaktulu) –nominative, genitive, dative, accusative, vocative, instrumental and locative 14 forms for each noun

7 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator7 Plural formation General rule is to add “lu” as a suffix; A series of rules are then applied to yield final form of : ©Õ (lu), ©Õx (l’l’u) or ¢œ¿Õx (n’d’lu)

8 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator8 Parts of Speech: Verb Number: singular, plural Gender: male, female, neutral Voice: 1st person, 2nd person, 3rd person Morphological forms: –Present, past, future, aorist affirmative, aorist negative, imperative and prohibitive –Present participle, past participle : affirmative and negative Number of forms: 2 x 3 x 3 x forms for each verb

9 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator9 Features in TelMore (v.1) Morphological form generation –Nouns –Verbs System –Library module for integration elsewhere –Flat file input & output (plain text or html) –User-interactive through command line –Web interface for data addition with user validation Web Interface

10 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator10 Current Data Size words have been created by native speakers upon request

11

12 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator12

13 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator13

14 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator14

15 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator15

16 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator16

17 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator17 Linguistic Knowledge The linguistic rules are taken from a book by C.P. Brown –Rules are demonstrated through examples –No formal description

18 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator18 Noun: First Declension Morphs

19 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator19 Noun: Second Declension

20 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator20 Noun: Third Declension

21 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator21 Noun: Third Declension: Irregular 2

22 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator22 Noun: Third Declension: Irregular 3

23 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator23 Noun: Third Declension: Irregular 4

24 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator24 Noun: Third Declension: Irregular 5

25 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator25 Verb: First Conjugation

26 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator26 Verb: Second Conjugation

27 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator27 Verb: Third Conjugation

28 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator28 Alternate dialects and spellings Telugu is spoken in many dialects –Andhra Pradesh has long borders with 4 states each of which speaks a different language, and one long coastal region –Dialects in each of these regions is different –learned and the others speak different dialects –Urdu influence in Hyderabad due to Muslim rule –pure/poetic formal/informal Telugu is written the way it is spoken Hence the different dialects result in different spellings of the words

29 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator29 Future work for this tool Causative, middle and passive voices to be added Morphology of adjectives, etc Integration of Om  native font integration for flat file processing Integration with English Lexicon to be of real use in multilingual applications

30 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator30 Acknowledgements Prof. Lori Levin Linguistics Advisor Prof. Raj Reddy Prof. N. Balakrishnan UDL Advisors R. Harsha Naveena Yanamala Web-interface creation Data Creation … V. Mythili Shyam G. Padmasree V. Abhinay B.V. Prashanth G. Ramana Lakshmi G. Padmavathy V. Nava Mallika

31 19th Nov, 2006ICUDL2006: TelMore - Telugu Morphological Generator31


Download ppt "Tel More Telugu Morphological Generator Madhavi Ganapathiraju and Lori Levin Language Technologies Institute Carnegie Mellon University Pittsburgh USA."

Similar presentations


Ads by Google