2004 October 25, VilniusSlide No 1 A Tool: Morphological Analyzer / Synthesizer for Lithuanian Vytautas Zinkevičius VDU KLC

Slides:



Advertisements
Similar presentations
Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
Advertisements

Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
Free Pascal compiler internationalisation Rimgaudas Laucius Institute of Mathematics and Informatics, Vilnius University Lithuania.
International Conference “Corpus linguistics – 2013” St. Petersburg, June 25–27, 2013 Roland Mittmann, M.A. Institute of Empirical Linguistics.
1() Multi-Source and MultiLingual Information Extraction Diana Maynard Natural Language Processing Group University of Sheffield, UK BCS-SIGAI Workshop,
Probabilistic Detection of Context-Sensitive Spelling Errors Johnny Bigert Royal Institute of Technology, Sweden
Language Resources in Indonesia Language Technology & Applied Information Laboratory Directorate for Information Technology and Electronics Agency for.
Translation software at Tilde Raivis Skadiņš Tilde “Nordic Language Technology Visit to Riga”, Riga, October 27.
CALTS, UNIV. OF HYDERABAD. SAP, LANGUAGE TECHNOLOGY CALTS has been in NLP for over a decade. It has participated in the following major projects: 1. NLP-TTP,
How do we work in a virtual multilingual classroom? A virtual multilingual classroom with Moodle and Apertium Cultural and Linguistic Practices in the.
Pasewark & Pasewark Microsoft Office XP: Introductory Course 1 INTRODUCTORY MICROSOFT WORD Lesson 3 – Helpful Word Features.
Lab 4: Microsoft word 1 Header and Footer, References & Review.
CSE111: Great Ideas in Computer Science Dr. Carl Alphonce 219 Bell Hall Office hours: M-F 11:00-11:
Robust Error Detection: A Hybrid Approach Combining Unsupervised Error Detection and Linguistic Knowledge Johnny Bigert and Ola Knutsson Royal Institute.
HLT Research and Development for Baltic Languages in Tilde Andrejs Vasiļjevs, Raivis Skadiņš Tilde Riga, October 27, 2004.
Introduction to the Basics of Microsoft Word Time: 15 minutes By: Nolan Moyer.
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Corpus Linguistics: session 2 Corpus Linguistics (2): The Tools of the Trade 669o4zt
Editing. Introduction When you edit, you check your writing for punctuation, capitalization, spelling, grammar, and sentence errors.
Research methods in corpus linguistics Xiaofei Lu.
Polderland Language & Speech Technology B.V.. Our vision To be an independent company at the forefront of international language technology, where our.
Korea Terminology Research Center for Language and Knowledge Engineering Infrastructures in Korea and for the Korean Language Key-Sun Choi.
SQA Adapted Digital Question Papers Paul D. Nisbet CALL Scotland University of Edinburgh.
Introduction to Natural Language Processing Heshaam Faili University of Tehran.
Martin KayCL Introduction1 Martin Kay Stanford University Ling 138/238.
Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**
WP5.4 - Introduction  Knowledge Extraction from Complementary Sources  This activity is concerned with augmenting the semantic multimedia metadata basis.
1 Corpora, Language Technology and Maltese Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd University of Sussex.
CIG Conference Norwich September 2006 AUTINDEX 1 AUTINDEX: Automatic Indexing and Classification of Texts Catherine Pease & Paul Schmidt IAI, Saarbrücken.
Week 9: resources for globalisation Finish spell checkers Machine Translation (MT) The ‘decoding’ paradigm Ambiguity Translation models Interlingua and.
WordNet ® and its Java API ♦ Introduction to WordNet ♦ WordNet API for Java Name: Hao Li Uni: hl2489.
Learner corpus analysis and error annotation Xiaofei Lu CALPER 2010 Summer Workshop July 13, 2010.
1 Computational Linguistics Ling 200 Spring 2006.
Number Words’ Frequency in Modern Lithuanian Adriano Cerri University of Pisa, Department of Linguistics
Introducing MorphoLogic to LIRICS Gábor Prószéky MorphoLogic Pázmány Péter Catholic University Faculty.
Chapter 2 Words & Paradigms Morphology Lane 333. What is a word? It’s used in more than one way There is a major ambiguity in the term The same vocabulary.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Projektas,,Pedagogų kvalifikacijos tobulinimo ir perkvalifikavimo sistemos plėtra (III etapas)“ Projekto Nr. VP1-2.2-ŠMM-02-V INOVATYVIŲ MOKYMO(SI)
The Balanced Tagged Corpus of Icelandic and Other Icelandic Language Technology Resources Eiríkur Rögnvaldsson, University of Iceland Sigrún Helgadóttir,
ICS 482: Natural language Processing Pre-introduction
Daisy Arias Math 382/Lab November 16, 2010 Fall 2010.
Engineering Students Opening Meeting. Montgomery College  Analyzing speech signals What did he say? Who is speaking Now?  Synthesizing speech  Improving.
ISPRA 2004 Automatic Eurovoc indexing an Experiment in the Czech Parliament Anna Lhotská, Václav Sklenář Office of the Chamber of Deputies, Parliament.
An Iterative Approach to Extract Dictionaries from Wikipedia for Under-resourced Languages G. Rohit Bharadwaj Niket Tandon Vasudeva Varma Search and Information.
CSE467/567 Computational Linguistics Carl Alphonce Computer Science & Engineering University at Buffalo.
Objective 1.04 Analyze how technology relates to other disciplines.
Translingual Information Management Stephan Busemann Language Technology Lab German Research Center for Artificial Intelligence.
Learning Usage of English KWICly with WebLEAP/DSR Takashi Yamanoue Kagoshima University, Japan Toshiro Minami Kyushu Institute of Information Sciences.
Collection of Pan-European Terminology Resources through Cooperation of Terminology Institutions EUROTERMBANK Andrejs Vasiļjevs, Tilde, Latvia.
Utkal University We Work On Image Processing Speech Processing Knowledge Management.
Learners' Dictionaries Oxford1948 Longman1978 Collins COBUILD1987 Macmillan2002 Macmillan2008 (bilingualized) Merriam-Webster2008 Jackson, Howard
Natural Language Processing Group Computer Sc. & Engg. Department JADAVPUR UNIVERSITY KOLKATA – , INDIA. Professor Sivaji Bandyopadhyay
Alex Turner Senior Program Manager Managed Languages Team Improve Your Code Quality using Live Code Analyzers.
Working Group "European Statistical Data Support" Luxembourg, 15 th February 2012 “Presentation of the new version of Assist“
Tasneem Ghnaimat. Language Model An abstract representation of a (natural) language. An approximation to real language Assume we have a set of sentences,
English-Lithuanian-English Lexicon Database Management System for MT Gintaras Barisevicius and Elvinas Cernys Kaunas University of Technology, Department.
INTRODUCTORY MICROSOFT WORD Lesson 3 – Helpful Word Features
Reading Lab Adult Education Center.
HLT Research and Development for Baltic Languages in Tilde
Basic Microsoft Word 2013.
Implementing AI solutions using the cognitive services in Azure
پایگاه داده فصل چهارم 21/35 نتیجه گیری نتایج آزمایش پایگاه داده مقدمه
1 مفهوم ارتباطات ارتباطات معادل واژه communications ) ميباشد(. ارتباطات يك فرايند اجتماعي و دو طرفه است كه در آن اطلاعات مبادله شده و نوعي تفاهم بين طرفهاي.
درس تطبيقي مادة التربية الفنية للصف الرابع الابتدائي
TMX 2012 LAB DEMO LAB 2 [WEB PUBLISHING] by: Ahmad Hafiz
National Language Service: Focus and Activities
Шаттық шеңбері.
LANGUAGE EDUCATION.
Presentation transcript:

2004 October 25, VilniusSlide No 1 A Tool: Morphological Analyzer / Synthesizer for Lithuanian Vytautas Zinkevičius VDU KLC

2004 October 25, VilniusSlide No 2 Morphological Analyzer / Synthesizer for Lithuanian Introduction Importance of the morphology level for the Lithuanian language technologies Difficulties caused by Lithuanian morphology

2004 October 25, VilniusSlide No 3 Morphological Analyzer / Synthesizer for Lithuanian

2004 October 25, VilniusSlide No 4 Morphological Analyzer / Synthesizer for Lithuanian Creating the Tool Problem of Ambiguity A Demo of the Tool

2004 October 25, VilniusSlide No 5 Morphological Analyzer / Synthesizer for Lithuanian Implementation of the tool in application programs and systems Spelling Checkers (e.g. Lithuanian Spellcheckers for Microsoft Office’ ) Grammatical tagging of the Lithuanian text corpus at CCL VMU Implementation of the Tool in SproUT (a multi-lingual shallow text processing system, Language Technology Lab, DFKI) Used in the process of compiling the "Frequency Dictionary of Contemporary Lithuanian" (Grumadienė L., Žilinskienė V., Dažninis dabartinės rašomosios lietuvių kalbos žodynas, - Vilnius, ).

2004 October 25, VilniusSlide No 6 Morphological Analysis in SProUT Lithuanian text: Šimtų tūkstančių ar milijono ir daugiau metų, per kuriuos atsirado žmogus, procesas vyko toli nuo dabartinės Lietuvos teritorijos. The Result of the Morphological analysis: Šimtų TYPE=wordform LEMMA={POS=numeral } GRAMM_MEANING={POS=numeral + GROUP_OF_NUMERAL=cardinal + GENDER=masculine + NUMBER=plural + CASE=genitive} tūkstančių TYPE=wordform LEMMA={POS=numeral } GRAMM_MEANING={POS=numeral + GROUP_OF_NUMERAL=cardinal + GENDER=masculine + CASE=genitive} ar TYPE=wordform LEMMA={POS=particle } GRAMM_MEANING={POS=particle } LEMMA={POS=conjunction } GRAMM_MEANING={POS=conjunction } LEMMA={POS=onomatopoeic_interjection } GRAMM_MEANING={POS=onomatopoeic_interjection }

2004 October 25, VilniusSlide No 7 Morphological Analyzer / Synthesizer for Lithuanian Foundations and projects Lithuanian State Science and Studies Foundation: 1994 reg. no /4D, contract no 41; 1995 reg. no /7E, contract no "Lithuanian language recognition and generation at morphological level" - in the National Lithuanian language committee program "Lithuanian language in informational society "

2004 October 25, VilniusSlide No 8 The End Thank You