Role of NLP in Linguistics 16-07-2010 Dipti Misra Sharma Language Technologies Research Centre International Institute of Information Technology Hyderabad.

Slides:



Advertisements
Similar presentations
SWG Strategy (C) Copyright IBM Corp. 2006, All Rights Reserved. P4 Task 2 Fact Extraction using a CNL Current Status David Mott, Dave Braines, ETS,
Advertisements

Syntax Lecture 2: Categories and Subcategorisation.
CODE/ CODE SWITCHING.
CSE 330: Numerical Methods
Why study grammar? Knowledge of grammar facilitates language learning
1 1 Capitalisation of R&D in the national accounts Ann Lisbet Brathaug Head of National accounts Statistics Norway
Cognitive Linguistics Croft & Cruse 9
Overview of the Hindi-Urdu Treebank Fei Xia University of Washington 7/23/2011.
Statistical NLP: Lecture 3
Units of specialized knowledge* “A unit of specialized knowledge (SKU) is a unit that represents specialized knowledge at the content level, and communicates.
CALTS, UNIV. OF HYDERABAD. SAP, LANGUAGE TECHNOLOGY CALTS has been in NLP for over a decade. It has participated in the following major projects: 1. NLP-TTP,
Linguistic Theory Lecture 8 Meaning and Grammar. A brief history In classical and traditional grammar not much distinction was made between grammar and.
Chapter 5 Probability Models Introduction –Modeling situations that involve an element of chance –Either independent or state variables is probability.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
1 Introduction to Computability Theory Lecture15: Reductions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
Erasmus University Rotterdam Frederik HogenboomEconometric Institute School of Economics Flavius Frasincar.
Hindi Treebank Dipti Misra Sharma LTRC International Institute of Information Technology Hyderabad India.
Language, Mind, and Brain by Ewa Dabrowska Chapter 2: Language processing: speed and flexibility.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Albert Gatt LIN 3098 Corpus Linguistics. In this lecture Some more on corpora and grammar Construction Grammar as a theoretical framework Collostructional.
Morphology For Marathi POS-Tagger Veena Dixit 11/ 10 /2005.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Assessment Report School of TAHSS Department: Modern Languages and Cultures Chair: Andrea Parada Assessment Coordinator: Donna Wilkerson- Barker.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
1 The Interaction Between Verbs And Constructions Lucas Champollion Oct 18 th, 2004 Goldberg, Adele E. (1995): Constructions. Ch. 2.
LIRICS mid-term review 1 LIRICS WP3: Morpho-syntactic and syntactic annotations Thierry Declerck DFKI-LT - Saarbrücken 23rd May 2006.
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 12.
CAS LX 502 8b. Formal semantics A fragment of English.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Psycholinguistic Theory
Steps of a Design Brief Panther Creek SciVis V
323 Morphology The Structure of Words 5. Morphological Trees (This page last updated 5 NO 06) 5.1 Compounding Lexical compounds are words that contain.
Annotation for Hindi PropBank. Outline Introduction to the project Basic linguistic concepts – Verb & Argument – Making information explicit – Null arguments.
Linguistic Essentials
The interface between model-theoretic and corpus-based semantics
Detection of Links between Words in the Task of Syntactic-Semantic Analysis of Russian Texts. Dmitry V. Merkuryev Saint-Petersburg State University, Russia.
Diagnostic Assessment: Salvia, Ysseldyke & Bolt: Ch. 1 and 13 Dr. Julie Esparza Brown Sped 512/Fall 2010 Portland State University.
Interactive Quiz Game Select the correct answer of each number. Click the letter that best answer to the questions below.
Role of NLP in Linguistics Dipti Misra Sharma Language Technologies Research Centre International Institute of Information Technology Hyderabad.
Artificial Intelligence: Natural Language
Natural Language Processing Chapter 2 : Morphology.
Introduction: definition of a lesson plan It can be simple as a mental checklist or as a complex as a detailed two-page typed lesson plan.
Supertagging CMSC Natural Language Processing January 31, 2006.
Intra-Chunk Dependency Annotation : Expanding Hindi Inter-Chunk Annotated Treebank Prudhvi Kosaraju, Bharat Ram Ambati, Samar Husain Dipti Misra Sharma,
Hybrid Method for Tagging Arabic Text Written By: Yamina Tlili-Guiassa University Badji Mokhtar Annaba, Algeria Presented By: Ahmed Bukhamsin.
SYNTAX.
Lexical Semantics Fall Lexicon Collection of Words Collection of Words Mental store of information about words and morphemes Mental store of information.
Group 2: Sino-Tibetan Languages Working Group II: Sino-Tibetan Languages Session Report July 2, 2005.
Lesson 4 Grammar - Chapter 13.
Chapter 6 Key Concepts. cognates Words in related languages that developed from the same ancestral root and therefore have a same or similar form across.
A knowledge rich morph analyzer for Marathi derived forms Ashwini Vaidya IIIT Hyderabad.
MORPHOLOGY. PART 1: INTRODUCTION Parts of speech 1. What is a part of speech?part of speech 1. Traditional grammar classifies words based on eight parts.
Lec. 10.  In this section we explain which constituents of a sentence are minimally required, and why. We first provide an informal discussion and then.
Chapter 5 The Oral Approach.
Introduction to Linguistics Unit Four Morphology, Part One Dr. Judith Yoel.
Lecture 2: Categories and Subcategorisation
CSC 594 Topics in AI – Natural Language Processing
Syntax Lecture 9: Verb Types 1.
Statistical NLP: Lecture 3
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Natural Language Processing (NLP)
What is Sanskrit Language? Sanskrit is the primary liturgical language of Hinduism; a philosophical language of Hinduism, Sikhism, Buddhism, and Jainism;
Essentials of Oral Defense (English/Chinese Translation)
Linguistic Essentials
Natural Language Processing (NLP)
Structure of a Lexicon Debasri Chakrabarti 13-May-19.
323 Morphology The Structure of Words 5
Artificial Intelligence 2004 Speech & Natural Language Processing
Natural Language Processing (NLP)
Presentation transcript:

Role of NLP in Linguistics Dipti Misra Sharma Language Technologies Research Centre International Institute of Information Technology Hyderabad India

NLP and Linguistics Have similar goals – Understanding human language(s) NLP relies on the theoretical models provided by linguistics – Therefore, NLP definitely needs linguistics What about Linguistics ? Does it benefit from NLP ?

NLP is useful NLP tools can be useful for certain linguistic tasks such as – collecting, organizing, classifying data, – providing statistics etc. This saves effort, brings forth facts which help in generalizations.... Makes life easier for linguists

NLP techniques are useful for creating linguistic resources such as – verb frames, transfer grammars, bilingual lexicons etc Studies in CL have shown the usefulness of NLP techniques in historical linguistics as well (e.g. phylogenetic trees) Thus, NLP is useful not only for data related tasks but also for creation of linguistic resources NLP and Linguistics Resources

What else ? NLP researchers and linguists look at language from different perspectives NLP researchers look for solutions which provide higher coverage – exceptions can be dealt with later Linguistic researchers find exceptions more interesting – these help identify problem areas for the theory

However Resource creation for NLP involves a close study of large scale real time data (e.g. linguistic annotation) Close look at real time data often springs linguistic issues which have theoretical implications

Our experience Hindi has A long list of lexical items Historically derived from Sanskrit verb roots But Are categorized as adjectives in Hindi For example, ‘sthita’ (situated), swiikrita (accepted), sviikaarya (acceptable), likhita (written), kathit (told) ……

However These ‘adjectives’ of Hindi have modifiers which have argument like properties – both semantically and syntactically For example, dillii mein sthit qutub miinaar ek darshaniiy Delhi in situated Qutub Minar one worth-watching sthal hai place is Qutub Minar situated in Delhi is a place worth visiting unke dvaaraa kathit kahaaniyaan bahut pracalit hain Them by ` told stories very popular are The stories told by them are very popular

The issue (1/2) Both ‘dillii mein’ and ‘unke dvaaraa’ have appropriate case markers ‘mein’ is locative and ‘dvaaraa’ agentive These adjectives are historically non-finite verbs – However, Hindi grammars do not account for them so anymore – These are not morphologically decompositional either

The issue (2/2) Morphological decomposition of sthit (situated) and kathit (told) would lead to a Sanskrit analysis and NOT a Hindi analysis Hindi, for example, does not have ‘sthaa’ or ‘kath’ as verb roots It doesn’t have ‘ita’ as an active participial suffix either. How do we explain the argument like properties of their modifiers ?

What does it indicate ? Linguists understand the relation but not through a linguistic process of Hindi A linguistic process (or at least the roots and suffixes) from Sanskrit will have to be brought in Is it that languages have elements which are at different stages of development/evolution ?

Another example Indian languages show frequent use of complex predicates Examples: pratiikshaa karnaa (wait do), kshamaa karnaa (forgive do) The problem, When is an NV sequence a complex predicate and when it is not ?

Complex Predicates The problem has long been discussed in linguistics literature Several diagnostics have also been proposed However, Quite a few NV sequences are a single unit semantically Syntactically, they fail the diagnostics The question remains, Do we consider such cases as ‘complex verbs’ or as instances of ‘verb argument’ ?

Conclusions NLP tools and techniques can be useful for linguists NLP throws up rich examples which need to be handled Poses challenges for the theory