Introduction to Natural Language Processing A.k.a., “Computational Linguistics”

Slides:



Advertisements
Similar presentations
The Structure of Sentences Asian 401
Advertisements

Natural Language Processing (or NLP) Reading: Chapter 1 from Jurafsky and Martin, Speech and Language Processing: An Introduction to Natural Language Processing,
Language Processing Technology Machines and other artefacts that use language.
Leksička semantika i pragmatika 5. predavanje. Ambiguity Find at least 5 meanings of this sentence: –I made her duck I cooked waterfowl for her benefit.
Statistical NLP: Lecture 3
SYNTAX Introduction to Linguistics. BASIC IDEAS What is a sentence? A string of random words? If it is a sentence, does it have to be meaningful?
MORPHOLOGY - morphemes are the building blocks that make up words.
Leksička semantika i pragmatika 6. predavanje. Headlines Police Begin Campaign To Run Down Jaywalkers Iraqi Head Seeks Arms Teacher Strikes Idle Kids.
Oct 2009HLT1 Human Language Technology Overview. Oct 2009HLT2 Acknowledgement Material for some of these slides taken from J Nivre, University of Gotheborg,
Introduction to Semantics and Pragmatics. LING NLP 2 NLP tends to focus on: Syntax – Grammars, parsers, parse trees, dependency structures.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Morphology Chapter 7 Prepared by Alaa Al Mohammadi.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Module 14 Thought & Language. INTRODUCTION Definitions –Cognitive approach method of studying how we process, store, and use information and how this.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for Natural Language Processing Ling 571 January 3, 2011 Gina-Anne Levow.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Constituency Tests Phrase Structure Rules
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
THE PARTS OF SYNTAX Don’t worry, it’s just a phrase ELL113 Week 4.
CIS 8590 – Fall 2008 NLP 1 Introduction to Natural Language Processing (aka, Computational Linguistics) Slides by me, Martha Palmer, Eleni Miltsakaki,
9/8/20151 Natural Language Processing Lecture Notes 1.
Chapter 4 Syntax Part II.
Introduction to Linguistics
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
1.Syntax: the rules of sentence formation; the component of the mental grammar that represent speakers’ knowledge of the structure of phrase and sentence.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
1 Computational Linguistics Ling 200 Spring 2006.
Phonemes A phoneme is the smallest phonetic unit in a language that is capable of conveying a distinction in meaning. These units are identified within.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
A Summary of Terminology in Linguistics. First Session Orientation to the Course Introduction to Language & Linguistics 1. Definition of Language 2. The.
Lecture 2 What Is Linguistics.
SYNTAX Lecture -1 SMRITI SINGH.
Linguistics The first week. Chapter 1 Introduction 1.1 Linguistics.
Introduction to CL & NLP CMSC April 1, 2003.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Linguistic Essentials
CSA2050 Introduction to Computational Linguistics Lecture 1 Overview.
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?
LIN 1101 TOPIC 1. Major Sub-fields of Linguistics Phonetics: nature of speech sounds –How they are articulated (articulatory phonetics) –Their physical.
Artificial Intelligence: Natural Language
CSE467/567 Computational Linguistics Carl Alphonce Computer Science & Engineering University at Buffalo.
Natural Language Processing Chapter 2 : Morphology.
Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for NLP Ling 571 January 6, 2014 Gina-Anne Levow.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
English Syntax Read J & M Chapter 9.. Two Kinds of Issues Linguistic – what are the facts about language? The rules of syntax (grammar) Algorithmic –
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for NLP Ling 571 January 5, 2015 Gina-Anne Levow.
NATURAL LANGUAGE PROCESSING
Introduction to Deep Processing Techniques for NLP Deep Processing Techniques for NLP Ling 571 January 4, 2016 Gina-Anne Levow.
Language Structure Lecture 1: Introduction & Overview Helena Frännhag Spring 2013.
INTRODUCTION ADE SUDIRMAN, S.Pd ENGLISH DEPARTMENT MATHLA’UL ANWAR UNIVERSITY.
Introduction to Linguistics
Statistical NLP: Lecture 3
Syntax.
BBI 3212 ENGLISH SYNTAX AND MORPHOLOGY
Natural Language - General
Introduction to Linguistics
Natural Language Processing
Linguistic Essentials
Artificial Intelligence 2004 Speech & Natural Language Processing
Post-Midterm Practice 1
Presentation transcript:

Introduction to Natural Language Processing A.k.a., “Computational Linguistics”

Recall: Agents and Environment Agent Environment sensors actuators percepts actions ?

Agents and Environments with NLP Agent Environment sensors actuators 1.What do the other agents claim to believe? (NL Understanding) 2.What do the other agents actually believe or want? (Plan recognition, game theory) 3.How can I make the other agents believe X? (Planning, NL Generation) Agent Speech, Handwriting, printed text, digital text Speech, Handwriting, printed text, digital text

WHAT IS LANGUAGE? Definition with respect to form: Language is a system of speech symbols. It is realized acoustically (sound waves), visually-spatially (sign language) and in written form. Definition with respect to function: Language is the most important means of human communication. It is used to convey and exchange information (informative function) Multiplicity of languages: We know of about 7000 languages, which is estimated to be about 1% of all the languages that ever existed.

LANGUAGE AND THE BRAIN

THEORIES OF LANGUAGE Noam Chomsky claims that language is innate. B. F. Skinner claims that language is learned; it is basically a stimulus-response mechanism.

WHAT IS GRAMMAR ? When we learn a language we also learn the rules that govern how language elements, such as words, are combined to produce meaningful language. These elements and rules constitute the Grammar of a language. The Grammar is “what we know” Grammar represents our linguistic competence.

DESCRIPTIVE vs PRESCRIPTIVE GRAMMAR Prescriptive (should be) Descriptive (is)

Areas of Linguistics phonetics - the study of speech sounds phonology - the study of sound systems morphology- the rules of word formation syntax - the rules of sentence formation semantics - the study of word meanings pragmatics – the study of discourse meanings sociolinguistics - the study of language in society applied linguistics –the application of the methods and results of linguistics to such areas as language teaching, national language policies, lexicography, translation, language in politics etc.

What is the meaning of ‘meaning’? Learning a language includes learning the “agreed upon” meanings of certain strings of sounds and, Learning how to combine these meaningful units into larger units which also convey meaning.

Morphemes Morpheme is the smallest linguistic unit that has meaning. Morpheme is a grammatical unit in which there is an arbitrary union of sound and a meaning and, which cannot be further analysed (broken down into parts that have meaning).

Morphemes A morpheme may be represented by a single sound: e.g. the plural morpheme [s] in cat+s A morpheme may be represented by a syllable (monosyllabic): e.g. child+ish

Morphemes A morpheme may be represented by more than one syllable (polysyllabic): e.g. lady, water or three syllables: e.g. crocodile or four syllables: e.g. salamander

15 Words Two basic ways to form words – Inflectional (e.g. English verbs + endings  other English verbs) Open + ed = opened Open + ing = opening – Derivational (e.g. adverbs from adjectives, nouns from adjectives) Happy  happily Happy  happiness (nouns from adjectives)

16 Syntax The study of classes of words (nouns, verbs, etc.) and the rules that govern how the words can combine to make phrases and sentences.

17 Basic classes of words Classes of words aka parts of speech (POS) – Nouns – Verbs – Adjectives – Adverbs The above classes of word belong to the type open class words We also have closed class words, or function words – Articles, pronouns, prepositions, particles, quantifiers, conjunctions

18 Basic phrases A word from an open class can be used to form the basis of a phrase The basis of a phrase is called the head

19 Examples of phrases Noun phrases – The manager of the institute – Her worry to pass the exams – Several students from the English Department Adjective phrases – easy to understand – mad as a dog – glad that he passed the exam

20 Examples of phrases Adverb phrases – fast like the wind – outside the building Verb phrases – ate her sandwich – went to the doctor – believed what I told him

21 Grammars and parsing syntactic parsing: Determining the syntactic structure of a sentence Basic steps – Identify sentence boundaries – Identify what part of speech is each word – Identify pairs of words that form phrases – Identify pairs of phrases that form larger phrases …

Context Free Grammar S -> NP VP NP -> det (adj) N NP -> Proper N NP -> N VP -> V, VP -> V PP VP -> V NP VP -> V NP PP, PP -> Prep NP VP -> V NP NP 22

23 Parses V PP VP S NP the mat satcat on NP Prep The cat sat on the mat Det N N

24 Parses V PP VP S NP time an arrow flies like NP Prep Time flies like an arrow. N DetN

25 Parses VNP VP S NP flies like an N Det Time flies like an arrow. N time arrow N

26 Semantics and Pragmatics Semantics: the study of meaning that can be determined from a sentence, phrase or word. Pragmatics: the study of meaning, as it depends on context (speaker, situation)

27 Language to Logic John went to a book store.  s. bookstore(s) ^ go(John, s) Every boy loves a girl. ∀ b. boy(b)  ∃ g. girl(g) ^ loves(b, g) Who broke the vase? λx. broke(x, vase17)

28 Headlines Police Begin Campaign To Run Down Jaywalkers Iraqi Head Seeks Arms Teacher Strikes Idle Kids Miners Refuse To Work After Death Juvenile Court To Try Shooting Defendant

Language Families

30 NLP tends to focus on: Syntax – Grammars, parsers, parse trees, dependency structures Semantics – Subcategorization frames, semantic classes, ontologies, formal semantics Pragmatics – Pronouns, reference resolution, discourse models

31 Issues in NLP Ambiguity Lack of Knowledge – it’s needed for understanding, but computers don’t have it

Ambiguity Computational linguists are obsessed with ambiguity Ambiguity is a fundamental problem of computational linguistics Resolving ambiguity is a crucial goal

Ambiguity Find at least 5 meanings of this sentence: – I made her duck

Ambiguity Find at least 5 meanings of this sentence: – I made her duck I cooked waterfowl for her benefit (to eat) I cooked waterfowl belonging to her I created the (plaster?) duck she owns I caused her to quickly lower her head or body I waved my magic wand and turned her into undifferentiated waterfowl At least one other meaning that’s inappropriate for gentle company.

Ambiguity is Pervasive I caused her to quickly lower her head or body – Lexical category: “duck” can be a N or V I cooked waterfowl belonging to her. – Lexical category: “her” can be a possessive (“of her”) or dative (“for her”) pronoun I made the (plaster) duck statue she owns – Lexical Semantics: “make” can mean “create” or “cook”

Ambiguity is Pervasive Grammar: Make can be: – Transitive: (verb has a noun direct object) I cooked [waterfowl belonging to her] – Ditransitive: (verb has 2 noun objects) I made [her] (into) [undifferentiated waterfowl] – Action-transitive (verb has a direct object and another verb) – I caused [her] [to move her body]

Ambiguity is Pervasive Phonetics! – I mate or duck – I’m eight or duck – Eye maid; her duck – Aye mate, her duck – I maid her duck – I’m aid her duck – I mate her duck – I’m ate her duck – I’m ate or duck – I mate or duck

Kinds of knowledge needed? Consider the following interaction with HAL the computer from 2001: A Space Odyssey Dave: Open the pod bay doors, Hal. HAL: I’m sorry Dave, I’m afraid I can’t do that.

Knowledge needed to build HAL? Speech recognition and synthesis – Dictionaries (how words are pronounced) – Phonetics (how to recognize/produce each sound of English) Natural language understanding – Knowledge of the English words involved What they mean How they combine (what is a `pod bay door’?) – Knowledge of syntactic structure I’m I do, Sorry that afraid Dave I’m can’t

What’s needed? Dialog and pragmatic knowledge – “open the door” is a REQUEST (as opposed to a STATEMENT or information-question) – It is polite to respond, even if you’re planning to kill someone. – It is polite to pretend to want to be cooperative (I’m afraid I can’t…) – What is `that’ in `I can’t do that’? Even a system to book airline flights needs much of this kind of knowledge

Computational models of how natural languages work These are sometimes called Language Models or sometimes Grammars Three main types (among many others): 1.Document models, or “topic” models 2.Sequence models: Markov models, HMMs, others 3.Context-free grammar models

Computational models of how natural languages work Most of the models I will show you are -Probabilistic models -Graphical models -Generative models In other words, they are essentially Bayes Nets. In addition, many (but not all) are -Latent variable models This means that some variables in the model are not observed in data, and must be inferred. (Like the hidden states in an HMM.)