Introduction Chapter 1 Foundations of statistical natural language processing.

Slides:



Advertisements
Similar presentations
Grammar is to Meaning as the Law if to Good Behaviour Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds and Sussex.
Advertisements

© Michael Lacewing Innate ideas Michael Lacewing.
Chapter 4 Key Concepts.
Psycholinguistic what is psycholinguistic? 1-pyscholinguistic is the study of the cognitive process of language acquisition and use. 2-The scope of psycholinguistic.
Statistical Methods and Linguistics - Steven Abney Thur. POSTECH Computer Science NLP Lab Shim Jun-Hyuk.
MORPHOLOGY - morphemes are the building blocks that make up words.
A Brief History of Artificial Intelligence
Module 14 Thought & Language.
Module 14 Thought & Language. INTRODUCTION Definitions –Cognitive approach method of studying how we process, store, and use information and how this.
January 12, Statistical NLP: Lecture 2 Introduction to Statistical NLP.
Introduction to Linguistics and Basic Terms
August 23, 2010 Grammars and Lexicons How do linguists study grammar?
Copyright ©2007 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. Gary D. Borich Effective Teaching Methods, 6e Gary.
Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.
Topic: Theoretical Bases for Cognitive Method Objectives Trainees will be able to give reasons for the design and procedures of the Cognitive Method.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
An Overview of Applied Linguistics
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
TRANSFORMATIONAL GRAMMAR An introduction. LINGUISTICS Linguistics Traditional Before 1930 Structural 40s -50s Transformational ((Chomsky 1957.
The students will be able to know:
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
Lecture 1 Introduction: Linguistic Theory and Theories
Generative Grammar(Part ii)
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Stages of Second Language Acquisition
Main Branches of Linguistics
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Language Acquisition.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
1 Words and rules Linguistics lecture #2 October 31, 2006.
Literacy is...  the quality or state of being literate, esp. the ability to read and write  An individual’s ability to construct, create, and communicate.
Lecture 2 What Is Linguistics.
Teaching language means teaching the components of language Content (also called semantics) refers to the ideas or concepts being communicated. Form refers.
Natural Language Processing (NLP) I. Introduction II. Issues in NLP III. Statistical NLP: Corpus-based Approach.
What is Language? Education 388 Lecture 3 January 23, 2008 Kenji Hakuta, Professor.
THE BIG PICTURE Basic Assumptions Linguistics is the empirical science that studies language (or linguistic behavior) Linguistics proposes theories (models)
Psycholinguistic Theory
An Intelligent Analyzer and Understander of English Yorick Wilks 1975, ACM.
Simulated Evolution of Language By: Jared Shane I400: Artificial Life as an approach to Artificial Intelligence January 29, 2007.
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
Natural Language Processing Spring 2007 V. “Juggy” Jagannathan.
© 2008 The McGraw-Hill Companies, Inc. Chapter 8: Cognition and Language.
Linguistics The third week. Chapter 1 Introduction 1.3 Some Major Concepts in Linguistics.
First Language Acquisition
1 CS 385 Fall 2006 Chapter 1 AI: Early History and Applications.
Cognitive Processes Chapter 8. Studying CognitionLanguage UseVisual CognitionProblem Solving and ReasoningJudgment and Decision MakingRecapping Main Points.
CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?
Contrastive Language Analysis HC9008 LAI Siu Yin / LI Xiaoying.
Discourse Analysis ENGL4339
PSY270 Michaela Porubanova. Language  a system of communication using sounds or symbols that enables us to express our feelings, thoughts, ideas, and.
Linguistic Anthropology Bringing Back the Brain. What Bloomfield Got “Right” Emphasized spoken language rather than written language The role of the linguist.
Linguistic Development Thomas G. Bowers, Ph.D
The Cognition of Discovery The Winds of Change Terms Places People
The Develop ment of Thought and Languag e Chapter 11 Thought & Language Chapter 10.
SIMS 296a-4 Text Data Mining Marti Hearst UC Berkeley SIMS.
Universal Moral Grammar: theory, evidence, and the future. Mikhail, J.(2007) Universal Moral Grammar: Theory, Evidence, and the Future. Trends in Cognitive.
Language and Cognition Colombo, June 2011 Day 2 Introduction to Linguistic Theory, Part 3.
Universal Grammar Chomsky and his followers no longer use the term LAD, but refer to the child’s innate endowment as Universal Grammar (UG). UG is a theory.
Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
Chapter 3 Language Acquisition: A Linguistic Treatment Jang, HaYoung Biointelligence Laborotary Seoul National University.
Introduction to Linguistics
PSYC 206 Lifespan Development Bilge Yagmurlu.
What is cognitive psychology?
INTRODUCTION TO LINGUISTICS 1
Reading and Frequency Lists
SYNTAX.
Theories of Language Development
Traditional Grammar VS. Generative Grammar
Information Retrieval
Presentation transcript:

Introduction Chapter 1 Foundations of statistical natural language processing

NLP and Statistical Approach Why many people are adopting a statistical approach to natural language processing? How one should approach this approach? –We will begin with discussion of some philosophical themes and leading ideas

Approaches to language Between 1960 and 1985 most of linguistics, Psychology, Artificial Intelligence and NLP was dominated by Rationalist approach “Significant part of the knowledge in the human mind is not derived by the senses but is fixed in advance, presumably by genetic inheritance”

Rationalist approach Dominated the field due to widespread acceptance of arguments by Noam Chomsky Argument: “Problem of poverty of stimulus” Difficult to see how children can learn something complex as natural language from limited input Questions?

Empiricist Approach Also begins with cognitive abilities point Difference between approaches is in terms of degree of belief “ Mind does not begin with detailed sets of principles/procedures for various components of language and things like morphological structure, case marking etc”. Baby’s brain begins with general operations of associations, pattern recognition, and generalization

Empiricist approach to NLP suggest that “ We can learn complicated and extensive language structures by specifying appropriate general language model” “and then using Statistical, Pattern Recognition and Machine Learning models to a large amount of language use”

SNLP People cannot work from observing a large amount of language usage –Instead simple ‘texts’ are used –A body of text is called Corpus (pl: Corpora) –Empiricist corpus-based approach is seen in American Structuralists (Zelling Harris ) Language’s structure can be discovered automatically using corpus

---- Chomskyan linguistics seeks to describe language model of human mind (I- language), for which texts (E-language) provide indirect evidence Empiricist approaches describe E-language as it ACTUALLY occurs –Chomsky postulates Linguistics competence Linguistic performance

Chomskyan linguistics depends on categorical principles –‘Do’ or ‘Do not’ satisfy –Same as American Structuralism –Categorical judgment of rare type of sentences Our approach would be inspired of Statistical NLP draws from work of Shannon –Assign probabilities to linguistic events to decided which sentences are ‘usual’ and which are ‘unusual’ –Associations and preferences occur in totality of language use

Scientific Content Questions that linguistics should answer –What kind of things do people say? –What do these things say/ask/request about the world. Key point: How knowledge of language is acquired by humans, and how they actually understand and generate sentences in real time

Competence Grammar –Said to underlie the language –Generative approach in speaker’s head It suggests that there is a set of sentences -Grammatical Sentences- and other strings which are ungrammatical The concept of grammaticality –Judged on how sentence is structurally well formed –Not according to what people say or semantically anomalous e.g. “ Colorless green ideas sleep furiously”

Syntactic grammaticality is a binary choice –Native speaker normally produces grammatical sentences Two points –Binary choice is plausible for simple sentences but for complex it may be farfetched –Non native speakers speak something grammatical but somehow odd. “ In addition to this, she insisted that women were regarded as a different existence from men unfairly ”

Non-categorical phenomena in language Categorical view of language may be sufficient for many purposes but has its limitations Frequency based analysis is required To see non-categorical phenomena change in the language should be studied –e.g. ‘While’ (noun)  time “Take a while” –While (Complementizer) “While you were out” After analyzing frequency, category should be reanalyzed

‘near’Adjective/Preposition –We will review that decision in the near future –He lives near the station –We live nearer the water than you thought Grammatically adjectives and nouns do not take direct object but preposition –‘convenient for people’ Comparative form is like adjective/adverbs Blending and Language change Kind of, sort of –We are kind of hungry

Summing up Few attempts to use statistical NLP for explaining complex linguistics phenomena –This new way of looking at language may be able to account for things such as non categorical phenomenon and language change Supportive argument “human cognition is probabilistic and that language must be too ”