Presentation on theme: "Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005."— Presentation transcript:
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005
Lecture 1, 7/21/2005Natural Language Processing2 Course Information Instructor : Sudeshna Sarkar Course Web Page: http://www.facweb.iitkgp.ernet.in/~sudeshna/courses/nlp/ Teaching Assistants: Monojit Choudhury
Lecture 1, 7/21/2005Natural Language Processing3 Today’s slides adapted from Ilyas Cicekli’s slide http://www.cs.ucf.edu/~ilyas/Courses/CAP6640 Martin & Jurafsky’s book
Lecture 1, 7/21/2005Natural Language Processing4 Preliminaries Required Basic formal language theory We will introduce the basics for those not familiar Knowledge of linguistic terminology will be useful. You can get a good overview of the field from “Survey of the State of the Art in Human Language Technology” http://cslu.cse.ogi.edu/HLTsurvey/HLTsurvey.html (1996) Assignment 1: Read the Survey
Lecture 1, 7/21/2005Natural Language Processing5 Text Books Daniel Jurafsky, and James H. Martin, "Speech and Language Processing", Prentice Hall, 2000. Other References James Allen, "Natural Language Understanding", Second edition, The Benjamin/Cumings Publishing Company Inc., 1995 Christopher D. Manning, and Hinrich Schutze, "Foundations of Statistical Natural Language Processing", The MIT Press, 1999.
Lecture 1, 7/21/2005Natural Language Processing6 Goal of NLP Develop techniques and tools to build practical and robust systems that can communicate with users in one or more natural language Natural Lang.Artificial Lang. Lexical>100 000 words~100 words SyntaxComplexSimple Semantic1 word --> several meanings 1 word --> 1 meaning
Lecture 1, 7/21/2005Natural Language Processing7 Course Topics Morphological Processing Part-of-Speech Tagging Parsing Algorithms for Context- Free Languages Features and Augmented Grammars Lexicalized and Probabilistic Parsing Semantic Analysis Lexical Semantics and Word Sense Disambiguation Discourse Natural Language Generation Machine Translation Probability & Information Theory Language Modeling N-gram models Parameter estimation Some linguistics Phonology, morphology, syntax, semantics, discourse Words & the lexicon
Lecture 1, 7/21/2005Natural Language Processing8 NLP The ultimate research goal: To develop an automated language understanding system What is NLP? The process of computer analysis of input provided in a human language (natural language), and conversion of this input into a useful form of representation. The field of NLP is concerned with Primarily: getting computers to perform useful and interesting tasks with human languages. Secondarily: helping us come to a better understanding of human language. Why is this useful?
Lecture 1, 7/21/2005Natural Language Processing9 Motivation for Natural Language Processing / Understanding 1.Getting computers to perform useful and interesting tasks with human languages Enables communication: Human computer interaction e.g., IR Computer assisted human-human communication e.g., MT 2.Computer modeling of NLP helps us to: 1.Understand language processing in humans 2.Understand other human cognitive processes Challenging task- Requires high level of knowledge about the world Ability to represent the knowledge and reason with it
Lecture 1, 7/21/2005Natural Language Processing10 Forms of Natural Language The input/output of a NLP system can be: written text: newspaper articles, letters, manuals, prose, … Speech: read speech (radio, TV, dictations), conversational speech, commands, … To process written text, we need: lexical, syntactic, Semantic knowledge about the language discourse information, real world knowledge
Lecture 1, 7/21/2005Natural Language Processing11 Forms of Natural Language To process written text, we need: lexical, syntactic, semantic knowledge about the language discourse information, real world knowledge To process spoken language, we need everything above plus speech recognition speech synthesis
Lecture 1, 7/21/2005Natural Language Processing12 Components of NLP Natural Language Understanding Mapping the given input in the natural language into a useful representation. Different level of analysis required: morphological analysis, syntactic analysis, semantic analysis, discourse analysis, … Natural Language Generation Producing output in the natural language from some internal representation. Different level of synthesis required: deep planning (what to say), syntactic generation Which is harder?
Lecture 1, 7/21/2005Natural Language Processing13 Why NL Understanding is hard? Natural language is extremely rich in form and structure, and very ambiguous. How to represent meaning, Which structures map to which meaning structures. One input can mean many different things. Ambiguity can be at different levels. Lexical (word level) ambiguity -- different meanings of words Syntactic ambiguity -- different ways to parse the sentence Interpreting partial information -- how to interpret pronouns Contextual information -- context of the sentence may affect the meaning of that sentence. Many input can mean the same thing. Interaction among components of the input is not clear. Noisy input (e.g. speech)
Lecture 1, 7/21/2005Natural Language Processing14 Knowledge of Language Phonology – concerns how words are related to the sounds that realize them. Morphology – concerns how words are constructed from more basic meaning units called morphemes. A morpheme is the primitive unit of meaning in a language. Syntax – concerns how can be put together to form correct sentences and determines what structural role each word plays in the sentence and what phrases are subparts of other phrases. Semantics – concerns what words mean and how these meaning combine in sentences to form sentence meaning. The study of context-independent meaning.
Lecture 1, 7/21/2005Natural Language Processing15 Knowledge of Language Pragmatics – concerns how sentences are used in different situations and how use affects the interpretation of the sentence. Discourse – concerns how the immediately preceding sentences affect the interpretation of the next sentence.For example, interpreting pronouns and interpreting the temporal aspects of the information. World Knowledge – includes general knowledge about the world. What each language user must know about the other’s beliefs and goals.