Presentation is loading. Please wait.

Presentation is loading. Please wait.

Natural Language Processing Rogelio Dávila Pérez Profesor – Investigador

Similar presentations


Presentation on theme: "Natural Language Processing Rogelio Dávila Pérez Profesor – Investigador"— Presentation transcript:

1 Natural Language Processing Rogelio Dávila Pérez Profesor – Investigador rdav9@hotmail.com

2 Some terms …  Speech recognition  Natural language understanding  Computational Linguistics  Natural language generation  Speech synthesis  Information retrieval  Information extraction  Inference

3 Application Areas  Machine Translation  Information Retrieval  Knowledge-acquisition  User interfaces –Question-Answering Systems

4 Application Areas  Advantages of Natural Language Interfaces Natural language has several obvious and desirable properties: (Patrick Doyle ) Patrick Doyle Patrick Doyle –It provides an immediate vocabulary for talking about the contents of the computer. –It provides a means of accessing information in the computer independently of its structure and encodings. –It shields the user from the formal access language of the underlying system. –It is available with a minimum of training.

5 Natural Language Processing History  Eliza Eliza [Weizenbaum, 1966] The most famous pattern-matching natural language program, ELIZA was built at MIT in 1966. The program assumes the role of a Rogerian, or "nondirective," therapist in its dialog with the user. It operated by matching the left sides of its rules against the user's last sentence, and using the appropriate right side to generate a response. Rules were indexed by keywords so only a few had to be matched against a particular sentence. Some rules had no left side, so they could apply anywhere with replies like "Tell me more about that." Note that these rules are "approximate" matchers. This accounts for ELIZA's major strength, its ability to say something reasonable most of the time, as well as its major weakness, the superficiality of its understanding and its ability to be led completely astray.

6 Natural Language Processing History  LUNAR [William Woods, 1973] LUNAR answered questions about the rock samples brought back from the moon using two databases -- the chemical analyzes and the literature references. Specifically, it helped geologists access, compare, and evaluate chemical analysis data on moon rocks and soil composition obtained from the Apollo-11 mission. It operated by translating a question entered in English into an expression in a formal query language. The translation was done with an ATN parser coupled with a rule-driven semantic interpretation procedure. ATN

7 Natural Language Processing History  SHRDLU SHRDLU [Winograd, 1972] SHRDLU carried on a dialog with a user in which the system simulated a robot manipulating a set of simple objects on a tabletop. Knowledge was represented as procedures within the system. The design of the system was based on the belief that, to understand language, a program must deal in an integrated way with syntax, semantics, and reasoning. The basic viewpoint guiding its implementation was that meanings (of words, phrases, and sentences) can be embodied in procedural structures and that language is a way of activating appropriate procedures in the hearer.

8 Natural Language Processing History  HEARSAY Speech understanding for voice chess [CMU, 1976]. HEARSAY uses a blackboard architecture with knowledge sources posting constraints. HEARSAY-II understood a spoken speech query about computer science abstracts stored in a database. HEARSAY-III is a general blackboard architecture. The HEARSAY project was meant to overcome the limitations of syntax-directed methods of parsing from left to right. HEARSAY uses three knowledge sources: acoustics and phonetics, syntax of legal utterances, and semantics of the domain. Knowledge was constrained by using expected utterances.

9 Basic Definitions  Computational linguistics (CL) Computational Linguistics is a discipline between linguistics and computer science which is concerned with the computational aspects of the human language faculty. It belongs to the cognitive sciences and overlaps with the field of artificial intelligence (AI), a branch of computer science aiming at computational models of human cognition. ( HANS USZKOREIT ) HANS USZKOREIT HANS USZKOREIT

10 Basic Definitions  Natural Language (NL) The languages that people speak: English, Spanish, Nahuatl, etc.  Natural Language Processing (NLP) NLP is concerned with making the computer to understand natural language.  Machine translation Machine translation is concerned with making the computer to automatically translate from one language into another.

11 Basic Definitions  Grammar A grammar of a language is a scheme for specifying the sentences in that language. It indicates the syntactic rules for combining words into well-formed phrases and clauses. The theory of generative grammar [Chomsky, 1957] had a profound effect on linguistic research, including AI work in computational linguistics. (Patrick Doyle ) Patrick Doyle Patrick Doyle

12 Basic Definitions  Parsing Parsing is the "de-linearization" of linguistic input; that is, the use of grammatical rules and other knowledge sources to determine the functions of words in the input sentence. Usually a parser produces a data structure like a derivation tree to represent the structural meaning of a sentence. (Patrick Doyle ) Patrick Doyle Patrick Doyle

13 Knowledge about language  Phonetics and Phonology  Morphology  Syntax  Semantics  Pragmatics  Discourse

14 Knowledge about language  Phonetics and Phonology. The study of linguistic sounds.  Morphology  Syntax  Semantics  Pragmatics  Discourse

15 Knowledge about language  Phonetics and Phonology. The study of linguistic sounds.  Morphology. The study of the meaningful components of words.  Syntax  Semantics  Pragmatics  Discourse

16 Knowledge about language  Phonetics and Phonology –The study of linguistic sounds.  Morphology –The study of the meaningful components of words.  Syntax –The study of the structural relationships between words.  Semantics  Pragmatics  Discourse

17 Knowledge about language  Phonetics and Phonology –The study of linguistic sounds.  Morphology –The study of the meaningful components of words.  Syntax –The study of the structural relationships between words.  Semantics –The study of meaning.  Pragmatics  Discourse

18 Knowledge about language  Phonetics and Phonology –The study of linguistic sounds.  Morphology –The study of the meaningful components of words.  Syntax –The study of the structural relationships between words.  Semantics –The study of meaning.  Pragmatics –The study of how language is to accomplish goals.  Discourse

19 Knowledge about language  Phonetics and Phonology –The study of linguistic sounds.  Morphology –The study of the meaningful components of words.  Syntax –The study of the structural relationships between words.  Semantics –The study of meaning.  Pragmatics –The study of how language is to accomplish goals.  Discourse –The study of linguistic units larger than a single sentence.

20 Ambiguity  We say that some linguistic construction is ambiguous if there are multiple alternative structures that can be built for it. –Lexical ambiguity: a word or expression may have more than one meaning. E.g. –Syntactic ambiguity: a sentence may have more than one syntactic tree.

21 State of Art  A Canadian computer program METEO for more than 20 years have accepted daily weather data and generated weather reports that are passed along unedited to the public in English and French [Chandioux, 1976]. METEO  The Babel Fish translation system from Systran handles over 1,000,000 translation requests a day from the AltaVista.com search engine site. Babel Fish AltaVista.comBabel Fish AltaVista.com  A visitor to Cambridge, Massachusetts, asks a computer about places to eat using only spoken language. The system returns relevant information from a database of facts about the local restaurant scene [Zue et al., 1991].

22 Some Definitions  Morpheme A meaningful linguistic unit that contains no smaller meaningful parts. (Patrick Doyle ) Patrick Doyle Patrick Doyle  Anaphora The use of a word to refer to previously-mentioned entities, e.g., The boys and I went over to Frank's, because they needed to talk to him. (Patrick Doyle ) Patrick Doyle Patrick Doyle


Download ppt "Natural Language Processing Rogelio Dávila Pérez Profesor – Investigador"

Similar presentations


Ads by Google