Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSA2050 Introduction to Computational Linguistics Lecture 1 Overview.

Similar presentations


Presentation on theme: "CSA2050 Introduction to Computational Linguistics Lecture 1 Overview."— Presentation transcript:

1 CSA2050 Introduction to Computational Linguistics Lecture 1 Overview

2 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?2 Lecture 1 Course Information What is CL? What is L? Course Contents

3 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?3 Course Information Web http://www.cs.um.edu.mt/~mros/csa2050 Lecturers mike.rosner@um.edu.mt ray.fabri@um.edu.mt angelo.dalli@um.edu.mt mike.rosner@um.edu.mt ray.fabri@um.edu.mt angelo.dalli@um.edu.mt Book (nominally) Jurafsky & Martin, Speech and Language Processing, Prentice Hall 2000, ISBN 0-13-095069-6 Natural Language Toolkit (NLTK)

4 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?4 Human Language Technologies Natural Language Processing (NLP) Computational models of language analysis, interpretation, and generation. syntax/semantics interface Natural Language Engineering emphasis on large-scale performance example: Google Speech Technology Computational Linguistics Emphasis on mechanised linguistic theories. Grew out of early Machine Translation efforts

5 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?5 CL: Two Main Disciplines COMP SCILINGUISTICS

6 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?6 Linguistics Phonetics: The study of speech sounds Phonology: The study of sound systems Morphology: The study of word structure Syntax: The study of sentence structure Semantics: The study of meaning Pragmatics: The study of language use

7 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?7 Noam Chomsky Noam Chomsky’s work in the 1950s radically changed linguistics, making syntax central. Chomsky has been the dominant figure in linguistics ever since. Chomsky invented the generative approach to grammar.

8 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?8 Generative Grammar: Key Points A language is a possibly infinite set of strings. Grammar is a finite description of that set. Grammar is precisely defined. Theory of Grammar is a theory of human linguistic abilities. Grammar should generate all and only the strings of the language. [source: Sag & Wasow]

9 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?9 A Simple Grammar + Lexicon grammar: S  NP VP NP  N VP  V NP lexicon: V  kicks N  John N  Bill S NP N Johnkicks NPV VP N Bill

10 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?10 Generative Power of a Grammar G G GL L L undergeneration only but not all overgeneration all but not only all and only

11 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?11 Formal v. Natural Languages Formal Languages Numbers 3290 1 1010101 Logic  x man(x)  mortal(x) C if (i >10) exit(0); Natural Languages English John saw the dog German Johann hat den hund gesehen Maltese Gianni ra kelb

12 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?12 Points of Similarity A language is considered to be a (possibly infinite) set of sentences. Sentences are sequences of tokens. Formation rules determine which sequences are valid sentences. Sentences have a definite structure. Sentence structure related to meaning.

13 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?13 Structure Affects Meaning I shot an elephant in my trousers

14 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?14 Points of Difference Formal Languages The grammar defines the language Restricted application Non ambiguous Natural Languages The language defines the grammar Universal application Highly ambiguous

15 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?15 Ambiguity Lexical Ambiguity Iraqi Head Seeks Arms Syntactic Ambiguity small animals and children laugh Semantic Ambiguity every girl loves a sailor Pragmatic Ambiguity can you pass the salt? The management of ambiguity is central to the success of CL

16 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?16 Algorithms and Linguistics Pure linguistics deals with data grammar rules theories about grammar rules Putting knowledge to some use involves processing. Linguistic theory is silent about implementation issues Implementation is central to Computational Linguistics

17 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?17 Computational Linguistics – Issues Representation of grammar and a lexicon How is the structure of a given sentence actually discovered? Generation of a sentence to express a particular meaning? Learning a language with limited exposure to grammatical sentences?

18 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?18 Unimplemented theories can be dangerous Representational details omitted. Computer memory/complexity issues omitted. Nature of individual steps may be unclear. Difficult to test. Potentially unimplementable

19 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?19 Computational Linguistics Twin Goals Scientific Goal: Contribute to Linguistics by adding a computational dimension. Technological Goal: Develop basis for machinery capable of handling human language that can support “language engineering”

20 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?20 Applications of Computational Linguistics Machine Translation Information Retrieval/Extraction Document Classification Question Answering Style and Spell Checking Dialogue Systems Speech

21 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?21 The Information Food Chain 1. input format 2. tokenization 3. gross text structure paragraph sentences words 4. morphological analysis 5. part of speech tagging 6. syntactic analysis parsing chunking 7. Semantic Analysis Entities People Locations Organisations Anaphora Resolution Relations

22 Feb 2008 -- MRCSA2050 - Lecture I: What Is CL?22 LECTURES 1 Overview 2 POS [RF] 3 Tagging 4 5 Chunking 6 7 Syntax[RF] 8 Parsing 9 10 Morphology[RF] 11 Finite State 12 Finite State 13 Lexicon 14 Revision


Download ppt "CSA2050 Introduction to Computational Linguistics Lecture 1 Overview."

Similar presentations


Ads by Google