Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Similar presentations


Presentation on theme: "CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?"— Presentation transcript:

1 CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

2 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?2 Lecture 1 Course Information What is CL? Linguistics CS Course Contents

3 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?3 Course Information Web http://www.cs.um.edu.mt/~mros/csa 2050 Lecturers mike.rosner@um.edu.mt ray.fabri@um.edu.mt Book (nominally) Jurafsky & Martin, Speech and Language Processing, Prentice Hall 2000, ISBN 0-13-095069-6

4 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?4 CL: Two Main Disciplines COMP SCILINGUISTICS

5 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?5 Computers and Language Computational Linguistics Emphasis on mechanised linguistic theories. Grew out of early Machine Translation efforts Natural Language Processing Computational models of language analysis, interpretation, and generation. syntax/semantics interface Language Engineering emphasis on large-scale performance example: Google Speech Technology

6 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?6 Linguistics Phonetics: The study of speech sounds Phonology: The study of sound systems Morphology: The study of word structure Syntax: The study of sentence structure Semantics: The study of meaning Pragmatics: The study of language use

7 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?7 History of Grammar Until 50 years ago, most linguistic work concerned sound systems (phonology), word structure (morphology), and the historical relationships among languages. Writings on grammar go back at least 3000 years. Until 200 years ago, almost all of it was prescriptive. Scientific study of sentence grammar is comparatively recent. [source: Sag & Wasow]

8 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?8 Grammar: the rules of a language Prescriptive Grammar Subjective Rules for and against certain uses Proscribed forms that are in current use “don’t end a sentence with a preposition” Descriptive Grammar Objective Rules characterizing what people actually say Goal is to characterize all and only sentences that belong to the language.

9 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?9 Noam Chomsky Noam Chomsky’s work in the 1950s radically changed linguistics, making syntax central. Chomsky has been the dominant figure in linguistics ever since. Chomsky invented the generative approach to grammar.

10 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?10 Generative Grammar: What Follows? Grammars should be formulated precisely and explicitly Grammar is a theory of linguistic knowledge. Mathematical definition of a grammar as a generative device. Grammar should generate exactly the strings of the language. [source: Sag & Wasow]

11 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?11 Generative Power of a Grammar G G GL L L undergeneration only but not all overgeneration all but not only all and only

12 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?12 Theories of Sentence and Word Structure: Rewrite Rules Rewrite rules can be used to specify the sentences of a language. Rules have the form LHS  RHS LHS may be a sequence of symbols RHS may be a sequence of symbols or words. Lexicon specifies words and their categories

13 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?13 A Simple Grammar/Lexicon grammar: S  NP VP NP  N VP  V NP lexicon: V  kicks N  John N  Bill S NP N Johnkicks NPV VP N Bill

14 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?14 Grammar + Lexicon Defines language = (possibly infinite) set of sentences. But grammar is finite. Assigns structures that are general "closer" to meaning than sentence itself. Grammar/Lexicon = Linguistic knowledge? Learnability: grammar is concrete entity that can be acquired.

15 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?15 Formal v. Natural Languages Formal Languages Numbers 3290 1 1010101 Logic  x man(x)  mortal(x) C if (i >10) exit(0); Natural Languages English John saw the dog German Johann hat den hund gesehen Maltese Gianni ra kelb

16 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?16 Points of Similarity A language is considered to be a (possibly infinite) set of sentences. Sentences are sequences of words. Formation rules determine which sequences are valid sentences. Sentences have a definite structure. Sentence structure related to meaning.

17 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?17 Points of Difference Formal Languages The grammar defines the language Restricted application Non ambiguous Natural Languages The language defines the grammar Universal application Highly ambiguous

18 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?18 Ambiguity Lexical Ambiguity the sheep is in the pen Syntactic Ambiguity small animals and children laugh Semantic Ambiguity every girl loves a sailor Pragmatic Ambiguity can you pass the salt? The management of ambiguity is central to the success of CL

19 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?19 Computer Science The study of basic concepts Algorithm Program Information Data The application of these concepts to practical tasks. Implementation of information processing models from other fields.

20 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?20 Unimplemented theories can be dangerous Representational details omitted. Computer memory requirements omitted. Nature of individual steps may be unclear. Difficult to test. Potentially unimplementable

21 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?21 Psychological Memory Model

22 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?22 Algorithms and Linguistics Does linguistic theory make sense without implementing the concepts? Linguistic theory provides linguistic knowledge in the form of grammar rules theories about grammar rules Putting knowledge to some use involves processing issues: parsing generation

23 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?23 Computational Linguistics – Issues How are a grammar and a lexicon represented? How is the structure of a given sentence actually discovered? How can we actually generate a sentence to express a particular meaning? How can linguistic theory be made concrete enough to test algorithmically? Can an artificial system learn a language with limited exposure to grammatical sentences?

24 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?24 Computational Linguistics Twin Goals Scientific Goal: Contribute to Linguistics by adding a computational dimension. Technological Goal: Develop basis for machinery capable of handling human language that can support “language engineering”

25 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?25 Applications of Computational Linguistics Machine Translation Information Retrieval/Extraction Document Classification Question Answering Style and Spell Checking Integrated Multimodal Tasks

26 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?26 Course Contents 1 (MR)Overview 2 (RF)Chomsky Hierarchy 3 (MR)Examples 4 (RF)Grammatical Categories 5, 6 (MR)Tagging 7 (RF)Morphology 8, 9, 10 (MR)Comp Morphology 11 (RF)Syntax 12, 13, 14(MR)Grammar Formalism

27 Feb 2005 -- MRCSA2050 - Lecture I: What Is CL?27 Computational Linguistics – Tools & Resources Grammar Formalisms, e.g. Definite Clause Grammars Parsing Algorithms sentence  structure Generation Algorithms structure  sentence Statistical Methods Linguistic Corpora


Download ppt "CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?"

Similar presentations


Ads by Google