Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Computational Linguistics Dr. Radhika Mamidi ENG 270 Lecture 2.

Similar presentations


Presentation on theme: "Introduction to Computational Linguistics Dr. Radhika Mamidi ENG 270 Lecture 2."— Presentation transcript:

1 Introduction to Computational Linguistics Dr. Radhika Mamidi ENG 270 Lecture 2

2 CL vs NLP CL and NLP are related with the focus being different. CL and NLP are related with the focus being different. Computational Linguistics aims to model language as people do. Computational Linguistics aims to model language as people do. Natural Language Processing is processing language from a computational point of view in order to build different applications and tools. Natural Language Processing is processing language from a computational point of view in order to build different applications and tools. Applications on the computer side Applications on the computer side

3 History: 1940-1950’s Development of formal language theory Development of formal language theory (Chomsky, Kleene, Backus) – Formal characterization of classes of grammar (context-free, regular) – Association with relevant automata Probability theory: language understanding as Probability theory: language understanding as decoding through noisy channel (Shannon) – Use of information theoretic concepts like entropy to measure success of language models.

4 1957-1983 Symbolic vs. Stochastic Symbolic Symbolic – Use of formal grammars as basis for natural language processing and learning systems. (Chomsky, Harris) – Use of logic and logic based programming for characterizing syntactic or semantic inference (Kaplan, Kay,Pereira) – First toy natural language understanding and generation systems (Woods, Minsky, Schank, Winograd) – Discourse Processing: Role of Intention, Focus (Grosz, Sidner, Hobbs) Stochastic Modeling Stochastic Modeling – Probabilistic methods for early speech recognition, OCR (Bledsoe and Browning, Jelinek, Black, Mercer)

5 1983-1993: Return of Empiricism Use of stochastic techniques for part of speech tagging, parsing, word sense disambiguation, etc. Use of stochastic techniques for part of speech tagging, parsing, word sense disambiguation, etc. Comparison of stochastic, symbolic and other models for language understanding and learning tasks. Comparison of stochastic, symbolic and other models for language understanding and learning tasks.

6 1993-Present Advances in software and hardware Advances in software and hardware create NLP needs for information retrieval (web), machine translation, spelling and grammar checking, speech recognition and synthesis. create NLP needs for information retrieval (web), machine translation, spelling and grammar checking, speech recognition and synthesis.

7 Language and Intelligence: Turing Test Turing test: -- machine, human, and human judge Judge asks questions of computer and human. -- Machine’s job is to act like a human -- Human’s job is to convince judge that he’s not the machine. Machine judged “intelligent” if it can fool judge. Judgment of “intelligence” linked to appropriate answers to questions from the system.

8 ELIZA A simple “Rogerian Psychologist” A simple “Rogerian Psychologist” Uses pattern Matching to carry on limited form of conversation. Uses pattern Matching to carry on limited form of conversation. It gives a feeling that it is “human” It gives a feeling that it is “human” Seems to pass the “Turing Test” Seems to pass the “Turing Test” It is one of the first chatbots. It is one of the first chatbots.

9 Ambiguity - Mental processing He showed me the mouse - rodent/object He showed me the mouse - rodent/object The leopard was spotted - verb/adjective The leopard was spotted - verb/adjective She hit the boy with the umbrella She hit the boy with the umbrella I am reading a book on films - now-a-days/right now I am reading a book on films - now-a-days/right now Mary promised Sally (i) to go to her (i) party Mary promised Sally (i) to go to her (i) party Mary (i) persuaded Sally to go to her (i) party Mary (i) persuaded Sally to go to her (i) party

10 What’s involved in an “intelligent” Answer? Analysis: Decomposition of the signal (spoken or written) eventually into meaningful units. This involves … Phonology Phonology Morphology Morphology Syntax Syntax Discourse Analysis Discourse Analysis Semantics Semantics Pragmatics Pragmatics

11 Levels of Language Processing Phonology Phonology Morphology Morphology Syntax Syntax Semantics Semantics Pragmatics Pragmatics Discourse Analysis Discourse Analysis

12 Examples Pronounce “GHOTI” Pronounce “GHOTI” I scream, A nameless man I scream, A nameless man change, kite, park, fine change, kite, park, fine Fine for parking! Fine for parking! Flying planes can be dangerous. Flying planes can be dangerous. If the baby doesn’t thrive on raw milk, boil it! If the baby doesn’t thrive on raw milk, boil it! How was it? How was it?

13 Speech/Character Recognition Decomposition into words, segmentation of words into appropriate phones or letters Decomposition into words, segmentation of words into appropriate phones or letters Requires knowledge of phonological patterns Requires knowledge of phonological patterns

14 Applications Text to speech Text to speech Riyadh is the capital city of the Kingdom of Saudi Arabia. Riyadh is a beautiful place. I love living here. http://tcts.fpms.ac.be/synthesis/mbrola/ Use: Public announcements – airport, railway stations Speech Recognition Speech Recognition Use: Pronunciation dictionaries, mobile phones, voice commands in pc

15 Some problems Grapheme to Phoneme conversion Grapheme to Phoneme conversion Different spellings – same pronunciation Different spellings – same pronunciation Same spellings – different pronunciation Same spellings – different pronunciation Example: Example: read, bow, dove read, bow, dove reed-read, bear-bare reed-read, bear-bare Numbers, Names, Acronyms Numbers, Names, Acronyms 1980, St., PSU 1980, St., PSU

16 Memory General Knowledge Lexicon Syntactic Rules Semantic Rules Discourse Rules Lexical Processing INPUTS Syntactic Processing Semantic Processing Discourse Processing OUTPUTS Hetararchical model of Language Processing

17 Morphological Analysis Inflectional morphology Inflectional morphology :word variation reflects features like tense, number, degree, gender :grammatical category remains same eg. eat-eats, boy-boys, thin-thinner Derivational morphology Derivational morphology :word variation changes grammatical category eg. act-actor, boy-boyish :word variation maintains grammatical category :word variation maintains grammatical category eg. fair-unfair, like-dislike eg. fair-unfair, like-dislike Inflection follows Derivation: act--actor--actors Inflection follows Derivation: act--actor--actors Morphological analyzer Morphological analyzer identifies roots and affixes identifies roots and affixes

18 Syntactic Parsing Process of identifying syntactic structure of a valid sentence Process of identifying syntactic structure of a valid sentence Represented by trees, rules and networks Represented by trees, rules and networks Syntax Components Syntax Components Phrase Structure Rules Phrase Structure Rules Transformational Rules Transformational Rules Syntactic Parsers Syntactic Parsers e.g. Augmented Transition Networks e.g. Augmented Transition Networks

19 Syntax Component Chomsky’s (1965) model of language Chomsky’s (1965) model of language Phrase Structure rules generate deep structures Phrase Structure rules generate deep structures Deep Structure holds all the syntactic information needed to derive the meaning of a sentence Deep Structure holds all the syntactic information needed to derive the meaning of a sentence This is fed into the semantic component to obtain acceptable combinations This is fed into the semantic component to obtain acceptable combinations Transformational rules map deep structures to surface structure Transformational rules map deep structures to surface structure Surface Structure has words in the right order Surface Structure has words in the right order This is obtained after feeding surface structure into the phonological component This is obtained after feeding surface structure into the phonological component

20 Chomsky’s model SYNTAX COMPONENT Surface structures Transformational rules Phrase Structure Rules Deep structures PHONOLOGICAL COMPONENT Phonological rules Selection restriction rules Lexicon SEMANTIC COMPONENT

21 Augmented Transition Networks Developed by Woods (1970) Developed by Woods (1970) Series of states with arrows (arcs) linking one state to the next Series of states with arrows (arcs) linking one state to the next Works through a sentence from left to right Works through a sentence from left to right The arcs are labelled The arcs are labelled Group of words stored temporarily in ‘register’ Group of words stored temporarily in ‘register’ helps in look ahead - which arc to take next helps in look ahead - which arc to take next

22 s1s2s3 NPVP S: s1 s2 s3 articlenoun Empty Adj loop NP: s1 s2 s3 verb NP VP:

23 S NPVP N V NP Riyadh is art beautiful Adj a Noun place

24 Example of syntactic analysis by ‘Link parser’. Riyadh is a beautiful place. (S (NP Riyadh) (VP is (VP is (NP a beautiful place)) (NP a beautiful place)).).)http://www.link.cs.cmu.edu/link/submit-sentence-4.html


Download ppt "Introduction to Computational Linguistics Dr. Radhika Mamidi ENG 270 Lecture 2."

Similar presentations


Ads by Google