CS621/CS449 Artificial Intelligence Lecture Notes

Slides:



Advertisements
Similar presentations
LING 438/538 Computational Linguistics Sandiway Fong Lecture 17: 10/25.
Advertisements

February 26, 2015Applied Discrete Mathematics Week 5: Mathematical Reasoning 1 Addition of Integers Example: Add a = (1110) 2 and b = (1011) 2. a 0 + b.
Classification. Introduction A discriminant is a function that separates the examples of different classes. For example – IF (income > Q1 and saving >Q2)
A BAYESIAN APPROACH TO SPELLING CORRECTION. ‘Noisy channels’ In a number of tasks involving natural language, the problem can be viewed as recovering.
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 12: Sequence Analysis Martin Russell.
CS 4705 Lecture 5 Probabilistic Approaches to Pronunciation and Spelling.
Expected accuracy sequence alignment
Events and their probability
Gobalisation Week 8 Text processes part 2 Spelling dictionaries Noisy channel model Candidate strings Prior probability and likelihood Lab session: practising.
Computational Language Andrew Hippisley. Computational Language Computational language and AI Language engineering: applied computational language Case.
Metodi statistici nella linguistica computazionale The Bayesian approach to spelling correction.
Naive Bayes model Comp221 tutorial 4 (assignment 1) TA: Zhang Kai.
Lesson 4 Review of Vectors and Matrices. Vectors A vector is normally expressed as or in terms of unit vectors likewise.
CS276 – Information Retrieval and Web Search Checking in. By the end of this week you need to have: Watched the online videos corresponding to the first.
Econ 140 Lecture 51 Bivariate Populations Lecture 5.
Introduction to Information Retrieval Introduction to Information Retrieval CS276: Information Retrieval and Web Search Christopher Manning, Pandu Nayak.
: Appendix A: Mathematical Foundations 1 Montri Karnjanadecha ac.th/~montri Principles of.
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
BİL711 Natural Language Processing1 Statistical Language Processing In the solution of some problems in the natural language processing, statistical techniques.
November 2005CSA3180: Statistics III1 CSA3202: Natural Language Processing Statistics 3 – Spelling Models Typing Errors Error Models Spellchecking Noisy.
Bayesian Networks. Male brain wiring Female brain wiring.
LING 438/538 Computational Linguistics
Using Probabilistic Information Read J & M Chapter 5, pages
Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141 ~ 189.
Naive Bayes Classifier
LING/C SC/PSYC 438/538 Lecture 19 Sandiway Fong. Administrivia Next Monday – guest lecture from Dr. Jerry Ball of the Air Force Research Labs to be continued.
1 CSA4050: Advanced Topics in NLP Spelling Models.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 25 Wednesday, 20 October.
Channel Capacity.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 3 (10/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Statistical Formulation.
CS276: Information Retrieval and Web Search
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
CS Statistical Machine learning Lecture 10 Yuan (Alan) Qi Purdue CS Sept
Computing & Information Sciences Kansas State University Wednesday, 22 Oct 2008CIS 530 / 730: Artificial Intelligence Lecture 22 of 42 Wednesday, 22 October.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 25 of 41 Monday, 25 October.
Chapter 5: Spell Checking
Pairwise Sequence Analysis-III
Expected accuracy sequence alignment Usman Roshan.
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
12/7/20151 Math b Conditional Probability, Independency, Bayes Theorem.
1 1 Slide © 2004 Thomson/South-Western Assigning Probabilities Classical Method Relative Frequency Method Subjective Method Assigning probabilities based.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 4 Introduction to Probability Experiments, Counting Rules, and Assigning Probabilities.
Weighted Minimum Edit Distance
5.2 Day One Probability Rules. Learning Targets 1.I can describe a probability model for a chance process. 2.I can use basic probability rules, including.
Probability Any event occurring as a result of a random experiment will usually be denoted by a capital letter from the early part of the alphabet. e.g.
STATISTICS 6.0 Conditional Probabilities “Conditional Probabilities”
1 Lecture 4 Probability Concepts. 2 Section 4.1 Probability Basics.
January 2012Spelling Models1 Human Language Technology Spelling Models.
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
LING/C SC/PSYC 438/538 Lecture 24 Sandiway Fong 1.
Matrices CHAPTER 8.9 ~ Ch _2 Contents  8.9 Power of Matrices 8.9 Power of Matrices  8.10 Orthogonal Matrices 8.10 Orthogonal Matrices 
Matrix Algebra Definitions Operations Matrix algebra is a means of making calculations upon arrays of numbers (or data). Most data sets are matrix-type.
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
AP Statistics From Randomness to Probability Chapter 14.
Text Classification and Naïve Bayes Formalizing the Naïve Bayes Classifier.
Chapter 4 Probability.
Naive Bayes Classifier
4-6 Probability and Counting Rules
Where are we in CS 440? Now leaving: sequential, deterministic reasoning Entering: probabilistic reasoning and machine learning.
Data Mining Lecture 11.
1.
Honors Statistics From Randomness to Probability
CS621/CS449 Artificial Intelligence Lecture Notes
CSA3180: Natural Language Processing
CPSC 503 Computational Linguistics
CS344 : Introduction to Artificial Intelligence
Guess the letter!.
CS621: Artificial Intelligence
Section 11.7 Probability.
CSCI 5582 Artificial Intelligence
Presentation transcript:

CS621/CS449 Artificial Intelligence Lecture Notes Set 8 : 27/10/2004 13/08/2004 CS-621/CS-449 Lecture Notes

Outline Probabilistic Spell Checker (continued from Noisy Channel Model) Confusion Matrix 27/10/2004 CS-621/CS-449 Lecture Notes

Probabilistic Spell Checker Noisy Channel Model The problem formulation for spell checker is based on the Noisy Channel Model w t (wn, wn-1, … , w1) (tm, tm-1, … , t1) Given t, find the most probable w : Find that ŵ for which P(w|t) is maximum, where t, w and ŵ are strings: Noisy Channel ŵ Guess at the correct word Correct word Wrongly spelt word 27/10/2004 CS-621/CS-449 Lecture Notes

Probabilistic Spell checker Applying Bayes rule, Why apply Bayes rule? Finding p(w|t) Vs p(t|w) ? P(w|t) or P(t|w) have to be computed by counting c(w,t) or c(t,w) and then normalizing them Assumptions : t is obtained from w by a single error of the above type. The words consist of only alphabets ŵ 27/10/2004 CS-621/CS-449 Lecture Notes

Confusion Matrix Confusion Matrix: 26x26 Data structure to store c(a,b) Different matrices for insertion, deletion, substitution and transposition Substitution The number of instances in which a is wrongly substituted by b in the training corpus (denoted sub(x,y) ) 27/10/2004 CS-621/CS-449 Lecture Notes

Confusion Matrix Insertion The number of times a letter y is inserted after x wrongly( denoted ins(x,y) ) Transposition The number of times xy is wrongly transposed to yx ( denoted trans(x,y) ) Deletion The number of times y is deleted wrongly after x ( denoted del(x,y) ) 27/10/2004 CS-621/CS-449 Lecture Notes

Confusion Matrix If x and y are alphabets, sub(x,y) = # times y is written for x (substitution) ins(x,y) = # times x is written as xy del(x,y) = # times xy is written as x trans(x,y) = # times xy is written as yx 27/10/2004 CS-621/CS-449 Lecture Notes

Probabilities P(t|w) = P(t|w)S + P(t|w)I + P(t|w)D + P(t|w)X Where P(t|w)S = sub(x,y) / count of x P(t|w)I = ins(x,y) / count of x P(t|w)D = del(x,y) / count of x P(t|w)X = trans(x,y) / count of x These are considered to be mutually exclusive events 27/10/2004 CS-621/CS-449 Lecture Notes

Example Correct document has ws Wrong document has ts P(maple|aple) = # (maple was wanted instead of aple) / # (aple) P(apple|aple) and P(applet|aple) calculated similarly Leads to problems due to data sparsity. Hence, use Bayes rule. 27/10/2004 CS-621/CS-449 Lecture Notes