CS621/CS449 Artificial Intelligence Lecture Notes Set 8 : 27/10/2004 13/08/2004 CS-621/CS-449 Lecture Notes
Outline Probabilistic Spell Checker (continued from Noisy Channel Model) Confusion Matrix 27/10/2004 CS-621/CS-449 Lecture Notes
Probabilistic Spell Checker Noisy Channel Model The problem formulation for spell checker is based on the Noisy Channel Model w t (wn, wn-1, … , w1) (tm, tm-1, … , t1) Given t, find the most probable w : Find that ŵ for which P(w|t) is maximum, where t, w and ŵ are strings: Noisy Channel ŵ Guess at the correct word Correct word Wrongly spelt word 27/10/2004 CS-621/CS-449 Lecture Notes
Probabilistic Spell checker Applying Bayes rule, Why apply Bayes rule? Finding p(w|t) Vs p(t|w) ? P(w|t) or P(t|w) have to be computed by counting c(w,t) or c(t,w) and then normalizing them Assumptions : t is obtained from w by a single error of the above type. The words consist of only alphabets ŵ 27/10/2004 CS-621/CS-449 Lecture Notes
Confusion Matrix Confusion Matrix: 26x26 Data structure to store c(a,b) Different matrices for insertion, deletion, substitution and transposition Substitution The number of instances in which a is wrongly substituted by b in the training corpus (denoted sub(x,y) ) 27/10/2004 CS-621/CS-449 Lecture Notes
Confusion Matrix Insertion The number of times a letter y is inserted after x wrongly( denoted ins(x,y) ) Transposition The number of times xy is wrongly transposed to yx ( denoted trans(x,y) ) Deletion The number of times y is deleted wrongly after x ( denoted del(x,y) ) 27/10/2004 CS-621/CS-449 Lecture Notes
Confusion Matrix If x and y are alphabets, sub(x,y) = # times y is written for x (substitution) ins(x,y) = # times x is written as xy del(x,y) = # times xy is written as x trans(x,y) = # times xy is written as yx 27/10/2004 CS-621/CS-449 Lecture Notes
Probabilities P(t|w) = P(t|w)S + P(t|w)I + P(t|w)D + P(t|w)X Where P(t|w)S = sub(x,y) / count of x P(t|w)I = ins(x,y) / count of x P(t|w)D = del(x,y) / count of x P(t|w)X = trans(x,y) / count of x These are considered to be mutually exclusive events 27/10/2004 CS-621/CS-449 Lecture Notes
Example Correct document has ws Wrong document has ts P(maple|aple) = # (maple was wanted instead of aple) / # (aple) P(apple|aple) and P(applet|aple) calculated similarly Leads to problems due to data sparsity. Hence, use Bayes rule. 27/10/2004 CS-621/CS-449 Lecture Notes