Smoothing Problem with MLE BeijingShanghai…MengChengCLASS = ‘china’ 11…0+ 10…0+ 01…0+ 00…0- A small city in the Anhui province BeijingShanghai…MengChengCLASS.

Presentation on theme: "Smoothing Problem with MLE BeijingShanghai…MengChengCLASS = ‘china’ 11…0+ 10…0+ 01…0+ 00…0- A small city in the Anhui province BeijingShanghai…MengChengCLASS."— Presentation transcript:

Smoothing Problem with MLE BeijingShanghai…MengChengCLASS = ‘china’ 11…0+ 10…0+ 01…0+ 00…0- A small city in the Anhui province BeijingShanghai…MengChengCLASS = ‘china’ 11…1? Common reasons: data sparseness, rare features, …

Smoothing Add-one smoothing (Laplace smoothing) –Essentially, every possible value for a variable have non-zero count in any class BeijingShanghai…MengChengCLASS = ‘china’ 11…0+ B = # of possible values for the variable in question.

Bernoulli Training Chinese Beijing Chinese+ Chinese Chinese Shanghai+ Chinese Macao+ Tokyo Japan Chinese- ChineseBeijingShanghaiMacaoTokyoJapanCLASS 110000+ 101000+ 100100+ 100011-

Bernoulli Training ChineseBeijingShanghaiMacaoTokyoJapanCLASS 110000+ 101000+ 100100+ 100011- ChineseBeijingShanghaiMacaoTokyoJapanCLASS 311100+ * 3 100011- * 1 B = # of possible values for the variable in question = 2

Bernoulli Testing Chinese Chinese Chinese Tokyo Japan? ChineseBeijingShanghaiMacaoTokyoJapanCLASS 311100+ * 3 100011- * 1 ChineseBeijingShanghaiMacaoTokyoJapanCLASS 100011?

Multinomial Training Chinese Beijing Chinese+ Chinese Chinese Shanghai+ Chinese Macao+ Tokyo Japan Chinese- W1W2W3WiCLASS ChineseBeijingChinese+ Shanghai+ ChineseMacao+ TokyoJapanChinese-

Multinomial Training W1W2W3CLASS ChineseBeijingChinese+ Shanghai+ ChineseMacao+ TokyoJapanChinese- WiCLASS Chinese+ Beijing+ Chinese+ + + Shanghai+ Chinese+ Macao+ Tokyo- Japan- Chinese- B = # of possible values for the variable in question = 6

Multinomial Testing Chinese Chinese Chinese Tokyo Japan? W1W2W3W4W5CLASS Chinese TokyoJapan? WCLASS Chinese+ Beijing+ Chinese+ + + Shanghai+ Chinese+ Macao+ Tokyo- Japan- Chinese-

Similar presentations