Logistic Regression Rong Jin. Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c.

Logistic Regression Rong Jin

Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c

Logistic Regression Model  The log-ratio of positive class to negative class  Results

Logistic Regression Model  Assume the inputs and outputs are related in the log linear function  Estimate weights: MLE approach

Example 1: Heart Disease Input feature x: age group id output y: having heart disease or not +1: having heart disease -1: no heart disease 1: 25-29 2: 30-34 3: 35-39 4: 40-44 5: 45-49 6: 50-54 7: 55-59 8: 60-64

Example 1: Heart Disease Logistic regression model Learning w and c: MLE approach Numerical optimization: w = 0.58, c = -3.34

Example 1: Heart Disease  W = 0.58 An old person is more likely to have heart disease  C = -3.34 i  w+c < 0  p(+|i) < p(-|i) i  w+c > 0  p(+|i) > p(-|i) i  w+c = 0  decision boundary  i* = 5.78  53 year old

Naïve Bayes Solution Inaccurate fitting Non Gaussian distribution i* = 5.59 Close to the estimation by logistic regression Even though naïve Bayes does not fit input patterns well, it still works fine for the decision boundary

Problems with Using Histogram Data?

Uneven Sampling for Different Ages

Solution w = 0.63, c = -3.56  i* = 5.65

Example: Text Classification  Input x: a binary vector Each word is a different dimension x i = 0 if the ith word does not appear in the document x i = 1 if it appears in the document  Output y: interesting document or not +1: interesting -1: uninteresting

Example: Text Classification Doc 1 The purpose of the Lady Bird Johnson Wildflower Center is to educate people around the world, … Doc 2 Rain Bird is one of the leading irrigation manufacturers in the world, providing complete irrigation solutions for people… termtheworldpeoplecompanycenter… Doc 111101… Doc 211110…

Example 2: Text Classification  Logistic regression model Every term t i is assigned with a weight w i  Learning parameters: MLE approach  Need numerical solutions

Example 2: Text Classification  Weight w i w i > 0: term t i is a positive evidence w i < 0: term t i is a negative evidence w i = 0: term t i is irrelevant to whether the document is intesting The larger the | w i |, the more important t i term is determining whether the document is interesting.  Threshold c

Example 2: Text Classification Dataset: Reuter-21578 Classification accuracy Naïve Bayes: 77% Logistic regression: 88%

Why Logistic Regression Works better for Text Classification?  Common words Small weights in logistic regression Large weights in naïve Bayes  Weight ~ p(w|+) – p(w|-)  Independence assumption Naive Bayes assumes that each word is generated independently Logistic regression is able to take into account of the correlation of words

Comparison Generative Model Model P(x|y) Model the input patterns Usually fast converge Cheap computation Robust to noise data But Usually performs worse Discriminative Model Model P(y|x) directly Model the decision boundary Usually good performance But Slow convergence Expensive computation Sensitive to noise data

Logistic Regression Rong Jin. Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c.

Similar presentations

Presentation on theme: "Logistic Regression Rong Jin. Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Logistic Regression Rong Jin. Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c.

Similar presentations

Presentation on theme: "Logistic Regression Rong Jin. Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c."— Presentation transcript:

Similar presentations

About project

Feedback