Comparative Experiments on Sentiment Classification for Online Product Reviews Hang Cui, Vibhu Mittal, and Mayur Datar AAAI 2006.

Comparative Experiments on Sentiment Classification for Online Product Reviews Hang Cui, Vibhu Mittal, and Mayur Datar AAAI 2006

Introduction A large amount of Web content is subjective and reflects peoples ’ opinions. Two focuses of their research: Large-scale, real-world datasets. Unigrams vs. n-grams

Contributions Conduct experiments on a corpus of over 200k online reviews with an average length of over 800 bytes. Study the impact of higher order n- grams (n>=3) Study multiple classification algorithms for processing large scale data.

Previous Work Pang, Lee and Vaithyanathan (2002) Thumbs up? Sentiment classification using machine learning techniques. Na ï ve Bayes, Maximum Entropy, SVM (bigram), PSP (2005) PA Algorithm, Language Model, Winnow classifier (Nigam and Hurst, 2004)

Classifiers - PA Passive-Aggressive (PA) Algorithm Based Classifier: the new classifier should be a close proximity to the current one (passive update) while achieve at least a unit margin on the most recent example (aggressive update). Constrained optimization problem

Classifiers - PA PA vs. SVM PA follows an online learning pattern, which is attractive to Web applications. PA has a theoretical loss bound. 10 cross validation

Classifiers - LM Language Modeling (LM) Based Classifier: a generative method that calculates the probability of generating a given word sequence.

Classifiers - LM Due to the limitations of training data, n-gram language modeling often suffers from data sparseness: smoothing. Good-Turing estimation:

Classifiers - Winnow Winnow learns a linear classifier from bag-of-words of documents to predict the polarity of review x: c w (x) = 0 or 1

Classifiers - Winnow Training phase: Calculate h(x) If the review is positive but is predicted as negative, update f w where c w (x) = 1 by f w x 2 If the review is negative but is predicted as positive update f w where c w (x) = 1 by f w / 2

N-grams as Linguistic Feature N-gram in this paper: 1+2+3+ … +N-gram N is set to 6 Calculate x 2 scores for each n-gram (term vs. class) Take top M ranked n-gram as features

Data Set Electronic products (digital cameras, laptops, PDAs, MP3 players … from Froogle http://froogle.google.com)http://froogle.google.com Rate R = 5 or 10, R=1 and R for training, R=2 and R-1 for testing.

Results

Results Discussion High order n-grams improve the performance of the classifiers, especially the performance on the negative instances. Discriminative models are more appropriate than sentiment classification than generative models. (4% up) Mixture makes the generative models confused.

Results Discussion The performance of the PA classifier is not sensitive to the number of features. Filtering out objective sentences does not show obvious advantage for our data set. (Product category/movie reviews, filtering performance, testing rate level … )

Conclusion Large-scale data set Discriminating classifier + high- order n-gram performs comparatively better Learning online is possible

Future Work Better feature selection scheme (noisy n-grams) Classification in different scales (Pang and Lee, 2005)

Comparative Experiments on Sentiment Classification for Online Product Reviews Hang Cui, Vibhu Mittal, and Mayur Datar AAAI 2006.

Similar presentations

Presentation on theme: "Comparative Experiments on Sentiment Classification for Online Product Reviews Hang Cui, Vibhu Mittal, and Mayur Datar AAAI 2006."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Comparative Experiments on Sentiment Classification for Online Product Reviews Hang Cui, Vibhu Mittal, and Mayur Datar AAAI 2006.

Similar presentations

Presentation on theme: "Comparative Experiments on Sentiment Classification for Online Product Reviews Hang Cui, Vibhu Mittal, and Mayur Datar AAAI 2006."— Presentation transcript:

Similar presentations

About project

Feedback