Text Classification and Naïve Bayes Text Classification: Evaluation.

Slides:



Advertisements
Similar presentations
CS276A Text Retrieval and Mining Lecture 17 [Borrows some slides from Ray Mooney]
Advertisements

PrasadL18SVM1 Support Vector Machines Adapted from Lectures by Raymond Mooney (UT Austin)
Text Categorization Moshe Koppel Lecture 1: Introduction Slides based on Manning, Raghavan and Schutze and odds and ends from here and there.
Evaluation of Decision Forests on Text Categorization
Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Harun Ug˘uz 2011.KBS A two-stage feature selection method for text categorization by.
ICML Linear Programming Boosting for Uneven Datasets Jurij Leskovec, Jožef Stefan Institute, Slovenia John Shawe-Taylor, Royal Holloway University.
Text Categorization Karl Rees Ling 580 April 2, 2001.
1 Text categorization Feature selection: chi square test.
Text Classification and Naïve Bayes
WEB BAR 2004 Advanced Retrieval and Web Mining Lecture 16.
Probabilistic Generative Models Rong Jin. Probabilistic Generative Model Classify instance x into one of K classes Class prior Density function for class.
Text Classification and Naïve Bayes
Information Retrieval (5) Prof. Dragomir R. Radev
Precision and Recall.
Assuming normally distributed data! Naïve Bayes Classifier.
Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.
1 I256: Applied Natural Language Processing Preslav Nakov and Marti Hearst October 16, 2006 (Many slides originally by Barbara Rosario, modified here)
CS347 Review Slides (IR Part II) June 6, 2001 ©Prabhakar Raghavan.
Kernel Technique Based on Mercer’s Condition (1909)
Using IR techniques to improve Automated Text Classification
1 A Text Filtering Method For Digital Libraries Mustafa Zafer BOLAT Hayri SEVER.
Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.
Evaluating Classifiers
Processing of large document collections Part 3 (Evaluation of text classifiers, applications of text categorization) Helena Ahonen-Myka Spring 2005.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
MULTICLASS CONTINUED AND RANKING David Kauchak CS 451 – Fall 2013.
Text Classification and Naïve Bayes
A Neural Network Approach to Topic Spotting Presented by: Loulwah AlSumait INFS 795 Spec. Topics in Data Mining
Evaluating Hypotheses Reading: Coursepack: Learning From Examples, Section 4 (pp )
ITCS 6265 Information Retrieval & Web Mining Lecture 14: Support vector machines and machine learning on documents [Borrows slides from Ray Mooney]
Nov 23rd, 2001Copyright © 2001, 2003, Andrew W. Moore Linear Document Classifier.
AUTOMATED TEXT CATEGORIZATION: THE TWO-DIMENSIONAL PROBABILITY MODE Abdulaziz alsharikh.
Machine learning system design Prioritizing what to work on
Processing of large document collections Part 3 (Evaluation of text classifiers, term selection) Helena Ahonen-Myka Spring 2006.
Remote Sensing Classification Accuracy
Text categorization Updated 11/1/2006. Performance measures – binary classification Accuracy: acc = (a+d)/(a+b+c+d) Precision: p = a/(a+b) Recall: r =
IR Homework #3 By J. H. Wang May 4, Programming Exercise #3: Text Classification Goal: to classify each document into predefined categories Input:
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
Baskent University Text Filtering1 A Text Filtering Method For Digital Libraries Mustafa Zafer BOLAT Hayri SEVER BASKENT UNIVERSITY COMPUTER ENGINEERING.
IR Homework #3 By J. H. Wang May 10, Programming Exercise #3: Text Classification Goal: to classify each document into predefined categories Input:
Information Retrieval Lecture 4 Introduction to Information Retrieval (Manning et al. 2007) Chapter 13 For the MSc Computer Science Programme Dell Zhang.
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
Web Search and Text Mining Lecture 18: Support Vector Machines.
An Exercise in Machine Learning
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Information Retrieval and Organisation Chapter 14 Vector Space Classification Dell Zhang Birkbeck, University of London.
Text Categorization by Boosting Automatically Extracted Concepts Lijuan Cai and Tommas Hofmann Department of Computer Science, Brown University SIGIR 2003.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
A Comprehensive Comparative Study on Term Weighting Schemes for Text Categorization with SVM Lan Man 3 Nov, 2004.
Efficient Text Categorization with a Large Number of Categories Rayid Ghani KDD Project Proposal.
Information Retrieval Lecture 3 Introduction to Information Retrieval (Manning et al. 2007) Chapter 8 For the MSc Computer Science Programme Dell Zhang.
Article Filtering for Conflict Forecasting Benedict Lee and Cuong Than Comp 540 4/25/2006.
Evaluating Classifiers. Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)
CS276B Text Information Retrieval, Mining, and Exploitation Lecture 5 23 January 2003.
Information Organization: Evaluation of Classification Performance.
IR Homework #2 By J. H. Wang May 9, Programming Exercise #2: Text Classification Goal: to classify each document into predefined categories Input:
Text Classification and Naïve Bayes Formalizing the Naïve Bayes Classifier.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Text Classification and Naïve Bayes
Evaluating Classifiers
Features & Decision regions
CSCI N317 Computation for Scientific Applications Unit Weka
Intro to Machine Learning
Information Retrieval
Roc curves By Vittoria Cozza, matr
Precision and Recall.
Information Organization: Evaluation of Classification Performance
Presentation transcript:

Text Classification and Naïve Bayes Text Classification: Evaluation

Dan Jurafsky 2 More Than Two Classes: Sets of binary classifiers Dealing with any-of or multivalue classification A document can belong to 0, 1, or >1 classes. For each class c ∈ C Build a classifier γ c to distinguish c from all other classes c’ ∈ C Given test doc d, Evaluate it for membership in each class using each γ c d belongs to any class for which γ c returns true Sec.14.5

Dan Jurafsky 3 More Than Two Classes: Sets of binary classifiers One-of or multinomial classification Classes are mutually exclusive: each document in exactly one class For each class c ∈ C Build a classifier γ c to distinguish c from all other classes c’ ∈ C Given test doc d, Evaluate it for membership in each class using each γ c d belongs to the one class with maximum score Sec.14.5

Dan Jurafsky 4 Most (over)used data set, 21,578 docs (each 90 types, 200 toknens) 9603 training, 3299 test articles (ModApte/Lewis split) 118 categories An article can be in more than one category Learn 118 binary category distinctions Average document (with at least one category) has 1.24 classes Only about 10 out of 118 categories are large Common categories (#train, #test) Evaluation: Classic Reuters Data Set Earn (2877, 1087) Acquisitions (1650, 179) Money-fx (538, 179) Grain (433, 149) Crude (389, 189) Trade (369,119) Interest (347, 131) Ship (197, 89) Wheat (212, 71) Corn (182, 56) Sec

Dan Jurafsky 5 Reuters Text Categorization data set (Reuters-21578) document 2-MAR :51:43.42 livestock hog AMERICAN PORK CONGRESS KICKS OFF TOMORROW CHICAGO, March 2 - The American Pork Congress kicks off tomorrow, March 3, in Indianapolis with 160 of the nations pork producers from 44 member states determining industry positions on a number of issues, according to the National Pork Producers Council, NPPC. Delegates to the three day Congress will be considering 26 resolutions concerning various issues, including the future direction of farm policy and the tax law as it applies to the agriculture sector. The delegates will also debate whether to endorse concepts of a national PRV (pseudorabies virus) control and eradication program, the NPPC said. A large trade show, in conjunction with the congress, will feature the latest in technology in all areas of the industry, the NPPC added. Reuter Sec

Dan Jurafsky Confusion matrix c For each pair of classes how many documents from c 1 were incorrectly assigned to c 2 ? c 3,2 : 90 wheat documents incorrectly assigned to poultry 6 Docs in test set Assigned UK Assigned poultry Assigned wheat Assigned coffee Assigned interest Assigned trade True UK True poultry True wheat True coffee True interest True trade

Dan Jurafsky 7 Per class evaluation measures Recall: Fraction of docs in class i classified correctly: Precision: Fraction of docs assigned class i that are actually about class i: Accuracy: (1 - error rate) Fraction of docs classified correctly: Sec

Dan Jurafsky 8 Micro- vs. Macro-Averaging If we have more than one class, how do we combine multiple performance measures into one quantity? Macroaveraging: Compute performance for each class, then average. Microaveraging: Collect decisions for all classes, compute contingency table, evaluate. Sec

Dan Jurafsky 9 Micro- vs. Macro-Averaging: Example Truth: yes Truth: no Classifier: yes10 Classifier: no10970 Truth: yes Truth: no Classifier: yes9010 Classifier: no10890 Truth: yes Truth: no Classifier: yes10020 Classifier: no Class 1Class 2Micro Ave. Table Sec Macroaveraged precision: ( )/2 = 0.7 Microaveraged precision: 100/120 =.83 Microaveraged score is dominated by score on common classes

Dan Jurafsky Development Test Sets and Cross-validation Metric: P/R/F1 or Accuracy Unseen test set avoid overfitting (‘tuning to the test set’) more conservative estimate of performance Cross-validation over multiple splits Handle sampling errors from different datasets Pool results over each split Compute pooled dev set performance Training setDevelopment Test SetTest Set Training Set Dev Test Training Set Dev Test

Text Classification and Naïve Bayes Text Classification: Evaluation