Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志.

Slides:



Advertisements
Similar presentations
An Introduction To Categorization Soam Acharya, PhD 1/15/2003.
Advertisements

Introduction to Support Vector Machines (SVM)
G53MLE | Machine Learning | Dr Guoping Qiu
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct
Data Mining Classification: Alternative Techniques
Pattern Recognition and Machine Learning
Support Vector Machines
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
A Survey on Text Categorization with Machine Learning Chikayama lab. Dai Saito.
The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
Fuzzy Support Vector Machines (FSVMs) Weijia Wang, Huanren Zhang, Vijendra Purohit, Aditi Gupta.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
CES 514 – Data Mining Lecture 8 classification (contd…)
Data mining and statistical learning - lecture 13 Separating hyperplane.
Text Classification With Labeled and Unlabeled Data Presenter: Aleksandar Milisic Supervisor: Dr. David Albrecht.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
1 IFT6255: Information Retrieval Text classification.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
Incremental Support Vector Machine Classification Second SIAM International Conference on Data Mining Arlington, Virginia, April 11-13, 2002 Glenn Fung.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
Support Vector Machines Mei-Chen Yeh 04/20/2010. The Classification Problem Label instances, usually represented by feature vectors, into one of the predefined.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Support Vector Machine (SVM) Based on Nello Cristianini presentation
©2012 Paula Matuszek CSC 9010: Text Mining Applications: Document-Based Techniques Dr. Paula Matuszek
Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者:郝柏翰 2013/01/28.
Universit at Dortmund, LS VIII
An Introduction to Support Vector Machine (SVM) Presenter : Ahey Date : 2007/07/20 The slides are based on lecture notes of Prof. 林智仁 and Daniel Yeung.
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Machine Learning in Ad-hoc IR. Machine Learning for ad hoc IR We’ve looked at methods for ranking documents in IR using factors like –Cosine similarity,
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
Text Classification 2 David Kauchak cs459 Fall 2012 adapted from:
Spam Detection Ethan Grefe December 13, 2013.
Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000.
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
Non-Bayes classifiers. Linear discriminants, neural networks.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
An Introduction to Support Vector Machine (SVM)
Machine Learning for Spam Filtering 1 Sai Koushik Haddunoori.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Class Imbalance in Text Classification
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Question Classification using Support Vector Machine Dell Zhang National University of Singapore Wee Sun Lee National University of Singapore SIGIR2003.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Perceptrons Michael J. Watts
1 An introduction to support vector machine (SVM) Advisor : Dr.Hsu Graduate : Ching –Wen Hong.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
A Document-Level Sentiment Analysis Approach Using Artificial Neural Network and Sentiment Lexicons Yan Zhu.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
CATEGORIZATION OF NEWS ARTICLES USING NEURAL TEXT CATEGORIZER
An Introduction to Support Vector Machines
Pawan Lingras and Cory Butz
Machine Learning Week 1.
Text Categorization Assigning documents to a fixed set of categories
An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Presentation transcript:

Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志

Outline  Introduction  Related Work  Method  Experiment  Conclusion

Introduction  Two types of approaches to text categorization Rule based - Define manually in form of if-then-else  Advantage 1)High precision  Disadvantages 1)Poor recall 2)Poor flexibility

Introduction ‚Machine learning - Using sample labeled documents  Advantage 1)Much High recall  Disadvantages 1)Slightly lower precision than rule based 2)Poor flexibility

Introduction  Focuses on machine learning based, discarding rule based  All the raw data should be encoded into numerical vectors  Encoding documents leads to two main problems 1)Huge dimensionality 2)Sparse distribution

Introduction  Propose two way 1)String vector – Provide more transparency in classification 2)NTC (Neural Text Categorizer) – Classify documents with its sufficient robustness Solves the huge dimensionality

Related Work  Machine learning algorithms applied to text categorization 1)KNN (K Nearest Neighbor) 2)NB (Naïve Bayes) 3)SVM (Support Vector Machine) 4)BP (Back Propagation )

Related Work  KNN is evaluated as a simple and competitive algorithm with Support Vector Machine by Sebastiani in 2002  Disadvantage 1)Costs very much time for classifying objects

Related Work  Evaluated feature selection methods within the application of NB by Mladenic and Grobellink in 1999  NB for implementing a spam mail filtering system as a real system based on text categorization by Androutsopoulos in 2000  Requires encoding documents into numerical vectors

Related Work  SVM becomes more popular than the KNN and NB machine learning algorithms  Defining a hyper-plane as a boundary of classes  Applicable to only linearly separable distribution of training examples  Optimizes the weights of the inner products of training examples and input vector, called Lagrange multipliers

Related Work  Define two hyper-planes as a boundary of two classes with a maximal margin, figure 1. Figure 1.

Related Work  Advantage 1)Tolerant to huge dimensionality of numerical vectors  Disadvantage 1)Applicable to only binary classification 1)Fragile in representing documents into numerical vectors

Related Work  A hierarchical combination of BPs, called HME (Hierarchical Mixture of Experts), instead of a single BP by Ruiz and Srinivasan in 2002  Observed that HME is the better combination of BPs  Disadvantage 1)Cost much time and slowly 2)Not practical

Study Aim  Two problems 1)Huge dimensionality 2)Sparse distribution  Two successful methods 1)String vectors 2)A new neural network

Method  Numerical Vectors Figure 2.

Method : Frequency of the word, w k : Total number of documents in the corpus : The number of documents including the word in the corpus Figure 3.

Method  Encoding a document into its string vector Figure 4.

Method  Text Categorization Systems  Proposed neural network (NTC)  Consists of the three layers 1)Input layer 2)Output layer 3)Learning layer

Method  Input Layer - C orresponds to each word in the string vector  Learning Layer - Corresponding to predefined categories  Output Layer - Generates categorical scores, and correspond to predefined categories. Figure 5.

Method  String vector is denoted by x = [t 1,t 2,...,td ], t i, 1 ≤ i ≤ d  Predefined categories is denoted by C = [c 1,c 2,…..c |c| ], 1≤ j ≤ |C|  W ji denote the weight by Figure 6.

Method  O j : Output node corresponding to the category, C j  Membership of the given input vector, x in the category, C j Figure 7.

Method  Each string vector in the training set has its own target label, C j  If its classified category, C k,, is identical to target category, C Figure 8.

Method  Inhibit weights for its misclassified category  Minimize the classification error Figure 9.

Experiment  Evaluate the five approaches on test bed, called ‘20NewsGroups  Each category contain identical number of test documents  Test bed consists of 20 categories and 20,000 documents  Using micro-averaged and macro-averaged average methods

Experiment  Back propagation is the best approach  NB is the worst approach with the decomposition of the task Figure 10. Evaluate the five text classifiers in 20Newsgroup with decomposition

Experiment  Classifier answers to each test document by providing one of 20 categories  Exits two groups 1)Better group - BP and NTC 2)Worse group – NB and KNN Figure 11. Evaluate the five text classifiers in 20Newsgroup without decomposition

Conclusion  Used a full inverted index as the basis for the operation on string vectors, instead of a restricted sized similarity matrix  Note trade-off between the two bases for the operation on string vectors  NB and BP are considered to be modified into their adaptable versions to string vetors, but may be insufficient for modifying other  Future research for modifying other machine learning algorithms