Special topics on text mining [ Part I: text classification ] Hugo Jair Escalante, Aurelio Lopez, Manuel Montes and Luis Villaseñor.

Slides:

Advertisements

Similar presentations

Study on Ensemble Learning By Feng Zhou. Content Introduction A Statistical View of M3 Network Future Works.

Advertisements

Multi-label Classification without Multi-label Cost - Multi-label Random Decision Tree Classifier 1.IBM Research – China 2.IBM T.J.Watson Research Center.

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,

Data Mining Classification: Alternative Techniques

Data Mining Classification: Alternative Techniques

Machine learning continued Image source:

Text Categorization Karl Rees Ling 580 April 2, 2001.

Classification and Decision Boundaries

Discriminative and generative methods for bags of features

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers.

Three kinds of learning

Rotation Forest: A New Classifier Ensemble Method 交通大學電子所蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Discriminative and generative methods for bags of features

Ensemble Learning (2), Tree and Forest

A k-Nearest Neighbor Based Algorithm for Multi-Label Classification Min-Ling Zhang

Advanced Multimedia Text Classification Tamara Berg.

Combining Supervised and Unsupervised Learning for Zero-Day Malware Detection © 2013 Narus, Inc. Prakash Comar 1 Lei Liu 1 Sabyasachi (Saby) Saha 2 Pang-Ning.

Step 3: Classification Learn a decision rule (classifier) assigning bag-of-features representations of images to different classes Decision boundary Zebra.

Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.

Bayesian Networks. Male brain wiring Female brain wiring.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Processing of large document collections Part 2 (Text categorization, term selection) Helena Ahonen-Myka Spring 2005.

Full model selection with heuristic search: a first approach with PSO Hugo Jair Escalante Computer Science Department, Instituto Nacional de Astrofísica,

CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 24 – Classifiers 1.

Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.

LOGO Ensemble Learning Lecturer: Dr. Bo Yuan

Learning from Multi-topic Web Documents for Contextual Advertisement KDD 2008.

MINING MULTI-LABEL DATA BY GRIGORIOS TSOUMAKAS, IOANNIS KATAKIS, AND IOANNIS VLAHAVAS Published on July, 7, 2010 Team Members: Kristopher Tadlock, Jimmy.

Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.

Today Ensemble Methods. Recap of the course. Classifier Fusion

Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.

CLASSIFICATION: Ensemble Methods

Classification Techniques: Bayesian Classification

Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.

Text categorization Updated 11/1/2006. Performance measures – binary classification Accuracy: acc = (a+d)/(a+b+c+d) Precision: p = a/(a+b) Recall: r =

Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Powerpoint Templates Page 1 Powerpoint Templates Scalable Text Classification with Sparse Generative Modeling Antti PuurulaWaikato University.

Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志.

USE RECIPE INGREDIENTS TO PREDICT THE CATEGORY OF CUISINE Group 7 – MEI, Yan & HUANG, Chenyu.

Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.

KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.

Text Categorization With Support Vector Machines: Learning With Many Relevant Features By Thornsten Joachims Presented By Meghneel Gore.

Class Imbalance in Text Classification

Date: 2011/1/11 Advisor: Dr. Koh. Jia-Ling Speaker: Lin, Yi-Jhen Mr. KNN: Soft Relevance for Multi-label Classification (CIKM’10) 1.

NTU & MSRA Ming-Feng Tsai

Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.

Xiangnan Kong,Philip S. Yu An Ensemble-based Approach to Fast Classification of Multi-label Data Streams Dept. of Computer Science University of Illinois.

Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.

Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.

BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.

Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.

Multi-label Classification Yusuke Miyao. N. Ghamrawi, A. McCallum. Collective multi-label classification. CIKM S. Godbole, S. Sarawagi. Discriminative.

Supervise Learning. 2 What is learning? “Learning denotes changes in a system that... enable a system to do the same task more efficiently the next time.”

KNN & Naïve Bayes Hongning Wang

11 Automated multi-label text categorization with VG-RAM weightless neural networks Presenter: Guan-Yu Chen A. F. DeSouza, F. Pedroni, E. Oliveira, P.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Who am I? Work in Probabilistic Machine Learning Like to teach 

COMP61011 : Machine Learning Ensemble Models

Cost-Sensitive Learning

Machine Learning Week 1.

Cost-Sensitive Learning

iSRD Spam Review Detection with Imbalanced Data Distributions

Special Topics in Text Mining

Ensemble learning.

Information Retrieval

A task of induction to find patterns

A task of induction to find patterns

Presentation transcript:

Special topics on text mining [ Part I: text classification ] Hugo Jair Escalante, Aurelio Lopez, Manuel Montes and Luis Villaseñor

Multi label text classification Hugo Jair Escalante, Aurelio Lopez, Manuel Montes and Luis Villaseñor Most of this material was taken from: G. Tsoumakas, I. Katakis and I. Vlahavas. Mining multi-label data. Data Mining and Knowledge Discovery Handbook, Part 6, O. Maimon, L. Rokach (Ed.), Springer, 2nd edition, pp , 2010.

Machine learning approach to TC Develop automated methods able to classify documents with a certain degree of success Training documents (Labeled) Learning machine (an algorithm) Trained machine Unseen (test, query) document Labeled document

What is a learning algorithm? A function: Given:

Binary vs multiclass classification Binary classification: each document can belong to one of two classes. Multiclass classification: each document can belong to one of K classes.

Classification algorithms (Some) classification algorithms for TC : – Naïve Bayes – K-Nearest Neighbors – Centroid-based classification – Decision trees – Support Vector Machines – Linear classifiers (including SVMs) – Boosting, bagging and ensembles in general – Random forest – Neural networks Some of this methods were designed for binary classification problems

Linear models Classification of DNA micro-arrays ? x1x1 x2x2 No Cancer Cancer ?

Main approaches to multiclass classification Single machine: Learning algorithms able to deal with multiple classes (e.g., KNN, Naïve Bayes) Combining the outputs of several binary classifiers: – One-vs-all: one classifier per-class – All-vs-all: one classifier per pair of classes

Multilabel classification To what category belong these documents:

Multilabel classification A function: Given:

Conventions X={x ij } n m xixi y ={y j }  w Slide taken from I. Guyon. Feature and Model Selection. Machine Learning Summer School, Ile de Re, France, 2008.

Conventions X={x ij } n m xixi Z ={Z j }  w |L| Slide taken from I. Guyon. Feature and Model Selection. Machine Learning Summer School, Ile de Re, France, 2008.

Multi-label classification Each instance can be associated to a set of labels instead of a single one Specialized multilabel classification algorithms must be developed How to deal with the multilabel classification problem?

(Text categorization is perhaps the dominant multilabel application)

Multilabel classifiers Transformation methods: Transform the multilabel classification task into several single-label problems Adaptation approaches: Modify learning algorithms to support multilabel classification problems

Transformation methods Copy transformation. Transforms the multilabel instances into several single-label ones Original ML problemTransformed ML problem (unweighted) Transformed ML problem (weighted)

Transformation methods Select transformation. Replaces the multilabel of each instance by a single one Original ML problemTransformed ML problem MaxMinRand Ignore approach

Transformation methods Label power set. Considers each unique set of labels in the ML problem as a single class Original ML problemTransformed ML problem Pruning can be applied

Transformation methods Binary relevance. Learns a different classifier per each different label. Each classifier i is trained using the whole data set by considering examples of class i as positive and examples of other classes (j≠i) as negative How labels are assigned to new instances? Original ML problemData sets generated by BR

Transformation methods Ranking by pairwise comparison. Learns a different classifier per each pair of different labels. Original ML problem Data sets generated by BR

Algorithm adaptation techniques Many variants, including – Decision trees – Boosting ensembles – Probabilistic generative models – KNN – Support vector machines

Algorithm adaptation techniques MLkNN. For each test instance: – Retrieve the top-k nearest neighbors to each instance – Compute the frequency of occurrence of each label – Assign a probability to each label and select the labels for the test instance

Feature selection in multilabel classification An (almost) unstudied topic = opportunities Wrappers can be applied directly (define an objective function to optimize based on a multilabel classifier) Validation Original feature set GenerationEvaluation Subset of feature Stopping criterion yes no Selected subset of feature process From M. Dash and H. Liu.

Feature selection in multilabel classification An almost un-studied topic = opportunities Existing filter methods transform the multilabel problem and apply standard filters for feature selection

Statistics Label cardinality Label density

Evaluation of multilabel learning (New) conventions: Data set Labels Predictions of a ML classifier for instances in D

Evaluation of multilabel learning Hamming loss: Classification accuracy:

Evaluation of multilabel learning Precision: Recall:

Evaluation of multilabel learning F1-measure

Suggested readings G. Tsoumakas, I. Katakis,I. Vlahavas. Mining multi-label data. Data Mining and Knowledge Discovery Handbook, Part 6, O. Maimon, L. Rokach (Ed.), Springer, 2nd edition, pp , G. Tsoumakas, I. Katakis. Multi-label classification: an overview. International Journal of Data Warehousing, 3(3), 1—13, M. Zhang, Z. Zhou. ML-kNN, A lazy learning approach to multi-label learning. Pattern recognition 40:2038—2048, M. Boutell, J. Luo, X. Shen. C. Brown. Learning multi-label scene classification. Pattern recognition 37:1757—1771, 2004.