Advanced data mining with TagHelper and Weka

Slides:



Advertisements
Similar presentations
Classification Classification Examples
Advertisements

Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Machine Learning in Practice Lecture 7 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Three kinds of learning
Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?
Sparse vs. Ensemble Approaches to Supervised Learning
For Better Accuracy Eick: Ensemble Learning
Ensembles of Classifiers Evgueni Smirnov
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Classification: Evaluation February 23,
Face Detection using the Viola-Jones Method
Issues with Data Mining
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
TagHelper: Basics Part 1 Carolyn Penstein Rosé Carnegie Mellon University Funded through the Pittsburgh Science of Learning Center and The Office of Naval.
CS 391L: Machine Learning: Ensembles
Moving Ahead: Creative Feature Extraction and Error Analysis Techniques Carolyn Penstein Rosé Carnegie Mellon University Funded through the Pittsburgh.
BOOSTING David Kauchak CS451 – Fall Admin Final project.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
Decision Trees. MS Algorithms Decision Trees The basic idea –creating a series of splits, also called nodes, in the tree. The algorithm adds a node to.
ADVANCED PERCEPTRON LEARNING David Kauchak CS 451 – Fall 2013.
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
COP5992 – DATA MINING TERM PROJECT RANDOM SUBSPACE METHOD + CO-TRAINING by SELIM KALAYCI.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Ensemble Methods in Machine Learning
Machine Learning in Practice Lecture 19 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
1 January 24, 2016Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques — Chapter 7 — Classification Ensemble Learning.
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Machine Learning in Practice Lecture 24 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Copyright  2004 limsoon wong Using WEKA for Classification (without feature selection)
Classification using Co-Training
Machine Learning in Practice Lecture 2 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Kaggle Competition Rossmann Store Sales.
Machine Learning in Practice Lecture 21 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Machine Learning in Practice Lecture 8 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Data Mining and Text Mining. The Standard Data Mining process.
Machine Learning in Practice Lecture 25 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Data Science Credibility: Evaluating What’s Been Learned
Ensemble Classifiers.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning in Practice Lecture 18
Sofus A. Macskassy Fetch Technologies
Semi-supervised Machine Learning Gergana Lazarova
Perceptrons Lirong Xia.
Reading: Pedro Domingos: A Few Useful Things to Know about Machine Learning source: /cacm12.pdf reading.
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
Basic machine learning background with Python scikit-learn
Asymmetric Gradient Boosting with Application to Spam Filtering
Boosting Nearest-Neighbor Classifier for Character Recognition
Data Mining Practical Machine Learning Tools and Techniques
Revision (Part II) Ke Chen
Machine Learning in Practice Lecture 11
Introduction to Data Mining, 2nd Edition
The Combination of Supervised and Unsupervised Approach
Revision (Part II) Ke Chen
Machine Learning in Practice Lecture 26
iSRD Spam Review Detection with Imbalanced Data Distributions
Ensembles.
Classification and Prediction
Machine Learning in Practice Lecture 23
Ensemble learning.
Machine Learning in Practice Lecture 22
Machine Learning in Practice Lecture 7
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Perceptrons Lirong Xia.
Presentation transcript:

Advanced data mining with TagHelper and Weka Carolyn Penstein Rosé Carnegie Mellon University Funded through the Pittsburgh Science of Learning Center and The Office of Naval Research, Cognitive and Neural Sciences Division

Outline Selecting a classifier Feature Selection Optimization Semi-supervised learning

Selecting a Classifier

Classifier Options * The three main types of Classifiers are Bayesian models (Naïve Bayes), functions (SMO), and trees (J48)

Classifier Options Rules of thumb: SMO is state-of-the-art for text classification J48 is best with small feature sets – also handles contingencies between features well Naïve Bayes works well for models where decisions are made based on accumulating evidence rather than hard and fast rules

Feature Selection

Why do irrelevant features hurt performance? They might confuse a classifier They waste time

Two Solutions Use a feature selection algorithm Only extract a subset of possible features

Feature Selection * Click on the AttributeSlectedClassifier

Feature Selection Feature selection algorithms pick out a subset of the features that work best Usually they evaluate each feature in isolation

Feature Selection * First click here * Then pick your base classifier just like before * Finally you will configure the feature selection

Setting Up Feature Selection

Setting Up Feature Selection The number of features you pick should not be larger than the number of features available The number should not be larger than the number of coded examples you have

Examining Which Features are Most Predictive You can find a ranked list of features in the Performance Report if you use feature selection * Predictiveness score * Frequency

Optimization

Key idea: combine multiple views on the same data in order to increase reliability

Boosting In boosting, a series of models are trained and each trained model is influenced by the strengths and weaknesses of the previous model New models should be experts in classifying examples that the previous model got wrong It specifically seeks to train multiple models that complement each other In the final vote, model predictions are weighted based on their model’s performance

More about Boosting The more iterations, the more confident the trained classifier will be in its predictions (since it will have more experts voting) On the other side, sometimes Boosting overfits Boosting can turn a weak classifier into a strong classifier

Boosting Boosting is an option listed in the Meta folder, near the Attribute Selected Classifier It is listed as AdaBoostM1 Go ahead and click on it now

Boosting * Now click here

Setting Up Boosting * Select a classifier * Set the number of cycles of boosting

Semi-Supervised Learning

Using Unlabeled Data If you have a small amount of labeled data and a large amount of unlabeled data: you can use a type of bootstrapping to learn a model that exploits regularities in the larger set of data The stable regularities might be easier to spot in the larger set than the smaller set Less likely to overfit your labeled data

Co-training Train two different models based on a few labeled examples Each model is learning the same labels but using different features Use each of these to label the unlabeled data For each approach, take the example most confidently labeled negative and most confidently labeled positive and add them to the labeled data Now repeat the process until all of the data is labeled

Semi-supervised Learning Remember the Basic idea: Train on a small amount of data Add the positive and negative example you are most confident about to the training data Retrain Keep looping until you label all the data

Questions?