Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.

Slides:



Advertisements
Similar presentations
Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
Advertisements

ECG Signal processing (2)
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Supervised Learning Techniques over Twitter Data Kleisarchaki Sofia.
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques

An Introduction of Support Vector Machine
Support Vector Machines
SVM—Support Vector Machines
Machine learning continued Image source:
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Groundwater 3D Geological Modeling: Solving as Classification Problem with Support Vector Machine A. Smirnoff, E. Boisvert, S. J.Paradis Earth Sciences.
Classification and Decision Boundaries
Discriminative and generative methods for bags of features
Lecture 14 – Neural Networks
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
x – independent variable (input)
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class.
Support Vector Machines
Support Vector Machines
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Classification III Tamara Berg CS Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell,
Support Vector Machines Piyush Kumar. Perceptrons revisited Class 1 : (+1) Class 2 : (-1) Is this unique?
Step 3: Classification Learn a decision rule (classifier) assigning bag-of-features representations of images to different classes Decision boundary Zebra.
Data mining and machine learning A brief introduction.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Support Vector Machine & Image Classification Applications
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
Machine Learning Lecture 11 Summary G53MLE | Machine Learning | Dr Guoping Qiu1.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
计算机学院 计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知 计算机学院 Perceptron Revisited: Linear Separators Binary classification.
An Introduction to Support Vector Machine (SVM) Presenter : Ahey Date : 2007/07/20 The slides are based on lecture notes of Prof. 林智仁 and Daniel Yeung.
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Chapter 6: Techniques for Predictive Modeling
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
Machine Learning in Ad-hoc IR. Machine Learning for ad hoc IR We’ve looked at methods for ranking documents in IR using factors like –Cosine similarity,
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
Ohad Hageby IDC Support Vector Machines & Kernel Machines IP Seminar 2008 IDC Herzliya.
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
Supervised Learning. CS583, Bing Liu, UIC 2 An example application An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc)
Support Vector Machines and Gene Function Prediction Brown et al PNAS. CS 466 Saurabh Sinha.
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Classification Course web page: vision.cis.udel.edu/~cv May 14, 2003  Lecture 34.
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
Support Vector Machines Reading: Textbook, Chapter 5 Ben-Hur and Weston, A User’s Guide to Support Vector Machines (linked from class web page)
Support Vector Machine Slides from Andrew Moore and Mingyue Tan.
Neural networks and support vector machines
CSSE463: Image Recognition Day 14
CS 9633 Machine Learning Support Vector Machines
Support Vector Machines
Support Vector Machines Introduction to Data Mining, 2nd Edition by
CSSE463: Image Recognition Day 14
COSC 4335: Other Classification Techniques
Advanced Artificial Intelligence Classification
COSC 4368 Machine Learning Organization
Presentation transcript:

Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh

Supervised Machine Learning  SML: machine performs function (e.g., classification) after training on a data set where inputs and desired outputs are provided  Following training, SML algorithm is able to generalize to new, unseen data  Application: “Data Mining” Often, large amounts of data must be handled efficiently Look for relevant information, patterns in data

Decision Trees  Logic-based algorithm  Sort instances (data) according to feature values…a hierarchy of tests  Nodes: features Root node: feature that best divides data  Algorithms exist for determining the best root node  Branches: values the node can assume

Decision Trees, an example INPUT: data (symptom) low RBC count yes no size of cells largesmall bleeding? STOP B 12 deficient? yes stop bleeding no STOPgastrin assay positive negative STOPanemia OUTPUT: category (condition)

Decision Trees: Assessment  Advantages:  Classification of data based on limiting features is intuitive  Handles discrete/categorical features best  Limitations:  Danger of “overfitting” the data  Not the best choice for accuracy

Bayesian Networks  Graphical algorithm that encodes the joint probability distribution of a data set  Captures probabilistic relationships between variables  Based on probability that instances (data) belong in each category

Bayesian Networks, an example Wikipedia, 2008

Bayesian Networks: Assessment  Advantages:  Takes into account prior information regarding relationships among features  Probabilities can be updated based on outcomes  Fast!…with respect to learning classification  Can handle incomplete sets of data  Avoids “overfitting” of data  Limitations:  Not suitable for data sets with many features  Not the best choice for accuracy

Neural Networks  Used for: Classification Noise reduction Prediction  Great because: Able to learn Able to generalize  Kiran Plaut’s (1996) semantic neural network that could be lesioned and retrained – useful for predicting treatment outcomes  Mikkulainen Evolving neural network that could adapt to the gaming environment – useful learning application

Neural Networks: Biological Basis

Feed-forward Neural Network Perceptron : Hidden layer

Neural Networks: Training  Presenting the network with sample data and modifying the weights to better approximate the desired function.  Supervised Learning Supply network with inputs and desired outputs Initially, the weights are randomly set Weights modified to reduce difference between actual and desired outputs Backpropagation

Support Vector Machines

Perceptron Revisited: Linear Classifier: y(x) = sign(w. x + b) w. x + b = 0 w. x + b < 0 w. x + b > 0

Which one is the best?

Notion of Margin  Distance from a data point to the hyperplane:  Data points closest to the boundary are called support vectors  Margin d is the distance between two classes. r d

Maximizing Margin  Maximizing margin is a quadratic optimization problem.  Quadratic optimization problems are a well-known class of mathematical programming problems, and many (rather intricate) algorithms exist for solving them.

Kernel Trick  What if the dataset is non-linearly separable?  We use a kernel to map the data to a higher-dimensional space: 0 x2x2 x 0 x

Non-linear SVMs: Feature spaces  General idea: The original space can always be mapped to some higher-dimensional feature space where the training set becomes separable: Φ: x → φ(x)

Examples of Kernel Trick  For the example in the previous figure: The non-linear mapping  A more commonly used radial basis function (RBF) kernel

Advantages and Applications of SVM  Advantages of SVM Unlike neural networks, the class boundaries don’t change as the weights change. Generalizability is high because margin is maximized. No local minima and robustness to outliers.  Applications of SVM Used in almost every conceivable situation where automatic classification of data is needed. (example from class) Raymond Mooney and his KRISPER natural language parser.

The Future of Supervised Learning (1)  Generation of synthetic data A major problem with supervised learning is the necessity of having large amounts of training data to obtain a good result. Why not create synthetic training data from real, labeled data? Example: use a 3D model to generate multiple 2D images of some object (such as a face) under different conditions (such as lighting). Labeling only needs to be done for the 3D model, not for every 2D model.

The Future of Supervised Learning (2)  Future applications Personal software assistants learning from past usage the evolving interests of their users in order to highlight relevant news (e.g., filtering scientific journals for articles of interest) Houses learning from experience to optimize energy costs based on the particular usage patterns of their occupants Analysis of medical records to assess which treatments are more effective for new diseases Enable robots to better interact with humans

References  ge/class/Psy394U/Hayhoe/cognitive%20s cience%202008/talks:readings/ ge/class/Psy394U/Hayhoe/cognitive%20s cience%202008/talks:readings/  junkie.com/ann/evolved/nnt1.html junkie.com/ann/evolved/nnt1.html  _en/backprop.html _en/backprop.html  uang-blanz-heisele.pdf uang-blanz-heisele.pdf  lille3.fr/~gilleron/introML.pdf lille3.fr/~gilleron/introML.pdf