SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
Slides from: Doug Gray, David Poole
Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
Linear Classifiers (perceptrons)
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
Support Vector Machines
Machine learning continued Image source:
An Overview of Machine Learning
Supervised Learning Recap
Indian Statistical Institute Kolkata
What is Statistical Modeling
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
Lesson 8: Machine Learning (and the Legionella as a case study) Biological Sequences Analysis, MTA.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Three kinds of learning
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
CS Instance Based Learning1 Instance Based Learning.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Part I: Classification and Bayesian Learning
Radial Basis Function Networks
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
This week: overview on pattern recognition (related to machine learning)
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
Data mining and machine learning A brief introduction.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Machine Learning Lecture 11 Summary G53MLE | Machine Learning | Dr Guoping Qiu1.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Learning from Observations Chapter 18 Through
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
Today Ensemble Methods. Recap of the course. Classifier Fusion
START OF DAY 5 Reading: Chap. 8. Support Vector Machine.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Data Mining and Decision Support
Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Who am I? Work in Probabilistic Machine Learning Like to teach 
Semi-Supervised Clustering
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Instance Based Learning
Basic machine learning background with Python scikit-learn
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Linear Discrimination
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
What is Artificial Intelligence?
Presentation transcript:

SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784

Goals of ML: - Learning from data MACHINE LEARNING - Establishing relationships between mutliple features. - Reasoning under uncertainity Just like the BRAIN! - Extracting statistical patterns

Application Areas: - Statistics - Engineering - Computer Science - Cognitive Science... MACHINE LEARNING

Types of ML: - Supervised Learning: Find the class labels or value of the new input, given the dataset. MACHINE LEARNING - Reinforcement learning: Learn to act in a way that maximizes the future rewards (or minimizes a cost function) - In game theory: Learn to act in a way that maximized the future rewards, in an environment that contains other machines. - Unsupervised Learning: contains neither targert outputs or reward from its environment.

SUPERVISED LEARNING INPUTSOUTPUT GenderMarriedJobAgeSalaryTrust Customer 1MaleNoTeacher431500good Customer 2FemaleNoLawyer552500good Customer 3MaleYesDoctor261700bad... Customer nMaleYesLawyer351600??? Customer n+1FemaleNoDoctor301400??? Customer n+2MaleYesRetired602000??? Instances are n-dimensional points in space, and the features of the instances correspond to the dimensions of that space.

- Features can be: - continuous - categorical - binary - Training set: The output of each data point is known. - Training Algorithms... - Test set: The output of each data point is estimated. - Output can be: - a class label - a real number SUPERVISED LEARNING

- No supervised target outputs - No rewards from the environment - No feedback (UNSUPERVISED LEARNING) SO? -Build representations of the inputs -Find patterns in the inputs -Decision making -Predict future inputs

(UNSUPERVISED LEARNING) - Extract information from unlabelled data. - Learn a probabilistic model of the data. This can be useful for: - Outlier detection - Classification - Data compression - Bayes Rule: (To have beliefs about the world, we trust the statistics.)

- Dataset Collection - Feature Selection - Algorithm Selection - Training SUPERVISED LEARNING

Brute-force method: Measuring everything available in the hope that the relevant & informative features can be isolated. (-) contains a lot of noisy data (-) missing features (-) requires significant data pre-processing (+) simple OR: An expert decides which features to measure and use. SUPERVISED LEARNING: COLLECTING THE DATASET

Possible problems in a dataset: - handling the missing data - outlier (noise) detection - instance selection (in the case of large datasets) - feature subset selection (in the case of redundant features and high dimensionality) - feature construction/transformation SUPERVISED LEARNING: COLLECTING THE DATASET

Performance of the algorithm is determined by the prediction accuracy, given by: SUPERVISED LEARNING: ALGORITHM SELECTION % correct prediction % all predictions 3 ways to calculate it: - 2/3 training & 1/3 estimating performance - Cross validation (training set is divided into mutually exclusive and equal sized subsets, and error rates of the subsets are averaged.) - Leave-one-out validation is a special case of cross validation. (Every subset has only 1 instance.) Unstability: Small changes in the training set result in large changes.

Decision trees are trees that classify instances by sorting them based on feature values. SUPERVISED LEARNING: LOGIC BASED ALGORITHMS: DECISION TREES

The feature that best divides the training data would be the root node of the tree To avoid overfitting : i) Stop the training before perfect fitting ii) Prune the induced decision tree. The tree with fewest number of leaves is preferred. Zheng (2000) created at-least M of-N features. An instance is true if at least M of its conditions is true, otherwise it is false. SUPERVISED LEARNING: LOGIC BASED ALGORITHMS: DECISION TREES

SUPERVISED LEARNING: PERCEPTRON BASED ALGORITHMS * Dataset: - x 1 to x n are the input feature values. - w 1 to w n are the connection weights / prediction vector. * Perceptron computes the weighted sum: ∑(x i * w i ) * Sum 1 Sum 0 * Run the algorithm repeatedly over the training set, until it finds a prediction vector that is correct on all the training set.

SUPERVISED LEARNING: PERCEPTRON BASED ALGORITHMS * Can only classify linearly separable sets of instances. * Binary => In the case of multiclass problems, the problem must be reduced to a set of multiple binary classification problems. * Anytime online! (Can produce a useful answer regardless of how long they run.) * Superior time complexity when dealing with irrelevant features.

SUPERVISED LEARNING: INSTANCE-BASED LEARNING K-NN Algorithm: Assign the same label according to the nearest neighbours (if K>1, do majority voting) It is a Lazy-learning algorithm! Which means: - No generalization process until classification is performed - Require less computation time during the training phase than eager-learning algorithm(such as decision trees, neural and Bayes nets) but more computation time during the classification process.

SUPERVISED LEARNING: INSTANCE-BASED LEARNING K-NN Algorithm: Different Distance Metrics to compare feature vectors:

SUPERVISED LEARNING: INSTANCE-BASED LEARNING K-NN Algorithm: i) they have large storage requirements ii) they are sensitive to the choice of the distance metric iii) hard to choose the best k

SUPERVISED LEARNING: SUPPORT VECTOR MACHINES An optimization problem: Find a hyperplane that separate the sample space which: 1) Maximizes the separation of the classes 2) Maximize the distance of the hyperplane to the closest samples on each side

SUPERVISED LEARNING: SUPPORT VECTOR MACHINES A separation with a higher margin is preferred for generalization purposes.

SUPERVISED LEARNING: SUPPORT VECTOR MACHINES If the training data is not Linearly Separable: Kernel Trick is applied to map the input space to a higher dimensional space where the data is now Linearly separable

UNSUPERVISED LEARNING - Extract information from unlabelled data. - Learn a probabilistic model of the data. This can be useful for: - Outlier detection - Classification - Data compression - Bayes Rule: (To have beliefs about the world, we trust the statistics.)

UNSUPERVISED LEARNING: LATENT VARIABLE MODELS Logic Based Algorithms Perceptron Based Techniques Statistical Learning Algorithms Instance Based Learning Support Vector Machines

SUPERVISED LEARNING: LATENT VARIABLE MODELS: FACTOR ANALYSIS

SUPERVISED LEARNING: LATENT VARIABLE MODELS: PRINCILPE COMPONENT ANALYSIS

SUPERVISED LEARNING: LATENT VARIABLE MODELS: INDEPENDENT COMPONENT ANALYSIS

SUPERVISED LEARNING: LATENT VARIABLE MODELS: MIXTURE OF GAUSSIANS

SUPERVISED LEARNING: LATENT VARIABLE MODELS: K-MEANS

UNSUPERVISED LEARNING: EM ALGORITHM EM: Expectation-Maximisation

UNSUPERVISED LEARNING: MODELING TIME SERIES

SUPERVISED LEARNING: MODELING TIME SERIES: STATE-SPACE MODELS

SUPERVISED LEARNING: MODELING TIME SERIES: HIDDEN MARKOV MODELS

UNSUPERVISED LEARNING: NONLINEAR, FACTORIAL AND HIERARCHICAL MODELS

UNSUPERVISED LEARNING: NONLINEAR, FACTORIAL AND HIERARCHICAL MODELS

UNSUPERVISED LEARNING: NONLINEAR, FACTORIAL AND HIERARCHICAL MODELS

UNSUPERVISED LEARNING: INTRACTABILITY

UNSUPERVISED LEARNING: GRAPHICAL MODELS

UNSUPERVISED LEARNING: GRAPHICAL MODELS: UNDIRECTED GRAPHS

UNSUPERVISED LEARNING: GRAPHICAL MODELS: FACTOR GRAPHS

UNSUPERVISED LEARNING: GRAPHICAL MODELS: EXPRESSIVE POWER

UNSUPERVISED LEARNING: EXACT INFERENCE IN GRAPHS

UNSUPERVISED LEARNING: EXACT INFERENCE IN GRAPHS ELIMINATION

UNSUPERVISED LEARNING: EXACT INFERENCE IN GRAPHS BELIEF PROPAGATION

UNSUPERVISED LEARNING: EXACT INFERENCE IN GRAPHS FACTOR GRAPH PROPAGATION

UNSUPERVISED LEARNING: EXACT INFERENCE IN GRAPHS JUNCTION TREE ALGORITHM

UNSUPERVISED LEARNING: EXACT INFERENCE IN GRAPHS CUTSET CONDITIONING

UNSUPERVISED LEARNING: LEARNING IN GRAPHICAL MODELS

UNSUPERVISED LEARNING: LEARNING IN GRAPHICAL MODELS LEARNING GRAPH PARAMETERS

UNSUPERVISED LEARNING: LEARNING IN GRAPHICAL MODELS LEARNING GRAPH STRUCTURE

UNSUPERVISED LEARNING: BAYESIAN MODEL COMPARISON & OCCAM’S RAZOR

UNSUPERVISED LEARNING: APPROXIMATING POSTERIORS & MARGINAL LIKELIHOODS

UNSUPERVISED LEARNING: DISCUSSION Semi-supervised learning: Small amount of labelled data and large amount of unlabelled data.

THE END