Foundational Issues Machine Learning 726 Simon Fraser University.

Slides:



Advertisements
Similar presentations
Chapter 4: Linear Models for Classification
Advertisements

Data Visualization STAT 890, STAT 442, CM 462
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
LER Path Comp = David Meyer
Supervised learning Given training examples of inputs and corresponding outputs, produce the “correct” outputs for new inputs Two main scenarios: –Classification:
CIS 678 Artificial Intelligence problems deduction, reasoning knowledge representation planning learning natural language processing motion and manipulation.
Artificial Intelligence Statistical learning methods Chapter 20, AIMA (only ANNs & SVMs)
Machine Learning CMPT 726 Simon Fraser University CHAPTER 1: INTRODUCTION.
Introduction to Neural Networks Simon Durrant Quantitative Methods December 15th.
Machine Learning CMPT 726 Simon Fraser University
Data-intensive Computing Algorithms: Classification Ref: Algorithms for the Intelligent Web 6/26/20151.
Introduction to machine learning
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Image Representation Gaussian pyramids Laplacian Pyramids
Entropy and some applications in image processing Neucimar J. Leite Institute of Computing
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
B. RAMAMURTHY EAP#2: Data Mining, Statistical Analysis and Predictive Analytics for Automotive Domain CSE651C, B. Ramamurthy 1 6/28/2014.
Playing with features for learning and prediction Jongmin Kim Seoul National University.
PATTERN RECOGNITION AND MACHINE LEARNING
1 What is learning? “Learning denotes changes in a system that... enable a system to do the same task more efficiently the next time.” –Herbert Simon “Learning.
ECSE 6610 Pattern Recognition Professor Qiang Ji Spring, 2011.
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
Transfer Learning with Applications to Text Classification Jing Peng Computer Science Department.
Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:
Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.
CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.
Chapter 6 – Three Simple Classification Methods © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
USE RECIPE INGREDIENTS TO PREDICT THE CATEGORY OF CUISINE Group 7 – MEI, Yan & HUANG, Chenyu.
CpSc 810: Machine Learning Instance Based Learning.
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.
A Short and Simple Introduction to Linear Discriminants (with almost no math) Jennifer Listgarten, November 2002.
Gaussian Processes For Regression, Classification, and Prediction.
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
Classification Ensemble Methods 1
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
1.5. Gaussian Processes Examples An introductory regression example Fitting Noisy Data XIAO LIYING.
Introduction to Machine Learning © Roni Rosenfeld,
Chapter 10 The Support Vector Method For Estimating Indicator Functions Intelligent Information Processing Laboratory, Fudan University.
Pattern recognition – basic concepts. Sample input attribute, attribute, feature, input variable, independent variable (atribut, rys, příznak, vstupní.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Bayesian Perception.
Unsupervised Feature Learning Introduction Oliver Schulte School of Computing Science Simon Fraser University.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Machine Learning with Spark MLlib
Introduction to Machine Learning
Who am I? Work in Probabilistic Machine Learning Like to teach 
Data-intensive Computing Algorithms: Classification
Deep Learning Amin Sobhani.
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Perceptual Loss Deep Feature Interpolation for Image Content Changes
Neural Networks for Machine Learning Lecture 1e Three types of learning Geoffrey Hinton with Nitish Srivastava Kevin Swersky.
Introduction to Data Science Lecture 7 Machine Learning Overview
Estimating Link Signatures with Machine Learning Algorithms
Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:
Machine Learning Ali Ghodsi Department of Statistics
Self organizing networks
Machine Learning Feature Creation and Selection
What is Pattern Recognition?
CSCI 5822 Probabilistic Models of Human and Machine Learning
دانشگاه صنعتی امیرکبیر Instructor : Saeed Shiry & Bishop Ch. 1
Machine Learning 101 Intro to AI, ML, Deep Learning
Image recognition: Defense adversarial attacks
RCNN, Fast-RCNN, Faster-RCNN
Machine learning overview
Machine Learning – a Probabilistic Perspective
Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
FOUNDATIONS OF BUSINESS ANALYTICS Introduction to Machine Learning
Presentation transcript:

Foundational Issues Machine Learning 726 Simon Fraser University

2 Outline Functions vs. Probabilities The Curse of Dimensionality Bishop: Ch. 1.

3 Learning Functions Much learning is about predicting the value of a function from a list of features. Classification: discrete function values. Regression: continuous function values. Mathematical Representation: map a feature vector x to a target value y i.e. f(x)=y. Oliver’s heel pain example. Often most intuitive to think in terms of function learning.

4 Why Probabilities Another view: the goal is learning the probability of an outcome. Advantages: Rank outcomes, quantify uncertainty. Deals with noisy data. Helps with combining predictions and pipelining. Can incorporate base rate information (e.g., only 10% of heel pain is caused by tumor). Can incorporate knowledge about inverse function, e.g., from diagnosis to symptom. Bayes’ theorem: single formula with base rates and inverse probabilities.

5 Why not probabilities Disadvantage: exact numbers may be hard to get, more than needed.

6 From Functions to Probabilities Function + noise = probability. See scatterplot, logistic regression.

7 From Probabilities to Functions Can model learning probability of y given x as function learning: f(x,y) = P(y|x). E.g., neural nets for computing probabilities.

8 The curse of dimensionality In many applications, we have an abundance of features. e.g., 20x20 image = 400 pixel values. Scaling standard ML methods to high-dimensional feature spaces is hard, both computationally and statistically. Statistics: data do not cover space. Typically only few of the possible data settings occur.  manifold learning.  learning aggregate, global, or high-level features. Unsupervised learning of feature hierarchies: deep learning. Discussion Question: does the brain do deep learning?