Machine Learning Usman Roshan Dept. of Computer Science NJIT.

Slides:



Advertisements
Similar presentations
The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke
Advertisements

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Pattern Recognition and Machine Learning
An Introduction of Support Vector Machine
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct
An Overview of Machine Learning
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Classification and risk prediction
EE491D Special Topics in Communications Adaptive Signal Processing Spring 2005 Prof. Anthony Kuh POST 205E Dept. of Elec. Eng. University of Hawaii Phone:
MACHINE LEARNING 6. Multivariate Methods 1. Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Motivating Example  Loan.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Dimensionality reduction Usman Roshan CS 675. Supervised dim reduction: Linear discriminant analysis Fisher linear discriminant: –Maximize ratio of difference.
General Information Course Id: COSC6342 Machine Learning Time: MO/WE 2:30-4p Instructor: Christoph F. Eick Classroom:SEC 201
CEN 592 PATTERN RECOGNITION Spring Term CEN 592 PATTERN RECOGNITION Spring Term DEPARTMENT of INFORMATION TECHNOLOGIES Assoc. Prof.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 10a-11:30a Instructor: Christoph F. Eick Classroom:AH123
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Machine Learning Lecture 1. Course Information Text book “Introduction to Machine Learning” by Ethem Alpaydin, MIT Press. Reference book “Data Mining.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
1 Chapter 6. Classification and Prediction Overview Classification algorithms and methods Decision tree induction Bayesian classification Lazy learning.
Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.
Overview of the final test for CSC Overview PART A: 7 easy questions –You should answer 5 of them. If you answer more we will select 5 at random.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Linear Models for Classification
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall DM Finals Study Guide Rodney Nielsen.
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
Dimensionality reduction
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
LECTURE 05: CLASSIFICATION PT. 1 February 8, 2016 SDS 293 Machine Learning.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
Background for Machine Learning (I) Usman Roshan.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
BNFO 615 Fall 2016 Usman Roshan NJIT. Outline Machine learning for bioinformatics – Basic machine learning algorithms – Applications to bioinformatics.
Usman Roshan Dept. of Computer Science NJIT
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Who am I? Work in Probabilistic Machine Learning Like to teach 
Machine Learning for Computer Security
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Trees, bagging, boosting, and stacking
Dimensionality reduction
The Elements of Statistical Learning
COMP61011 : Machine Learning Ensemble Models
Basic machine learning background with Python scikit-learn
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
Prepared by: Mahmoud Rafeek Al-Farra
Pattern Recognition and Machine Learning
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Parametric Methods Berlin Chen, 2005 References:
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Linear Discrimination
Machine Learning – a Probabilistic Perspective
Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
Usman Roshan Dept. of Computer Science NJIT
Machine learning CS 229 / stats 229
Presentation transcript:

Machine Learning Usman Roshan Dept. of Computer Science NJIT

What is Machine Learning? “Machine learning is programming computers to optimize a performance criterion using example data or past experience.” Intro to Machine Learning, Alpaydin, 2010 Examples: – Facial recognition – Digit recognition – Molecular classification

A little history 1946: First computer called ENIAC to perform numerical computations 1950: Alan Turing proposes the Turing test. Can machines think?Can machines think 1952: First game playing program for checkers by Arthur Samuel at IBM. Knowledge based systems such as ELIZA and MYCIN. 1957: Perceptron developed by Frank Roseblatt. Can be combined to form a neural network. Early 1990’s: Statistical learning theory. Emphasize learning from data instead of rule-based inference. Current status: Used widely in industry, combination of various approaches but data-driven is prevalent.

Example up-close Problem: Recognize images representing digits 0 through 9 Input: High dimensional vectors representing images Output: 0 through 9 indicating the digit the image represents Learning: Build a model from “training data” Predict “test data” with model

Data model We assume that the data is represented by a set of vectors each of fixed dimensionality. Vector: a set of ordered numbers We may refer to each vector as a datapoint and each dimension as a feature Example: – A bank wishes to classify humans as risky or safe for loan – Each human is a datapoint and represented by a vector – Features may be age, income, mortage/rent, education, family, current loans, and so on

Machine learning resources Data – NIPS 2003 feature selection contest NIPS 2003 feature selection contest – mldata.org mldata.org – UCI machine learning repository UCI machine learning repository Contests – Kaggle Kaggle Software – Python sci-kit Python sci-kit – R R – Your own code Your own code

Machine Learning techniques we will learn in the course Bayesian classification: Univariate and multivariate Linear regression Maximum likelihood estimation Naïve-Bayes Feature selection Dimensionality reduction: PCA Fisher discriminant Maximum margin criterion Clustering Nearest neighbor Perceptron and neural networks Linear discrimination: Logistic regression Support vector machines Kernel methods Regularized risk minimization Hidden Markov models Decision trees and random forests (if time permits) Advanced topics (if time permits): Boosting Deep learning

Textbook Not required but highly recommended for beginners Introduction to Machine Learning by Ethem Alpaydin (2 nd edition, 2010, MIT Press). Written by computer scientist and material is accessible with basic probability and linear algebra background Applied predictive modeling by Kuhn and Johnson (2013, Springer). More recent book focuses on practical modeling.

Some practical techniques Combination of various methods Parameter tuning – Error trade-off vs model complexity Data pre-processing – Normalization – Standardization Feature selection – Discarding noisy features

Background Basic linear algebra and probability – Vectors – Dot products – Eigenvector and eigenvalue See Appendix of textbook for probability background – Mean – Variance – Gaussian/Normal distribution

Assignments Implementation of basic classification algorithms with Perl and Python – Nearest Means – Naïve Bayes – K nearest neighbor – Cross validation scripts Experiment with various algorithms on assigned datasets

Project Some ideas: – Experiment with Kaggle and NIPS 2003 feature selection datasets – Experimental performance study of various machine learning techniques on a given dataset. For example comparison of feature selection methods with a fixed classifier.

Exams One exam in the mid semester Final exam What to expect on the exams: – Basic conceptual understanding of machine learning techniques – Be able to apply techniques to simple datasets – Basic runtime and memory requirements – Simple modifications

Grade breakdown Assignments and project worth 50% Exams worth 50%