Usman Roshan Dept. of Computer Science NJIT

Slides:



Advertisements
Similar presentations
The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke
Advertisements

Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Pattern Recognition and Machine Learning
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct
An Overview of Machine Learning
Supervised Learning Recap
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Classification and risk prediction
Machine Learning CMPT 726 Simon Fraser University
EE491D Special Topics in Communications Adaptive Signal Processing Spring 2005 Prof. Anthony Kuh POST 205E Dept. of Elec. Eng. University of Hawaii Phone:
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
General Information Course Id: COSC6342 Machine Learning Time: MO/WE 2:30-4p Instructor: Christoph F. Eick Classroom:SEC 201
CEN 592 PATTERN RECOGNITION Spring Term CEN 592 PATTERN RECOGNITION Spring Term DEPARTMENT of INFORMATION TECHNOLOGIES Assoc. Prof.
机器学习 陈昱 北京大学计算机科学技术研究所 信息安全工程研究中心. Concept Learning Reference : Ch2 in Mitchell’s book 1. Concepts: Inductive learning hypothesis General-to-specific.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 10a-11:30a Instructor: Christoph F. Eick Classroom:AH123
Machine Learning Lecture 1. Course Information Text book “Introduction to Machine Learning” by Ethem Alpaydin, MIT Press. Reference book “Data Mining.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.
Overview of the final test for CSC Overview PART A: 7 easy questions –You should answer 5 of them. If you answer more we will select 5 at random.
Linear Models for Classification
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
Background for Machine Learning (I) Usman Roshan.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
BNFO 615 Fall 2016 Usman Roshan NJIT. Outline Machine learning for bioinformatics – Basic machine learning algorithms – Applications to bioinformatics.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
CS 9633 Machine Learning Support Vector Machines
Who am I? Work in Probabilistic Machine Learning Like to teach 
Deep Feedforward Networks
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Intro to Machine Learning
Empirical risk minimization
Trees, bagging, boosting, and stacking
The Elements of Statistical Learning
Regularized risk minimization
COMP61011 : Machine Learning Ensemble Models
Basic machine learning background with Python scikit-learn
Machine Learning Basics
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
Machine Learning 101 Intro to AI, ML, Deep Learning
Pattern Recognition and Machine Learning
Classification Boundaries
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Overview of deep learning
Parametric Methods Berlin Chen, 2005 References:
Empirical risk minimization
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Linear Discrimination
Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
Usman Roshan Dept. of Computer Science NJIT
Machine learning CS 229 / stats 229
Machine Learning for Cyber
Presentation transcript:

Usman Roshan Dept. of Computer Science NJIT Machine Learning Usman Roshan Dept. of Computer Science NJIT

What is Machine Learning? “Machine learning is programming computers to optimize a performance criterion using example data or past experience.” Intro to Machine Learning, Alpaydin, 2010 Examples: Facial recognition Digit recognition Molecular classification

A little history 1946: First computer called ENIAC to perform numerical computations 1950: Alan Turing proposes the Turing test. Can machines think? 1952: First game playing program for checkers by Arthur Samuel at IBM. Knowledge based systems such as ELIZA and MYCIN. 1957: Perceptron developed by Frank Roseblatt. Can be combined to form a neural network. Early 1990’s: Statistical learning theory. Emphasize learning from data instead of rule-based inference. Current status: Used widely in industry, combination of various approaches but data-driven is prevalent.

Example up-close Problem: Recognize images representing digits 0 through 9 Input: High dimensional vectors representing images Output: 0 through 9 indicating the digit the image represents Learning: Build a model from “training data” Predict “test data” with model

Data model We assume that the data is represented by a set of vectors each of fixed dimensionality. Vector: a set of ordered numbers We may refer to each vector as a datapoint and each dimension as a feature Example: A bank wishes to classify humans as risky or safe for loan Each human is a datapoint and represented by a vector Features may be age, income, mortage/rent, education, family, current loans, and so on

Machine learning resources Data NIPS 2003 feature selection contest mldata.org UCI machine learning repository Contests Kaggle Software Python sci-kit R Your own code

Machine Learning techniques and concepts we will learn in this course Bayesian classification: Univariate and multivariate Linear regression Maximum likelihood estimation Naïve-Bayes Perceptron and basic single layer neural networks Linear discrimination and gradient descent optimization: Least squares Logistic regression Support vector machines Kernel methods Regularized risk minimization Bayesian decision theory and error bounds Decision trees, random forests, and boosting Feature selection Dimensionality reduction: PCA Fisher discriminant Maximum margin criterion Clustering Hidden Markov models Big Data methods Representation learning Deep learning

Textbooks Not required but highly recommended for beginners Introduction to Machine Learning by Ethem Alpaydin (2nd edition, 2010, MIT Press). Written by computer scientist and material is accessible with basic probability and linear algebra background Foundations of Machine Learning by Afshin Rostamizadeh, Ameet Talwalkar, and Mehryar Mohri (2012 MIT Press) Applied predictive modeling by Kuhn and Johnson (2013, Springer). This book focuses on practical modeling.

Some practical techniques Combination of various methods Randomization methods Parameter tuning Error trade-off vs model complexity Data pre-processing Normalization Standardization Feature selection Discarding noisy features

Background Basic linear algebra and probability Vectors Dot products Eigenvector and eigenvalue See Appendix of textbook for probability background Mean Variance Gaussian/Normal distribution Also see basic and applied stats slides on course website

Assignments Implementation of basic classification algorithms with Perl and Python Nearest Means Naïve Bayes Gradient descent for least squares, hinge loss, and logistic loss CART algorithm for decision tree K-means clustering Optional feature learning assignment

Project Feature selection on high dimensional genomic data

Exams One exam in the mid semester Final exam What to expect on the exams: Basic conceptual understanding of machine learning techniques Be able to apply techniques to simple datasets Basic runtime and memory requirements Simple modifications

Grade breakdown Assignments and project worth 50% Exams worth 50%