Logistic Regression Classification Machine Learning.

Slides:



Advertisements
Similar presentations
Neural Networks: Learning
Advertisements

Classification Classification Examples
Machine Learning Week 2 Lecture 1.
Machine Learning Week 1, Lecture 2. Recap Supervised Learning Data Set Learning Algorithm Hypothesis h h(x) ≈ f(x) Unknown Target f Hypothesis Set 5 0.
Lecture 13 – Perceptrons Machine Learning March 16, 2010.
Logistic Regression Classification Machine Learning.
Middle Term Exam 03/01 (Thursday), take home, turn in at noon time of 03/02 (Friday)
Computer vision: models, learning and inference
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Artificial Intelligence Lecture 2 Dr. Bo Yuan, Professor Department of Computer Science and Engineering Shanghai Jiaotong University
Lecture 14 – Neural Networks
x – independent variable (input)
Greg GrudicIntro AI1 Introduction to Artificial Intelligence CSCI 3202: The Perceptron Algorithm Greg Grudic.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Collaborative Filtering Matrix Factorization Approach
Logistic Regression L1, L2 Norm Summary and addition to Andrew Ng’s lectures on machine learning.
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
ISCG8025 Machine Learning for Intelligent Data and Information Processing Week 2B *Courtesy of Associate Professor Andrew Ng’s Notes, Stanford University.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Machine Learning Introduction Study on the Coursera All Right Reserved : Andrew Ng Lecturer:Much Database Lab of Xiamen University Aug 12,2014.
Model representation Linear regression with one variable
Andrew Ng Linear regression with one variable Model representation Machine Learning.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
ICS 178 Introduction Machine Learning & data Mining Instructor max Welling Lecture 6: Logistic Regression.
M Machine Learning F# and Accord.net. Alena Dzenisenka Software architect at Luxoft Poland Member of F# Software Foundation Board of Trustees Researcher.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
CSE 446 Logistic Regression Winter 2012 Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer.
Logistic Regression Week 3 – Soft Computing By Yosi Kristian.
Machine Learning CUNY Graduate Center Lecture 4: Logistic Regression.
The problem of overfitting
Regularization (Additional)
Logistic Regression (Classification Algorithm)
M Machine Learning F# and Accord.net.
Regress-itation Feb. 5, Outline Linear regression – Regression: predicting a continuous value Logistic regression – Classification: predicting a.
Support Vector Machines Optimization objective Machine Learning.
WEEK 2 SOFT COMPUTING & MACHINE LEARNING YOSI KRISTIAN Gradient Descent for Linear Regression.
Why does it work? We have not addressed the question of why does this classifier performs well, given that the assumptions are unlikely to be satisfied.
Deep Feedforward Networks
Computer vision: models, learning and inference
Dan Roth Department of Computer and Information Science
Machine Learning & Deep Learning
Lecture 3: Linear Regression (with One Variable)
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Perceptrons Lirong Xia.
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Classification with Perceptrons Reading:
Basic machine learning background with Python scikit-learn
Logistic Regression Classification Machine Learning.
Logistic Regression Classification Machine Learning.
Statistical Learning Dong Liu Dept. EEIS, USTC.
Machine Learning Today: Reading: Maria Florina Balcan
CS 188: Artificial Intelligence
Collaborative Filtering Matrix Factorization Approach
Logistic Regression Classification Machine Learning.
Ying shen Sse, tongji university Sep. 2016
Logistic Regression.
Support Vector Machine I
Machine Learning Algorithms – An Overview
Logistic Regression Chapter 7.
Linear Discrimination
Logistic Regression Classification Machine Learning.
Multiple features Linear Regression with multiple variables
Multiple features Linear Regression with multiple variables
Logistic Regression Classification Machine Learning.
Linear regression with one variable
Perceptrons Lirong Xia.
Logistic Regression Geoff Hulten.
Presentation transcript:

Logistic Regression Classification Machine Learning

Classification Email: Spam / Not Spam? Online Transactions: Fraudulent (Yes / No)? Tumor: Malignant / Benign ? 0: “Negative Class” (e.g., benign tumor) 1: “Positive Class” (e.g., malignant tumor)

Threshold classifier output at 0.5: (Yes) 1 Malignant ? (No) 0 Tumor Size Tumor Size Threshold classifier output at 0.5: If , predict “y = 1” If , predict “y = 0”

Classification: y = 0 or 1 can be > 1 or < 0 Logistic Regression:

Hypothesis Representation Logistic Regression Hypothesis Representation Machine Learning

Sigmoid function Logistic function Logistic Regression Model Want 1 0.5 Sigmoid function Logistic function

Interpretation of Hypothesis Output = estimated probability that y = 1 on input x Example: If Tell patient that 70% chance of tumor being malignant “probability that y = 1, given x, parameterized by ”

Logistic Regression Decision boundary Machine Learning

Logistic regression z 1 Suppose predict “ “ if predict “ “ if

Decision Boundary x2 3 2 1 1 2 3 x1 Predict “ “ if

Non-linear decision boundaries x2 1 x1 -1 1 -1 Predict “ “ if x2 x1

Logistic Regression Cost function Machine Learning

Training set: m examples How to choose parameters ?

Cost function Linear regression: “non-convex” “convex”

Logistic regression cost function If y = 1 1

Logistic regression cost function If y = 0 1

Simplified cost function and gradient descent Logistic Regression Simplified cost function and gradient descent Machine Learning

Logistic regression cost function

Logistic regression cost function To fit parameters : To make a prediction given new : Output

Gradient Descent Want : Repeat (simultaneously update all )

Algorithm looks identical to linear regression! Gradient Descent Want : Repeat (simultaneously update all ) Algorithm looks identical to linear regression!

Advanced optimization Logistic Regression Advanced optimization Machine Learning

Optimization algorithm Cost function . Want . Given , we have code that can compute (for ) Gradient descent: Repeat

Optimization algorithm Given , we have code that can compute (for ) Optimization algorithms: Gradient descent Advantages: No need to manually pick Often faster than gradient descent. Disadvantages: More complex Conjugate gradient BFGS L-BFGS

Example: function [jVal, gradient] = costFunction(theta) jVal = (theta(1)-5)^2 + ... (theta(2)-5)^2; gradient = zeros(2,1); gradient(1) = 2*(theta(1)-5); gradient(2) = 2*(theta(2)-5); options = optimset(‘GradObj’, ‘on’, ‘MaxIter’, ‘100’); initialTheta = zeros(2,1); [optTheta, functionVal, exitFlag] ... = fminunc(@costFunction, initialTheta, options);

code to compute code to compute code to compute code to compute theta = function [jVal, gradient] = costFunction(theta) jVal = [ ]; code to compute gradient(1) = [ ]; code to compute gradient(2) = [ ]; code to compute gradient(n+1) = [ ]; code to compute

Multi-class classification: One-vs-all Logistic Regression Multi-class classification: One-vs-all Machine Learning

Multiclass classification Email foldering/tagging: Work, Friends, Family, Hobby Medical diagrams: Not ill, Cold, Flu Weather: Sunny, Cloudy, Rain, Snow

Binary classification: Multi-class classification: x1 x2 x2 x1

One-vs-all (one-vs-rest): x2 One-vs-all (one-vs-rest): x1 x2 x2 x1 x1 x2 Class 1: Class 2: Class 3: x1

One-vs-all Train a logistic regression classifier for each class to predict the probability that . On a new input , to make a prediction, pick the class that maximizes