Middle Term Exam 03/01 (Thursday), take home, turn in at noon time of 03/02 (Friday)

Slides:



Advertisements
Similar presentations
Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
Advertisements

Why does it work? We have not addressed the question of why does this classifier performs well, given that the assumptions are unlikely to be satisfied.
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
Optimization Tutorial
CMPUT 466/551 Principal Source: CMU
Chapter 4: Linear Models for Classification
Naïve Bayes Classifier
Probabilistic Generative Models Rong Jin. Probabilistic Generative Model Classify instance x into one of K classes Class prior Density function for class.
Multivariate linear models for regression and classification Outline: 1) multivariate linear regression 2) linear classification (perceptron) 3) logistic.
Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by grants from the National.
Logistic Regression Rong Jin. Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c.
Logistic Regression Rong Jin. Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Announcements  Homework 4 is due on this Thursday (02/27/2004)  Project proposal is due on 03/02.
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
Today Linear Regression Logistic Regression Bayesians v. Frequentists
Project  Now it is time to think about the project  It is a team work Each team will consist of 2 people  It is better to consider a project of your.
Linear Methods for Classification
Generative Models Rong Jin. Statistical Inference Training ExamplesLearning a Statistical Model  Prediction p(x;  ) Female: Gaussian distribution N(
Simple Bayesian Supervised Models Saskia Klein & Steffen Bollmann 1.
Bayes Classifier, Linear Regression 10701/15781 Recitation January 29, 2008 Parts of the slides are from previous years’ recitation and lecture notes,
1 Linear Classification Problem Two approaches: -Fisher’s Linear Discriminant Analysis -Logistic regression model.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Logistic Regression 10701/15781 Recitation February 5, 2008 Parts of the slides are from previous years’ recitation and lecture notes, and from Prof. Andrew.
Review of Lecture Two Linear Regression Normal Equation
Crash Course on Machine Learning
Crash Course on Machine Learning Part II
Outline Classification Linear classifiers Perceptron Multi-class classification Generative approach Naïve Bayes classifier 2.
CSE 446 Gaussian Naïve Bayes & Logistic Regression Winter 2012
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
Learning Theory Reza Shadmehr logistic regression, iterative re-weighted least squares.
Logistic Regression Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata September 1, 2014.
1 CS546: Machine Learning and Natural Language Discriminative vs Generative Classifiers This lecture is based on (Ng & Jordan, 02) paper and some slides.
CSE 446 Perceptron Learning Winter 2012 Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer.
Generative verses discriminative classifier
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
CS Statistical Machine learning Lecture 10 Yuan (Alan) Qi Purdue CS Sept
CSE 446 Logistic Regression Winter 2012 Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Machine Learning CUNY Graduate Center Lecture 4: Logistic Regression.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Classification: Logistic Regression –NB & LR connections Readings: Barber.
Linear Methods for Classification Based on Chapter 4 of Hastie, Tibshirani, and Friedman David Madigan.
Regress-itation Feb. 5, Outline Linear regression – Regression: predicting a continuous value Logistic regression – Classification: predicting a.
Logistic Regression William Cohen.
Recitation4 for BigData Jay Gu Feb LASSO and Coordinate Descent.
DATA MINING LECTURE 10b Classification k-nearest neighbor classifier
CSE 446 Logistic Regression Perceptron Learning Winter 2012 Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Linear Models (II) Rong Jin. Recap  Classification problems Inputs x  output y y is from a discrete set Example: height 1.8m  male/female?  Statistical.
Naive Bayes (Generative Classifier) vs. Logistic Regression (Discriminative Classifier) Minkyoung Kim.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Model selection –Error decomposition –Bias-Variance Tradeoff –Classification:
LEARNING FROM EXAMPLES AIMA CHAPTER 18 (4-5) CSE 537 Spring 2014 Instructor: Sael Lee Slides are mostly made from AIMA resources, Andrew W. Moore’s tutorials:
Matt Gormley Lecture 4 September 12, 2016
Multiplicative updates for L1-regularized regression
Machine Learning Logistic Regression
Empirical risk minimization
CSE 4705 Artificial Intelligence
10701 / Machine Learning.
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
ECE 5424: Introduction to Machine Learning
Probabilistic Models for Linear Regression
Machine Learning Logistic Regression
Statistical Learning Dong Liu Dept. EEIS, USTC.
دسته بندی با استفاده از مدل های خطی
Large Scale Support Vector Machines
Empirical risk minimization
Logistic Regression Chapter 7.
Recap: Naïve Bayes classifier
Logistic Regression [Many of the slides were originally created by Prof. Dan Jurafsky from Stanford.]
Presentation transcript:

Middle Term Exam 03/01 (Thursday), take home, turn in at noon time of 03/02 (Friday)

Project 03/15 (Phase 1): 10% of training data is available for algorithm development 04/05 (Phase 2): full training data and test examples are available 04/18 (submission): submit your prediction before 11:59pm Apr. 18 (Wednesday) 04/24 and 04/26: Project presentation Announce the competition results 04/30: project report is due

Logistic Regression Rong Jin

Logistic Regression Generative models often lead to linear decision boundary Linear discriminatory model Directly model the linear decision boundary w is the parameter to be decided

Logistic Regression

Learn parameter w by Maximum Likelihood Estimation (MLE) Given training data

Logistic Regression Convex objective function, global optimal Gradient descent Classification error

Logistic Regression Convex objective function, global optimal Gradient descent Classification error

Illustration of Gradient Descent

How to Decide the Step Size ? Back track line search

Example: Heart Disease Input feature x: age group id Output y: if having heart disease y=1: having heart disease y=-1: no heart disease 1: : : : : : : : 60-64

Example: Heart Disease

Example: Text Categorization Learn to classify text into two categories Input d: a document, represented by a word histogram Output y=  1: +1 for political document, -1 for non- political document

Example: Text Categorization Training data

Example 2: Text Classification Dataset: Reuter Classification accuracy Naïve Bayes: 77% Logistic regression: 88%

Logistic Regression vs. Naïve Bayes Both are linear decision boundaries Naïve Bayes: Logistic regression: learn weights by MLE Both can be viewed as modeling p(d|y) Naïve Bayes: independence assumption Logistic regression: assume an exponential family distribution for p(d|y) (a broad assumption)

Logistic Regression vs. Naïve Bayes

Discriminative vs. Generative Discriminative Models Model P(y|x) Pros Usually good performance Cons Slow convergence Expensive computation Sensitive to noise data Generative Models Model P(x|y) Pros Usually fast converge Cheap computation Robust to noise data Cons Usually performs worse

Overfitting Problem Consider text categorization What is the weight for a word j appears in only one training document d k ?

Overfitting Problem

Using regularization Without regularization Iteration Overfitting Problem Decrease in the classification accuracy of test data

Solution: Regularization Regularized log-likelihood The effects of regularizer Favor small weights Guarantee bounded norm of w Guarantee the unique solution

Regularized Logistic Regression Using regularization Without regularization Iteration Classification performance by regularization

Regularization as Robust Optimization Assume each data point is unknown but bounded in a sphere of radius r and center x i

Sparse Solution by Lasso Regularization RCV1 collection: 800K documents 47K unique words

Sparse Solution by Lasso Regularization How to solve the optimization problem? Subgradient descent Minimax

Bayesian Treatment Compute the posterior distribution of w Laplacian approximation

Bayesian Treatment Laplacian approximation

Multi-class Logistic Regression How to extend logistic regression model to multi-class classification ?

Conditional Exponential Model Let classes be Need to learn Normalization factor (partition function)

Conditional Exponential Model Learn weights ws by maximum likelihood estimation Any problem ?

Modified Conditional Exponential Model