Eco 6380 Predictive Analytics For Economists Spring 2016

Slides:



Advertisements
Similar presentations
Eco 5385 Predictive Analytics For Economists Spring 2014 Professor Tom Fomby Director, Richard B. Johnson Center for Economic Studies Department of Economics.
Advertisements

DECISION TREES. Decision trees  One possible representation for hypotheses.
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
Data Mining Classification: Alternative Techniques
Pattern Recognition and Machine Learning
Salvatore giorgi Ece 8110 machine learning 5/12/2014
K Means Clustering , Nearest Cluster and Gaussian Mixture
Indian Statistical Institute Kolkata
Model assessment and cross-validation - overview
Chapter 7 – K-Nearest-Neighbor
Chapter 2: Pattern Recognition
I.1 ii.2 iii.3 iv.4 1+1=. i.1 ii.2 iii.3 iv.4 1+1=
1 Nearest Neighbor Learning Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AI.
1 Ensembles of Nearest Neighbor Forecasts Dragomir Yankov, Eamonn Keogh Dept. of Computer Science & Eng. University of California Riverside Dennis DeCoste.
I.1 ii.2 iii.3 iv.4 1+1=. i.1 ii.2 iii.3 iv.4 1+1=
ROC Curve and Classification Matrix for Binary Choice Professor Thomas B. Fomby Department of Economics SMU Dallas, TX February, 2015.
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor.
1 An Introduction to Nonparametric Regression Ning Li March 15 th, 2004 Biostatistics 277.
Discriminant Analysis Testing latent variables as predictors of groups.
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
The Broad Institute of MIT and Harvard Classification / Prediction.
K Nearest Neighbors Classifier & Decision Trees
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
Ensemble Methods: Bagging and Boosting
Copyright © 2010 Pearson Addison-Wesley. All rights reserved. Chapter 12 Multiple Linear Regression and Certain Nonlinear Regression Models.
Chapter 6 – Three Simple Classification Methods © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Introduction to Machine Learning Multivariate Methods 姓名 : 李政軒.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
LECTURE 05: CLASSIFICATION PT. 1 February 8, 2016 SDS 293 Machine Learning.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
KNN & Naïve Bayes Hongning Wang
Logistic Regression: To classify gene pairs
k-Nearest neighbors and decision tree
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Regression Chapter 6 I Introduction to Regression
Eco 6380 Predictive Analytics For Economists Spring 2016
Lecture 3: Linear Regression (with One Variable)
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Ch8: Nonparametric Methods
CH 5: Multivariate Methods
Chapter 7 – K-Nearest-Neighbor
Overview of Supervised Learning
Introduction Feature Extraction Discussions Conclusions Results
Eco 6380 Predictive Analytics For Economists Spring 2016
Machine Learning Week 1.
Statistical Learning Dong Liu Dept. EEIS, USTC.
Eco 6380 Predictive Analytics For Economists Spring 2014
Eco 6380 Predictive Analytics For Economists Spring 2016
Contact: Machine Learning – (Linear) Regression Wilson Mckerrow (Fenyo lab postdoc) Contact:
Evaluating Impacts: An Overview of Quantitative Methods
Index Notation Sunday, 24 February 2019.
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Introduction.
Machine Learning – a Probabilistic Perspective
Cases. Simple Regression Linear Multiple Regression.
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor
Business Statistics - QBM117
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Outlines Introduction & Objectives Methodology & Workflow
Machine Learning for Cyber
Presentation transcript:

Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU

Presentation 6 The K-Nearest-Neighbors Model For Prediction or Classification Chapter 7 in SPB

OUTLINE I. K-NN: A Nonparametric Method A. No Parameters to Estimate as in Multiple Linear Regression B. Definition of Euclidean Distance between Vectors C. Recommendation: Standardize Input Variables before Proceeding

OUTLINE II. Un-Weighted Nearest Neighbor Scores: Simple Average of Training Neighbor’s Output Values III. Weighted Nearest Neighbor Scores: Weighted Average of Training Neighbor’s Output Values IV. Therefore K-NN is a sophisticated “Step-Function” Predictor that relies on an Average of Neighborhood Output Values taken from the Training Data Set

OUTLINE V. In the K-NN Prediction Problem the Neighborhood Size, K, is the “tuning” parameter VI. In the K-NN Classification Problem there are two “tuning” parameters: The Neighborhood Size and the cut-off probability for choice selection VII. The K-NN Tuning Parameters are Often Chosen so as to maximize the accuracy of scoring the Validation Data Set

Now for a Discussion of the Various Parts of this Outline go to the pdf file K-NN Method.pdf

Classroom Exercise: Exercise 4