Comparison of Instance-Based Techniques for Learning to Predict Changes in Stock Prices iCML Conference December 10, 2003 Presented by: David LeRoux.

Slides:



Advertisements
Similar presentations
DECISION TREES. Decision trees  One possible representation for hypotheses.
Advertisements

DIMENSIONALITY REDUCTION: FEATURE EXTRACTION & FEATURE SELECTION Principle Component Analysis.
Data Mining Classification: Alternative Techniques
1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.
Lazy vs. Eager Learning Lazy vs. eager learning
Classification Algorithms – Continued. 2 Outline  Rules  Linear Models (Regression)  Instance-based (Nearest-neighbor)
Classification Algorithms – Continued. 2 Outline  Rules  Linear Models (Regression)  Instance-based (Nearest-neighbor)
1er. Escuela Red ProTIC - Tandil, de Abril, Instance-Based Learning 4.1 Introduction Instance-Based Learning: Local approximation to the.
Navneet Goyal. Instance Based Learning  Rote Classifier  K- nearest neighbors (K-NN)  Case Based Resoning (CBR)
More Classifier and Accuracy Measure of Classifiers
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
CS 590M Fall 2001: Security Issues in Data Mining Lecture 3: Classification.
By Fernando Seoane, April 25 th, 2006 Demo for Non-Parametric Classification Euclidean Metric Classifier with Data Clustering.
Practical Reinforcement Learning in Continuous Space William D. Smart Brown University Leslie Pack Kaelbling MIT Presented by: David LeRoux.
A Similarity Evaluation Technique for Data Mining with Ensemble of Classifiers Seppo Puuronen, Vagan Terziyan International Workshop on Similarity Search.
Flexible Metric NN Classification based on Friedman (1995) David Madigan.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Lazy Learning k-Nearest Neighbour Motivation: availability of large amounts of processing power improves our ability to tune k-NN classifiers.
Data Mining Classification: Alternative Techniques
These slides are based on Tom Mitchell’s book “Machine Learning” Lazy learning vs. eager learning Processing is delayed until a new instance must be classified.
1 Nearest Neighbor Learning Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AI.
CES 514 – Data Mining Lec 9 April 14 Mid-term k nearest neighbor.
Aprendizagem baseada em instâncias (K vizinhos mais próximos)
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
INSTANCE-BASE LEARNING
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor.
Instance-Based Learners So far, the learning methods that we have seen all create a model based on a given representation and a training set. Once the.
Nearest Neighbor Classifiers other names: –instance-based learning –case-based learning (CBL) –non-parametric learning –model-free learning.
CS Instance Based Learning1 Instance Based Learning.
Module 04: Algorithms Topic 07: Instance-Based Learning
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
1 Lazy Learning – Nearest Neighbor Lantz Ch 3 Wk 2, Part 1.
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
K Nearest Neighborhood (KNNs)
DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
1 Data Mining Lecture 5: KNN and Bayes Classifiers.
K Nearest Neighbors Saed Sayad 1www.ismartsoft.com.
Correlation and Prediction Error The amount of prediction error is associated with the strength of the correlation between X and Y.
CSC 196k Semester Project: Instance Based Learning
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 9 Instance-Based.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.7: Instance-Based Learning Rodney Nielsen.
CpSc 810: Machine Learning Instance Based Learning.
COMP 2208 Dr. Long Tran-Thanh University of Southampton K-Nearest Neighbour.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Outline K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
Lazy Learners K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
KNN Classifier.  Handed an instance you wish to classify  Look around the nearby region to see what other classes are around  Whichever is most common—make.
CS Machine Learning Instance Based Learning (Adapted from various sources)
K-Nearest Neighbor Learning.
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
Classification Algorithms Covering, Nearest-Neighbour.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil.
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
Data Science Algorithms: The Basic Methods
Classification Nearest Neighbor
K Nearest Neighbors and Instance-based methods
Chapter 7 – K-Nearest-Neighbor
Instance Based Learning (Adapted from various sources)
K Nearest Neighbor Classification
Classification Nearest Neighbor
Instance Based Learning
Classification Algorithms
Nearest Neighbors CSC 576: Data Mining.
CSE4334/5334 Data Mining Lecture 7: Classification (4)
Objectives Approximate a definite integral using the Trapezoidal Rule.
Presentation transcript:

Comparison of Instance-Based Techniques for Learning to Predict Changes in Stock Prices iCML Conference December 10, 2003 Presented by: David LeRoux

Goals of Paper Analyze k-nearest neighbor classification methods... to predict whether the S&P 500 stock index will increase by more than the median amount in a month... using WEKA

The Data 14 indexes from the Federal Reserve interest rates business conditions employment 11.5 years of monthly data through 6/2003 Actual values and month-to-month changes 28 features, 138 observations Timing issue - when are indexes published?

Instance-Based Classifiers Lazy learners - don’t develop representation Simple rule: classify same way as similar situations in training data Problems: What is similar? How many neighbors? How to weight contributions?

Similarity Metric Curse of dimensionality Identifying most important features Normalizing data Distance measurement

Number of Observations Trade-off between noise reduction and homogeneity Formula for estimating k Estimate noise error using Central Limit Theorem Estimate heterogeneity error using bounds on derivative of function being estimated Choose k where there errors are roughly equal

Results