Module 04: Algorithms Topic 07: Instance-Based Learning

Slides:



Advertisements
Similar presentations
1 Classification using instance-based learning. 3 March, 2000Advanced Knowledge Management2 Introduction (lazy vs. eager learning) Notion of similarity.
Advertisements

Conceptual Clustering
K-NEAREST NEIGHBORS AND DECISION TREE Nonparametric Supervised Learning.
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
Searching on Multi-Dimensional Data
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
1 CS 391L: Machine Learning: Instance Based Learning Raymond J. Mooney University of Texas at Austin.
Classification Algorithms – Continued. 2 Outline  Rules  Linear Models (Regression)  Instance-based (Nearest-neighbor)
Classification Algorithms – Continued. 2 Outline  Rules  Linear Models (Regression)  Instance-based (Nearest-neighbor)
Navneet Goyal. Instance Based Learning  Rote Classifier  K- nearest neighbors (K-NN)  Case Based Resoning (CBR)
Nearest Neighbor. Predicting Bankruptcy Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return.
K nearest neighbor and Rocchio algorithm
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: Nearest Neighbor Models (Reading: Chapter.
Instance Based Learning
Instance Based Learning. Nearest Neighbor Remember all your data When someone asks a question –Find the nearest old data point –Return the answer associated.
CES 514 – Data Mining Lec 9 April 14 Mid-term k nearest neighbor.
Aprendizagem baseada em instâncias (K vizinhos mais próximos)
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
INSTANCE-BASE LEARNING
Nearest Neighbor Classifiers other names: –instance-based learning –case-based learning (CBL) –non-parametric learning –model-free learning.
CS Instance Based Learning1 Instance Based Learning.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets.
Issues with Data Mining
GEOMETRIC VIEW OF DATA David Kauchak CS 451 – Fall 2013.
K Nearest Neighborhood (KNNs)
DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.
1 Data Mining Lecture 5: KNN and Bayes Classifiers.
K Nearest Neighbors Classifier & Decision Trees
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.7: Instance-Based Learning Rodney Nielsen.
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
 Classification 1. 2  Task: Given a set of pre-classified examples, build a model or classifier to classify new cases.  Supervised learning: classes.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Sections 4.1 Inferring Rudimentary Rules Rodney Nielsen.
Today’s Topics HW1 Due 11:55pm Today (no later than next Tuesday) HW2 Out, Due in Two Weeks Next Week We’ll Discuss the Make-Up Midterm Be Sure to Check.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall 6.8: Clustering Rodney Nielsen Many / most of these.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 6.2: Classification Rules Rodney Nielsen Many.
KNN Classifier.  Handed an instance you wish to classify  Look around the nearby region to see what other classes are around  Whichever is most common—make.
DATA MINING LECTURE 10b Classification k-nearest neighbor classifier
CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning.
K nearest neighbors algorithm Parallelization on Cuda PROF. VELJKO MILUTINOVIĆ MAŠA KNEŽEVIĆ 3037/2015.
Chapter 4: Algorithms CS 795. Inferring Rudimentary Rules 1R – Single rule – one level decision tree –Pick each attribute and form a single level tree.
CS Machine Learning Instance Based Learning (Adapted from various sources)
Classification Algorithms Covering, Nearest-Neighbour.
1 Learning Bias & Clustering Louis Oliphant CS based on slides by Burr H. Settles.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Instance-Based Learning Evgueni Smirnov. Overview Instance-Based Learning Comparison of Eager and Instance-Based Learning Instance Distances for Instance-Based.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
CS 8751 ML & KDDInstance Based Learning1 k-Nearest Neighbor Locally weighted regression Radial basis functions Case-based reasoning Lazy and eager learning.
Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Algorithms: The Basic Methods Clustering WFH:
Data Science Practical Machine Learning Tools and Techniques 6.8: Clustering Rodney Nielsen Many / most of these slides were adapted from: I. H. Witten,
Spatial Data Management
General-Purpose Learning Machine
k-Nearest neighbors and decision tree
Data Science Algorithms: The Basic Methods
Data Mining: Concepts and Techniques
Instance Based Learning
Ch9: Decision Trees 9.1 Introduction A decision tree:
K Nearest Neighbor Classification
Clustering.
Instance Based Learning
Classification Algorithms
Review Given a training space T(A1,…,An, C) and its features subspace X(A1,…,An) = T[A1,…,An], a functional f:X Reals, distance d(x,y)  |f(x)-f(y)| and.
CSC 558 – Data Analytics II, Prep for assignment 1 – Instance-based (lazy) machine learning January 2018.
Nearest Neighbors CSC 576: Data Mining.
Chapter 7: Transformations
Data Mining CSCI 307, Spring 2019 Lecture 21
Data Mining CSCI 307, Spring 2019 Lecture 23
Data Mining CSCI 307, Spring 2019 Lecture 11
Presentation transcript:

Module 04: Algorithms Topic 07: Instance-Based Learning CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 07: Instance-Based Learning

Instance-Based Learning Distance function defines what’s learned Most instance-based schemes use Euclidean distance: a(1) and a(2): two instances with k attributes Taking the square root is not required when comparing distances Other popular metric: city-block metric Adds differences without squaring them

Normalization and Other Issues Different attributes are measured on different scales  need to be normalized: vi : the actual value of attribute i Nominal attributes: distance either 0 or 1 Common policy for missing values: assumed to be maximally distant (given normalized attributes) 𝑎 𝑖 = 𝑣 𝑖 −𝑚𝑖𝑛 𝑣 𝑖 𝑚𝑎𝑥 𝑣 𝑖 −𝑚𝑖𝑛 𝑣 𝑖

Finding Nearest Neighbors Efficiently Simplest way of finding nearest neighbor: linear scan of the data Classification takes time proportional to the product of the number of instances in training and test sets Nearest-neighbor search can be done more efficiently using appropriate data structures We will discuss two methods that represent training data in a tree structure: kD-trees and ball trees

kD-Tree Example

More on kD-Trees Complexity depends on depth of tree, given by logarithm of number of nodes Amount of backtracking required depends on quality of tree (“square” vs. “skinny” nodes) How to build a good tree? Need to find good split point and split direction Split direction: direction with greatest variance Split point: median value along that direction Using value closest to mean (rather than median) can be better if data is skewed Can apply this recursively

Building Trees Incrementally Big advantage of instance-based learning: classifier can be updated incrementally Just add new training instance! Can we do the same with kD-trees? Heuristic strategy: Find leaf node containing new instance Place instance into leaf if leaf is empty Otherwise, split leaf according to the longest dimension (to preserve squareness) Tree should be re-built occasionally (i.e. if depth grows to twice the optimum depth)

Ball Trees Problem in kD-trees: corners Observation: no need to make sure that regions don't overlap Can use balls (hyperspheres) instead of hyperrectangles A ball tree organizes the data into a tree of k-dimensional hyperspheres Normally allows for a better fit to the data and thus more efficient search

Ball Tree Example

Discussion of Nearest-Neighbor Learning Often very accurate Assumes all attributes are equally important Remedy: attribute selection or weights Possible remedies against noisy instances: Take a majority vote over the k nearest neighbors Removing noisy instances from dataset (difficult!) Statisticians have used k-NN since early 1950s If n   and k/n  0, error approaches minimum kD-trees become inefficient when number of attributes is too large (approximately > 10) Ball trees (which are instances of metric trees) work well in higher-dimensional spaces

More Discussion Instead of storing all training instances, compress them into regions Example: hyperpipes (from discussion of 1R) Another simple technique (Voting Feature Intervals): Construct intervals for each attribute Discretize numeric attributes Treat each value of a nominal attribute as an “interval” Count number of times class occurs in interval Prediction is generated by letting intervals vote (those that contain the test instance)

EXAMPLE Temperature Humidity Wind Play 45 10 50 Yes -20 30 65 No 30 65 No Normalize the data: new value = (original value – minimum value)/(max – min)

EXAMPLE Temperature Humidity Wind Play 45 0.765 10 0.2 50 1 Yes -20 30 30 0.6 65 No Normalize the data: new value = (original value – minimum value)/(max – min) So for Temperature: new = (45 - -20)/(65 - -20) = 0.765 new = (-20 - -20)/(65 - -20) = 0 new = (65 - -20)/(65 - -20) = 1

EXAMPLE Temperature Humidity Wind Play Distance 45 0.765 10 0.2 50 1 Yes -20 30 0.6 65 No Temperature Humidity Wind Play 35 0.647 40 0.8 10 0.2 ??? Normalize the data in the new case (so it’s on the same scale as the instance data): new value = (original value – minimum value)/(max – min) 2. Calculate the distance of the new case from each of the old cases (we’re assuming linear storage rather than kd or ball trees here).

EXAMPLE Temperature Humidity Wind Play Distance 45 0.765 10 0.2 50 1 Yes 1.007 -20 30 0.6 1.104 65 No 0.452 Temperature Humidity Wind Play 35 0.647 40 0.8 10 0.2 ??? 2. Calculate the distance of the new case from each of the old cases (we’re assuming linear storage rather than kd or ball trees here). 𝑑 1 = (0.647−0.765) 2 + (0.8−0.2) 2 + (0.2−1) 2 =1.007 𝑑 2 = (0.647−0) 2 + (0.8−0) 2 + (0.2−0.6) 2 =1.104 𝑑 3 = (0.647−1) 2 + (0.8−1) 2 + (0.2−0) 2 =0.452

EXAMPLE Temperature Humidity Wind Play Distance 45 0.765 10 0.2 50 1 Yes 1.007 -20 30 0.6 1.104 65 No 0.452 Temperature Humidity Wind Play 35 0.647 40 0.8 10 0.2 ??? 3. Determine the nearest neighbor (the smallest distance). We can see that our current case is closest to the third example so we would use that prediction for play – that is, we would predict Play = No.

The Mystery Sound Yay! Made it to the mystery sound slide – what is this sound?