Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil.

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil P. Schmidt Department of Computer Information Sciences, KSU http://www.cis.ksu.edu/~cps4444 Instance-Based Learning KDD Group Presentation

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Presentation Outline What is Instance Based Learning? k-Nearest Neighbor Learning Distance-Weighted Nearest Neighbor Algorithm Other IBL Methods –Locally Weighted Regression –Radial Basis Functions –Case-Based Reasoning Lazy Versus Eager Learning Research Opportunities in IBL Summary Bibliography

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems What is Instance Based Learning? Description –Instance Base Learning (IBL) methods initially store the presented training data. –Upon encountering a new query, the set of similar related instances are retrieved and used for classification. Differences between IBL methods and others –local versus global approximation of query –unique approximation to target function per distinct query instance Advantages –complex global approximation becomes simpler local approximation –instance points can be complex, symbolic representations Disadvantages –Nearly all computation takes place at classification time –Classification typically looks at all attributes of the query instance whereas they might not all be important causing misclassification.

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems k-Nearest Neighbor Learning Most basic of IBL Methods –Assumes all instances correspond to points in the n-dimensional space  n –Nearest neighbors of a query instance are defined in terms of standard Euclidean distance. Distance d definition –let an arbitrary instance x be described by feature vector  a 1 (x), a 2 (x),…a n (x)  where a r (x) denotes the value of the rth attribute of instance x. The distance between x i and x j is d(x i, x j ), where d(x i, x j )  sqrt(  r = 1..n (a r (x i ) - a r (x j )) 2 Distance Calculation Example –find distance between vector  2,2,4  and vector  2,2,2  –sqrt((2-2) 2 + (2-2) 2 + (2-4) 2 ) = 2

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems k-Nearest Neighbor Learning Algorithm: Discrete-Valued Target Function k-Nearest Neighbor Learning Algorithm: Discrete-Valued Target Function Algorithm Consists of two parts which include a training part and a classification part. Training Algorithm –For each training_example  x, f(x) , add the example to the list train_examples Classification Algorithm –Given a query instance x q to be classified Let x 1 … x k denote the k instances from training_examples that are nearest to x q Return f-hat(x q )  argmax v  V  i=1..k  (v,f(x i )) where  (a,b) = 1 if a = b and  (a,b) = 0 otherwise

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems k-Nearest Neighbor Learning Algorithm: Discrete- Valued Target Function Classification Example k-Nearest Neighbor Learning Algorithm: Discrete- Valued Target Function Classification Example Classification Example –Assume following training_examples –x 1 =  1,2,7,8,+ , x 2 =  1,3,5,6,-  –Classify x 3 =  1,2,7,6  (Step 1) Compute distance from each point in training examples d(x 1, x 3 ) = 2 d(x 2, x 3 ) = 1 (Step 2) Classify x 3 based on the nearest k points where k = 1 we classify x 3 as - since it is nearest x 2

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems K-Nearest Neighor: Real-Valued Target Function Used to Approximate continuous-valued target function Calculates mean value of k-nearest training examples rather than their most common value. Replaces final line of discrete-valued target function with the following: –f-hat(x q )  (  i=1..k f(x i ))/k Example –Given training_examples (x 1,1),(x 2,1),(x 3,0) –We wish to classify x using 2-nearest neighbors –d(x 1,x 4 ) = 5, d(x 2,x 4 ) = 2, d(x 3,x 4 ) = 4 –therefore x 4 is nearest x 2 and x 3 –taking the mean we get (1+0)/2 = 0.5

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Distance-weighted Nearest Neighbor Algorithm Weights contribution of each k neighbors according to their distance to the query point x q, giving greater weight to closer neighbors Example –weight each neighbor according to inverse square of its distance from x q f-hat(x q )  argmax v  V  i=1..k w i  (v,f(x i )) where w i  1/d(x q,x i ) 2 if x q exactly matches x i then w i = 1 Modification of real-valued target function normalizes contributions of various weights f-hat(x q )   i=1..k w i f(x i )/  i=1..k w i Adding distance weighting, all training examples can be used to influence classification of x q Classifier will run more slowly Referred to as a global method when all training points are used Referred to as a local method when k-nearest training examples are used

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Other IBL Methods Locally Weighted Regression –Constructs explicit approximation of f over a local region surrounding x q. –Uses nearby or distance-weighted training examples to form local approximation –local refers to the fact that only data near the query point is used –weighted refers to fact that contribution of each training example is weighted by distance from query point –regression term is used because it is used widely in statistical learning community for approximating real-valued function. Radial Basis Functions –Closely related to distance-weighted regression and ANNs –Provide global approximation to the target function, represented by a linear combination of many local kernel functions (kernel function K is function of distance used to determine the weight of each training example) Case-Based Reasoning –Uses rich symbolic descriptions to represent instances versus real-valued points in an n-dimensional Euclidean space. –Requires more elaborate retrieval of similar instances (nearest neighbors) –Applied to conceptual design problems - similar designs

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Lazy Versus Eager Learing Lazy Learing –Defers decision to generalize beyond training data until each new query instance is encountered –examples include k-nearest neighbor, locally-weighted regression, and case-based reasoning Eager Learning –Generalizes beyond the training data before observing the new query –examples include Decision Tree Learning algorithms such as ID3 and ANNs Differences –Computation Time –Lazy methods generally require less computation during training –Lazy methods require more computation time during classification –Classification Differences –Lazy methods may consider the query instance xq when deciding how to generalize beyond the training data D –Eager methods have already chosen their (global) approximation to the target function. –Lazy learner has option of representing target function by a combination of many local approximation, whereas eager methods must commit at training time.

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Research Opportunities In IBL Improved methods for indexing instances which may be rich relational descriptions (CBR) Development of eager methods which employ multiple local approximations to achieve similar effects as lazy learning methods but reduce computation at classification time. (RBF learning attempts this) Applications (Can you think of some others?) –Finding similar software patterns in currently implemented software or in analysis or design models. –Matching database query to most likely instances

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Summary Nearest neighbor algorithms are examples of IBL methods which delay much of the classification computation until classification time. These algorithms use local approximation versus global approximation which can have the effect of having unique target functions for each query instance. Ability to model complex target functions by a collection of less complex local approximations Information present in training examples is never lost Distance metric can become misleading as all attributes are considered Employ lazy learning versus eager learning techniques

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Bibliography Primary material for this presentation came from: –Mitchell, T. (1997) Machine Learning, MIT Press and The McGraw-Hill Companies, Inc.Boston, Mass.

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil.

Similar presentations

Presentation on theme: "Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil.

Similar presentations

Presentation on theme: "Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil."— Presentation transcript:

Similar presentations

About project

Feedback