Presentation is loading. Please wait.

Presentation is loading. Please wait.

Classification Dr Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA 92521 Who.

Similar presentations


Presentation on theme: "Classification Dr Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA 92521 Who."— Presentation transcript:

1 Classification Dr Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA 92521 eamonn@cs.ucr.edu Who is smarter, Humans or Pigeons?

2 Section 1.1 (again) Section 4.1 Section 4.3 Read in Detail Section 4.2.2 Section 4.34 Glance over

3 Examples of class AExamples of class B 1) What class is this object? 2) What class is this object? 1 2 3 4 1 2 3 4

4 Examples of class AExamples of class B 1) What class is this object? 2) What class is this object? 1 2 3 4 1 2 3 4

5 Examples of class AExamples of class B 1) What class is this object? 2) What class is this object? 1 2 3 4 1 2 3 4

6 The “game” we have just been playing is Supervised Classification. Why is it useful?

7 Examples of class A People who contracted disease X. Examples of class B People who are disease free. 1) What class is this person? Is this person at risk of getting the disease? 2) What class is this person? Is this person at risk of getting the disease? 1 2 3 4 1 2 3 4 Patient temperature 99 Blood count 4214 Weight 167 Patient temperature 98 Blood count 3214 Weight 179 Patient temperature 97 Blood count 2763 Weight 121 Patient temperature 99 Blood count 3234 Weight 117 Patient temperature 97 Blood count 0012 Weight 190 Patient temperature 99 Blood count 0114 Weight 202 Patient temperature 98 Blood count 1014 Weight 345 Patient temperature 99 Blood count 1214 Weight 190 Patient temperature 97 Blood count 0118 Weight 280 Patient temperature 99 Blood count 3452 Weight 99

8 Examples of class AExamples of class B 1) What class is this object? 2) What class is this object? 1 2 3 4 1 2 3 4

9 Examples of class AExamples of class B 1 2 3 4 1 2 3 4 3 4 1.5 5 6 8 2.5 5 5 2.5 5 2 8 3 4.5 3 1) What class is this object? 2) What class is this object? 8 1.5 4.5 7

10 Classification There are many classification algorithms, in this class we will consider only… Simple Linear Classifier. Nearest Neighbor Classifier. Decision Tree. Naïve Bayes.

11 The classification problem The classification algorithm is shown a number of labeled examples from the problem domain of interest. (this collection of labeled data is called the training set). The algorithm builds a model that “explains” the labeling of the examples. (this model may or may not be accessible to humans, depending on the algorithm). At some future time the algorithm is shown an unlabeled example, and asked to classify it. Shape Domain Cat Domain

12 Class:IncomeSavings Num_credit_cardsIs_married A:123,00034,100 0 N B: 24,000 -2,000 13 Y A: 45,200 12,100 3 N … ….. …… … … B: 423,020 23,440 0 N B: 14,000 87,000 0 Y A: 11,200 -2,000 2 Y Sample dataset for a credit worthiness problem ?123,00034,100 0 N What is this instances class? Number of rows is the size of the training set, number of columns is the dimensionality of the training set, each row is called an instance (or exemplar) each column is called a feature.

13 Visualizing classification algorithms We can visualize some classification algorithms in 2D… Warning: This tends to make the problem look easy... Class feature 1 feature 2 height1 height2 A34 B52.5 A1.55 ……...

14 10 123456789 1 2 3 4 5 6 7 8 9 A trivial machine learning example represented in 2D Euclidean Space. The blue circles and red squares represent the two classes in our training data, and the black shapes are the objects we are trying to classify. From now on we will only consider the 2D plots when explaining algorithms and problems. We should always remember that this plots are representations of real world objects.

15 Simple Linear Classifier A dataset which is not linearly separable

16 Piecewise Linear Classifier Simple Quadratic Classifier (or some other function)

17 This example is one for which we know a perfect rule, “above the diagonal is circle class, below the diagonal is square class”. (Don’t forget that for real world problems we can never know a perfect rule, even if there is one). What happens if we learn a piecewise linear classifier or a quadratic classifier on this dataset with small training dataset? This problem is called overfitting. Piecewise Linear Classifier Simple Quadratic Classifier

18 The Nearest Neighbor Algorithm The nearest neighbor algorithm (NN) works by projecting the item to be classified into the same space as the training data, then finding the labeled exemplar which is closest. Whatever class that nearest neighbor is, is then assigned to the item to be classified. In this example, the item (6, 2) is correctly classified. In spite of its amazing simplicity, Nearest Neighbor is one of the best algorithms for many problems. We can use many different distance measures to measure the distance between objects. Typically Euclidean distance is used.

19 Evaluation of Classification Leaving one out Cross fold validation

20 Discussion of Nearest Neighbor I It is sensitive to irrelevant features. One possible solution is search for good subsets. It is sensitive to noise. One possible solution is use KNN. 10 123456789 Suppose there is a disease. Although we don’t know this, it happens that if your blood sugar is over 5.5 you have the disease and below you don’t….

21 Discussion of Nearest Neighbor II It is sensitive to the units in which the features are measured. One possible solution is to normalize the features. X axis measured in feet Y axis measure in dollars X axis measured in inches Y axis measure in dollars

22 Discussion of Nearest Neighbor III Scalability

23 A Famous Problem. R. A. Fisher’s Iris Dataset. 3 classes 50 of each class The task is to classify Iris plants into one of 3 varieties using the Petal Length and Petal Width. Iris SetosaIris VersicolorIris Virginica


Download ppt "Classification Dr Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA 92521 Who."

Similar presentations


Ads by Google