Download presentation

Presentation is loading. Please wait.

1
**Pattern Recognition Ku-Yaw Chang canseco@mail.dyu.edu.tw**

Assistant Professor, Department of Computer Science and Information Engineering Da-Yeh University

2
**Outline Introduction Features and Classes Supervised v.s. Unsupervised**

Statistical v.s. Structural (Syntactic) Statistical Decision Theory 2004/03/02 Pattern Recognition

3
**Supervised v.s. Unsupervised**

Supervised learning Using a training set of patterns of known class to classify additional similar samples Unsupervised learning Dividing samples into groups or clusters based on measures of similarity without any prior knowledge of class membership 2004/03/02 Pattern Recognition

4
**Supervised v.s. Unsupervised**

Dividing the class into two groups: Supervised learning Male features Female features Unsupervised learning Male v.s. Female Tall v.s. Short With v.s. Without glasses … 2004/03/02 Pattern Recognition

5
**Statistical v.s. Structural**

Statistical PR To obtain features by manipulating the measurements as purely numerical (or Boolean) variables Structural (Syntactic) PR To design features in some intuitive way corresponding to human perception of the objects 2004/03/02 Pattern Recognition

6
**Statistical v.s. Structural**

Optical Character Recognition (OCR) Statistical PR Structural PR 2004/03/02 Pattern Recognition

7
**Statistical Decision Theory**

An automated classification system Classified data sets Selected features 2004/03/02 Pattern Recognition

8
**Statistical Decision Theory**

Hypothetical Basketball Association (HBA) apg : average number of points per game To predict the winner of the game Based on the difference between the home team’s apg and the visiting team’s apg for previous games Training set Scores of previously played games Home team classified as a winner or a loser 2004/03/02 Pattern Recognition

9
**Statistical Decision Theory**

Given a game to be played, predict the home team to be a winner or loser using the feature: dapg = Home Team apg – Visiting Team apg 2004/03/02 Pattern Recognition

10
**Statistical Decision Theory**

Game dapg Home Team 1 1.3 Won 16 -3.1 2 -2.7 Lost 17 1.7 3 -0.5 18 2.8 4 -3.2 19 4.6 5 2.3 20 3.0 6 5.1 21 0.7 7 -5.4 22 10.1 8 8.2 23 2.5 9 -10.8 24 0.8 10 -0.4 25 -5.0 11 10.5 26 8.1 12 -1.1 27 -7.1 13 28 2.7 14 -4.2 29 -10.0 15 -3.4 30 -6.5 2004/03/02 Pattern Recognition

11
**Statistical Decision Theory**

A histogram of dapg 2004/03/02 Pattern Recognition

12
**Statistical Decision Theory**

The classification cannot be performed perfectly using the single feature dapg. Probability of membership in each class With the smallest expected penalty Decision boundary or threshold The value T for Home Team Won: dapg is less than or equal to T Lost: dapg is greater than T 2004/03/02 Pattern Recognition

13
**Statistical Decision Theory**

Home team’s apg = 103.4 Visiting team’s apg = 102.1 dapg = – = 1.3 and 1.3 > T Home team will win the game T = 0.8 or -6.5 T = 0.8 achieves the minimum error rate 2004/03/02 Pattern Recognition

14
**Statistical Decision Theory**

Adding an additional feature to increase the accuracy of classification dwp = Home Team wp – Visiting Team wp wp denotes the winning percentage 2004/03/02 Pattern Recognition

15
**Statistical Decision Theory**

Game dapg dwp Home Team 1 1.3 25.0 Won 16 -3.1 9.4 2 -2.7 -16.9 Lost 17 1.7 6.8 3 -0.5 5.3 18 2.8 17.0 4 -3.2 -27.5 19 4.6 13.3 5 2.3 -18.0 20 3.0 -24.0 6 5.1 31.2 21 0.7 -17.8 7 -5.4 5.8 22 10.1 44.6 8 8.2 34.3 23 2.5 -22.4 9 -10.8 -56.3 24 0.8 12.3 10 -0.4 25 -5.0 -3.8 11 10.5 16.3 26 8.1 36.0 12 -1.1 -17.6 27 -7.1 -20.6 13 5.7 28 2.7 23.2 14 -4.2 16.0 29 -10.0 -46.9 15 -3.4 30 -6.5 19.7 2004/03/02 Pattern Recognition

16
**Statistical Decision Theory**

Feature vector (dapg, dwp) Presented as a scatterplot W W W W W W W W W W L W W W W L L W W W L L L L W W L W L L 2004/03/02 Pattern Recognition

17
**Statistical Decision Theory**

The feature space can be divided into two decision region by a straight line Linear decision boundary If a feature space cannot be perfectly separated by a straight line, a more complex boundary line might be used. 2004/03/02 Pattern Recognition

18
Exercise One The values of a feature x for nine samples from class A are 1, 2, 3, 3, 4, 4, 6, 6, 8. Nine samples from class B had x values of 4, 6, 7, 7, 8, 9, 9, 10, 12. Make a histogram (with an interval width of 1) for each class and find a decision boundary (threshold) that minimizes the total number of misclassifications for this training data set. 2004/03/02 Pattern Recognition

19
Exercise Two Can the feature vectors (x,y) = (2,3), (3,5), (4,2), (2,7) from class A be separated from four samples from class B located at (6,2), (5,4), (5,6), (3,7) by a linear decision boundary? If so, give the equation of one such boundary and plot it. If not, find a boundary that separates them as well as possible. 2004/03/02 Pattern Recognition

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google