Presentation is loading. Please wait.

Presentation is loading. Please wait.

This week: overview on pattern recognition (related to machine learning)

Similar presentations


Presentation on theme: "This week: overview on pattern recognition (related to machine learning)"— Presentation transcript:

1 This week: overview on pattern recognition (related to machine learning)

2 Non-review of chapters 6/7 Z-transforms Convolution Sampling/aliasing Linear difference equations Resonances FIR/IIR filtering DFT/FFT

3 Speech Pattern Recognition Soft pattern classification plus temporal sequence integration Supervised pattern classification: class labels used in training Unsupervised pattern classification: labels not available (or at least not used)

4

5 Training and testing Training: learning parameters of classifier Testing: classify independent test set, compare with labels, and score

6 F1 and F2 for various vowels

7 Swedish basketball players vs. speech recognition researchers

8 Feature extraction criteria Class discrimination Generalization Parsimony (efficiency)

9

10

11 Feature vector size Best representations for discrimination on training set are highly dimensioned Best representations for generalization to test set tend to be succinct

12 Dimensionality reduction Principal components (i.e., SVD, KL transform, eigenanalysis...) Linear Discriminant Analysis (LDA) Application-specific knowledge Feature Selection via PR Evaluation

13

14

15 PR Methods Minimum distance Discriminant Functions – Linear – Nonlinear (e.g., quadratic, neural networks) – Some aspects of each - SVMs Statistical Discriminant Functions

16 Minimum Distance Vector or matrix representing element Define distance function Collect examples for each class In testing, choose the class of the closest example Choice of distance equivalent to an implicit statistical assumption Signals (e.g. speech) add temporal variability

17

18 Limitations Variable scale of dimensions Variable importance of dimensions For high dimensions, sparsely sampled space For tough problems, resource limitations (storage, computation, memory access)

19 Decision Rule for Min Distance Nearest Neighbor (NN) - in the limit of infinite samples, at most twice the error of optimum classifier k-Nearest Neighbor (kNN) lots of storage for large problems; potentially large searches

20 Some Opinions Better to throw away bad data than to reduce its weight Dimensionality-reduction based on variance often a bad choice for supervised pattern recognition Both of these are only true sometimes

21 Discriminant Analysis Discriminant functions max for correct class, min for others Decision surface between classes Linear decision surface for 2-dim is a line, for 3 is a plane; generally called hyperplane For 2 classes, surface at w T x + w 0 = 0 2-class quadratic case, surface at x T Wx + w T x + w 0 = 0

22 Two prototype example D i 2 = (x – z i ) T (x – z i ) = x T x + z i T z i – 2 x T z i D 1 2 – D 2 2 = 2 x T z 2 – 2 x T z 1 + z 1 T z 1 – z 2 T z 2 At decision surface, distances are equal, so 2 x T z 2 – 2 x T z 1 = z 2 T z 2 – z 1 T z 1 Or x T (z 2 – z 1 ) = ½ (z 2 T z 2 – z 1 T z 1 ) And if prototypes are all normalized to 1, Decision surface is x T (z 2 – z 1 ) = 0 And each discriminant function is x T z i

23 System to discriminate between two classes

24 Training Discriminant Functions Minimum distance Fisher linear discriminant Gradient learning

25 Generalized Discriminators - ANNs McCulloch Pitts neural model Rosenblatt Perceptron Multilayer Systems

26

27

28

29 Typical unit for MLP Σ

30

31 Typical MLP

32 Support Vector Machines (SVMs) High dimensional feature vectors – Transformed from simple features (e.g., polynomial) – Can potentially classify training set arbitrarily well Improve generalization by maximizing margin Via “Kernel trick”, don’t need explicit high dim – Inner product function between 2 points in space Slack variables allow for imperfect classification

33 Maximum margin decision boundary, SVM

34 Unsupervised clustering Large and diverse literature Many methods Next time one method explained in the context of a larger, statistical system

35 Some PR/machine learning Issues Testing on the training set Training on the test set # parameters vs # training examples: overfitting and overtraining For much more on machine learning, CS 281A/B


Download ppt "This week: overview on pattern recognition (related to machine learning)"

Similar presentations


Ads by Google