Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bayesian Classification

Similar presentations


Presentation on theme: "Bayesian Classification"— Presentation transcript:

1 Bayesian Classification
A reference

2 Copyright, G. A. Tagliarini, PhD
Example1: 2/24/2019 Copyright, G. A. Tagliarini, PhD

3 Copyright, G. A. Tagliarini, PhD
Example 1 (continued) Overall objective: count the number of people on the beach Intermediate objectives: Reduce the search space Segment the image into three zones (classes) Surf, Beach, and Building 2/24/2019 Copyright, G. A. Tagliarini, PhD

4 Copyright, G. A. Tagliarini, PhD
Example 1 (continued) Consider a randomly selected pixel x from the image Suppose the a priori probabilities with respect to the three classes are: P(x is in the building area)  0.17 P(x is in the beach area)  0.58 P(x is in the surf area)  0.25 What decision rule minimizes error? 2/24/2019 Copyright, G. A. Tagliarini, PhD

5 Copyright, G. A. Tagliarini, PhD
Example 1: Suppose additional information regarding a property (such as color, brightness, or variability) of the pixel (or its neighborhood) is available. Can such knowledge aid classification? What is p(the pixel x came from the beach area given the pixel is red), i.e., p(x | red)? 2/24/2019 Copyright, G. A. Tagliarini, PhD

6 Copyright, G. A. Tagliarini, PhD
Example 1: Consider the hypothetical, regional color distributions h 2/24/2019 Copyright, G. A. Tagliarini, PhD

7 Copyright, G. A. Tagliarini, PhD
Example 1: The joint probability that a randomly selected pixel is from the beach area and has a hue h, written p(beach, h) = p(h|beach) P(beach) = P(beach|h) p(h) Solving for P(beach|h) we get P(beach|h) = p(h|beach) P(beach) / p(h) p(h) = p(h|building) P(building) p(h|beach) P(beach) + p(h|surf) P(surf) 1. With a priori probabilities consider p(card is red and a king) then p(red and king)=2/52 = p(red|king)*p(king)=2/4 * 4/52 = 1/26 =p(king|red)*p(red) = 2/26 * 26/52 = 2/52. 2. p(h) merely accumulates a weighted average of the occurrences of hue h, since the areas are assume to be independent, which is simply non-overlapping in this example. 2/24/2019 Copyright, G. A. Tagliarini, PhD

8 Copyright, G. A. Tagliarini, PhD
A General Formulation 2/24/2019 Copyright, G. A. Tagliarini, PhD

9 Copyright, G. A. Tagliarini, PhD
A Casual Formulation The prior probability reflects knowledge of the relative frequency of instances of a class The likelihood is a measure of the probability that a measurement value occurs in a class The evidence is a scaling term 2/24/2019 Copyright, G. A. Tagliarini, PhD

10 Copyright, G. A. Tagliarini, PhD
Forming a Classifier Create discriminant functions gi(x) for each class i = 1,…,c Not unique Partition measurement space with crisp boundaries Assign x to class k if gk(x) > gj(x) for all k ≠ j For a minimum error classifier, gi(x)=P(i|x) 2/24/2019 Copyright, G. A. Tagliarini, PhD

11 Equivalent Discriminants
If f is monotone increasing, the collection hi(x) = f(gi(x)), i = 1,…,c forms an equivalent family of discriminant functions, e.g., 2/24/2019 Copyright, G. A. Tagliarini, PhD

12 Gaussian Distributions
2/24/2019 Copyright, G. A. Tagliarini, PhD

13 Gaussian Distributions Details
2/24/2019 Copyright, G. A. Tagliarini, PhD

14 Discriminants for Normal Density
Recall the classifier functions Assuming the measurements are normally distributed, we have 2/24/2019 Copyright, G. A. Tagliarini, PhD

15 Some Algebra to Simplify the Discriminants
Since We take the natural logarithm to re-write the first term 2/24/2019 Copyright, G. A. Tagliarini, PhD

16 Some Algebra to Simplify the Discriminants (continued)
2/24/2019 Copyright, G. A. Tagliarini, PhD

17 The Discriminants (Finally!!)
2/24/2019 Copyright, G. A. Tagliarini, PhD

18 Copyright, G. A. Tagliarini, PhD
Special Case 1: i = 2I 2/24/2019 Copyright, G. A. Tagliarini, PhD

19 Copyright, G. A. Tagliarini, PhD
Special Case 1: i = 2I If the classes are equally likely, the discriminants depend only upon the distances to the means A diagonal covariance matrix implies the parameters are statistically independent A constant diagonal implies the class measurements have identical variability in each dimension and hence, they are spherical in d space The discriminant functions define hyperplanes orthogonal to the line segments joining the distribution means 2/24/2019 Copyright, G. A. Tagliarini, PhD

20 Copyright, G. A. Tagliarini, PhD
Special Case 1: i = 2I 2/24/2019 Copyright, G. A. Tagliarini, PhD

21 Copyright, G. A. Tagliarini, PhD
Special Case 2: i =  2/24/2019 Copyright, G. A. Tagliarini, PhD

22 Copyright, G. A. Tagliarini, PhD
Special Case 2: i =  Since may possess nonzero, off-diagonal elements and varying diagonal elements the measurement distributions lie in hyper-ellipsoids The discriminant hyperplanes are often not orthogonal to the segments joining the class means 2/24/2019 Copyright, G. A. Tagliarini, PhD

23 Copyright, G. A. Tagliarini, PhD
Special Case 2: i =  The quadratic term is independent of i and may be eliminated. 2/24/2019 Copyright, G. A. Tagliarini, PhD

24 Copyright, G. A. Tagliarini, PhD
Case 3: i = arbitrary This is quadratic in x The discriminant decision surfaces can arise from hyperplanes, hyperparabloids, hyperellipsoids, hyperspheres, or combinations of these!!! 2/24/2019 Copyright, G. A. Tagliarini, PhD

25 Copyright, G. A. Tagliarini, PhD
Example 2: A Problem Exemplars (transposed) For w1 = {(2, 6), (3, 4), (3, 8), (4, 6)} For w2 = {(1, -2), (3, 0), (3, -4), (5, -2)} Calculated means (transposed) m1 = (3, 6) m2 = (3, -2) 2/24/2019 Copyright, G. A. Tagliarini, PhD

26 Example 2: Covariance Matrices
2/24/2019 Copyright, G. A. Tagliarini, PhD

27 Example 2: Covariance Matrices
2/24/2019 Copyright, G. A. Tagliarini, PhD

28 Example 2: Inverse and Determinant for Each of the Covariance Matrices
2/24/2019 Copyright, G. A. Tagliarini, PhD

29 Example 2: A Discriminant Function for Class 1
2/24/2019 Copyright, G. A. Tagliarini, PhD

30 Copyright, G. A. Tagliarini, PhD
Example 2 2/24/2019 Copyright, G. A. Tagliarini, PhD

31 Example 2: A Discriminant Function for Class 2
2/24/2019 Copyright, G. A. Tagliarini, PhD

32 Copyright, G. A. Tagliarini, PhD
Example 2 2/24/2019 Copyright, G. A. Tagliarini, PhD

33 Example 2: The Class Boundary
2/24/2019 Copyright, G. A. Tagliarini, PhD

34 Example 2: A Quadratic Separator
2/24/2019 Copyright, G. A. Tagliarini, PhD

35 Example 2: Plot of the Discriminant
2/24/2019 Copyright, G. A. Tagliarini, PhD

36 Summary Steps for Building a Bayesian Classifier
Collect class exemplars Estimate class a priori probabilities Estimate class means Form covariance matrices, find the inverse and determinant for each Form the discriminant function for each class 2/24/2019 Copyright, G. A. Tagliarini, PhD

37 Copyright, G. A. Tagliarini, PhD
Using the Classifier Obtain a measurement vector x Evaluate the discriminant function gi(x) for each class i = 1,…,c Decide x is in the class j if gj(x) > gi(x) for all i  j 2/24/2019 Copyright, G. A. Tagliarini, PhD


Download ppt "Bayesian Classification"

Similar presentations


Ads by Google