Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and.

Similar presentations


Presentation on theme: "1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and."— Presentation transcript:

1 1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and Multimedia, National Taiwan University

2 2 Basic Assumptions The decision problem is posed in probabilistic terms All of the relevant probability values are known

3 3 State of Nature State of nature – A priori probability (prior) – Decision rule to judge just one fish –

4 4 Class-Conditional Probability Density

5 5 Bayes Formula

6 6 Posterior Probabilities

7 7 Bayes Decision Rule Probability of error Bayes decision rule

8 8 Bayes Decision Theory (1/3) CategoriesActions Loss functions Feature vector

9 9 Bayes Decision Theory (2/3) Bayes formula Conditional risk

10 10 Bayes Decision Theory (3/3) Decision function assumes one of the values Overall risk Bayes decision rule: compute the conditional risk then select the action for which is minimum then select the action for which is minimum

11 11 Two-Category Classification Conditional risk Decision rule: decide  1 if Likelihood ratio

12 12 Minimum-Error-Rate Classification If action is taken and the true state is, then the decision is correct if and in error if Error rate (the probability of error) is to be minimized Symmetrical or zero-one loss function Conditional risk

13 13 Minimum-Error-Rate Classification

14 14 Mini-max Criterion To perform well over a range of prior probability Minimize the maximum possible overall risk –So that the worst risk for any value of the priors is as small as possible

15 15 Mini-maximizing Risk

16 16 Searching for Mini-max Boundary

17 17 Neyman-Pearson Criterion Minimize the overall risk subject to a constraint Example –Minimize the total risk subject to

18 18 Discriminant Functions A classifier assigns to class if where are called discriminant functions where are called discriminant functions A discriminant function for a Bayes classifier Two discriminant functions for minimum- error-rate classification

19 19 Discriminant Functions

20 20 Two-Dimensional Two-Category Classifier

21 21 Dichotomizers Place a pattern in one of only two categories –cf. Polychotomizers More common to define a single duscriminant function Some particular forms

22 22 Univariate Normal PDF

23 23 Distribution with Maximum Entropy and Central Limit Theorem Entropy for discrete distribution Entropy for continuous distribution Central limit theorem –Aggregate effect of the sum of a large number of small, independent random disturbances, will lead to a Gaussian distrubution

24 24 Multivariate Normal PDF : d-component mean vector : d-component mean vector : d-by-d : d-by-d covariance matrix covariance matrix

25 25 Linear Combination of Gaussian Random Variables

26 26 Whitening Transform : matrix whose columns are the orthonormal eigenvectors of  : matrix whose columns are the orthonormal eigenvectors of  : diagonal matrix of the corresponding eigenvalues : diagonal matrix of the corresponding eigenvalues Whitening transform

27 27 Bivariate Gaussian PDF

28 28 Mahalanobis Distance Squared Mahalanobus distance Volume of the Hyperellipsoids of constant Mahalanobis distance r

29 29 Discriminant Functions for Normal Density

30 30 Case 1:  i =  2 I

31 31 Decision Boundaries

32 32 Decision Boundaries when P(  i )=P(  j )

33 33 Decision Boundaries when P(  i ) and P(  j ) are unequal

34 34 Case 2:  i = 

35 35 Decision Boundaries

36 36 Decision Boundaries

37 37 Case 3:  i = arbitrary

38 38 Decision Boundaries for One- Dimensional Case

39 39 Decision Boundaries for Two- Dimensional Case

40 40 Decision Boundaries for Three- Dimensional Case (1/2)

41 41 Decision Boundaries for Three- Dimensional Case (2/2)

42 42 Decision Boundaries for Four Normal Distributions

43 43 Example: Decision Regions for Two-Dimensional Gaussian Data

44 44 Example: Decision Regions for Two-Dimensional Gaussian Data

45 45 Bayes Decision Compared with Other Decision Strategies

46 46 Multicategory Case Probability of being correct Bayes classifier maximizes this probability by choosing the regions so that the integrand is maximal for all x –No other partitioning can yield a smaller probability of error

47 47 Error Bounds for Normal Densities Full calculation of the error probability is difficult for the Gaussian case –Especially in high dimensions –Discontinuous nature of the decision regions Upper bound on the error can be obtained for two-category case –By approximating the error integral analytically

48 48 Chernoff Bound

49 49 Bhattacharyya Bound

50 50 Chernoff Bound and Bhattacharyya Bound

51 51 Example: Error Bounds for Gaussian Distribution

52 52 Example: Error Bounds for Gaussian Distribution Bhattacharyya bound – k(1/2) = 4.11157 – P(error) < 0.0087 Chernoff bound – 0.008190 by numerical searching Error rate by numerical integration – 0.0021 –Impractical for higher dimension

53 53 Signal Detection Theory Internal signal in the detector x –Has mean  2 when external signal (pulse) is present –Has mean  1 when external signal is not present – p(x|  i ) ~ N(  i,  2 )

54 54 Signal Detection Theory

55 55 Four Probabilities Hit: P(x>x*|x in  2 ) False alarm: P(x>x*|x in  1 ) Miss: P(x<x*|x in  2 ) Correct reject: P(x<x*|x in  1 )

56 56 Receiver Operating Characteristic (ROC)

57 57 Bayes Decision Theory: Discrete Features

58 58 Independent Binary Features

59 59 Discriminant Function

60 60 Example: Three-Dimensional Binary Data

61 61 Example: Three-Dimensional Binary Data

62 62 Illustration of Missing Features

63 63 Decision with Missing Features

64 64 Noisy Features

65 65 Example of Statistical Dependence and Independence

66 66 Example of Causal Dependence State of an mobile –Temperature of engine –Pressure of brake fluid –Pressure of air in the tires –Voltages in the wires –Oil temperature –Coolant temperature –Speed of the radiator fan

67 67 Bayesian Belief Nets (Causal Networks)

68 68 Example: Belief Network for Fish

69 69 Simple Belief Network 1

70 70 Simple Belief Network 2

71 71 Use of Bayes Belief Nets Seek to determine some particular configuration of other variables –Given the values of some of the variables (evidence) Determine values of several query variables ( x ) given the evidence of all other variables ( e )

72 72 Example

73 73 Example

74 74 Naïve Bayes’ Rule (Idiot Bayes’ Rule) When the dependency relationship among the features are unknown, we generally take the simplest assumption –Features are conditionally independent given the category –Often works quite well

75 75 Applications in Medical Diagnosis Uppermost nodes represent a fundamental biological agent –Such as the presence of a virus or bacteria Intermediate nodes describe disease –Such as flu or emphysema Lowermost nodes describe the symptoms –Such as high temperature or coughing A physician enters measured values into the net and finds the most likely disease or cause

76 76 Compound Bayesian Decision


Download ppt "1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and."

Similar presentations


Ads by Google