Download presentation

1
**Pattern Recognition and Machine Learning**

Chapter 1: Introduction

2
Example Handwritten Digit Recognition

3
**Polynomial Curve Fitting**

4
**Sum-of-Squares Error Function**

5
0th Order Polynomial

6
1st Order Polynomial

7
3rd Order Polynomial

8
9th Order Polynomial

9
Over-fitting Root-Mean-Square (RMS) Error:

10
**Polynomial Coefficients**

11
Data Set Size: 9th Order Polynomial

12
Data Set Size: 9th Order Polynomial

13
Regularization Penalize large coefficient values

14
Regularization:

15
Regularization:

16
Regularization: vs.

17
**Polynomial Coefficients**

18
Probability Theory Apples and Oranges

19
**Probability Theory Marginal Probability Conditional Probability**

Joint Probability

20
Probability Theory Sum Rule Product Rule

21
**The Rules of Probability**

Sum Rule Product Rule

22
Bayes’ Theorem posterior likelihood × prior

23
**Probability Densities**

24
**Transformed Densities**

25
**Expectations Conditional Expectation (discrete)**

Approximate Expectation (discrete and continuous)

26
**Variances and Covariances**

27
**The Gaussian Distribution**

28
**Gaussian Mean and Variance**

29
**The Multivariate Gaussian**

30
**Gaussian Parameter Estimation**

Likelihood function

31
**Maximum (Log) Likelihood**

32
Properties of and

33
**Curve Fitting Re-visited**

34
Maximum Likelihood Determine by minimizing sum-of-squares error,

35
**Predictive Distribution**

36
**MAP: A Step towards Bayes**

Determine by minimizing regularized sum-of-squares error,

37
**Bayesian Curve Fitting**

38
**Bayesian Predictive Distribution**

39
Model Selection Cross-Validation

40
**Curse of Dimensionality**

41
**Curse of Dimensionality**

Polynomial curve fitting, M = 3 Gaussian Densities in higher dimensions

42
Decision Theory Inference step Determine either or . Decision step For given x, determine optimal t.

43
**Minimum Misclassification Rate**

44
Minimum Expected Loss Example: classify medical images as ‘cancer’ or ‘normal’ Decision Truth

45
Minimum Expected Loss Regions are chosen to minimize

46
Reject Option

47
**Why Separate Inference and Decision?**

Minimizing risk (loss matrix may change over time) Reject option Unbalanced class priors Combining models

48
**Decision Theory for Regression**

Inference step Determine . Decision step For given x, make optimal prediction, y(x), for t. Loss function:

49
**The Squared Loss Function**

50
**Generative vs Discriminative**

Generative approach: Model Use Bayes’ theorem Discriminative approach: Model directly

51
**Entropy Important quantity in coding theory statistical physics**

machine learning

52
Entropy Coding theory: x discrete with 8 possible states; how many bits to transmit the state of x? All states equally likely

53
Entropy

54
Entropy In how many ways can N identical objects be allocated M bins? Entropy maximized when

55
Entropy

56
Differential Entropy Put bins of width ¢ along the real line Differential entropy maximized (for fixed ) when in which case

57
Conditional Entropy

58
**The Kullback-Leibler Divergence**

59
Mutual Information

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google