Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bayesian Learning, Cont’d. Administrivia Various homework bugs: Due: Oct 12 (Tues) not 9 (Sat) Problem 3 should read: (duh) (some) info on naive Bayes.

Similar presentations


Presentation on theme: "Bayesian Learning, Cont’d. Administrivia Various homework bugs: Due: Oct 12 (Tues) not 9 (Sat) Problem 3 should read: (duh) (some) info on naive Bayes."— Presentation transcript:

1 Bayesian Learning, Cont’d

2 Administrivia Various homework bugs: Due: Oct 12 (Tues) not 9 (Sat) Problem 3 should read: (duh) (some) info on naive Bayes in Sec. 4.3 of text

3 Administrivia Another bug in last time’s lecture: Multivariate Gaussian should look like:

4 5 minutes of math... Joint probabilities Given d different random vars, The “joint” probability of them taking on the simultaneous values given by Or, for shorthand, Closely related to the “joint PDF”

5 5 minutes of math... Independence: Two random variables are statistically independent iff: Or, equivalently (usually for discrete RVs): For multivariate RVs:

6 Exercise Suppose you’re given the PDF: Where z is a normalizing constant. What must z be to make this a legitimate PDF? Are and independent? Why or why not? What about the PDF:

7 Parameterizing PDFs Given training data, [X, Y], w/ discrete labels Y Break data out into sets, etc. Want to come up with models,, Suppose the individual f() s are Gaussian, need the params μ and σ How do you get the params? Now, what if the f()s are something really funky you’ve never seen before in your life, with parameters, etc.

8 Maximum likelihood Principle of maximum likelihood: Pick the parameters that make the data as probable (or, in general “likely”) as possible Regard the probability function as a func of two variables: data and parameters: Function L is the “likelihood function” Want to pick the that maximizes L

9 Example Consider the exponential PDF: Can think of this as either a function of x or τ

10 Exponential as fn of x

11 Exponential as a fn of τ

12 Max likelihood params So, for a fixed set of data, X, want the parameter that maximizes L Hold X constant, optimize How? More important: f() is usually a function of a single data point (possibly vector), but L is a func. of a set of data How do you extend f() to set of data?

13 IID Samples In supervised learning, we usually assume that data points are sampled independently and from the same distribution IID assumption: data are independent and identically distributed

14 IID Samples In supervised learning, we usually assume that data points are sampled independently and from the same distribution IID assumption: data are independent and identically distributed ⇒ joint PDF can be written as product of individual (marginal) PDFs:

15 The max likelihood recipe Start with IID data Assume model for individual data point, f(X; Θ ) Construct joint likelihood function (PDF): Find the params Θ that maximize L (If you’re lucky): Differentiate L w.r.t. Θ, set =0 and solve Repeat for each class

16 Exercise Find the maximum likelihood estimator of μ for the univariate Gaussian: Find the maximum likelihood estimator of β for the degenerate gamma distribution: Hint: consider the log of the likelihood fns in both cases

17 Putting the parts together [X,Y][X,Y] complete training data

18 5 minutes of math... Marginal probabilities If you have a joint PDF:... and want to know about the probability of just one RV (regardless of what happens to the others) Marginal PDF of or :

19 5 minutes of math... Conditional probabilities Suppose you have a joint PDF, f(H,W) Now you get to see one of the values, e.g., H=“183cm” What’s your probability estimate of A, given this new knowledge?

20 5 minutes of math... Conditional probabilities Suppose you have a joint PDF, f(H,W) Now you get to see one of the values, e.g., H=“183cm” What’s your probability estimate of A, given this new knowledge?

21 Everything’s random... Basic Bayesian viewpoint: Treat (almost) everything as a random variable Data/independent var: X vector Class/dependent var: Y Parameters: Θ E.g., mean, variance, correlations, multinomial params, etc. Use Bayes’ Rule to assess probabilities of classes


Download ppt "Bayesian Learning, Cont’d. Administrivia Various homework bugs: Due: Oct 12 (Tues) not 9 (Sat) Problem 3 should read: (duh) (some) info on naive Bayes."

Similar presentations


Ads by Google