ETHEM ALPAYDIN © The MIT Press, Lecture Slides for
Rationale Bayes’ Rule: Generative model: 3Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Estimating the Parameters of a Distribution: Discrete case x t i =1 if in instance t is in state i, probability of state i is q i Dirichlet prior, i are hyperparameters Sample likelihood Posterior Dirichlet is a conjugate prior With K=2, Dirichlet reduced to Beta 4 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
5 Estimating the Parameters of a Distribution: Continuous case p(x t )~N( , 2 ) Gaussian prior for , p( )~ N( , 2 ) Posterior is also Gaussian p( X)~ N ( N, N 2 ) where
6Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 6 Estimating the Parameters of a Function: Regression r=w T x+ where p( )~N(0,1/ ), and p(r t |x t,w, ) ~ N(w T x t, 1/ ) Log likelihood ML solution Gaussian conjugate prior: p(w)~N(0,1/ ) Posterior: p(w|X)~N( N N where
7Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
8 8 8 Basis/Kernel Functions For new x’, the estimate r’ is calculated as Linear kernel For any other (x), we can write K(x’,x)= (x’) T (x) Dual representation
Kernel Functions 9Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
10 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 10 Gaussian Processes Assume Gaussian prior p(w)~N(0,1/ ) y=Xw, where E[y]=0 and Cov(y)=K with K ij = (x i ) T x i K is the covariance function, here linear With basis function (x), K ij = ( (x i )) T (x i ) r~N N (0,C N ) where C N = (1/ )I+K With new x’ added as x N+1, r N+1 ~N N+1 (0,C N+1 ) where k = [K(x’,x t ) t ] T and c=K(x’,x’)+1/ . p(r’|x’,X,r)~N(k T C N-1 r,c-k T C N-1 k)
11