Download presentation
Presentation is loading. Please wait.
1
Probabilistic Models for Linear Regression
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
2
Regression Problem N iid training samples { π₯ π , π¦ π }
Response / Output / Target : π¦ π βπ
Input / Feature vector: πβ π
π Linear Regression π¦ π = π€ π π₯ π + π π Polynomial Regression π¦ π = π€ π π π₯ π + π π π π π₯ = π₯ π Still linear function of w Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
3
Least Squares Formulation
Deterministic error term π π Minimize total error πΈ π€ = π π π 2 π€ β = arg min π€ πΈ(π€) Find gradient wrt π€ and equate to 0 π€ β = π π π β1 π π π¦ Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
4
Regularization for Regression
How does regression overfit? Adding regularization to regression πΈ 1 π€,π· + ππΈ 2 π€ Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
5
Regularization for Regression
Possibilities for regularizers π 2 norm π€ π π€ (Ridge regression) Quadratic: Continuous, convex π€ β = ππΌ+ π π π β1 π π π π 1 norm (Lasso) Choosing π Cross validation: wastes training data β¦ Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
6
Probabilistic formulation
Model X and Y as random variables Directly model conditional distribution of Y IID π π | π π =π₯βΌπππ π(π¦|π₯) Linear π π = π€ π π π + π π , π π βΌπππ π π Gaussian noise π π =π 0, π 2 π π¦ π₯ = π π exp{β π¦β π€ π π₯ π 2 } Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
7
Probabilistic formulation
Image from Michael Jordanβs book Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
8
Maximum Likelihood Estimation
Formulate loglikelihood πΏ π€ = π π π¦ π π₯ π ;π€ = 1 2ππ^2 π/2 expβ‘{β 1 2π π π π¦β π€ π π₯ π } π π€ = π π¦β π€ π π₯ π 2 Recovers LMS formulation! Maximize to get MLE π€ ππΏ = π π π β1 π π π π 2 ππΏ = 1 π π ( π¦ π β π€ ππΏ π π₯ π ) Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
9
Bayesian Linear Regression
Model W as random variable with prior distribution π π€ =π π 0 , π 0 ;π€, π 0 is πΓ1, π 0 is πΓπ Derive posterior distribution π π€ π¦ =π π π , π π (for some π π , π π ) Derive mean of posterior distribution π€ π΅ =πΈ π π¦ = π π Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
10
Iterative Solutions for Normal Equations
Direct solutions have limitations Iterative solutions First order method: Gradient descent π€ (π‘+1) β π€ (π‘) +π π π¦ π β π€ π‘ π π₯ π π₯ π Convergence guarantees Convergence in probability to correct solution for appropriate fixed step size Sure convergence with decreasing step sizes Stochastic gradient descent Update based on a single data point as each step Often converges faster Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
11
Advantages of Probabilistic Modeling
Makes assumptions explicit Modularity Conceptually simple to change a model by replacing with appropriate distributions Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
12
Summary Probabilistic formulation of linear regression
Recovers least squares formulation Iterative algorithms for training Forms of regularization Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.