Presentation is loading. Please wait.

Presentation is loading. Please wait.

Probabilistic Models for Linear Regression

Similar presentations


Presentation on theme: "Probabilistic Models for Linear Regression"β€” Presentation transcript:

1 Probabilistic Models for Linear Regression
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya

2 Regression Problem N iid training samples { π‘₯ 𝑛 , 𝑦 𝑛 }
Response / Output / Target : 𝑦 𝑛 βˆˆπ‘… Input / Feature vector: π‘‹βˆˆ 𝑅 𝑑 Linear Regression 𝑦 𝑛 = 𝑀 𝑇 π‘₯ 𝑛 + πœ– 𝑛 Polynomial Regression 𝑦 𝑛 = 𝑀 𝑇 πœ™ π‘₯ 𝑛 + πœ– 𝑛 πœ™ 𝑗 π‘₯ = π‘₯ 𝑗 Still linear function of w Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya

3 Least Squares Formulation
Deterministic error term πœ– 𝑛 Minimize total error 𝐸 𝑀 = 𝑛 πœ– 𝑛 2 𝑀 βˆ— = arg min 𝑀 𝐸(𝑀) Find gradient wrt 𝑀 and equate to 0 𝑀 βˆ— = 𝑋 𝑇 𝑋 βˆ’1 𝑋 𝑇 𝑦 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya

4 Regularization for Regression
How does regression overfit? Adding regularization to regression 𝐸 1 𝑀,𝐷 + πœ†πΈ 2 𝑀 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya

5 Regularization for Regression
Possibilities for regularizers 𝑙 2 norm 𝑀 𝑇 𝑀 (Ridge regression) Quadratic: Continuous, convex 𝑀 βˆ— = πœ†πΌ+ 𝑋 𝑇 𝑋 βˆ’1 𝑋 𝑇 π‘Œ 𝑙 1 norm (Lasso) Choosing πœ† Cross validation: wastes training data … Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya

6 Probabilistic formulation
Model X and Y as random variables Directly model conditional distribution of Y IID π‘Œ 𝑖 | 𝑋 𝑖 =π‘₯βˆΌπ‘–π‘–π‘‘ 𝑝(𝑦|π‘₯) Linear π‘Œ 𝑖 = 𝑀 𝑇 𝑋 𝑖 + πœ– 𝑛 , πœ– 𝑛 βˆΌπ‘–π‘–π‘‘ 𝑝 πœ– Gaussian noise 𝑝 πœ– =𝑁 0, 𝜎 2 𝑝 𝑦 π‘₯ = πœ‹ 𝜎 exp{βˆ’ π‘¦βˆ’ 𝑀 𝑇 π‘₯ 𝜎 2 } Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya

7 Probabilistic formulation
Image from Michael Jordan’s book Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya

8 Maximum Likelihood Estimation
Formulate loglikelihood 𝐿 𝑀 = 𝑛 𝑝 𝑦 𝑛 π‘₯ 𝑛 ;𝑀 = 1 2πœ‹πœŽ^2 𝑁/2 exp⁑{βˆ’ 1 2πœ‹ 𝜎 𝑛 π‘¦βˆ’ 𝑀 𝑇 π‘₯ 𝑛 } 𝑙 𝑀 = 𝑛 π‘¦βˆ’ 𝑀 𝑇 π‘₯ 𝑛 2 Recovers LMS formulation! Maximize to get MLE 𝑀 𝑀𝐿 = 𝑋 𝑇 𝑋 βˆ’1 𝑋 𝑇 π‘Œ 𝜎 2 𝑀𝐿 = 1 𝑁 𝑛 ( 𝑦 𝑛 βˆ’ 𝑀 𝑀𝐿 𝑇 π‘₯ 𝑛 ) Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya

9 Bayesian Linear Regression
Model W as random variable with prior distribution 𝑝 𝑀 =𝑁 π‘š 0 , 𝑆 0 ;𝑀, π‘š 0 is 𝑀×1, 𝑆 0 is 𝑀×𝑀 Derive posterior distribution 𝑝 𝑀 𝑦 =𝑁 π‘š 𝑁 , 𝑆 𝑁 (for some π‘š 𝑁 , 𝑆 𝑁 ) Derive mean of posterior distribution 𝑀 𝐡 =𝐸 π‘Š 𝑦 = π‘š 𝑁 Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya

10 Iterative Solutions for Normal Equations
Direct solutions have limitations Iterative solutions First order method: Gradient descent 𝑀 (𝑑+1) ← 𝑀 (𝑑) +𝜌 𝑛 𝑦 𝑛 βˆ’ 𝑀 𝑑 𝑇 π‘₯ 𝑛 π‘₯ 𝑛 Convergence guarantees Convergence in probability to correct solution for appropriate fixed step size Sure convergence with decreasing step sizes Stochastic gradient descent Update based on a single data point as each step Often converges faster Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya

11 Advantages of Probabilistic Modeling
Makes assumptions explicit Modularity Conceptually simple to change a model by replacing with appropriate distributions Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya

12 Summary Probabilistic formulation of linear regression
Recovers least squares formulation Iterative algorithms for training Forms of regularization Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya


Download ppt "Probabilistic Models for Linear Regression"

Similar presentations


Ads by Google