[slides prises du cours cs UC Berkeley (2006 / 2009)]
Classification (reminder) X ! Y Anything: continuous ( , d, …) discrete ({0,1}, {1,…k}, …) structured (tree, string, …) … discrete: – {0,1}binary – {1,…k}multi-class – tree, etc.structured
Classification (reminder) X Anything: continuous ( , d, …) discrete ({0,1}, {1,…k}, …) structured (tree, string, …) …
Classification (reminder) X Anything: continuous ( , d, …) discrete ({0,1}, {1,…k}, …) structured (tree, string, …) … Perceptron Logistic Regression Support Vector Machine Decision Tree Random Forest Kernel trick
Regression X ! Y continuous: – , d Anything: continuous ( , d, …) discrete ({0,1}, {1,…k}, …) structured (tree, string, …) … 1
degree 15 overfitting!
Between two models / hypotheses which explain as well the data, choose the simplest one In Machine Learning: ◦ we usually need to tradeoff between training error model complexity ◦ can be formalized precisely in statistics (bias- variance tradeoff, etc.)
training errormodel complexity
Logiciels: ◦ Weka (Java): ◦ RapidMiner (nicer GUI?): ◦ SciKit Learn (Python): Livres: ◦ Pattern Classification (Duda, Hart & Stork) ◦ Pattern Recognition and Machine Learning (Bishop) ◦ Data Mining (Witten, Frank & Hall) ◦ The Elements of Statistical Learning (Hastie, Tibshirani, Friedman) Programmer en python: ◦ cours cs188 de Dan Klein à Berkeley:
Kernel Regression Kernel regression (sigma=1)