Machine learning overview

Machine learning overview
Computational method to improve performance on a task by using training data. This shows a NN, but can replace with other ML methods

Task (prediction from input)
Possible Tasks Classification – identify one of several categories, possible with some inputs missing Regression/approximation – approximate a real-valued function of specified inputs Generation – produce output of specified type as a function of input (e.g., a description from a scene or a scene from a description) Image processing – denoising, inpainting, superresolution Many others

Performance (loss function)
Correct measure depends on the task Classification yields a probability distribution. Use cross- entropy to compare to true distribution for training. We really care about accuracy (fraction of correct classifications out of total), but difficult to train on that. Image comparison often uses mean squared error - sum of squared differences between pixels, divided by number of pixels. Problem is that both good and bad images can have the same distance from a target image – can lead to blurry output or artifacts. May need a metric that weights more heavily in some directions than others.

Datasets and learning type
Supervised learning: training data is pairs of inputs and targets. The goal is to learn the mapping from input to target. E.g., noisy image to clean image or sentence in English to translation in French. Unsupervised learning: training data is just a set of inputs. The goal is to learn something about the distribution of these inputs in the underlying space. E.g., learn a low dimensional representation for a set of natural images, as in an autoencoder.

Linear regression Task: predict y from x, using or
Other forms possible, such as This is still linear in the parameters w.

Linear regression Divide data into training set and testing set.
Performance is MSE: This is a function of w. The gradient of this function wrt w gives the direction (vector) in w-space that gives the maximum increase. Take a step in the negative gradient direction to decrease this function: gradient descent

Linear regression (and ML in general)
Goal: perform well on test data using only the training data Assumption: test data and training data are drawn from the same distribution (i.i.d.: independent, identically distributed samples) Refined goal: Make training error small Make difference between training error and test error small. General strategy: decompose error into two pieces and study each piece separately.

Capacity and data fitting
Capacity: a measure of the ability to fit complex data Increased capacity means we can make the training error small. Overfitting: like memorizing the training inputs. Capacity large enough to reproduce training data, but does poorly on test data. Too much capacity for the available data. Underfitting: like ignoring details. Not enough capacity for the available detail.

Capacity and data fitting

Capacity and generalization error

Regularization Sometimes minimizing the performance or loss function directly promotes overfitting. E.g., the Runge phenomenon of interpolating by a polynomial using evenly spaced points. Red = target function Blue = degree 5 Green = degree 9 Output is linear in the coeffs

Regularization Can get a better fit by using a penalty on the coefficients. E.g.,

Machine learning overview

Similar presentations

Presentation on theme: "Machine learning overview"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Machine learning overview

Similar presentations

Presentation on theme: "Machine learning overview"— Presentation transcript:

Similar presentations

About project

Feedback