Presentation is loading. Please wait.

Presentation is loading. Please wait.

Decision making in episodic environments

Similar presentations


Presentation on theme: "Decision making in episodic environments"— Presentation transcript:

1 Decision making in episodic environments
We have just looked at decision making in sequential environments Now let’s consider the “easier” problem of episodic environments The agent gets a series of unrelated problem instances and has to make some decision or inference about each of them This is what most of “machine learning” is about

2 Example: Image classification
input desired output apple pear tomato cow dog horse

3 Example: Spam Filter

4 Example: Seismic data Earthquakes Surface wave magnitude
Nuclear explosions Body wave magnitude

5 The basic classification framework
y = f(x) Learning: given a training set of labeled examples {(x1,y1), …, (xN,yN)}, estimate the parameters of the prediction function f Inference: apply f to a never before seen test example x and output the predicted value y = f(x) output classification function input

6 Example: Training and testing
Training set (labels known) Test set (labels unknown) Key challenge: generalization to unseen examples

7 Naïve Bayes classifier
A single dimension or attribute of x

8 Decision tree classifier
Example problem: decide whether to wait for a table at a restaurant, based on the following attributes: Alternate: is there an alternative restaurant nearby? Bar: is there a comfortable bar area to wait in? Fri/Sat: is today Friday or Saturday? Hungry: are we hungry? Patrons: number of people in the restaurant (None, Some, Full) Price: price range ($, $$, $$$) Raining: is it raining outside? Reservation: have we made a reservation? Type: kind of restaurant (French, Italian, Thai, Burger) WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60)

9 Decision tree classifier

10 Decision tree classifier

11 Nearest neighbor classifier
Training examples from class 2 Training examples from class 1 Test example f(x) = label of the training example nearest to x All we need is a distance function for our inputs No training required!

12 Linear classifier Find a linear function to separate the classes
f(x) = sgn(w1x1 + w2x2 + … + wDxD) = sgn(w  x)

13 Perceptron . Input Weights x1 w1 x2 w2 Output: sgn(wx + b) x3 w3 wD
xD

14 Linear separability

15 Multi-Layer Neural Network
Can learn nonlinear functions Training: find network weights to minimize the error between true and estimated labels of training examples: Minimization can be done by gradient descent provided f is differentiable This training method is called back-propagation

16 Differentiable perceptron
Input Weights x1 w1 x2 w2 Output: (wx + b) x3 w3 . Sigmoid function: wd xd

17 Review: Types of classifiers
Naïve Bayes Decision tree Nearest neighbor Linear classifier Nonlinear classifier


Download ppt "Decision making in episodic environments"

Similar presentations


Ads by Google