Artificial Intelligence 9. Perceptron

Artificial Intelligence 9. Perceptron
Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka

Outline Feature space Perceptrons The averaged perceptron
Lecture slides

Feature space Instances are represented by vectors in a feature space

Feature space Instances are represented by vectors in a feature space
正例 <Outlook = sunny, Temperature = cool, Humidity = normal> 負例 <Outlook = rain, Temperature = high, Humidity = high>

Separating instances with a hyperplane
Find a hyperplane that separates the positive and negative examples

Perceptron learning Can always find such a hyperplane if the given examples are linearly separable

Linear classification
Binary classification with a linear model ：　instance ：　feature vector ：　weight vector bias If the inner product of the feature vector with the linear weights is greater than or equal to zero, then it is classified as a positive example, otherwise it is classified as a negative example

The Perceptron learning algorithm
Initialize the weight vector Choose an example (randomly) from the training data If it is not classified correctly, If it is a positive example If it is a negative example Step 2 and 3 are repeated until all examples are correctly classified.

Learning the concept OR
Training data Negative Positive Positive Positive

Iteration 1 x1 Wrong!

Iteration 3 x2 OK!

Iteration 4 x3 OK!

Separating hyperplane
Final weight vector t 1 Separating hyperplane s 1 s and t are the input (the second and the third elements of the feature vector)

Why the update rule works
When a positive example has not been correctly classified This values was too small Original value This is always positive The update rule makes it less likely for the perceptron to make the same mistake

Convergence The Perceptron training algorithm converges after a finite number of iterations to a hyperplane that perfectly classifies the training data, provided the training examples are linearly separable. The number of iterations can be very large The algorithm does not converge if the training data are not linearly separable

Learning the PlayTennis concept
Final weight vector Feature space 11 binary features Perceptron learning Converged in 239 steps　 Bias Outlook = Sunny -3 Outlook = Overcast 5 Outlook = Rain -2 Temperature = Hot Temperature = Mild 3 Temperature = Cool Humidity = High -4 Humidity = Normal 4 Wind = Strong Wind = Weak

Averaged Perceptron A variant of the Perceptron learning algorithm
Output the weight vector which is averaged over iterations rather than the final weight vector Do not wait until convergence Determine when to stop by observing the performance on the validation set Practical and widely used

Naive Bayes vs Perceptrons
The naive Bayes model assumes conditional independence between features Adding informative features does not necessarily improve the performance Percetrons allow one to incorporate diverse types of features The training takes longer

Artificial Intelligence 9. Perceptron

Similar presentations

Presentation on theme: "Artificial Intelligence 9. Perceptron"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Artificial Intelligence 9. Perceptron

Similar presentations

Presentation on theme: "Artificial Intelligence 9. Perceptron"— Presentation transcript:

Similar presentations

About project

Feedback