Download presentation

Presentation is loading. Please wait.

Published byAlec Delany Modified over 2 years ago

1
Single Neuron Hung-yi Lee

2
Single Neuron … bias Activation function

3
Learning to say “yes/no” Binary Classification

4
Learning to say yes/no Spam filtering Is an e-mail spam or not? Recommendation systems recommend the product to the customer or not? Malware detection Is the software malicious or not? Stock prediction Will the future value of a stock increase or not with respect to its current value? Binary Classification

5
Example Application: Spam filtering (http://spam-filter-review.toptenreviews.com/) E-mail Spam Not spam

6
Example Application: Spam filtering How to estimate P(yes|x)? What does the function f look like?

7
Example Application: Spam filtering To estimate P(yes|x), collect examples first Yes (Spam) No (Not Spam) Yes (Spam) ……. ….. Earn … free …… free Talk … Meeting … Win … free…… Some words frequently appear in the spam e.g., “free” Use the frequency of “free” to decide if an e-mail is spam x1x1 x2x2 x3x3 Estimate P(yes|x free = k) x free is the number of “free” in e-mail x

8
Regression Frequency of “Free” (x free ) in an e-mail x p(yes|x free ) Problem: What if one day you receive an e-mail with 3 “free” …. In training data, there is no e- mail containing 3 “free”. p(yes| x free = 1 ) = 0.4 p(yes| x free = 0 ) = 0.1

9
Regression f(x free ) = wx free + b (f(x free ) is an estimate of p(yes|x free ) ) Store w and b Frequency of “Free” (x free ) in an e-mail x Regression p(yes|x free )

10
Regression Problem: What if one day you receive an e-mail with 6 “free” …. Frequency of “Free” (x free ) in an e-mail x p(yes|x free ) Regression f(x free ) = wx free + b The output of f is not between 0 and 1

11
Logit vertical line: Probability to be spam p(yes|x free ) (p) p is always between 0 and 1 vertical line: logit(p) x free

12
Logit vertical line: Probability to be spam p(yes|x free ) (p) p is always between 0 and 1 vertical line: logit(p) x free f ’ (x free ) = w ’ x free + b ’ (f ’ (x free ) is an estimate of logit(p) )

13
Logit > 0.5, so “yes” “yes” vertical line: logit(p) x free Store w ’ and b ’ f ’ (x free ) = w ’ x free + b ’ (f ’ (x free ) is an estimate of logit(p) ) (perceptron)

14
Multiple Variables x free Consider two words “free” and “hello” compute p(yes|x free,x hello ) (p) x hello

15
Multiple Variables Regression x free x hello Consider two words “free” and “hello” compute p(yes|x free,x hello ) (p)

16
Multiple Variables Of course, we can consider all words {t 1, t 2, … t N } in a dictionary z is to approximate logit(p)

17
approximate Logistic Regression If the probability p = 1 or 0, ln(p/1-p) = +infinity or –infinity Can not do regression t 1 appears 3 times t 2 appears 0 time … t N appears 1 time x The probability to be spam p is always 1 or 0.

18
Logistic Regression Sigmoid Function

19
Logistic Regression Yes (Spam) x1x1 close to 1 No (not Spam) x2x2 close to 0

20
Logistic Regression … bias feature Yes 1 0 No

21
Logistic Regression … 1 0 Yes No

22
Reference for Linear Classifier http://grzegorz.chrupala.me/papers/ml4nlp/linear- classifiers.pdf

23
More than saying “yes/no” Multiclass Classification

24
More than saying “yes/no” Handwriting digit classification This is Multiclass Classification

25
More than saying “yes/no” Handwriting digit classification Simplify the question: whether an image is “2” or not … feature of an image Each pixel corresponds to one dimension in the feature Describe the characteristics of input object

26
More than saying “yes/no” Handwriting digit classification Simplify the question: whether an image is “2” or not … “2” or not

27
More than saying “yes/no” Handwriting digit classification Binary classification of 1, 2, 3 … … “1” or not “2” or not “3” or not …… If y 2 is the max, then the image is “2”.

28
Limitation of single neuron Motivation of cascading neurons

29
Limitation of Logistic Regression Input Output x1x1 x2x2 00No 01Yes 10 11No Yes No

30
Limitation of Logistic Regression XNOR gate AND OR AND NOT Input Output x1x1 x2x2 00No 01Yes 10 11No 0 0 1 1 0 0 1 0 1 1 1 0 0 1 1 0 1 1

31
Neural Network XNOR gate AND OR AND NOT Hidden Neurons Neural Network

32
a 2 =0.27 a 2 =0.73 a 2 =0.27 a 2 =0.05 a 1 =0.27 a 1 =0.05 a 1 =0.73

33
b (0.27, 0.27) (0.73, 0.05) (0.05,0.73) a 2 =0.27 a 2 =0.73 a 2 =0.27 a 2 =0.05 a 1 =0.27 a 1 =0.05 a 1 =0.73 a1a1 a2a2 a1a1 a2a2

34
Thank you for your listening!

35
Appendix

36
More reference http://www.ccs.neu.edu/home/vip/teach/MLcours e/2_GD_REG_pton_NN/lecture_notes/logistic_regr ession_loss_function/logistic_regression_loss.pdf http://www.ccs.neu.edu/home/vip/teach/MLcours e/2_GD_REG_pton_NN/lecture_notes/logistic_regr ession_loss_function/logistic_regression_loss.pdf http://mathgotchas.blogspot.tw/2011/10/why-is- error-function-minimized-in.html http://mathgotchas.blogspot.tw/2011/10/why-is- error-function-minimized-in.html https://cs.nyu.edu/~yann/talks/lecun-20071207- nonconvex.pdf https://cs.nyu.edu/~yann/talks/lecun-20071207- nonconvex.pdf http://www.cs.columbia.edu/~blei/fogm/lectures/g lms.pdf http://www.cs.columbia.edu/~blei/fogm/lectures/g lms.pdf

37
Logistic Regression x

Similar presentations

OK

Machine Learning Introduction Study on the Coursera All Right Reserved : Andrew Ng Lecturer:Much Database Lab of Xiamen University Aug 12,2014.

Machine Learning Introduction Study on the Coursera All Right Reserved : Andrew Ng Lecturer:Much Database Lab of Xiamen University Aug 12,2014.

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Free ppt on brain machine interface insect Ppt on artificial intelligence for cse Ppt on credit default swaps news Ppt on earth hour chicago Ppt on ar verbs Ppt on solar system for class 8 Ppt on non biodegradable wastes Ppt on polarisation of light Download ppt on water level indicator Ppt on yoga and aerobics classes