Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS621: Artificial Intelligence Lecture 11: Perceptrons capacity Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay.

Similar presentations


Presentation on theme: "CS621: Artificial Intelligence Lecture 11: Perceptrons capacity Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay."— Presentation transcript:

1 CS621: Artificial Intelligence Lecture 11: Perceptrons capacity Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

2 The Perceptron Model A perceptron is a computing element with input lines having associated weights and the cell having a threshold value. The perceptron model is motivated by the biological neuron. Output = y wnwn W n-1 w1w1 X n-1 x1x1 Threshold = θ

3 θ 1 y Step function / Threshold function y = 1 for Σw i x i >=θ =0 otherwise ΣwixiΣwixi

4 Threshold functions n # Boolean functions (2^2^n) #Threshold Functions (2 n2 ) 1 4 4 2 16 14 3 256 128 4 64K 1008 Functions computable by perceptrons - threshold functions #TF becomes negligibly small for larger values of #BF. For n=2, all functions except XOR and XNOR are computable.

5 Concept of Hyper-planes ∑ w i x i = θ defines a linear surface in the (W,θ) space, where W= is an n- dimensional vector. A point in this (W,θ) space defines a perceptron. y x1x1... θ w1w1 w2w2 w3w3 wnwn x2x2 x3x3 xnxn

6 Perceptron Property Two perceptrons may have different parameters but same functional values. Example of the simplest perceptron w.x>θ gives y=1 w.x≤θ gives y=0 Depending on different values of w and θ, four different functions are possible θ y x1x1 w1w1

7 Simple perceptron contd. 10101 11000 f4f3f2f1x θ≥0 w≤0 θ≥0 w>0 θ<0 w≤0 θ<0 W<0 0-function Identity Function Complement Function True-Function

8 Counting the number of functions for the simplest perceptron For the simplest perceptron, the equation is w.x=θ. Substituting x=0 and x=1, we get θ=0 and w=θ. These two lines intersect to form four regions, which correspond to the four functions. θ=0 w=θ R1 R2 R3 R4

9 Fundamental Observation The number of TFs computable by a perceptron is equal to the number of regions produced by 2 n hyper-planes,obtained by plugging in the values in the equation ∑ i=1 n w i x i = θ Intuition: How many lines are produced by the existing planes on the new plane? How many regions are produced on the new plane by these lines?

10 The geometrical observation Problem: m linear surfaces called hyper- planes (each hyper-plane is of (d-1)-dim) in d- dim, then what is the max. no. of regions produced by their intersection? i.e. R m,d = ?

11 Concept forming examples Max #regions formed by m lines in 2-dim isR m,2 = R m-1,2 + ? The new line intersects m-1 lines at m-1 points and forms m new regions. R m,2 = R m-1,2 + m, R 1,2 = 2 Max #regions formed by m planes in 3 dimensions is R m,3 = R m-1,3 + R m-1,2, R 1,3 = 2

12 Concept forming examples contd.. Max #regions formed by m planes in 4 dimensions is R m,4 = R m-1,4 + R m-1,3, R 1,4 = 2 R m,d = R m-1,d + R m-1,d-1 Subject to R 1,d = 2 R m,1 = 2

13 General Equation R m,d = R m-1,d + R m-1,d-1 Subject to R 1,d = 2 R m,1 = 2 All the hyperplanes pass through the origin.

14 Method of Observation for lines in 2-D R m,2 = R m-1,2 + m R m-1,2 = R m-2,2 + m-1 R m-2,2 = R m-3,2 + m-2 : R 2,2 = R 1,2 + 2 Therefore, R m,2 = R m-1,2 + m = 2 + m + (m-1) + (m-2) + …+ 2 = 1 + ( 1 + 2 + 3 + … + m) = 1 + [m(m+1)]/2

15 Method of generating function R m,2 = R m-1,2 + m f(x) = R 1,2 x + R 2,2 x 2 + R 3,2 x 3 + … + R i,2 x m + … + α –>Eq1 xf(x) = R 1,2 x 2 + R 2,2 x 3 + R 3,2 x 4 + … + R i,2 x m+1 + … + α –>Eq2 Observe that R m,2 - R m-1,2 = m

16 Method of generating functions cont… Eq1 – Eq2 gives (1-x)f(x) = R 1,2 x + (R 2,2 - R 1,2 )x 2 + (R 3,2 - R 2,2 )x 3 + … + (R m,2 - R m-1,2 )x m + … + α (1-x)f(x) = R 1,2 x + (2x 2 + 3x 3 + …+mx m +..) = 2x 2 + 3x 3 + …+mx m +.. f(x) = (2x 2 + 3x 3 + …+mx m +..)(1-x) -1

17 Method of generating functions cont… f(x) = (2x 2 + 3x 3 + …+mx m +..)(1+x+x 2 +x 3 …)  Eq3 Coeff of x m is R m,2 = (2 + 2 + 3 + 4 + …+m) = 1+[m(m+1)/2]

18 The general problem of m hyperplanes in d dimensional space c(m,d) = c(m-1,d) + c(m-1,d-1) subject to c(m,1)= 2 c(1,d)= 2

19 Generating function f(x,y) = R 1,1 xy + R 1,2 xy 2 + R 1,3 xy 3 + … + R 2,1 x 2 y + R 2,2 x 2 y 2 + R 2,3 x 2 y 3 +... + R 3,1 x 3 y + R 3,2 x 3 y 2 + ……… f(x,y) = ∑ m≥1 ∑ n≥1 R m,d x m y d

20 # of regions formed by m hyperplanes passing through origin in the d dimensional space c(m,d)= 2.Σ d-1 i=0 m-1 c i


Download ppt "CS621: Artificial Intelligence Lecture 11: Perceptrons capacity Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay."

Similar presentations


Ads by Google