Presentation is loading. Please wait.

Presentation is loading. Please wait.

+ + + + + + + + + + + + - - - - - - - - - - - - Example set X Can Inductive Learning Work? Hypothesis space H Training set  Inductive hypothesis h size.

Similar presentations


Presentation on theme: "+ + + + + + + + + + + + - - - - - - - - - - - - Example set X Can Inductive Learning Work? Hypothesis space H Training set  Inductive hypothesis h size."— Presentation transcript:

1 + + + + + + + + + + + + - - - - - - - - - - - - Example set X Can Inductive Learning Work? Hypothesis space H Training set  Inductive hypothesis h size m size |H| h : hypothesis that agrees with all examples in  p( x ): probability that example x is picked from X L

2 Approximately Correct Hypothesis h  H is approximately correct (AC) with accuracy  iff: Pr[ h ( x ) correct] > 1 –  where x is an example picked with probability distribution p from X

3 PAC Learning Algorithm  A leaning algorithm L is Provably Approximately Correct (PAC) with confidence 1  iff the probability that it generates a non-AC hypothesis h is   : Pr[ h is non-AC ]   Can L be PAC if the size m of the training set  is large enough? If yes, how big should m be?

4 Intuition  If m is large enough and g  H is not AC, it is unlikely that it agrees with all examples in the training dataset   So, if m is large enough, there should be few non-AC hypotheses that agree with all examples in   Hence, it is unlikely that L will pick one

5 Can L Be PAC?  Let g be an arbitrary hypothesis in H that is not approximately correct  Since g is not AC, we have: Pr[ g ( x ) correct]  1–   The probability that g is consistent with all the examples in  is at most (1-  ) m  The probability that there exists a non-AC hypothesis matching all examples in  is at most |H|(1-  ) m  Therefore, L is PAC if m verifies: |H|(1-  ) m   L is PAC if Pr[h is non-AC]   h  H is AC iff: Pr[h( x ) correct] > 1– 

6 Calculus  H = {h1, h2, …, h |H| }  Pr(hi is not-AC and agrees with  )  (1-  ) m Pr(h1, or h2, …, is not-AC and agrees with  )   i=1,…,|H| Pr(hi is not-AC and agrees with  )  |H| (1-  ) m

7 Size of Training Set  From |H|(1  ) m   we derive: m  ln(  /|H|) / ln(1-  )  Since  <  ln(1  ) for 0 <  <1, we have: m  ln(  /|H|) / (  ) m  ln(|H|/  ) /   So, m increases logarithmically with the size of the hypothesis space But how big is |H|?

8  If H is the set of all logical sentences with n observable predicates, then |H| =, and m is exponential in n  If H is the set of all conjunctions of k << n observable predicates picked among n predicates, then |H| = O(n k ) and m is logarithmic in n   Importance of choosing a “good” KIS bias Importance of KIS Bias 2 2n2n


Download ppt "+ + + + + + + + + + + + - - - - - - - - - - - - Example set X Can Inductive Learning Work? Hypothesis space H Training set  Inductive hypothesis h size."

Similar presentations


Ads by Google