PAC Learning adapted from Tom M.Mitchell Carnegie Mellon University.

PAC Learning adapted from Tom M.Mitchell Carnegie Mellon University

Learning Issues Under what conditions is successful learning … possible ? … assured for a particular learning algorithm ?

Sample Complexity How many training examples are needed … for a learner to converge (with high probability) to a successful hypothesis?

Computational Complexity How much computational effort is needed … for a learner to converge (with high probability) to a successful hypothesis?

The world X is the sample space Example: Two dice {(1,1),(1,2), …,(6,5),(6,6)} x x x x x x x x x x x x x

Weighted world X is a distribution over X Example: Biased dice {(1,1; p 11 ),(1,2 ; p 12 ), …,(6,5 ; p 65 ),(6,6 ; p 66 )} x x x x x x x x x x x x x

An event E is a subset of X Example: Two dice {(1,1),(1,2), …,(6,5),(6,6)} x x x x x x x x x x x x x

An event E is a subset of X Example: A pair in Two dice {(1,1),(2,2),(3,3),(4,4),(5,5),(6,6)} x x x x x x x x x x x x x

A Concept C is an indicator function of an event E Example: A pair in Two dice c(x,y) := (x==y) x x x x x x x x x x x x x

A hypotesis h is an approximation to a concept c Example: A separating hyperplane h(x,y) := (0.5).[1+sign(a.x+by+c)] x x x x x x x x x x x x x

The dataset D is an i.i.d. sample from (X, ) { } i=1, …,m m examples

An Inductive learner L is an algorithm that uses data D to produce h  H Example: The Perceptron Algorithm h(x,y) := (0.5).[1+sign(a(D).x+b(D).y+c(D))] x x x x x x x x x x x x x

Error Measures Training error of hypothesis h How often over training instances True error of hypothesis h How often over future random instances

True error

Learnability How to describe Learn-ability ? the number of training examples needed to learn a hypothesis for which = 0. Infeasible

PAC Learnability Weaken demands on the learner  true error accuracy  failure probability  and  can be arbitrarily small Probably Approximately Correct Learning

PAC Learnability C is PAC-learnable by L true error <  with probability  (1-  ) after reasonable # of examples reasonable time per example Reasonable polynomial in terms of 1/ , 1/ , n(size of examples) and target concept encoding length

PAC Learnability

C is PAC-Learnable each target concept in C can be learned from a polynominal number of training examples the processing time per example is also polynominal bounded polynomial in terms of 1/ , 1/ , n (size of examples) and target c encoding length

PAC Learning adapted from Tom M.Mitchell Carnegie Mellon University.

Similar presentations

Presentation on theme: "PAC Learning adapted from Tom M.Mitchell Carnegie Mellon University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PAC Learning adapted from Tom M.Mitchell Carnegie Mellon University.

Similar presentations

Presentation on theme: "PAC Learning adapted from Tom M.Mitchell Carnegie Mellon University."— Presentation transcript:

Similar presentations

About project

Feedback