Ch 2. The Probably Approximately Correct Model and the VC Theorem 2.3 The Computational Nature of Language Learning and Evolution, Partha Niyogi, 2004.

Ch 2. The Probably Approximately Correct Model and the VC Theorem 2.3 The Computational Nature of Language Learning and Evolution, Partha Niyogi, 2004. Summarized by J. Yang Biointelligence Laboratory, Seoul National University http://bi.snu.ac.kr/

2(C) 2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Contents 2.3 The Probably Approximately Correct (PAC) Model and the VC Theorem  2.3.1 Sets and Indicator Functions  2.3.2 Graded Distance  2.3.3 Examples and Learnability  2.3.4 The Vapnik-Chervonenkis (VC) Theorem  2.3.5 Proof of Lower Bound for Learning  2.3.6 Implications  2.3.7 Complexity of Learning  2.3.8 Final Words

Classes of functions: Concept class ( F ), Hypothesis class ( H ) f: X  Y (X= set of all strings  *, Y={0,1}) For a language, indicator function is defined as Graded distance with probability P on  * 3(C) 2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Sets, indicator functions, graded distance

Examples, learner’s hypothesis Examples: (x, y) where x   * (drawn according to P) and y = 1 L (x) Learner’s hypothesis  Set of all k data points (streams):  Learner’s hypothesis:  Mapping of effective procedure (a learning algorithm): 4 (C) 2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/

PAC framework, DFA, Gold learnable (deterministic) finite automata: An automaton is a mathematical model for a finite state machine (FSM). A FSM is a machine that, given an input of symbols, "jumps", or transitions, through a series of states according to a transition function (which can be expressed as a table). 6 (C) 2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Learnable language/family of languages If there exist a learning algorithm A such that the corresponding learner’s hypothesis converges to the target in probability, the target language is said to be learnable 7 (C) 2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Vapnic-Chervonenkis (VC) Theorem VC dimension: at the point n = h For finite VC dimension, If G(n) = n ln 2, VC dimension is infinite and any sample of size n can be split in all 2 n possible ways by the functions of learning machine and the minimum of the empirical risk for this machine is always zero (nonfalsifiable) – overfitting, false generalization A set of functions (a learning machine) has VC dimension h if there exist h samples that can be shattered but there do not exist h+1 samples that can be shattered (maximum number of samples for which all possible binary labeling can be induced by the set of functions) 10 (C) 2009, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Ch 2. The Probably Approximately Correct Model and the VC Theorem 2.3 The Computational Nature of Language Learning and Evolution, Partha Niyogi, 2004.

Similar presentations

Presentation on theme: "Ch 2. The Probably Approximately Correct Model and the VC Theorem 2.3 The Computational Nature of Language Learning and Evolution, Partha Niyogi, 2004."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ch 2. The Probably Approximately Correct Model and the VC Theorem 2.3 The Computational Nature of Language Learning and Evolution, Partha Niyogi, 2004.

Similar presentations

Presentation on theme: "Ch 2. The Probably Approximately Correct Model and the VC Theorem 2.3 The Computational Nature of Language Learning and Evolution, Partha Niyogi, 2004."— Presentation transcript:

Similar presentations

About project

Feedback