Presentation on theme: "VC Dimension – definition and impossibility result"— Presentation transcript:
1VC Dimension – definition and impossibility result Lecturer: Yishay MansourEran Nir and Ido Trivizki
2VC Dimension – Lecture Overview PAC Model – ReviewVC dimension – motivationDefinitionsSome examples of geometric conceptsSample size lower boundsMore examples
3The PAC Model - ReviewA fixed, unknown distribution D from which the examples are chosen independently.The target concept is a computable functionOur goal – finding h such that:- accuracy parameter; - confidence parameter.An algorithm A learns a family of concepts C if for any and any distribution D , A outputs a function such that
4VC Dimension - Motivation Question: How many examples does a learning algorithm need?For PAC and a finite concept class C we proved:We would like to be able to handle infinite concept classes – VC Dimensions will provide us a substitute to for infinite concept classes.
5VC Dimension - Definitions Given a concept class C defined over the instance space X, letThe projection of C on S is all the possible functions that C induces on S :A concept class C shatters S ifIn other words: a class shatters a set if every possible function on the set is in the class.
6VC Dimension – Definitions Cont. VCdim (Vapnik-Chervonenkis dimension) of C:The maximum size of a set shattered by C:If a maximum value doesn’t exist thenFor a finite class C:
7VC Dimension – Examples In order to show that the VCdim of a class is d we have to show:: find some shattered set of size d.: show that no set of size d+1 isshattered
8VC Dimension – Examples: Half Lines (C1) The concepts are for where:
9VC Dimension – Examples: Half Lines (C1) Cont. Claim:: , , thus: for any set of size 2 there is an assignment which is not in the concept class: forthe assignment which lets x be 1 and y be 0 is impossible.
10VC Dimension – Examples: Linear halfspaces (C2) The concepts are where for let are lines in the plane where positive points above or on the line, and negative points are below.
11VC Dimension – Examples: Linear halfspaces (C2) Cont. Claim:: Any three points that are not collinear can be shattered.: No set of four points can be shattered:Generally: Half spaces in have VCdim of
12VC Dimension – Examples: Axis-aligned rectangles in the plane (C3) Positive examples are points inside the rectangle, and negative examples are points outside the rectangle.
13VC Dimension – Examples: Axis-aligned rectangles in the plane (C3) Claim:: a for points set in the following shape can be shattered:
14VC Dimension – Examples: Axis-aligned rectangles in the plane (C3) Claim:: Given a set of five points in the plane, there must be some point that is neither the extreme left, right, top or bottom point of the five. If we label this non-extermal point negative and the remaining four extermal points positive, no rectangle can satisfy the assignment.
15VC Dimension – Examples: A finite union of intervals (C4) For any set of points we could cover the positive points by choosing the intervals small enough so
16VC Dimension – Examples: Convex Polygons on the plane (C5) Points inside the convex polygon are positive and outside are negative.There is no bound on the number of edges.Claim:
17VC Dimension – Examples: Convex Polygons on the plane (C5) Proof:For every labeling of d points on the circle perimeter, there exists that is consistent with the labeling.This is a polygon which includes all the positive examples and none of the negative. Thus the group of points is shuttered.This holds for every d, and so
18Sample Size Lower Bounds Goal: we want to show that for a concept class with a finite VCdim d there is a function m ofsuch that if we sample less thanpoints, any PAC learning algorithm would fail.Theorem: If a concept class C has VCdim d+1 then:
19Sample Size Lower Bounds - Proof For contradiction: let such that C shatters T (possible because ).Let D(x) beChoose randomly so that it’s
20Sample Size Lower Bounds – Proof Cont. is in C because C shatters T.Claim: if we sample less than points out ofthen the error is at least .Proof: Let RARE beSample size: the expected number of points we sample from RARE is at mostError:This implies that with probability of at least we sample at most points of RARE and thus have error of at least .
21VC Dimension – Examples: Parity (C6) Let The concept class is whereClaim:: Let For any bits assignment for the vectors we choosethe set We get:and so is shattered.: There are parity functions, thus
22VC Dimension – Examples: OR of n literals (C7) Let The concept class isClaim:: use n unit vectors (see prev. proof).:Use ELIM algorithm to showShow the (n+1) vector cannot be assigned 1, thus no set of (n+1) vectors can be shuttered.
23Radon Theorem Definitions: Convex Set: A is convex if for every the line connecting is in A.Convex Hull: The Convex Hull of S is the smallest convex set which contains all the points of S. We denote it as conv(S).Theorem (Radon):Let E be a set of d+2 points in There is a subset S of E such that
24VC Dimension – Examples: Hyper-Planes (C8) The concept class assigns 1 to a point if it’s above or on a corresponding hyper-plane, 0 otherwise.Claim:: use n unit vectors and the zero vector to form a n+1 set that can be shuttered.: use Radon theorem (next page)
25VC Dimension – Examples: Hyper-Planes (C8) Cont. Assume a set of size d+2 points can be shattered.Use Radon Theorem to find S such thatAssume there is a separating hyper-plane that classifies points in S as ‘1’, points not in S as 0.No way to classify points in