# Informational Complexity Notion of Reduction for Concept Classes Shai Ben-David Cornell University, and Technion Joint work with Ami Litman Technion.

## Presentation on theme: "Informational Complexity Notion of Reduction for Concept Classes Shai Ben-David Cornell University, and Technion Joint work with Ami Litman Technion."— Presentation transcript:

Informational Complexity Notion of Reduction for Concept Classes Shai Ben-David Cornell University, and Technion Joint work with Ami Litman Technion

Measures of the Informational Complexity of a class  The VC-dimension of the class.  The sample complexity for learning the class from random examples.  The optimal mistake bound for learning the class online (or the query complexity of learning this class using membership and equivalence queries).  The size of the minimal compression scheme for the class.

Outline of the talk  Defining our reductions, and the induced notion of complete concept classes.  Introducing a specific family of classes that contains many natural concept classes.  Prove that the class of half-spaces is complete w.r.t. that family.  Demonstrate some non-reducability results.  Corollaries concerning the existence of compression schemes.

Defining Reductions We consider pair of sets (X,Y) where X is a domain and Y is a set of concepts. A concept class is a relations R over XxY (so each y  Y can be viewed as the subset {x: (x,y)  R} of X ).  An embedding of C=(X,Y,R) into C’=(X’,Y’,R’) is a pair of functions  :X  X’,  :Y  Y’, so that (x,y)  R iff (  (x),  (y))  R’.  C reduces to C’, denoted C  C,’ if such an embedding exits.

Relationship to Info Complexity If C  C’ then, for each of the complexity parameters mentioned above, C’ is at least as complex as C. E.g., if C  C,’ then, for every  and  the sample complexity of  learning C is at most that needed for learning C’. (This is in the agnostic prediction model)

Immediate observations  If we take into account the computational complexity of the embedding functions, then we can also bound the computational complexity of learning C by that of learning C’  For every k, the class of all binary functions on a k-size domain is minimal w.r.t. the family of all classes having VC-dimension k.

Universal Classes We say that a concept class C is universal for a family of classes F if every member of F reduces to C. Universal classes play a role analogous to that of, say, NP-hard decision problems – they are as complex as any member of the family F

Some important classes  For an integer k, let HS k denote the class of half spaces over R k. That is HS k =(R k, R k+1, H) where ((x 1,….x k ),(a 1,…a k+1 ))  H iff  a i x i +a k+1  0  Let PHS k denote the class of positive half spaces, that is, half spaces in which a 1 =1.  Finally, let HS k 0 denote the class of homogenous half spaces (I.e., those having a k+1 =0), and PHS k 0 the class of poditive and homogenous half spaces.

Half Spaces and Completeness The first family of classes that comes to mind is the family VC n - the family of all concept classes having VC-dimensions n. Theorem: For any n>2, no class HS k is universal for VC n (This holds even if we consider only finite classes)

Dudley Classes (1) Next, we define a rich subfamily of VC n for which classes of half spaces are universal. Let F be a family of real valued functions over some domain set X. For any function g, let h be any real valued function over X and define a concept class D F,h = (X, F, R F,h ) where R F,h = {(x,f) : f(x)+h(x)  0}. (Note that all the PPD’s defined by Adam yesterday were of this form)

Dudley Classes (2) Classes of the form D F,h = (X, F, R F,h ) are called Dudley Classes if the family of functions F is a vector space over the reals (with respect to point-wise addition and scalar multiplication). Examples of Dudley classes: HS k, PHS k, HS k 0, PHS k 0, and the class of all balls in any Euclidean space R k

Dudley’s Theorem Theorem: If the a family of functions F is a vector space, then, for every h, the VC dimension of D F,h equals the (linear) dimension of the vector space F. Corollary: Easy calculations for the VC dimension of the classes HS k, PHS k, HS k 0, PHS k 0, k-dimensional balls.

A Completeness Theorem Theorem: For every k, PHS k+1 0 is universal, (and therefore, complete) for the family of all k - dimensional Dudley classes. Proof: Let f 1, …f k be a basis for the vector space F, define  :X  R k+1,  :F  R k+1, be  x)  f 1  x), …. f k (x), h(x)) and for f=  a i f i  f)=(a 1, …a k, 1, 0)

Corollaries  k-size compression schemes for any k-dimensional Dudley class.  Learning algorithms for all Dudley classes.  An easy proof to Dudley’s theorem. (show that for any k –dimensional F, the class HS k 0 is embeddable into D F,h, for h=0)

Download ppt "Informational Complexity Notion of Reduction for Concept Classes Shai Ben-David Cornell University, and Technion Joint work with Ami Litman Technion."

Similar presentations