Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.

Similar presentations


Presentation on theme: "Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL."— Presentation transcript:

1 Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL

2 Goal :  Efficient training of jointly sparse models in high dimensional spaces. Joint Sparsity Why? :  Learn from fewer examples.  Build more efficient classifiers.  Interpretability.

3 3 ChurchAirport Grocery StoreFlower-Shop

4 4 ChurchAirport Grocery StoreFlower-Shop

5 5 ChurchAirport Grocery StoreFlower-Shop

6 l 1,∞ Regularization  How do we promote joint (i.e. row) sparsity ? Coefficients for feature 2 Coefficients for task 2 An l 1 norm on the maximum absolute values of the coefficients across tasks promotes sparsity. Use few features The l ∞ norm on each row promotes non-sparsity on each row. Share parameters

7 Contributions  An efficient projected gradient method for l 1,∞ regularization  Our projection works on O(n log n) time, same cost as l 1 projection  Experiments in Multitask image classification problems  We can discover jointly sparse solutions  l 1,∞ regularization leads to better performance than l 2 and l 1 regularization

8 Multitask Application Joint Sparse Approximation Collection of Tasks

9 l 1,∞ Regularization: Constrained Convex Optimization Formulation  We use a Projected SubGradient method. Main advantages: simple, scalable, guaranteed convergence rates. A convex function Convex constraints  Projected SubGradient methods have been recently proposed:  l 2 regularization, i.e. SVM [Shalev-Shwartz et al. 2007]  l 1 regularization [Duchi et al. 2008]

10 Euclidean Projection into the l 1-∞ ball 10

11 Characterization of the solution 11

12 Characterization of the solution Feature IFeature II Feature IIIFeature IV

13 Mapping to a simpler problem  We can map the projection problem to the following problem which finds the optimal maximums μ:

14 Efficient Algorithm 14

15 Efficient Algorithm 15

16 Complexity 16  The total cost of the algorithm is dominated by sorting the entries of A.  The total cost is in the order of:

17 Synthetic Experiments 17  Generate a jointly sparse parameter matrix W:  For every task we generate pairs: where:  We compared three different types of regularization :  l 1,∞ projection  l 1 projection  l 2 projection

18 Synthetic Experiments 18

19 Dataset: Image Annotation 19  40 top content words  Raw image representation: Vocabulary Tree (Nister and Stewenius 2006)  11000 dimensions president actressteam

20 Results 20 Most of the differences are statistically significant

21 Results 21

22 Dataset: Indoor Scene Recognition 22  67 indoor scenes.  Raw image representation: similarities to a set of unlabeled images.  2000 dimensions. bakerybar Train station

23 Results

24 Conclusions  We proposed an efficient global optimization algorithm for l 1,∞ regularization.  We presented experiments on image classification tasks and shown that our method can recover jointly sparse solutions.  A simple an efficient tool to implement an l 1,∞ penalty, similar to standard l 1 and l 2 penalties.


Download ppt "Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL."

Similar presentations


Ads by Google