Download presentation

Presentation is loading. Please wait.

Published byElliot Davids Modified about 1 year ago

1
Learning an Attribute Dictionary for Human Action Classification Qiang Qiu, Zhuolin Jiang, and Rama Chellappa, ”Sparse Dictionary-based Representation and Recognition of Action Attributes”, ICCV 2011 Qiang Qiu

2
Action Feature Representation 2 Shape Motion HOG

3
Action Sparse Representation 3 = = 0.43 × × × × = 0.64× × ×+0.35 × ActionDictionary Sparse code

4
K-SVD 4 = Y y2y2 d1d1 d2d2 d3d3 … x1x1 x2x2 y1y1 D X K-SVD Input: signals Y, dictionary size, sparisty T Output: dictionary D, sparse codes X arg min |Y- DX| 2 s.t. i, |x i | 0 ≤ T D,X Input signalsDictionary Sparse codes [1] M. Aharon and M. Elad and A. Bruckstein, K-SVD: An Algorithm for Designing Overcomplete Dictionries for Sparse Representation, IEEE Trans. on Signal Process, 2006

5
5 Compact Discriminative and Dictionary. Learn a Objective

6
Probabilistic Model for Sparse Representation A Gaussian Process Dictionary Class Distribution 6

7
= y2y2 y1y1 y4y4 y3y3 l1l1 l1l1 l2l2 l2l2 d1d1 d2d2 d3d3 … x1x1 x2x2 x3x3 x4x4 x d1 l1l1 l1l1 l2l2 l2l2 More Views of Sparse Representation

8
8 y2y2 y1y1 y4y4 y3y3 l1l1 l1l1 l2l2 l2l2 x d x1 x2 x3 x4x1 x2 x3 x … … x d2 x d3 x d4 x d5 l1 l1 l2 l2l1 l1 l2 l2 d1d1 d2d2 d3d3 d4d4 d5d5 A Gaussian Process Covariance function entry: K(i,j) = cov(x di, x dj ) P(X d* |X D* ) is a Gaussian with a closed-form conditional variance A Gaussian Process

9
9 y2y2 y1y1 y4y4 y3y3 l1l1 l1l1 l2l2 l2l2 x d x1 x2 x3 x4x1 x2 x3 x … … x d2 x d3 x d4 x d5 l1 l1 l2 l2l1 l1 l2 l2 d1d1 d2d2 d3d3 d4d4 d5d5 Dictionary Class Distribution P(L|d i ), L [1, M] aggregate |x di | based on class labels to obtain a M sized vector P(L=l 1 |d 5 ) = ( )/( ) = P(L=l 2 |d 5 ) = (0+0.42)/( ) = 0.37 Dictionary Class Distribution

10
Dictionary Learning Approaches Maximization of Joint Entropy (ME) Maximization of Mutual Information (MMI) Unsupervised Learning (MMI-1) Supervised Learning (MMI-2) 10

11
11 Maximization of Joint Entropy (ME) - Initialize dictionary using k-SVD D o = - Start with D* = - Untill |D*|=k, iteratively choose d * from D o \D *, d * = arg max H(d|D * ) d ME dictionary D - A good approximation to ME criteria arg max H(D) where

12
12 Maximization of Mutual Information for Unsupervised Learning (MMI-1) - Initialize dictionary using k-SVD D o = - Start with D* = - Untill |D*|=k, iteratively choose d * from D o \D *, MMI dictionary - Closed form: - A near-optimal approximation to MMI arg max I(D; D o \D) within (1-1/e) of the optimum D d * = arg max H(d|D * ) - H(d|D o \(D * d)) d Diversity Coverage

13
13 Dictionary Class Distribution P(L|d i ), L [1, M] aggregate|x di |based on class labels to obtain a M sized vector P(l 1 |d 5 ) = ( )/( ) = P(l 2 |d 5 ) = (0+0.42)/( ) = 0.37 P(L d ) = P(L|d) P(L D ) = P(L|D), where y2y2 y1y1 y4y4 y3y3 l1l1 l1l1 l2l2 l2l2 x d x1 x2 x3 x4x1 x2 x3 x … … x d2 x d3 x d4 x d5 l1 l1 l2 l2l1 l1 l2 l2 d1d1 d2d2 d3d3 d4d4 d5d5 Revisit

14
14 Maximization of Mutual Information for Supervised Learning (MMI-2) - Initialize dictionary using k-SVD D o = - Start with D* = - Untill |D*|=k, iteratively choose d * from D o \D *, d d * = arg max [H(d|D * ) - H(d|D o \(D * d))] + λ[H(L d |L D* ) – H(L d |L Do\(D* d) )] - MMI-1 is a special case of MMI-2 with λ=0.

15
Keck gesture dataset 15

16
Representation Consistency 16 [1] J. Liu and M. Shah, Learning Human Actions via Information Maximization, CVPR 2008 [1]

17
Recognition Accuracy 17 The recognition accuracy using initial dictionary D o : (a) 0.23 (b) 0.42 (c) 0.71

18
Recognizing Realistic Actions broadcast sports videos. 10 different actions. Average recognition rate: 83:6% Best reported result 86.6%

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google