Presentation is loading. Please wait.

Presentation is loading. Please wait.

School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping.

Similar presentations


Presentation on theme: "School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping."— Presentation transcript:

1 School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping

2 Outline : 3 Experiments: PASCAL & Stanford 40 Actions 4 Intuition: Action Attributes and Parts 2 5 Algorithm: Learning Bases of Attributes and Parts Conclusion 1 Action Classification in Still Images

3 Low level feature Riding bike

4 Action Classification in Still Images Riding a bike Sitting on a bike seat Wearing a helmet Peddling the pedals … - Semantic concepts – Attributes Low level featureHigh-level representation Riding bike

5 Action Classification in Still Images - Semantic concepts – Attributes - Objects Riding a bike Sitting on a bike seat Wearing a helmet Peddling the pedals … Low level featureHigh-level representation Riding bike

6 Action Classification in Still Images - Semantic concepts – Attributes - Objects - Human poses Parts Riding a bike Sitting on a bike seat Wearing a helmet Peddling the pedals … Low level featureHigh-level representation Riding bike

7 Action Classification in Still Images - Semantic concepts – Attributes - Objects - Human poses - Contexts of attributes & parts Parts Riding a bike Sitting on a bike seat Wearing a helmet Peddling the pedals … Riding Low level featureHigh-level representation Riding bike

8 Low level feature - Semantic concepts – Attributes - Objects - Human poses - Contexts of attributes & parts High-level representation Parts riding a bike wearing a helmet Peddling the pedal sitting on bike seat Incorporate human knowledge; More understanding of image content; More discriminative classifier. Action Classification in Still Images Riding bike

9

10 Outline : 3 Experiments: PASCAL & Stanford 40 Actions 4 Intuition: Action Attributes and Parts 2 5 Algorithm: Learning Bases of Attributes and Parts Conclusion 1 Action Classification in Still Images

11 Action Attributes and Parts Attributes: …… semantic descriptions of human actions

12 Action Attributes and Parts Attributes: …… semantic descriptions of human actions Riding bike Not riding bike Discriminative classifier, e.g. SVM

13 Action Attributes and Parts Attributes: …… Parts-Objects: …… Parts-Poselets: …… A pre-trained detector

14 Action Attributes and Parts Attributes: …… Parts-Objects: …… Parts-Poselets: …… Attribute classification Object detection Poselet detection a: Image feature vector

15 Action Attributes and Parts Attributes: …… Parts-Objects: …… Parts-Poselets: …… Attribute classification Object detection Poselet detection a: Image feature vector … Action bases Φ

16 Action Attributes and Parts Attributes: …… Parts-Objects: …… Parts-Poselets: …… a: Image feature vector … Action bases Φ

17 Action Attributes and Parts Attributes: …… Parts-Objects: …… Parts-Poselets: …… a: Image feature vector … Action bases Φ

18 Action Attributes and Parts Attributes: …… Parts-Objects: …… Parts-Poselets: …… … Action bases Bases coefficients w Φ a: Image feature vector SVM

19 Action Attributes and Parts Attributes: …… Parts-Objects: …… Parts-Poselets: …… … Action bases Bases coefficients w Φ a: Image feature vector Riding bike

20

21 Outline : 3 Experiments: PASCAL & Stanford 40 Actions 4 Intuition: Action Attributes and Parts 2 5 Algorithm: Learning Bases of Attributes and Parts Conclusion 1 Action Classification in Still Images

22 Bases of Atr. & Parts: Training w Φ a Input: Output: sparse Jointly estimate and : ΦW …

23 Bases of Atr. & Parts: Testing … w Φ a Input: Output: sparse Estimate w:

24 Outline : 3 Experiments: PASCAL & Stanford 40 Actions 4 Intuition: Action Attributes and Parts 2 5 Algorithm: Learning Bases of Attributes and Parts Conclusion 1 Action Classification in Still Images

25 1. PASCAL Action Dataset http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2008 /

26 1. PASCAL Action Dataset Contain 9 classes, there are 21,738 images in total; Randomly select 50% of each class for training/validation and the remain images for testing; 14 attributes, 27 objects, 150 poselets; The number of action bases are set to 400 and 600 respectively. The  and  values are set to 0.1 and 0.15.

27 Classification Result Phoning Playing instrument Reading Riding bike Riding horse Running Taking photo Using computer Walking Average precision Our method, use “a” POSELETS SURREY_MK UCLEAR_DOSP … w Φ a

28 … w Φ a Phoning Playing instrument Reading Riding bike Riding horse Running Taking photo Walking Our method, use “a” Our method, use “w” POSELETS SURREY_MK UCLEAR_DOSP Average precision Using computer Classification Result

29 … w Φ a Phoning Playing instrument Reading Riding bike Riding horse Running Taking photo Walking Our method, use “a” Our method, use “w” Poselet, Maji et al, 2011 SURREY_MK UCLEAR_DOSP Average precision Using computer 400 action bases attributes objects poselets Classification Result

30 … w Φ a Phoning Playing instrument Reading Riding bike Riding horse Running Taking photo Walking Our method, use “a” Our method, use “w” Poselet, Maji et al, 2011 SURREY_MK UCLEAR_DOSP Average precision Using computer 400 action bases attributes objects poselets Classification Result

31 … w Φ a Phoning Playing instrument Reading Riding bike Riding horse Running Taking photo Walking Our method, use “a” Our method, use “w” Poselet, Maji et al, 2011 SURREY_MK UCLEAR_DOSP Average precision Using computer 400 action bases attributes objects poselets Classification Result

32 Control Experiment … w Φ a Use “a” Use “w” A: attribute O: object P: poselet

33 2. Stanford 40 Actions ApplaudingBlowing bubbles Brushing teeth Calling Cleaning floor Climbing wall CookingCutting trees Cutting vegetables DrinkingFeeding horse FishingFixing bike GardeningHolding umbrella Jumping Playing guitar Playing violin Pouring liquid Pushing cart ReadingRepairing car Riding bike Riding horse RowingRunningShooting arrow Smoking cigarette Taking photo Texting message Throwing frisbee Using computer Using microscope Using telescope Walking dog Washing dishes Watching television Waving hands Writing on board Writing on paper http://vision.stanford.edu/Datasets/40actions.html

34 2. Stanford 40 Actions contains 40 diverse daily human actions; 180 ∼ 300 images for each class, 9532 real world images in total; All the images are obtained from Google, Bing, and Flickr; large variations in human pose, appearance, and background clutter. Cutting vegetables DrinkingFeeding horse Fixing bike GardeningHolding umbrella Playing guitar Playing violin Pouring liquid ReadingRepairing car Riding bike Shooting arrow Smoking cigarette Taking photo Walking dog Washing dishes Watching television Drinking Gardening Smoking Cigarette

35 35 Result: Randomly select 100 images in each class for training, and the remaining images for testing. 45 attributes, 81 objects, 150 poselets. The number of action bases are set to 400 and 600 respectively. The  and  values are set to 0.1 and 0.15. Compare our method with the Locality-constrained Linear Coding (LLC, Wang et al, CVPR 2010) baseline. Average precision

36 Control Experiment … w Φ a A: attribute O: object P: poselet Use “a” Use “w”

37 Outline : 3 Experiments: PASCAL & Stanford 40 Actions 4 Intuition: Action Attributes and Parts 2 5 Algorithm: Learning Bases of Attributes and Parts Conclusion 1 Action Classification in Still Images

38 Partwise Bag-of-Words (PBoW) Representation:  Local feature  Body part localization  PBoW generation head-wise BoW limb-wise BoW leg-wise BoW foot-wise BoW

39 Local Action Attribute Method: 1. Label the action samples according to different parts static vertical move horizontal move Head static swing … Limb … For each part, we define a new set of low-level semantic to re- class the training action samples static … Leg … static … Foot …

40 Local Action Attribute Method: 2. For each part, train a set of attribute classifiers according to the set of semantic we define. for each part train … … …

41 Local Action Attribute Method: 3. For each action sample, map its low-level representation to a middle- level representation through the framework as follow: Head-wise BoW Limb-wise BoW Leg-wise BoW Foot-wise BoW Combine this four part to built a new histogram representation of the sample One action sample

42 Local Action Attribute Method: 4. Thus, based on local action attribute, we construct a new descriptor of action samples. It can be used to classify. Training set Testing set SVM K-NN Training set Testing set

43 School of Electronic Information Engineering, Tianjin University Thank you


Download ppt "School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping."

Similar presentations


Ads by Google