Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICCV 2003UC Berkeley Computer Vision Group Recognizing Action at a Distance A.A. Efros, A.C. Berg, G. Mori, J. Malik UC Berkeley.

Similar presentations


Presentation on theme: "ICCV 2003UC Berkeley Computer Vision Group Recognizing Action at a Distance A.A. Efros, A.C. Berg, G. Mori, J. Malik UC Berkeley."— Presentation transcript:

1 ICCV 2003UC Berkeley Computer Vision Group Recognizing Action at a Distance A.A. Efros, A.C. Berg, G. Mori, J. Malik UC Berkeley

2 ICCV 2003UC Berkeley Computer Vision Group Looking at People 3-pixel man Blob tracking –vast surveillance literature 300-pixel man Limb tracking –e.g. Yacoob & Black, Rao & Shah, etc. Far fieldNear field

3 ICCV 2003UC Berkeley Computer Vision Group Medium-field Recognition The 30-Pixel Man

4 ICCV 2003UC Berkeley Computer Vision Group Appearance vs. Motion Jackson Pollock Number 21 (detail)

5 ICCV 2003UC Berkeley Computer Vision Group Goals Recognize human actions at a distance –Low resolution, noisy data –Moving camera, occlusions –Wide range of actions (including non-periodic)

6 ICCV 2003UC Berkeley Computer Vision Group Our Approach Motion-based approach –Non-parametric; use large amount of data –Classify a novel motion by finding the most similar motion from the training set Related Work –Periodicity analysis Polana & Nelson; Seitz & Dyer; Bobick et al; Cutler & Davis; Collins et al. –Model-free Temporal Templates [Bobick & Davis] Orientation histograms [Freeman et al; Zelnik & Irani] Using MoCap data [Zhao & Nevatia, Ramanan & Forsyth]

7 ICCV 2003UC Berkeley Computer Vision Group Gathering action data Tracking –Simple correlation-based tracker –User-initialized

8 ICCV 2003UC Berkeley Computer Vision Group Figure-centric Representation Stabilized spatio-temporal volume –No translation information –All motion caused by person’s limbs Good news: indifferent to camera motion Bad news: hard! Good test to see if actions, not just translation, are being captured

9 ICCV 2003UC Berkeley Computer Vision Group input sequence Remembrance of Things Past “Explain” novel motion sequence by matching to previously seen video clips –For each frame, match based on some temporal extent Challenge: how to compare motions? motion analysis run walk left swing walk right jog database

10 ICCV 2003UC Berkeley Computer Vision Group How to describe motion? Appearance –Not preserved across different clothing Gradients (spatial, temporal) –same (e.g. contrast reversal) Edges/Silhouettes –Too unreliable Optical flow –Explicitly encodes motion –Least affected by appearance –…but too noisy

11 ICCV 2003UC Berkeley Computer Vision Group Spatial Motion Descriptor Image frame Optical flow blurred

12 ICCV 2003UC Berkeley Computer Vision Group Spatio-temporal Motion Descriptor t … … … …  Sequence A Sequence B Temporal extent E B frame-to-frame similarity matrix A motion-to-motion similarity matrix A B I matrix E E blurry I E E

13 ICCV 2003UC Berkeley Computer Vision Group Football Actions: matching Input Sequence Matched Frames inputmatched

14 ICCV 2003UC Berkeley Computer Vision Group Football Actions: classification 10 actions; 4500 total frames; 13-frame motion descriptor

15 ICCV 2003UC Berkeley Computer Vision Group Classifying Ballet Actions 16 Actions; 24800 total frames; 51-frame motion descriptor. Men used to classify women and vice versa.

16 ICCV 2003UC Berkeley Computer Vision Group Classifying Tennis Actions 6 actions; 4600 frames; 7-frame motion descriptor Woman player used as training, man as testing.

17 ICCV 2003UC Berkeley Computer Vision Group Classifying Tennis Red bars show classification results

18 ICCV 2003UC Berkeley Computer Vision Group Querying the Database input sequence database run walk left swing walk right jog runwalk leftswingwalk rightjog Action Recognition: Joint Positions:

19 ICCV 2003UC Berkeley Computer Vision Group 2D Skeleton Transfer We annotate database with 2D joint positions After matching, transfer data to novel sequence –Ajust the match for best fit Input sequence: Transferred 2D skeletons:

20 ICCV 2003UC Berkeley Computer Vision Group 3D Skeleton Transfer We populate database with rendered stick figures from 3D Motion Capture data Matching as before, we get 3D joint positions (kind of)! Input sequence: Transferred 3D skeletons:

21 ICCV 2003UC Berkeley Computer Vision Group “Do as I Do” Motion Synthesis Matching two things: –Motion similarity across sequences –Appearance similarity within sequence (like VideoTextures) Dynamic Programming input sequence synthetic sequence

22 ICCV 2003UC Berkeley Computer Vision Group “Do as I Do” Source MotionSource Appearance Result 3400 Frames

23 ICCV 2003UC Berkeley Computer Vision Group “Do as I Say” Synthesis Synthesize given action labels –e.g. video game control run walk left swing walk right jog synthetic sequence run walk left swing walk right jog

24 ICCV 2003UC Berkeley Computer Vision Group “Do as I Say” Red box shows when constraint is applied

25 ICCV 2003UC Berkeley Computer Vision Group Actor Replacement SHOW VIDEO (GregWorldCup.avi, DivX)

26 ICCV 2003UC Berkeley Computer Vision Group Conclusions In medium field action is about motion What we propose: –A way of matching motions at coarse scale What we get out: –Action recognition –Skeleton transfer –Synthesis: “Do as I Do” & “Do as I say” What we learned? –A lot to be said for the “little guy”!


Download ppt "ICCV 2003UC Berkeley Computer Vision Group Recognizing Action at a Distance A.A. Efros, A.C. Berg, G. Mori, J. Malik UC Berkeley."

Similar presentations


Ads by Google