Presentation is loading. Please wait.

Presentation is loading. Please wait.

Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik.

Similar presentations


Presentation on theme: "Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik."— Presentation transcript:

1 Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik Dmitry Computer Vision Seminar, IDC

2 Motivation Recognizing human actions in a motion sequence Objects in medium field – about 30 px The objects are noisy Different view angle Non-periodic actions Moving camera Not appearance-based

3 Additional Applications Classification of actions Action synthesis: Do as I do Do as I say Action database (images, skeletons) Figures correction

4 Previous Work Large-scale objects: body parts recognition Periodic motion Stationary cameras, background subtraction Spatio-temporal gradients for video events recognition: high-resolution, different motion classes

5 Recognition Method Tracking a person (simple normalized correlation based tracker, user initialized ) Stabilizing figure’s center in the sequence Calculating spatio-temporal motion descriptors per each frame Measuring motion similarities in sequences

6 Motion Descriptors Use actual pixel values (appearance)? Use spatial gradients? Use temporal gradients? Edges? Pixel-wise optical flow Encodes motion Least affected by appearance …but noisy

7 Optical Flow Overview Pixel-wise stabilization of video sequence Using Lucas&Kanade registration method The images are taken from http://www.apple.com/shake/imageprocessing.html

8 Optical Flow Overview – cont. Per each pixel we have: Intensity: Velocity at each time point: Assuming small motion (1): (Taylor first-order approximation)

9 Optical Flow Overview – cont. Assuming the intensity of the moving pixel remains the same (2): Therefore:

10 Optical Flow Overview – cont. For all the pixels in a small block:

11 Optical Flow Overview – cont. Solving the equation :

12 Optical Flow Overview – cont. We have the motion vector per each block The images are taken from http://www.cs.otago.ac.nz/research/vision/Research/OpticalFlow/opticalflow.html

13 Back to Motion Descriptors The optical flow results are noisy We would like to blur them FxFx FyFy

14 From Optical Flow to Descriptors Splitting the motion vectors V(X,Y) to positive and negative channels Gaussian blurring and normalizing of the four channels

15 Comparing Descriptors In order to compare motions, we need to compare frames of two different sequences The descriptors of all frames are compared using spatio-temporal correlation Where is the descriptor number c of frame i in sequence A Frame-to-frame similarity

16 Frame-to-frame Similarity We’ll start from the inner term This is the frame-to-frame similarity function Where i are indices of frames in sequence A Where j are indices of frames in sequence B

17 Frame-to-frame Similarity The frame-to-frame similarity matrix: Sequence A b 1 b 2 b 3 b 4 Sequence B a 1 a 2 a 3 a 4 Similar motions will appear as diagonals The motion-to-motion similarity matrix:

18 Motion-to-motion Similarity The similar motion patterns will appear in diagonals, or slanted diagonals In order to examine the diagonals, we will convolve the FF-similarity matrix with diagonal kernel Typical FF-similarity matrix for runningThe resulting MM-similarity matrix i j

19 Classifying actions Each motion in a learning sequence has a label Each row in a MM-similarity matrix represents a frame in a novel sequence Construct MM-similarity matrix for the novel sequence Look at the corresponding row Assign a label to the current frame, according to a majority vote Label 1 Current Frame Label 2 Label 3 Label 4

20 Classification Examples

21 Skeleton Transfer Hand-mark the 2D database with joint locations Perform the classification on the sequences, and classify the novel sequence to a skeleton

22 3D Motion Classification Render synthetic 2D images of a stick figure Perform classification of a 3D motion It has many ambiguities

23 Action Synthesis We can use the visual quality of motion descriptors to generate actions Collect a large database of actions of a specific person (Charlie Chaplin) Generate any action, based on the database

24 “Do As I Do” Synthesis We build a sequence S by picking frames from given target sequence T according to a driver sequence D S must: Match the sequence D (in terms of motion descriptors) Appear smooth and natural We will need: MM-similarity matrix between D and T: Similarity-in-appearance matrix (frame-to-frame norm. correlation) between all the frames in T:

25 “Do As I Do” Synthesis – cont. Match-to-driver termSmoothness term :the following frame in T after Now we will maximize a cost function: Sequence T Sequence S

26 “Do As I Do” Example

27 “Do As I Say” Synthesis Generate motion sequence by issuing commands for an action: Classification of target sequence T with the descriptors Use the same approach as in “Do as I do” algorithm Not real-time application

28 “Do As I Say” Example

29 Figure Correction Correct occlusions, background noise Find k similar frames in the same sequence The median image will be the estimate for the current frame Given enough data, the common parts in the found images will be the figure itself

30 Disadvantages High Complexity Scale-sensitive Unable to recognize motions with different speed

31 Video Examples That’s all folks…

32 Recognizing and Tracking Human Action (Preview) Josephine Sullivan and Stefan Carlsson Numerical Analysis and Computing Science, Royal Institute of Technology, Stockholm, Sweden Presented by Pundik Dmitry Computer Vision Seminar, IDC

33 Shape Correspondence Every point on the shape (contour) has location and a tangent direction Assuming correspondence and a smooth transformation between frames Each four points create a unique complex, which can help us build correspondence between points

34 Topological Type 1. Point order 2. Line direction order 3. Relative intersection of the lines and the points

35 Unique Correspondence By choosing every four points on a shape, we will detect the unique correspondence

36 More In The Paper… Frame distance function Key frame based action recognition Tracking by point transfer Body joint locations


Download ppt "Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik."

Similar presentations


Ads by Google