Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik.

Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik Dmitry Computer Vision Seminar, IDC

Motivation Recognizing human actions in a motion sequence Objects in medium field – about 30 px The objects are noisy Different view angle Non-periodic actions Moving camera Not appearance-based

Additional Applications Classification of actions Action synthesis: Do as I do Do as I say Action database (images, skeletons) Figures correction

Previous Work Large-scale objects: body parts recognition Periodic motion Stationary cameras, background subtraction Spatio-temporal gradients for video events recognition: high-resolution, different motion classes

Recognition Method Tracking a person (simple normalized correlation based tracker, user initialized ) Stabilizing figure’s center in the sequence Calculating spatio-temporal motion descriptors per each frame Measuring motion similarities in sequences

Motion Descriptors Use actual pixel values (appearance)? Use spatial gradients? Use temporal gradients? Edges? Pixel-wise optical flow Encodes motion Least affected by appearance …but noisy

Optical Flow Overview Pixel-wise stabilization of video sequence Using Lucas&Kanade registration method The images are taken from http://www.apple.com/shake/imageprocessing.html

Optical Flow Overview – cont. Per each pixel we have: Intensity: Velocity at each time point: Assuming small motion (1): (Taylor first-order approximation)

Optical Flow Overview – cont. Assuming the intensity of the moving pixel remains the same (2): Therefore:

Optical Flow Overview – cont. For all the pixels in a small block:

Optical Flow Overview – cont. Solving the equation :

Optical Flow Overview – cont. We have the motion vector per each block The images are taken from http://www.cs.otago.ac.nz/research/vision/Research/OpticalFlow/opticalflow.html

Back to Motion Descriptors The optical flow results are noisy We would like to blur them FxFx FyFy

From Optical Flow to Descriptors Splitting the motion vectors V(X,Y) to positive and negative channels Gaussian blurring and normalizing of the four channels

Comparing Descriptors In order to compare motions, we need to compare frames of two different sequences The descriptors of all frames are compared using spatio-temporal correlation Where is the descriptor number c of frame i in sequence A Frame-to-frame similarity

Frame-to-frame Similarity We’ll start from the inner term This is the frame-to-frame similarity function Where i are indices of frames in sequence A Where j are indices of frames in sequence B

Frame-to-frame Similarity The frame-to-frame similarity matrix: Sequence A b 1 b 2 b 3 b 4 Sequence B a 1 a 2 a 3 a 4 Similar motions will appear as diagonals The motion-to-motion similarity matrix:

Motion-to-motion Similarity The similar motion patterns will appear in diagonals, or slanted diagonals In order to examine the diagonals, we will convolve the FF-similarity matrix with diagonal kernel Typical FF-similarity matrix for runningThe resulting MM-similarity matrix i j

Classifying actions Each motion in a learning sequence has a label Each row in a MM-similarity matrix represents a frame in a novel sequence Construct MM-similarity matrix for the novel sequence Look at the corresponding row Assign a label to the current frame, according to a majority vote Label 1 Current Frame Label 2 Label 3 Label 4

Classification Examples

Skeleton Transfer Hand-mark the 2D database with joint locations Perform the classification on the sequences, and classify the novel sequence to a skeleton

3D Motion Classification Render synthetic 2D images of a stick figure Perform classification of a 3D motion It has many ambiguities

Action Synthesis We can use the visual quality of motion descriptors to generate actions Collect a large database of actions of a specific person (Charlie Chaplin) Generate any action, based on the database

“Do As I Do” Synthesis We build a sequence S by picking frames from given target sequence T according to a driver sequence D S must: Match the sequence D (in terms of motion descriptors) Appear smooth and natural We will need: MM-similarity matrix between D and T: Similarity-in-appearance matrix (frame-to-frame norm. correlation) between all the frames in T:

“Do As I Do” Synthesis – cont. Match-to-driver termSmoothness term :the following frame in T after Now we will maximize a cost function: Sequence T Sequence S

“Do As I Do” Example

“Do As I Say” Synthesis Generate motion sequence by issuing commands for an action: Classification of target sequence T with the descriptors Use the same approach as in “Do as I do” algorithm Not real-time application

“Do As I Say” Example

Figure Correction Correct occlusions, background noise Find k similar frames in the same sequence The median image will be the estimate for the current frame Given enough data, the common parts in the found images will be the figure itself

Disadvantages High Complexity Scale-sensitive Unable to recognize motions with different speed

Video Examples That’s all folks…

Recognizing and Tracking Human Action (Preview) Josephine Sullivan and Stefan Carlsson Numerical Analysis and Computing Science, Royal Institute of Technology, Stockholm, Sweden Presented by Pundik Dmitry Computer Vision Seminar, IDC

Shape Correspondence Every point on the shape (contour) has location and a tangent direction Assuming correspondence and a smooth transformation between frames Each four points create a unique complex, which can help us build correspondence between points

Topological Type 1. Point order 2. Line direction order 3. Relative intersection of the lines and the points

Unique Correspondence By choosing every four points on a shape, we will detect the unique correspondence

More In The Paper… Frame distance function Key frame based action recognition Tracking by point transfer Body joint locations

Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik.

Similar presentations

Presentation on theme: "Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik.

Similar presentations

Presentation on theme: "Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik."— Presentation transcript:

Similar presentations

About project

Feedback