Multiple Frame Motion Inference Using Belief Propagation Jiang Gao Jianbo Shi Presented By: Gilad Kapelushnik Visual Recognition, Spring 2005, Technion.

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

CVPR2013 Poster Modeling Actions through State Changes.
Lecture 16 Hidden Markov Models. HMM Until now we only considered IID data. Some data are of sequential nature, i.e. have correlations have time. Example:
Pose Estimation and Segmentation of People in 3D Movies Karteek Alahari, Guillaume Seguin, Josef Sivic, Ivan Laptev Inria, Ecole Normale Superieure ICCV.
Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.
1 RTL Example: Video Compression – Sum of Absolute Differences Video is a series of frames (e.g., 30 per second) Most frames similar to previous frame.
Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.
Spatial-Temporal Consistency in Video Disparity Estimation ICASSP 2011 Ramsin Khoshabeh, Stanley H. Chan, Truong Q. Nguyen.
Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.
Exact Inference in Bayes Nets
Semantic Texton Forests for Image Categorization and Segmentation We would like to thank Amnon Drory for this deck הבהרה : החומר המחייב הוא החומר הנלמד.
Dynamic Bayesian Networks (DBNs)
Loopy Belief Propagation a summary. What is inference? Given: –Observabled variables Y –Hidden variables X –Some model of P(X,Y) We want to make some.
COLORCOLOR A SET OF CODES GENERATED BY THE BRAİN How do you quantify? How do you use?
Learning to estimate human pose with data driven belief propagation Gang Hua, Ming-Hsuan Yang, Ying Wu CVPR 05.
Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
3D Human Body Pose Estimation from Monocular Video Moin Nabi Computer Vision Group Institute for Research in Fundamental Sciences (IPM)
Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.
AlgirdasBeinaravičius Gediminas Mazrimas Salman Mosslem.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)
Recognition of Traffic Lights in Live Video Streams on Mobile Devices
1 Image Completion using Global Optimization Presented by Tingfan Wu.
A Bayesian Formulation For 3d Articulated Upper Body Segmentation And Tracking From Dense Disparity Maps Navin Goel Dr Ara V Nefian Dr George Bebis.
1 Face Tracking in Videos Gaurav Aggarwal, Ashok Veeraraghavan, Rama Chellappa.
Abstract Extracting a matte by previous approaches require the input image to be pre-segmented into three regions (trimap). This pre-segmentation based.
Belief Propagation Kai Ju Liu March 9, Statistical Problems Medicine Finance Internet Computer vision.
Tracking with Linear Dynamic Models. Introduction Tracking is the problem of generating an inference about the motion of an object given a sequence of.
Robust estimation Problem: we want to determine the displacement (u,v) between pairs of images. We are given 100 points with a correlation score computed.
Behavior Analysis Midterm Report Lipov Irina Ravid Dan Kotek Tommer.
Accurate, Dense and Robust Multi-View Stereopsis Yasutaka Furukawa and Jean Ponce Presented by Rahul Garg and Ryan Kaminsky.
Super-Resolution of Remotely-Sensed Images Using a Learning-Based Approach Isabelle Bégin and Frank P. Ferrie Abstract Super-resolution addresses the problem.
Computer vision: models, learning and inference
Abstract - Many interactive image processing approaches are based on semi-supervised learning, which employ both labeled and unlabeled data in its training.
1/1/20001 Topic >>>> Scan Conversion CSE Computer Graphics.
Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park.
Computer vision: models, learning and inference Chapter 19 Temporal models.
Simultaneous Localization and Mapping Presented by Lihan He Apr. 21, 2006.
Motion Segmentation By Hadas Shahar (and John Y.A.Wang, and Edward H. Adelson, and Wikipedia and YouTube) 1.
Background Subtraction based on Cooccurrence of Image Variations Seki, Wada, Fujiwara & Sumi Presented by: Alon Pakash & Gilad Karni.
Forward-Scan Sonar Tomographic Reconstruction PHD Filter Multiple Target Tracking Bayesian Multiple Target Tracking in Forward Scan Sonar.
Tracking People by Learning Their Appearance Deva Ramanan David A. Forsuth Andrew Zisserman.
CS332 Visual Processing Department of Computer Science Wellesley College Binocular Stereo Vision Region-based stereo matching algorithms Properties of.
Hank Childs, University of Oregon Lecture #6 CIS 410/510: Advection (Part 1)
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
1 Motion Analysis using Optical flow CIS601 Longin Jan Latecki Fall 2003 CIS Dept of Temple University.
Real-Time Tracking with Mean Shift Presented by: Qiuhua Liu May 6, 2005.
Visual Computing Computer Vision 2 INFO410 & INFO350 S2 2015
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Scale Invariant Feature Transform (SIFT)
Presented by: Idan Aharoni
Contextual models for object detection using boosted random fields by Antonio Torralba, Kevin P. Murphy and William T. Freeman.
CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann.
Bayesian Belief Propagation for Image Understanding David Rosenberg.
CS 548 Spring 2016 Model and Regression Trees Showcase by Yanran Ma, Thanaporn Patikorn, Boya Zhou Showcasing work by Gabriele Fanelli, Juergen Gall, and.
Zhaoxia Fu, Yan Han Measurement Volume 45, Issue 4, May 2012, Pages 650–655 Reporter: Jing-Siang, Chen.
Toward humanoid manipulation in human-centered environments T. Asfour, P. Azad, N. Vahrenkamp, K. Regenstein, A. Bierbaum, K. Welke, J. Schroder, R. Dillmann.
SIFT.
SIFT Scale-Invariant Feature Transform David Lowe
Computer vision: models, learning and inference
Color Image Processing
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Today.
Inferring Edges by Using Belief Propagation
Presented by: Chang Jia As for: Pattern Recognition
Graduate School of Information Sciences, Tohoku University
Synthesis of Motion from Simple Animations
CS 416 Artificial Intelligence
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Sign Language Recognition With Unsupervised Feature Learning
Presentation transcript:

Multiple Frame Motion Inference Using Belief Propagation Jiang Gao Jianbo Shi Presented By: Gilad Kapelushnik Visual Recognition, Spring 2005, Technion IIT.

Abstract Find “best fit” upper body joint configuration. Input is a 2D video Each joint is described by its location on a 2D grid. S1(X,Y) S2(X,Y) S4(X,Y) S6(X,Y) S5(X,Y) S3(X,Y) Let J be a joint configuration – {S1,S2,S3,S4,S5,S6} We would like to find:

Step 1: Subtract two sequential frames. Step 2: Apply threshold. Motion Energy Image

From #NrgPixels To Probability Sum the Energy Pixels in the Patch. Calculate probability using the following: S5(10,60) S6(40,30)

Find configuration J with the highest probability. Computing all possible probabilities is inefficient. a-Priori data give better and faster results. removing impossible configurations reduce inference time. Main Idea

a-Priori Data A probability table for Each P(Sx,Sy). Compute probability at grid crossing. Use nearest neighbor for the rest of the image. Example: For right arm - P(S2,S3) Red – Low probability Green – High probability Ns^2…21P(S2,S3) 0…001 0… ………… 0000Ns^2

Face is detected using face detection algorithm. Initial assumption of Shoulders from face and pose. Even using BP there are too many possible states to go through. Candidates for elbows from shoulders & Energy Map. Candidates for Wrists from skin color model. Detect Candidate states (1)

Detect Candidate states (2) Many states can be discarded. Remove close candidate states. Pros: Much faster inference. Cons: Less accurate. Note: This is only an option. Fits skin color and wrist location Pink for right wrist Red for left wrist Blue for elbow

The Markov Model Empty Circles - States - 2D positions of joints Full Circles - Observations - Computed from energy map. Each state correspond to an observation.

Belief Propagation (1) Solve inference problem using an algorithm with Linear complexity. Each joint has a vector with probabilities for each candidate. Shoulder Elbow Wrist

Belief Propagation (2) m23 m32 m21 m12 m14 m41 For each iteration: Each node sends a message to its neighbor nodes containing the “wanted” probability (for each state). Messages are computed according to: Sum over all candidates A-priori Data for each state. Normalize variable. Observation (# of Energy pixels in patch) for each state converted to a probability. Message from k to i (all messages from the neighbors). This is actually a vector with a probability for each state. Message from i to j.

Belief Propagation (3) - Example 21 Message from 1 to 2 4 states 2 states

Belief Propagation (4) BP converge after 2-4 iterations (giving the right a-Priori data). For every joint there is a probability vector for each candidate state.

Multiple Frame Probability Multiple frame (8) is proposed for smoother transition between configurations. Prevents joints changing their state to a different which is “far away” (Euclidian distance). Though BP was designed to work with loopy-free models, the author stated that it worked fine. And for those who really want to know:

2D to 3D 2D -> 3D by Taylor (2000). Assuming (u1,v1) and (u2,v2) are projections then depth can be retrieved using the following:

Results(1)

Results(2)

Results(3) Errors accrue when 2 joints intersect each other. On some occasions, even when limbs intersect, it was possible to infer correctly.

Q?