Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011.

Slides:



Advertisements
Similar presentations
Compact Signatures for High- speed Interest Point Description and Matching Calonder, Lepetit, Fua, Konolige, Bowman, Mihelich (as rendered by Lord)
Advertisements

Real-Time Template Tracking
Real-Time Projector Tracking on Complex Geometry Using Ordinary Imagery Tyler Johnson and Henry Fuchs University of North Carolina – Chapel Hill ProCams.
UNIVERSIDAD DE MURCIA LÍNEA DE INVESTIGACIÓN DE PERCEPCIÓN ARTIFICIAL Y RECONOCIMIENTO DE PATRONES - GRUPO DE COMPUTACIÓN CIENTÍFICA A CAMERA CALIBRATION.
Exploration of advanced lighting and shading techniques
1 Photometric Stereo Reconstruction Dr. Maria E. Angelopoulou.
Mobile Application Development Fall COP 4655 U1 T/R 5:00 - 6:15pm – ECS 135 Steve Luis lecture1.
Active Appearance Models
Feature Based Image Mosaicing
CSCE 643 Computer Vision: Template Matching, Image Pyramids and Denoising Jinxiang Chai.
3D Model Matching with Viewpoint-Invariant Patches(VIP) Reporter :鄒嘉恆 Date : 10/06/2009.
Group Meeting Presented by Wyman 10/14/2006
Improved Census Transforms for Resource-Optimized Stereo Vision
People Counting and Human Detection in a Challenging Situation Ya-Li Hou and Grantham K. H. Pang IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART.
The fundamental matrix F
CSE473/573 – Stereo and Multiple View Geometry
T1.1- Analysis of acceleration opportunities and virtualization requirements in industrial applications Bologna, April 2012 UNIBO.
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
Face Alignment with Part-Based Modeling
1 Online Construction of Surface Light Fields By Greg Coombe, Chad Hantak, Anselmo Lastra, and Radek Grzeszczuk.
Real-Time Accurate Stereo Matching using Modified Two-Pass Aggregation and Winner- Take-All Guided Dynamic Programming Xuefeng Chang, Zhong Zhou, Yingjie.
Parallel Tracking and Mapping for Small AR Workspaces Vision Seminar
IIIT Hyderabad Pose Invariant Palmprint Recognition Chhaya Methani and Anoop Namboodiri Centre for Visual Information Technology IIIT, Hyderabad, INDIA.
Robust Object Tracking via Sparsity-based Collaborative Model
Adviser : Ming-Yuan Shieh Student ID : M Student : Chung-Chieh Lien VIDEO OBJECT SEGMENTATION AND ITS SALIENT MOTION DETECTION USING ADAPTIVE BACKGROUND.
Dana Cobzas-PhD thesis Image-Based Models with Applications in Robot Navigation Dana Cobzas Supervisor: Hong Zhang.
Virtual Dart: An Augmented Reality Game on Mobile Device Supervisor: Professor Michael R. Lyu Prepared by: Lai Chung Sum Siu Ho Tung.
Multimodal Templates for Real-Time Detection of Texture-less Objects in Heavily Cluttered Scenes Stefan Hinterstoisser, Stefan Holzer, Cedric Cagniart,
Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,
Exchanging Faces in Images SIGGRAPH ’04 Blanz V., Scherbaum K., Vetter T., Seidel HP. Speaker: Alvin Date: 21 July 2004.
Accurate Non-Iterative O( n ) Solution to the P n P Problem CVLab - Ecole Polytechnique Fédérale de Lausanne Francesc Moreno-Noguer Vincent Lepetit Pascal.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL An Incremental Weighted Least Squares Approach To Surface Light Fields Greg Coombe Anselmo Lastra.
Effective Gaussian mixture learning for video background subtraction Dar-Shyang Lee, Member, IEEE.
Multiple View Geometry Marc Pollefeys University of North Carolina at Chapel Hill Modified by Philippos Mordohai.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 11, NOVEMBER 2011 Qian Zhang, King Ngi Ngan Department of Electronic Engineering, the Chinese university.
REAL-TIME DETECTION AND TRACKING FOR AUGMENTED REALITY ON MOBILE PHONES Daniel Wagner, Member, IEEE, Gerhard Reitmayr, Member, IEEE, Alessandro Mulloni,
Manhattan-world Stereo Y. Furukawa, B. Curless, S. M. Seitz, and R. Szeliski 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.
(Fri) Young Ki Baik Computer Vision Lab.
MACHINE VISION GROUP Head-tracking virtual 3-D display for mobile devices Miguel Bordallo López*, Jari Hannuksela*, Olli Silvén* and Lixin Fan**, * University.
Augmented Reality New Trend in Education Adoracion C. Cunanan Presenter.
MACHINE VISION GROUP Graphics hardware accelerated panorama builder for mobile phones Miguel Bordallo López*, Jari Hannuksela*, Olli Silvén* and Markku.
Shape Recognition and Pose Estimation for Mobile Augmented Reality Author : N. Hagbi, J. El-Sana, O. Bergig, and M. Billinghurst Date : Speaker.
Olga Zoidi, Anastasios Tefas, Member, IEEE Ioannis Pitas, Fellow, IEEE
Computer Graphics Graphics Hardware
KinectFusion : Real-Time Dense Surface Mapping and Tracking IEEE International Symposium on Mixed and Augmented Reality 2011 Science and Technology Proceedings.
3D SLAM for Omni-directional Camera
WSCG2008, Plzen, 04-07, Febrary 2008 Comparative Evaluation of Random Forest and Fern classifiers for Real-Time Feature Matching I. Barandiaran 1, C.Cottez.
Marco Pedersoli, Jordi Gonzàlez, Xu Hu, and Xavier Roca
21 June 2009Robust Feature Matching in 2.3μs1 Simon Taylor Edward Rosten Tom Drummond University of Cambridge.
Computer Vision Lecture #10 Hossam Abdelmunim 1 & Aly A. Farag 2 1 Computer & Systems Engineering Department, Ain Shams University, Cairo, Egypt 2 Electerical.
Source: Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on Author: Paucher, R.; Turk, M.; Adviser: Chia-Nian.
A Flexible New Technique for Camera Calibration Zhengyou Zhang Sung Huh CSPS 643 Individual Presentation 1 February 25,
Feature Matching. Feature Space Outlier Rejection.
CSE 185 Introduction to Computer Vision Feature Matching.
Markerless Augmented Reality Platform Design and Verification of Tracking Technologies Author:J.M. Zhong Date: Speaker:Sian-Lin Hong.
Visual Odometry David Nister, CVPR 2004
Silhouette Segmentation in Multiple Views Wonwoo Lee, Woontack Woo, and Edmond Boyer PAMI, VOL. 33, NO. 7, JULY 2011 Donguk Seo
Image-Based Rendering Geometry and light interaction may be difficult and expensive to model –Think of how hard radiosity is –Imagine the complexity of.
Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.
Image Fusion In Real-time, on a PC. Goals Interactive display of volume data in 3D –Allow more than one data set –Allow fusion of different modalities.
Correspondence and Stereopsis. Introduction Disparity – Informally: difference between two pictures – Allows us to gain a strong sense of depth Stereopsis.
11/25/03 3D Model Acquisition by Tracking 2D Wireframes Presenter: Jing Han Shiau M. Brown, T. Drummond and R. Cipolla Department of Engineering University.
Computer Graphics Graphics Hardware
Video object segmentation and its salient motion detection using adaptive background generation Kim, T.K.; Im, J.H.; Paik, J.K.;  Electronics Letters 
3D Single Image Scene Reconstruction For Video Surveillance Systems
RGBD Camera Integration into CamC Computer Integrated Surgery II Spring, 2015 Han Xiao, under the auspices of Professor Nassir Navab, Bernhard Fuerst and.
Unsupervised Face Alignment by Robust Nonrigid Mapping
Dingding Liu* Yingen Xiong† Linda Shapiro* Kari Pulli†
Video Compass Jana Kosecka and Wei Zhang George Mason University
Computer Graphics Graphics Hardware
Presentation transcript:

Wonwoo Lee, Youngmin Park, Vincent Lepetit, Woontack Woo IEEE TRANSACTIONS ON CURCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 10, OCTOBER 2011

Introduction Online Target Learning Detection and Tracking Experimental Results Conclusion

Objective : Augment a real-world scene with minimal user intervention on a mobile phone. Anywhere Augmentation Considerations: Avoid reconstruction of 3D scene Perspective patch recognition Mobile phone processing power Mobile phone accelerometers Mobile phone Bluetooth connectivity

The proposed method follows a standard procedure of target learning and detection. Input Image Online Learning Real-time Detection

The proposed method follows a standard procedure of target learning and detection Input Image Online Learning Real-time Detection

Input: Image of the target plane Output: Patch data and camera poses Assumptions Known camera parameters Horizontal or vertical surface

Input ImageFrontal View GenerationBlurred Patch GenerationPost-processingInput ImageFrontal View GenerationBlurred Patch GenerationPost-processing

We need a frontal view to create the patch data and their associated poses. Targets whose frontal views are available.

However, frontal views are not always available in the real world. Targets whose frontal views are NOT available.

Objective : Fronto-parallel view image from the input image. Approach : Exploit the phones built-in accelerometer. Assumption : Patch is on horizontal or vertical surface.

The orientation of a target (H / V) is recommended based on the current pose of the phone. Vertical π/4 -π/4 Parallel to Ground G (detected by acceleromaeter) Horizontal

Under the 1 degree of freedom assumption Frontal view camera: [I|0] Captured view camera: [R|c] T = -Rc Function to warp image to virtual frontal view. [12] [12] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge, U.K.: Cambridge Univ. Press, 2000.

Input ImageFrontal View GenerationBlurred Patch GenerationPost-processing

Objective: Learn the appearances of a target surface fast. Approach : Adopt the approach of patch learning in Gepard [6] Real-time learning of a patch on the desktop computer. [6] S. Hinterstoisser, V. Lepetit, S. Benhimane, P. Fua, and N. Navab, Learning real-time perspective patch rectification, Int. J. Comput. Vis., vol. 91, pp. 107–130, Jan

Fast patch learning by linearizing image warping with principal component analysis. Mean patch as a patch descriptor. Difficult to directly apply to mobile phone platform. Low performance of mobile phone CPU Large amount of pre-computed data is required (about 90MB)

Remove need for fronto-parallel view Using phones accelerometers and limiting to 2 planes Skip the Feature Point Detection step Instead use larger patches for robustness Replace how templates are constructed By blurring instead Added Bluetooth sharing of AR configuration

Approach: Use blurred patch instead of mean patch

Generate blurred patches through multi-pass rendering in a GPU. Faster image processing through a GPUs parallelism.

1st Pass: Warping Render the input patch from a certain viewpoint Much faster than on CPU

2nd Pass: Radial blurring to the warped patch Allow the blurred patch covers a range of poses close to the exact pose

3rd Pass: Gaussian blurring to the radial- blurred patch Make the blurred patch robust to image noise

Fig. 7. Effectiveness of radial blur. Combining the radial blur and the Gaussian blur outperforms simple Gaussian blurring.

4th Pass: Accumulation of blurred patches in a texture unit. Reduce the number of readback from GPU memory to CPU memory

Input ImageFrontal View GenerationBlurred Patch GenerationPost-processing

Downsampling blurred patches (128x128) to (32x32) Normalization Zero mean and Standard Deviation of 1

User points the target through the camera. Square patch at the center of the image is used for detection.

Initial pose is retrieved by comparing the input patch with the learned mean patches. ESM-Blur [20] is applied for further pose refinement. NEON instructions are used for faster pose refinement. [20] Y. Park, V. Lepetit, and W. Woo, ESM-blur: Handling and rendering blur in 3D tracking and augmentation, in Proc. Int. Symp. Mixed Augment. Reality, 2009, pp. 163–166. in Proc. Int. Symp. Mixed Augment. Reality, 2009, pp. 163–166.

Patch size: 128 x 128 Number of views used for learning: 225 Maximum radial blur range: 10 degrees Gaussian blur kernel: 11x11 Memory requirement: 900 KB for a target

iPhone3GS / 4PC CPU600MHz / 1GHzIntel QuadCore 2.4 GHz GPUPowerVR SGX 535GeForce 8800 GTX RendererOpenGL ES 2.0OpenGL 2.0 Video480x360640x480

More views, more rendering. Slow radial blur due on the mobile phone. Possible speed improvement through shader optimization. PC iPhone 3GSiPhone 4

Comparison with Gepard[6] [6] S. Hinterstoisser, V. Lepetit, S. Benhimane, P. Fua, and N. Navab,Learning real-time perspective patch rectification, Int. J. Comput. Vis.,vol. 91, pp. 107–130, Jan Fig. 11. Planar targets used for evaluation. (a) Sign-1. (b) Sign-2. (c) Car. (d) Wall. (e) City. (f) Cafe. (g) Book. (h) Grass. (i) MacMini. (j) Board. The patches delimited by the yellow squares are used as a reference patch.

Our approach performs slightly worse in terms of recognition rates, but it is better adapted to mobile phones.

The mean patches comparison takes about 3ms with 225 views. The speed of pose estimation and tracking with ESM-Blur depend on the accuracy of the initial pose provided by patch detection.

Weak to repetitive textures and reflective surfaces. Currently single target only.

Potential applications AR tagging on the real world AR apps anywhere anytime Future work More optimization on mobile phones Detection of multiple targets at the same time