OverLay: Practical Mobile Augmented Reality

Slides:



Advertisements
Similar presentations
Clustering Crowdsourced Videos by Line-of-Sight FOCUS: Clustering Crowdsourced Videos by Line-of-Sight Puneet Jain, Justin Manweiler, Arup Acharya, and.
Advertisements

For Internal Use Only. © CT T IN EM. All rights reserved. 3D Reconstruction Using Aerial Images A Dense Structure from Motion pipeline Ramakrishna Vedantam.
Use it Free: Instantly Knowing Your Phone Attitude Pengfei Zhou*, Mo Li Nanyang Technological University Guobin (Jacky) Shen Microsoft Research.
Use it Free: Instantly Knowing Your Phone Attitude Pengfei Zhou*, Mo Li Nanyang Technological University Guobin (Jacky) Shen Microsoft Research.
1 “Did you see Bob?”: Human Localization using Mobile Phones Ionut Constandache Co-authors: Xuan Bao, Martin Azizyan, and Romit Roy Choudhury Modified.
“Mapping while walking”
Managing Redundant Content in Bandwidth Constrained Wireless Networks Tuan Dao, Amit K. Roy- Chowdhury, Srikanth V. Krishnamurthy U.C. Riverside Harsha.
Silhouette Lookup for Automatic Pose Tracking N ICK H OWE.
TagSense: A Smartphone-based Approach to Automatic Image Tagging - Ujwal Manjunath.
Justin Manweiler Predicting Length of Stay at WiFi Hotspots INFOCOM 2013, Wireless Networks 3 April 18, 2013 IBM T. J. Watson Research Formerly: Duke University.
1/30/2015 Just-in-Time Virtual Machine Provisioning for Cloud Offload Kiryong Ha Carnegie Mellon University.
InSight: Recognizing Humans without Face Recognition He Wang, Xuan Bao, Romit Roy Choudhury, Srihari Nelakuditi.
SurroundSense: Mobile Phone Localization via Ambience Fingerprinting MARTIN AZIZYAN, IONUT CONSTANDACHE, ROMIT ROY CHOUDHURY Presented by Lingfei Wu.
Uncertainty Representation. Gaussian Distribution variance Standard deviation.
Mobile Assisted Localization in Wireless Sensor Networks N.B. Priyantha, H. Balakrishnan, E.D. Demaine, S. Teller MIT Computer Science Presenters: Puneet.
Accurate Non-Iterative O( n ) Solution to the P n P Problem CVLab - Ecole Polytechnique Fédérale de Lausanne Francesc Moreno-Noguer Vincent Lepetit Pascal.
A Practical Approach to Recognizing Physical Activities Jonathan Lester Tanzeem Choudhury Gaetano Borriello.
1 Energy-Efficient localization for networks of underwater drifters Diba Mirza Curt Schurgers Department of Electrical and Computer Engineering.
INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.
ART: Augmented Reality Table for Interactive Trading Card Game Albert H.T. Lam, Kevin C. H. Chow, Edward H. H. Yau and Michael R. Lyu Department of Computer.
Planar Matchmove Using Invariant Image Features Andrew Kaufman.
Linear Solution to Scale and Rotation Invariant Object Matching Professor: 王聖智 教授 Student : 周 節.
VOICe 1.5 Enabling Technology - Final Project Gabe Su.
Visually Fingerprinting Humans without Face Recognition
1/53 Key Problems Localization –“where am I ?” Fault Detection –“what’s wrong ?” Mapping –“what is my environment like ?”
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
THE SECOND LIFE OF A SENSOR: INTEGRATING REAL-WORLD EXPERIENCE IN VIRTUAL WORLDS USING MOBILE PHONES Sherrin George & Reena Rajan.
PhonePoint Pen: Using Mobile Phones to Write in Air Sandip Agrawal, Ionut Constandache, Shravan Gaonkar, Romit Roy Choudhury ACM MobiHeld 2009.
Research Area B Leif Kobbelt. Communication System Interface Research Area B 2.
Chapter 6 Feature-based alignment Advanced Computer Vision.
Presented by: Z.G. Huang May 04, 2011 Did You See Bob? Human Localization using Mobile Phones Romit Roy Choudhury Duke University Durham, NC, USA Ionut.
Ambulation : a tool for monitoring mobility over time using mobile phones Computational Science and Engineering, CSE '09. International Conference.
Satellites in Our Pockets: An Object Positioning System using Smartphones Justin Manweiler, Puneet Jain, Romit Roy Choudhury TsungYun
SurroundSense: Mobile Phone Localization via Ambience Fingerprinting Martin Azizyan, Ionut Constandache, Romit Roy Choudhury Mobicom 2009.
APT: Accurate Outdoor Pedestrian Tracking with Smartphones TsungYun
“Low-Power, Real-Time Object- Recognition Processors for Mobile Vision Systems”, IEEE Micro Jinwook Oh ; Gyeonghoon Kim ; Injoon Hong ; Junyoung.
HERO: Online Real-time Vehicle Tracking in Shanghai Xuejia Lu 11/17/2008.
Fan Zhang, Yang Gao and Jason D. Bakos
Combined Task and Motion Planning for Mobile Manipulation Jason Wolfe, etc. University of California, Berkeley Mozi Song INFOTECH 1.
Using Mobile Phones to Write in Air
Towards real-time camera based logos detection Mathieu Delalandre Laboratory of Computer Science, RFAI group, Tours city, France Osaka Prefecture Partnership.
A Two-level Pose Estimation Framework Using Majority Voting of Gabor Wavelets and Bunch Graph Analysis J. Wu, J. M. Pedersen, D. Putthividhya, D. Norgaard,
1 SmartGossip: An Adaptive Broadcast Service for Wireless Sensor Networks Presented By Thomas H. Hand Duke University Adapted from: “ SmartGossip: An Adaptive.
Visual SLAM Visual SLAM SPL Seminar (Fri) Young Ki Baik Computer Vision Lab.
Spatio-Temporal Case-Based Reasoning for Behavioral Selection Maxim Likhachev and Ronald Arkin Mobile Robot Laboratory Georgia Tech.
Source: Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on Author: Paucher, R.; Turk, M.; Adviser: Chia-Nian.
James Pittman February 9, 2011 EEL 6788 MoVi: Mobile Phone based Video Highlights via Collaborative Sensing Xuan Bao Department of ECE Duke University.
Rotation On a coordinate grid. For a Rotation, you need An angle or fraction of a turn –Eg 90° or a Quarter Turn –Eg 180° or a Half Turn A direction –Clockwise.
COMP 417 – Jan 12 th, 2006 Guest Lecturer: David Meger Topic: Camera Networks for Robot Localization.
Semantic Extraction and Semantics-Based Annotation and Retrieval for Video Databases Authors: Yan Liu & Fei Li Department of Computer Science Columbia.
Accurate Indoor Localization With Zero Start-up Cost
It Starts with iGaze: Visual Attention Driven Networking with Smart Glasses It Starts with iGaze: Visual Attention Driven Networking with Smart Glasses.
NO NEED TO WAR-DRIVE UNSUPERVISED INDOOR LOCALIZATION He Wang, Souvik Sen, Ahmed Elgohary, Moustafa Farid, Moustafa Youssef, Romit Roy Choudhury -twohsien.
Vision-based SLAM Enhanced by Particle Swarm Optimization on the Euclidean Group Vision seminar : Dec Young Ki BAIK Computer Vision Lab.
February 4, Location Based M-Services Soon there will be more on-line personal mobile devices than on-line stationary PCs. Location based mobile-services.
Augmented Reality and 3D modelling Done by Stafford Joemat Supervised by Mr James Connan.
By James J. Todd and Victor J. Perotti Presented by Samuel Crisanto THE VISUAL PERCEPTION OF SURFACE ORIENTATION FROM OPTICAL MOTION.
Advanced Games Development Game Physics CO2301 Games Development 1 Week 19.
Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.
Electrical and Computer Engineering Smart Goggles To Chong Ryan Offir Matt Ferrante James Kestyn Advisor: Dr. Tilman Wolf Team Wolf.
CIRP Annals - Manufacturing Technology 60 (2011) 1–4 Augmented assembly technologies based on 3D bare-hand interaction S.K. Ong (2)*, Z.B. Wang Mechanical.
Processing visual information for Computer Vision
Introduction | Model | Solution | Evaluation
Recognizing Smoking Gestures with Inertial Measurements Unit (IMU)
TagSense: A Smartphone-based Approach to Automatic Image Tagging
Review on Smart Solutions for People with Visual Impairment
Rotation On Cartesian grid.
Rotation On a coordinate grid.
Sensor Fusion Localization and Navigation for Visually Impaired People
Visual Recognition of American Sign Language Using Hidden Markov Models 문현구 문현구.
Presentation transcript:

OverLay: Practical Mobile Augmented Reality Hello everyone, Good morning! It is my great pleasure in speaking here at XXX about my work. My name is Puneet Jain. Thank you for attending my talk and inviting me here. And I will look forward meeting many of you later today. Puneet Jain Duke University/UIUC Justin Manweiler IBM Research Romit Roy Choudhury UIUC

Last year’s tax statements Idea Mobile Augmented Reality Last year’s tax statements Allow random indoor object tagging Others should be able to retrieve Faulty Monitor Wish Mom Birthday Return CDs Mobile AR refers to ones ability to scan the surroundings via smartphones camera and see virtual information associated to the object on phone’s screen. The information could be anything from text annotations, web URLs to audio or video. This vision is certainly not new and have existed for a while now. Many designers, science fiction writers, and researchers have imagined several applications which could exist if this is realized into a holistic system.

Introduction Going forward, I would set ones expectation from Mobile AR. I have video demonstration of an AR system which help in understanding our objective better.

Why not a solved problem? Need to understand today’s approaches Vision Sensing The obvious question is why? – to answer why.. We need to look at current generation approaches.. Mobile AR is currently done in two ways.. Vision/Image based AR or Sensing based AR… both of these approaches are necessary but none of them are sufficient .. Both necessary but not sufficient

Accurate Algorithms are Slow Vision Sensing Feature Extraction Feature Matching Note than accuracy is most important in case of Mobile AR. Unlike google search where any match similar to a given image is OK, mobile AR require exact match. Also, no two similar looking things are. One exit sign is different from another exit sign in the same building since they can indicate different things. Accurate Algorithms are Slow

Matching latency too high for real-time Vision Sensing Offloading + GPU For 100 image DB Matching ≈ 1 s Extraction ≈ 29 ms Network ≈ 302 ms GPU on Cloud Matching latency too high for real-time

Vision Sensing Not possible indoors Requires User Location Brunelleschi's dome Requires User Location Requires Precise Orientation Talk about how new objects would be added and how inaccuracies in sensing quickly detail this. Requires Object Location Not possible indoors

Vision Sensing Accurate/Slow Quick/Inaccurate Indoor Location Clearly there are tradeoffs between accuracy and latency … sensing and computer vision… offload or not to offload…and in todays talk.. That’s what our primary agenda is … Indoor Location But, Indoor Localization is not always available Can accelerate Vision Prerequisites for Sensing

Location-free AR Natural pause, turn, walk indicate spatial-relationships between tags 10 seconds C D 110° 5 seconds 80° B 7 seconds Lets look a how people would use in a museum scenario. An user walks across the museum and looks at paintings on the way. Few natural usage patterns emerge here -- possibly indicating separation between the tags. A Sensors can help in building such geometric layouts Geometry, instead of location, can be used to reduce computation burden on vision

Primary Challenge: Matching Latency Temporal Relationships Rotational Relationships

Temporal Relationships ROTATIONAL Temporal Relationships E D T=21, saw E C T=15, saw C Temporal separations can be captured on cloud B TAB ≤ 7 + ETAB TAB ≥ 7 – ETAB TAC ≤ 15 + ETAC TAC ≥ 15 – ETAC ETAB, ETAC, TAB, TAC≥ 0 When phone is moving toward C, C can be prioritized for the matching.. Similarly when phone turns away from D, it can be removed from the candidate set T=7, saw B A T=0, saw A

Solving for Typical Time TEMPORAL ROTATIONAL Solving for Typical Time

Using temporal Relationships ROTATIONAL Using temporal Relationships E D T=TCURRENT C B EAB TCURRENT - TA A T=TA if ((TCURRENT – TA) + ETAB > TAB ) - Shortlist Time when the object is viewed

Rotational Relationships TEMPORAL ROTATIONAL Rotational Relationships E D C Gyroscope captures angular changes 90° clockwise B When phone is moving toward C, C can be prioritized for the matching.. Similarly when phone turns away from D, it can be removed from the candidate set 110° anti-clockwise RB – RA ≤ 20° + ERBA RB – RA ≥ 20° – ERBA RC – RA ≤ 130° + ERCA RC – RA ≥ 130° – ERCA ERBA, ERCA, RA, RB, RC ≥ 0 A 20° anti-clockwise

Using Rotational Relationships TEMPORAL ROTATIONAL Using Rotational Relationships E D RCURRENT = RA + Gyro Gyro RD B RB RE RA RB RCURRENT ERB/2 A B’s rotational distance = RB – RCURRENT + ERB/2 - Pick tags closer in rotational distance

OverLay: Converged Architecture Selected candidates GPU Optimized Pipeline SURF Refine Match frame “Botanist” N E T W O R K Blur? Hand Motion? Frame Diff? Macro-trajectory Linear Program Sensory Geometry (time, orientation) Learning Update modules (frames, sensors) Annotation DB SURF Annotation DB Retrieve Micro-trajectory Spatial reasoning Visual Geometry Select Candidates Annotate This talk (image, “Botanist”)

Evaluation Android App/Samsung Galaxy S4 Server: GPU on Cloud 12 Cores, 16G RAM, 6G NVidia GPU 11 Volunteers 100+ Tags 4200 Frame Uploads

System Variants Approximate (Quick Computer Vision) Matching using approximate schemes e.g., KDTree Conservative (Slow Computer Vision) Matching using brute-force schemes OverLay Conservative + Optimizations

Optimizations lead to 4 fold improvement Latency Optimizations lead to 4 fold improvement

Accuracy: Precision OverLay ≈ Bruteforce

Approximate < OverLay < Bruteforce Accuracy: Recall Approximate < OverLay < Bruteforce

Conclusion Vision and Sensing based ARs Geometric layouts: Accelerated Vision OverLay: Practical Mobile AR

synrg.csl.illinois.edu/projects/MobileAR Thank you synrg.csl.illinois.edu/projects/MobileAR Puneet Jain Duke University/UIUC Justin Manweiler IBM Research Romit Roy Choudhury UIUC

3D-OBJECTS

Handling 3D Objects: Learning Tagged from particular angle Retrieving from different angle

Accuracy: After Learning Recall > Bruteforce and Precision ≈ Bruteforce