Real-Time 3D Model Acquisition

Slides:

Advertisements

Similar presentations

KinectFusion: Real-Time Dense Surface Mapping and Tracking

Advertisements

Kawada Industries Inc. has introduced the HRP-2P for Robodex 2002

Structured Light principles Figure from M. Levoy, Stanford Computer Graphics Lab.

www-video.eecs.berkeley.edu/research

--- some recent progress Bo Fu University of Kentucky.

A Theft-Based Approach to 3D Object Acquisition Ronit Slyper Jim McCann.

Structured light and active ranging techniques Class 11.

Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #17.

Vision Sensing. Multi-View Stereo for Community Photo Collections Michael Goesele, et al, ICCV 2007 Venus de Milo.

Gratuitous Picture US Naval Artillery Rangefinder from World War I (1918)!!

Stereo Many slides adapted from Steve Seitz. Binocular stereo Given a calibrated binocular stereo pair, fuse it to produce a depth image Where does the.

Robust Global Registration Natasha Gelfand Niloy Mitra Leonidas Guibas Helmut Pottmann.

Structured Light + Range Imaging Lecture #17 (Thanks to Content from Levoy, Rusinkiewicz, Bouguet, Perona, Hendrik Lensch)

Implementation of ICP Variants Pavan Ram Piratla Janani Venkateswaran.

Real-time Acquisition and Rendering of Large 3D Models Szymon Rusinkiewicz.

Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.

Reverse Engineering Niloy J. Mitra.

Registration of two scanned range images using k-d tree accelerated ICP algorithm By Xiaodong Yan Dec

Speed and Robustness in 3D Model Registration Szymon Rusinkiewicz Princeton University.

SIGGRAPH Course 30: Performance-Driven Facial Animation Section: Markerless Face Capture and Automatic Model Construction Part 2: Li Zhang, Columbia University.

1 MURI review meeting 09/21/2004 Dynamic Scene Modeling Video and Image Processing Lab University of California, Berkeley Christian Frueh Avideh Zakhor.

Filling Holes in Complex Surfaces using Volumetric Diffusion James Davis, Stephen Marschner, Matt Garr, Marc Levoy Stanford University First International.

Efficient Variants of the ICP Algorithm

Structured light and active ranging techniques Class 8.

Stereoscopic Light Stripe Scanning: Interference Rejection, Error Minimization and Calibration By: Geoffrey Taylor Lindsay Kleeman Presented by: Ali Agha.

A Laser Range Scanner Designed for Minimum Calibration Complexity James Davis, Xing Chen Stanford Computer Graphics Laboratory 3D Digital Imaging and Modeling.

CS CS 175 – Week 2 Processing Point Clouds Registration.

Light Field Mapping: Hardware-Accelerated Visualization of Surface Light Fields.

Matching and Recognition in 3D. Moving from 2D to 3D Some things harderSome things harder – Rigid transform has 6 degrees of freedom vs. 3 – No natural.

Recovering Articulated Object Models from 3D Range Data Dragomir Anguelov Daphne Koller Hoi-Cheung Pang Praveen Srinivasan Sebastian Thrun Computer Science.

3D Scanning. Computer Graphics Pipeline Human time = expensiveHuman time = expensive Sensors = cheapSensors = cheap – Computer graphics increasingly relies.

Spectral Processing of Point-sampled Geometry

3D full object reconstruction from kinect Yoni Choukroun Elie Semmel Advisor: Yonathan Afflalo.

A Hierarchical Method for Aligning Warped Meshes Leslie Ikemoto 1, Natasha Gelfand 2, Marc Levoy 2 1 UC Berkeley, formerly Stanford 2 Stanford University.

CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.

3D object capture Capture N “views” (parts of the object) –get points on surface of object –create mesh (infer connectivity) Hugues Hoppe –filter data.

Acquiring graphical models of shape and motion James Davis ISM101 May 2005.

Structured light and active ranging techniques Class 8

Automatic Registration of Color Images to 3D Geometry Computer Graphics International 2009 Yunzhen Li and Kok-Lim Low School of Computing National University.

KinectFusion : Real-Time Dense Surface Mapping and Tracking IEEE International Symposium on Mixed and Augmented Reality 2011 Science and Technology Proceedings.

MESA LAB Multi-view image stitching Guimei Zhang MESA LAB MESA (Mechatronics, Embedded Systems and Automation) LAB School of Engineering, University of.

Reporter: Zhonggui Chen

A Method for Registration of 3D Surfaces ICP Algorithm

Realtime 3D model construction with Microsoft Kinect and an NVIDIA Kepler laptop GPU Paul Caheny MSc in HPC 2011/2012 Project Preparation Presentation.

A Camera-Projector System for Real-Time 3D Video Marcelo Bernardes, Luiz Velho, Asla Sá, Paulo Carvalho IMPA - VISGRAF Laboratory Procams 2005.

Velodyne Lidar Sensor. Plane Detection in a 3D environment using a Velodyne Lidar Jacoby Larson UCSD ECE 172.

Stereo Many slides adapted from Steve Seitz.

A General-Purpose Platform for 3-D Reconstruction from Sequence of Images Ahmed Eid, Sherif Rashad, and Aly Farag Computer Vision and Image Processing.

Stereo Many slides adapted from Steve Seitz. Binocular stereo Given a calibrated binocular stereo pair, fuse it to produce a depth image image 1image.

Asian Institute of Technology

Lec 22: Stereo CS4670 / 5670: Computer Vision Kavita Bala.

04/23/03© 2003 University of Wisconsin Where We’ve Been Photo-realistic rendering –Accurate modeling and rendering of light transport and surface reflectance.

EFFICIENT VARIANTS OF THE ICP ALGORITHM

CSE 185 Introduction to Computer Vision Feature Matching.

Triangulation Scanner Design Options

Visual Odometry David Nister, CVPR 2004

Computer Vision Computer Vision based Hole Filling Chad Hantak COMP December 9, 2003.

CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.

Manufacturing Process II

Acquiring, Stitching and Blending Diffuse Appearance Attributes on 3D Models C. Rocchini, P. Cignoni, C. Montani, R. Scopigno Istituto Scienza e Tecnologia.

Faculty of Sciences and Technology from University of Coimbra Generating 3D Meshes from Range Data [1] Graphic Computation and Three-dimensional Modeling.

Real-Time 3D Model Acquisition Szymon Rusinkiewicz Olaf Hall-Holt Marc Levoy Ilya Korsunsky Princeton University Stanford University Hunter College.

Danfoss Visual Inspection System

CS4670 / 5670: Computer Vision Kavita Bala Lec 27: Stereo.

Geometrically Stable Sampling for the ICP Algorithm

Acknowledgement: some content and figures by Brian Curless

3D Scan Alignment Using ICP

Real-time Acquisition and Rendering of Large 3D Models

Point-Cloud 3D Modeling.

Presentation transcript:

Real-Time 3D Model Acquisition Princeton University Stanford University Szymon Rusinkiewicz Olaf Hall-Holt Marc Levoy

3D Scanning 3D scanning becoming widely used, both in traditional areas like model building for movies and in new areas such as art history and archeology.

Possible Research Goals Low noise Guaranteed high accuracy High speed Low cost Automatic operation No holes There are many research goals for 3D scanning. In this paper we focus on one that hasn’t been explicitly addressed too often: coming up with a hole-free model.

3D Model Acquisition Pipeline 3D Scanner To understand the problem, let’s look at the entire 3D model acquisition pipeline. We start with the 3D scanner, which typically returns a range image: 3D shape as seen from a single point of view.

3D Model Acquisition Pipeline 3D Scanner View Planning In order to get the entire object, need to move scanner around object (or move object w.r.t. scanner). This requires figuring out where to put the scanner.

3D Model Acquisition Pipeline 3D Scanner View Planning Alignment Multiple scans need to be aligned.

3D Model Acquisition Pipeline 3D Scanner View Planning Alignment Merging The scans are then merged into a single model.

3D Model Acquisition Pipeline 3D Scanner View Planning Alignment Done? Merging Now, the user needs to plan further scans and determine whether the entire object has been covered.

3D Model Acquisition Pipeline 3D Scanner View Planning Alignment Done? Merging To do this, user needs some sort of feedback, so the final piece of the pipeline is some sort of rendering. Display

3D Model Acquisition Difficulties Much (often most) time spent on “last 20%” Pipeline not optimized for hole-filling Not sufficient just to speed up scanner – must design pipeline for fast feedback With most versions of the previous pipeline, most of the effort goes into hole filling. Each iteration of the pipeline is slow, because the user has to do a scan, align it, render, set up another scan, etc. The key point of this paper is that to speed up the pipeline and make it easier to use, it’s not enough to just focus on the scanner – it’s necessary to design the entire pipeline to provide feedback to the user as efficiently as possible.

Real-Time 3D Model Acquisition [Video clip of hand moving object, then view of screen. 40 secs, no sound] Here is a new pipeline that *is* designed for fast feedback to the user. The user moves the object around by hand (notice the black glove so we don’t get fingers in the scan). You can always see the current state of the model on the screen, and can move the object to fill any holes.

Real-Time 3D Model Acquisition Pipeline 3D Scanner View Planning Alignment Human Here’s how we design this new pipeline. We want to retain the human in the loop, doing view planning, since that’s what humans do best. However, we want to have the rest of the pipeline be designed so as to use the human’s time most efficiently: we don’t want the user to be waiting around for scans to complete. Done? Merging Display

Real-Time 3D Model Acquisition Pipeline 3D Scanner Alignment View Planning Merging Challenge: Real Time So, the challenge is to design the rest of the pipeline to run in real time – to provide feedback to the user as quickly as possible. In the remainder of this talk I’ll focus on the different stages of this pipeline. These all use variations of techniques that have been proposed previously, but by picking particular algorithms that work well together in this real-time pipeline, we really appear to get something that’s more than the sum of its parts. Done? Display

Real-Time 3D Model Acquisition Pipeline 3D Scanner View Planning Alignment Part I: Structured-Light Triangulation As you saw in the video, the real-time system uses a range scanner based on structured light. The novel thing about it is that it has to work on objects that are moving during the scanning. Done? Merging Display

Real-Time 3D Model Acquisition Pipeline 3D Scanner View Planning Alignment Part II: Fast ICP Next, we align the geometry from different views using a real-time variant of a classic algorithm called ICP. Done? Merging Display

Real-Time 3D Model Acquisition Pipeline 3D Scanner View Planning Alignment Part III: Voxel Grid The merging and rendering stages are fairly simple: they are based on binning samples in a voxel grid. Done? Merging Display

Triangulation Project laser stripe onto object Object Laser Camera The first stage in the real-time pipeline is the range scanner. It’s based on the idea of triangulation. In the simplest case, this consists of projecting a stripe of light onto a scene, looking at it from an angle… Project laser stripe onto object

Triangulation Depth from ray-plane triangulation Object Laser Camera (x,y) … and triangulating between a plane from the point of view of the light source and a ray from the point of view of the camera. In the simplest case of a laser triangulation scanner, this yields data from a single contour on the object at a time, and you can sweep the line across the surface to stack up a bunch of these contours and get a scan of an entire patch of surface. Depth from ray-plane triangulation

Triangulation Faster acquisition: project multiple stripes Correspondence problem: which stripe is which? For our real-time pipeline, though, we want to go faster – we need to get as much data at once as possible. The obvious way to do this is to project a bunch of stripes at once, but then we need to figure out which stripe is which.

Continuum of Triangulation Methods Single-frame Single-stripe Multi-stripe Multi-frame There’s a continuum of methods we can use to establish these correspondences. At one extreme are single-stripe systems. These never have the problem of ambiguity, so they produce very high-quality data, but they take a long time to do it. At the other extreme are systems that try to get all the data at once by projecting lots of stripes (or, in this case, dots). They are fast, since they get everything with just a single frame, but they tend to be a lot more fragile and get confused by discontinuities. In the middle are methods that use a few frames to get depth, by flashing stripes on and off over time, and conveying a code about which stripe is which through this pattern of on/off flashing. Slow, robust Fast, fragile

Time-Coded Light Patterns Assign each stripe a unique illumination code over time [Posdamer 82] Time Just as an example, here’s a very simple code based on binary numbers. If you look at a single position in space (i.e., a single pixel), you see a certain on/off pattern over these four frames. This conveys a code that tells you which stripe you are looking at, which gives you a plane in space with which you can then triangulate to find depths. Space

Codes for Moving Scenes Assign time codes to stripe boundaries Perform frame-to-frame tracking of corresponding boundaries Propagate illumination history [Hall-Holt & Rusinkiewicz, ICCV 2001] In our system, though, the objects are moving from frame to frame, so if you just used the simple algorithm you might get half of one code and half of another, and come up with the wrong code. So, we need to do some tracking from frame to frame. In fact, we do this based on looking at the boundaries between stripes. The stripes are tracked from frame to frame, and at the end of the day the illumination on both sides of the boundary is what conveys the code. Illumination history = (WB),(BW),(WB) Code

Designing a Code Want many “features” to track: lots of black/white edges at each frame Try to minimize ghosts – WW or BB “boundaries” that can’t be seen directly There’s an extra little wrinkle, though. If you’re looking for the boundaries between stripes, sometimes you have a “boundary” between two stripes of the same color, which we’ll call a ghost. So, some stripes are easy to track, because you can see them in both frames, but in some cases you have to infer the presence of a stripe you can’t see directly. In order to make this at all feasible, we have to design a code that tries to minimize these ghosts and ensures, for example, that a ghost in one frame becomes visible in the next.

[Hall-Holt & Rusinkiewicz, ICCV 2001] Designing a Code 0000 1101 1010 0111 1111 0010 0101 1000 [Either omit this slide or just display it for a couple of seconds, as comic relief] Designing this code turns into a graph problem – since I’m sure you’ve all solved it in your heads by now, I’ll just move on and won’t waste your time with it… 1011 0110 0001 1100 0100 1001 1110 0011 [Hall-Holt & Rusinkiewicz, ICCV 2001]

Implementation Pipeline: DLP projector illuminates scene @ 60 Hz. Synchronized NTSC camera captures video Pipeline returns range images @ 60 Hz. Project Code Capture Images Find Boundaries Match Boundaries Decode Compute Range So here’s how the range scanner looks. We project stripes at 60 Hz., capture video frames at the same rate, and process them to get range images at 60 Hz.

Real-Time 3D Model Acquisition Pipeline 3D Scanner View Planning Alignment Part II: Fast ICP Next, we align the geometry from different views using a real-time variant of a classic algorithm called ICP. Done? Merging Display

Aligning 3D Data This range scanner can be used for any moving objects For rigid objects, range images can be aligned to each other as object moves So far, I haven’t assumed that we’re looking at rigid objects. For this real-time model acquisition pipeline, though, we assume that the range images are different views of the same rigid objects, so we can align them to each other.

Aligning 3D Data ICP (Iterative Closest Points): for each point on one scan, minimize distance to closest point on other scan… We do this using a well-known algorithm called ICP. The idea is that for each point on the (upper) green scan, we find the closest point on the lower (red) scan, and minimize the distances between these pairs of points. This brings the scans into closer alignment, …

Aligning 3D Data … and iterate to find alignment Iterated Closest Points (ICP) [Besl & McKay 92] … and we can iterate this process until it converges.

ICP in the Real-Time Pipeline Potential problem with ICP: local minima In this pipeline, scans close together Very likely to converge to correct (global) minimum Basic ICP algorithm too slow (~ seconds) Point-to-plane minimization Projection-based matching With these tweaks, running time ~ milliseconds [Rusinkiewicz & Levoy, 3DIM 2001] There are a couple of things that we need to consider when applying the basic ICP algorithm to our pipeline. First, a traditional problem with ICP is that it only converges to a local minimum, not necessarily the global minimum. In our pipeline, though, this turns out not to be a problem because we get range images at 60 Hz., so they tend to be very close to each other. The other issue is speed: the basic algorithm as I’ve described it takes on the order of seconds to converge. There are a couple of tweaks we can make, though: one changes the function that is minimized and the other replaces the closest-point finding with an approximation. With these tweaks, the algorithm becomes fast enough to fit in our pipeline.

Real-Time 3D Model Acquisition Pipeline 3D Scanner View Planning Alignment Part III: Voxel Grid The merging and rendering stages are fairly simple: they are based on binning samples in a voxel grid. Done? Merging Display

Merging and Rendering Goal: visualize the model well enough to be able to see holes Cannot display all the scanned data – accumulates linearly with time Standard high-quality merging methods: processing time ~ 1 minute per scan The third piece of the real-time pipeline is merging and rendering. This is necessary, as we saw, to provide feedback to the user. We can’t use the “standard” methods (typically based on combining volumetric signed distance functions or on Voronoi diagrams on the 3D point cloud) because they take too long. So, instead, we adopt a much simpler algorithm.

Merging and Rendering The idea is that we start with a range image, …

Merging and Rendering … quantize the samples to a grid in 3D, …

Merging and Rendering … and compute a surface normal at each voxel.

+ Merging and Rendering We can then incorporate the samples corresponding to a new range image with the data we already have in the grid, ….

Merging and Rendering … and come up with a merged result. We then do splat rendering from this combined grid, using the averaged normals for lighting. Point rendering, using accumulated normals for lighting

Example: Photograph 18 cm. Here’s another example of the whole pipeline in action. This is a little turtle figurine, about 15 cm. long. 18 cm.

Result [Video clip] Here’s what the user sees while scanning the turtle. Notice how easy it is to see and fill holes in this bumpy back.

Postprocessing Real-time display Quality/speed tradeoff Goal: let user evaluate coverage, fill holes Offline postprocessing for high-quality models Global registration High-quality merging (e.g., using VRIP [Curless 96]) Of course, the quality of the real-time pipeline doesn’t really compare with other scanners – there’s simply not enough CPU to be able to run high-quality algorithms. On the other hand, it doesn’t need to – it only needs to be good enough to let the user see holes in the data. Later on, once the user is sure that all the data has been captured, we can run a high-quality postprocess. This can take as long as it needs to, since the user is no longer in the loop.

Postprocessed Model So, here’s the turtle after postprocessing. In this case we ran global registration, and produced a merged model using VRIP.

Recapturing Alignment

Summary 3D model acquisition pipeline optimized for obtaining complete, hole-free models Use human’s time most efficiently Pieces of pipeline selected for real-time use: Structured-light scanner for moving objects Fast ICP variant Simple grid-based merging, point rendering

Limitations Prototype noisier than commercial systems Could be made equivalent with careful engineering Ultimate limitations on quality: focus, texture Scan-to-scan ICP not perfect  alignment drift Due to noise, miscalibration, degenerate geometry Reduced, but not eliminated, by “anchor scans” Possibly combine ICP with separate trackers

Future Work Faster scanning Application in different contexts Better stripe boundary tracking Multiple cameras, projectors High-speed cameras, projectors Application in different contexts Cart- or shoulder-mounted for digitizing rooms Infrared for imperceptibility

Acknowledgments Collaborators: Sponsors: Li-Wei He James Davis Lucas Pereira Sean Anderson Sponsors: Sony Intel Interval