3D Computer Vision: CSc 83020.

Name: 3D Computer Vision: CSc 83020.
Uploaded: 2017-07-05T11:43:54+00:00
Duration: PTM23S35
Channel: Dayton Keathley
Description: 3D Computer Vision: CSc 83020.

3D Computer Vision: CSc 83020

3D Computer Vision: CSc 83020 Instructor: Ioannis Stamos
istamos (at) hunter.cuny.edu Office Hours: Tuesdays 4-6 (at Hunter) or by appoitnment Office: 1090G Hunter North (69th street bw. Park and Lex.) Computer Vision Lab: 1090E Hunter North Course web page:

Goals To familiarize you with basic the techniques and jargon in the field To enable you to solve computer vision problems To let you experience (and appreciate!) the difficulties of real-world computer vision To get you excited!

Class Policy You have to Late policy Teaming
Turn in all assignments (60% of grade) Complete a final project (30% of grade) Actively participate in class (10% of grade) Late policy Six late days (but not for final project) Teaming For final project you can work in groups of 2

About me 11th year at Hunter and the Graduate Center
Graduated from Columbia in ’01 CS Ph.D. Research Areas: Computer Vision 3D Modeling Computer Graphics

Books Computer Vision: Algorithms and Applications, Richard Szeliski, 2010 (available online for free) Robot Vision B. K. P. Horn, The MIT Press (great classic book) Introductory Techniques for 3-D Computer Vision Emanuele Trucco and Alessandro Verri, Prentice Hall, 1998 (algorithmic perspective) Computer Vision A Modern Approach David A. Forsyth, Jean Ponce, Prentice Hall 2003 An Invitation to 3-D Vision Yi Ma, Stefano Soatto, Jana Kosecka, S. Shankar Sastry Springer 2004. Three-Dimensional Computer Vision: A Geometric Viewpoint Olivier Faugeras The MIT Press, 1996.

Journals/Web International Journal of Computer Vision.
Computer Vision and Image Understanding. IEEE Trans. on Pattern Analysis and Machine Intelligence. SIGGRAPH (mostly Graphics) (CMU’s Robotic Institute) (The Vision Home Page) (CV Online) (Annotated CV Bibliography)

Class History Based on class taught at Columbia University
by Prof. Shree Nayar. New material reflects modern approach. Taught similar class at Hunter Taught “3D Photography” class at the Graduate Center of CUNY. My active research area Funded by the National Science Foundation

Class Schedule Check class website Final project proposals
Due Nov. 7 Design your own or check list of possible projects on class website Final project presentations and report May 16 (last class)

What is Computer Vision?
Sensors Images or Video Illumination Vision System Physical 3D World Scene Description Measuring Visual Information

Computer Graphics Output Image Model Synthetic Camera
(slides courtesy of Michael Cohen)

Computer Vision Output Model Real Scene Real Cameras

Combined Output Image Model Real Scene Synthetic Camera Real Cameras

Cont. Vision is automating visual processes (Ball & Brown).
Vision is an information processing task (Marr). Vision is inverting image formation (Horn). Vision is inverse graphics. Vision looks easy, but is difficult. Vision is difficult, but it is fun (Kanade). Vision is useful.

Some Applications Industrial Material Handling Inspection Assembly

Some Applications Autonomous Navigation

Some Applications Vision for Graphics Film Industry Urban Planning
E-commerce Virtual Reality

Some Applications Realistic 3D experience Google Earth
Microsoft Photosynth

More Applications! Optical Character Recognition (OCR)
Visual Databases (images or movies) Searching for image content Face Recognition (security) Iris Recognition (security) Traffic Monitoring Systems Many more…

Vision deals with images

Images Look Nice…

Images Look Nice… Ioannis Stamos – CSc Spring 2007

...Essentially a 2D array of numbers

Low-Level or “Early” Vision
Considers local properties of an image “There’s an edge!” From: Szymon Rusinkiewicz, Princeton.

“There’s an object and a background!”
Mid-Level Vision Grouping and segmentation “There’s an object and a background!”

High-Level Vision Recognition “It’s a chair!”

Humans Vision is easy for us. But how do we do it?

Human Vision: Illusions
Fraser’s spiral (Fraser 1908)

Illusions Zölner Illusion (1860) Hering Illusion (1861)
Wundt Illusion (1896)

Visual Ambiguities Young-Girl/Old-Woman

Visual Ambiguities From NALWA.

Visual Ambiguities

Visual Ambiguities Two lava cones with craters, perceived to be the image of two craters with mounds. Implicit assumption that lighting is from above.

Visual Ambiguities

Seeing and Thinking Kanizsa (1979)

Syllabus Overview

Image Formation and Optics
Light Source p Surface normal CCD Array Lens P Object Surface Projection of 3-D World on a 2-D plane

Lenses Ray of light Optical Axis

Image Sensors/Camera Models
Typical 512x512 CCD array Imaging Area 262,144 pixels One Pixel 20μm 20μm 512 (10.25mm) Convert Optical Images To Electrical Signals. 512 (10.25mm)

Filtering = h g f

Image Features Detecting intensity changes in the image
Ioannis Stamos – CSc Spring 2007

Grouping image features
Finding continuous lines from edge segments Ioannis Stamos – CSc Spring 2007

Camera Calibration Camera Coordinate Frame Zc Pixel Coordinates Yc
Intrinsic Parameters Xc Extrinsic Parameters Image Coordinate Frame Zw Yw World Coordinate Frame Xw

Shape from X Shape from X Stereo Motion Shading Texture foreshortening

Binocular Stereo depth map

Active Sensing Sheet of light CCD array Lens Sources of error:
1) grazing angle, 2) object boundaries. Sheet of light CCD array Lens

Shape from Shading Three-dimensional shape from a single image.

Motion (optical flow) Determining the movement of scene objects

Reflectance and Color Why do these spheres look different?

Object Recognition Learning visual appearance.
Real-time object recognition.

Template-Based Methods
Cootes et al.

Some Vision Systems…

Example 2: Structure From Motion
Slide courtesy of Sebastian Thrun Stanford

Example 4: 3D Modeling Slide courtesy of Sebastian Thrun
Stanford Drago Anguelov

Example 5: Segmentation

Example 6: Classification

Real-world Applications
Osuna et al:

Range Scanning Outdoor Structures

Data Acquisition Spot laser scanner. Time of flight. Max Range: 100m.
Scanning time: minutes for x1000 points. Accuracy: 6mm.

Latest Video

Inserting models in Google Earth

Image sequence (CMU, Virtualized Reality Project)
Dynamic Scenes Image sequence (CMU, Virtualized Reality Project)

Dynamic Scenes Dynamic 3D model.

Dynamic texture-mapped model.
Dynamic Scenes Dynamic texture-mapped model.

Scanning the David Marc Levoy, Stanford
2:30 through closeup of eye 0:40 So here we are scanning our “piece de resistance” the gantry is extended to nearly its full height If you’ve looked at our web site you know the little story about the gantry being too short initially because we relied on art history books for the height of the David and they're all wrong – by 3 feet !! so at the last minute we had to add a piece to the gantry…here and load another few hundred pounds of ballast to the base…here – to keep the gantry from tipping over on the statue But as you can see from the photo on the right, we did finally reach the top of the statue height of gantry: meters weight of gantry: kilograms 4:00 through architectural reps 0:45

Statistics about the scan
0:20 So here’s a rendering of our computer model of the David we rolled the gantry, which now weighed about a ton, 480 times 2 billion polygons you can read the rest It took 30 days to scan the statue but we were only allowed to scan at night, when the museum was closed so it was basically 30 all-nighters in a row 480 individually aimed scans 2 billion polygons 7,000 color images 32 gigabytes 30 nights of scanning 22 people Ioannis Stamos – CSc Spring 2007 0:30

Head of Michelangelo’s David
0:20 We haven’t assembled the David at full-resolution yet Here’s a 1 mm model of the head watertight with color The photograph at the left was taken with uncalibrated lighting so the two images don’t match exactly but hopefully you agree that we’ve basically captured the statue’s appearance photograph 1.0 mm computer model

David’s left eye 0.25 mm model photograph
0:30 Here’s a closeup of David’s left eye at the maximum resolution of our model – ¼ mm we can see some interesting things… These bumps are holes from Michelangelo’s drill so that’s real geometry are artifacts from space carving, the hole-filling part of our range image merging algorithm smoothing out those artifacts, while preserving the observed surfaces, is future work Finally, these bumps are noise from laser subsurface scattering Let’s zoom in some more and look at the triangle mesh noise from laser scatter holes from Michelangelo’s drill artifacts from space carving 0.25 mm model photograph Ioannis Stamos – CSc Spring 2007 0:45

What do you think?

3D Computer Vision: CSc 83020.

Similar presentations

Presentation on theme: "3D Computer Vision: CSc 83020."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

3D Computer Vision: CSc 83020.

Similar presentations

Presentation on theme: "3D Computer Vision: CSc 83020."— Presentation transcript:

Similar presentations

About project

Feedback