IIIT-B Computer Vision, Fall 2006 Lecture 1 Introduction to Computer Vision Arvind Lakshmikumar Technology Manager, Sarnoff Corporation Adjunct Faculty,

Slides:

Advertisements

Similar presentations

November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.

Advertisements

Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.

5/13/2015CAM Talk G.Kamberova Computer Vision Introduction Gerda Kamberova Department of Computer Science Hofstra University.

Internet Vision - Lecture 3 Tamara Berg Sept 10. New Lecture Time Mondays 10:00am-12:30pm in 2311 Monday (9/15) we will have a general Computer Vision.

Foreground Modeling The Shape of Things that Came Nathan Jacobs Advisor: Robert Pless Computer Science Washington University in St. Louis.

Computer Vision - A Modern Approach Set: Introduction to Vision Slides by D.A. Forsyth Why study Computer Vision? Images and movies are everywhere Fast-growing.

December 5, 2013Computer Vision Lecture 20: Hidden Markov Models/Depth 1 Stereo Vision Due to the limited resolution of images, increasing the baseline.

CPSC 425: Computer Vision (Jan-April 2007) David Lowe Prerequisites: 4 th year ability in CPSC Math 200 (Calculus III) Math 221 (Matrix Algebra: linear.

A new approach for modeling and rendering existing architectural scenes from a sparse set of still photographs Combines both geometry-based and image.

Exchanging Faces in Images SIGGRAPH ’04 Blanz V., Scherbaum K., Vetter T., Seidel HP. Speaker: Alvin Date: 21 July 2004.

Advanced Computer Vision Introduction Goal and objectives To introduce the fundamental problems of computer vision. To introduce the main concepts and.

A Study of Approaches for Object Recognition

Overview of Computer Vision CS491E/791E. What is Computer Vision? Deals with the development of the theoretical and algorithmic basis by which useful.

Processing Digital Images. Filtering Analysis –Recognition Transmission.

Computing With Images: Outlook and applications

Computer Vision Dan Witzner Hansen Course web page:

Computational Vision Jitendra Malik University of California at Berkeley Jitendra Malik University of California at Berkeley.

What is “Image Processing and Computer Vision”?

© 2004 by Davi GeigerComputer Vision January 2004 L1.1 Introduction.

Stockman MSU Fall Computing Motion from Images Chapter 9 of S&S plus otherwork.

Computer Vision Marc Pollefeys COMP 256 Administrivia Classes: Mon & Wed, 11-12:15, SN115 Instructor: Marc Pollefeys (919) Room.

CS292 Computational Vision and Language Visual Features - Colour and Texture.

1 Chapter 21 Machine Vision. 2 Chapter 21 Contents (1) l Human Vision l Image Processing l Edge Detection l Convolution and the Canny Edge Detector l.

CSCE 641: Computer Graphics Image-based Rendering Jinxiang Chai.

Convergence of vision and graphics Jitendra Malik University of California at Berkeley Jitendra Malik University of California at Berkeley.

Integration Of CG & Live-Action For Cinematic Visual Effects by Amarnath Director, Octopus Media School.

CMSC 426: Image Processing (Computer Vision) David Jacobs.

A Brief Overview of Computer Vision Jinxiang Chai.

Copyright © 2012 Elsevier Inc. All rights reserved.

Themes in Computer Vision Carlo Tomasi. Applications autonomous cars, planes, missiles, robots,... space exploration aid to the blind, ASL recognition.

1 Perception and VR MONT 104S, Spring 2008 Lecture 22 Other Graphics Considerations Review.

Technology and Historical Overview. Introduction to 3d Computer Graphics  3D computer graphics is the science, study, and method of projecting a mathematical.

CS 376b Introduction to Computer Vision 04 / 29 / 2008 Instructor: Michael Eckmann.

What we didn’t have time for CS664 Lecture 26 Thursday 12/02/04 Some slides c/o Dan Huttenlocher, Stefano Soatto, Sebastian Thrun.

Perception Introduction Pattern Recognition Image Formation

CS654: Digital Image Analysis Lecture 3: Data Structure for Image Analysis.

BASIC DRAWING SKILLS 6 th Grade Art & Introduction to Art Ms. McDaniel.

CSCE 5013 Computer Vision Fall 2011 Prof. John Gauch

CIS 489/689: Computer Vision Instructor: Christopher Rasmussen Course web page: vision.cis.udel.edu/cv February 12, 2003  Lecture 1.

Introduction EE 520: Image Analysis & Computer Vision.

December 4, 2014Computer Vision Lecture 22: Depth 1 Stereo Vision Comparing the similar triangles PMC l and p l LC l, we get: Similarly, for PNC r and.

Computer Vision Why study Computer Vision? Images and movies are everywhere Fast-growing collection of useful applications –building representations.

CS 8690: Computer Vision Ye Duan. CS8690 Computer Vision University of Missouri at Columbia Instructor Ye Duan (209 Engr West)

Computer Science Department Pacific University Artificial Intelligence -- Computer Vision.

Why is computer vision difficult?

Computer Vision Michael Isard and Dimitris Metaxas.

MACHINE VISION Machine Vision System Components ENT 273 Ms. HEMA C.R. Lecture 1.

December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.

1 Perception and VR MONT 104S, Fall 2008 Lecture 21 More Graphics for VR.

1 Artificial Intelligence: Vision Stages of analysis Low level vision Surfaces and distance Object Matching.

Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.

Autonomous Robots Vision © Manfred Huber 2014.

Jack Pinches INFO410 & INFO350 S INFORMATION SCIENCE Computer Vision I.

Vision Overview  Like all AI: in its infancy  Many methods which work well in specific applications  No universal solution  Classic problem: Recognition.

Colour and Texture. Extract 3-D information Using Vision Extract 3-D information for performing certain tasks such as manipulation, navigation, and recognition.

1Ellen L. Walker 3D Vision Why? The world is 3D Not all useful information is readily available in 2D Why so hard? “Inverse problem”: one image = many.

Image-Based Rendering Geometry and light interaction may be difficult and expensive to model –Think of how hard radiosity is –Imagine the complexity of.

MASKS © 2004 Invitation to 3D vision. MASKS © 2004 Invitation to 3D vision Lecture 1 Overview and Introduction.

CS 558 Computer Vision John Oliensis. Today’s class What is vision What is computer vision How we can solve vision problems –Important tools –Overall.

1 INTRODUCTION TO COMPUTER GRAPHICS. Computer Graphics The computer is an information processing machine. It is a tool for storing, manipulating and correlating.

Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.

Selected areas in Image and Video Communication

Why study Computer Vision?

Common Classification Tasks

Machine Vision Acquisition of image data, followed by the processing and interpretation of these data by computer for some useful application like inspection,

Image Based Modeling and Rendering (PI: Malik)

Brief Review of Recognition + Context

CMSC 426: Image Processing (Computer Vision)

Fast Forward, Part II Multi-view Geometry Stereo Ego-Motion

Presentation transcript:

IIIT-B Computer Vision, Fall 2006 Lecture 1 Introduction to Computer Vision Arvind Lakshmikumar Technology Manager, Sarnoff Corporation Adjunct Faculty, IIIT-B

Course Overview Introduction to vision Case Studies of Applied Vision –Automotive Safety –Autonomous Navigation –Industrial Inspection –Medical Imaging –Entertainment Image Formation About Cameras Image Processing Geometric Vision Camera Motion Paper readings

Computer Graphics Image Output Model Synthetic Camera (slides courtesy of Michael Cohen)

Real Scene Computer Vision Real Cameras Model Output (slides courtesy of Michael Cohen)

Combined Model Real Scene Real Cameras Image Output Synthetic Camera (slides courtesy of Michael Cohen)

The Vision Problem How to infer salient properties of 3-D world from time-varying 2-D image projection ¤ What is salient? ¤ How to deal with loss of information going from 3-D to 2-D?

Why study Computer Vision? Images and movies are everywhere Fast-growing collection of useful applications –building representations of the 3D world from pictures –automated surveillance (who’s doing what) –movie post-processing –face finding Various deep and attractive scientific mysteries –how does object recognition work? Greater understanding of human vision

Properties of Vision One can “see the future” –Cricketers avoid being hit in the head There’s a reflex --- when the right eye sees something going left, and the left eye sees something going right, move your head fast. –Gannets pull their wings back at the last moment Gannets are diving birds; they must steer with their wings, but wings break unless pulled back at the moment of contact. Area of target over rate of change of area gives time to contact.

Properties of Vision 3D representations are easily constructed –There are many different cues. –Useful to humans (avoid bumping into things; planning a grasp; etc.) in computer vision (build models for movies). –Cues include multiple views (motion, stereopsis) texture shading

Properties of Vision People draw distinctions between what is seen –“Object recognition” –This could mean “is this a fish or a bicycle?” –It could mean “is this George Washington?” –It could mean “is this someone I know?” –It could mean “is this poisonous or not?” –It could mean “is this slippery or not?” –It could mean “will this support my weight?” –Great mystery How to build programs that can draw useful distinctions based on image properties.

Part I: The Physics of Imaging How images are formed –Cameras What a camera does How to tell where the camera was –Light How to measure light What light does at surfaces How the brightness values we see in cameras are determined –Color The underlying mechanisms of color How to describe it and measure it

Part II: Early Vision in One Image Representing small patches of image –For three reasons We wish to establish correspondence between (say) points in different images, so we need to describe the neighborhood of the points Sharp changes are important in practice --- known as “edges” Representing texture by giving some statistics of the different kinds of small patch present in the texture. –Tigers have lots of bars, few spots –Leopards are the other way

Representing an image patch Filter outputs –essentially form a dot-product between a pattern and an image, while shifting the pattern across the image –strong response -> image locally looks like the pattern –e.g. derivatives measured by filtering with a kernel that looks like a big derivative (bright bar next to dark bar)

Convolve this image With this kernel To get this

Texture Many objects are distinguished by their texture –Tigers, cheetahs, grass, trees We represent texture with statistics of filter outputs –For tigers, bar filters at a coarse scale respond strongly –For cheetahs, spots at the same scale –For grass, long narrow bars –For the leaves of trees, extended spots Objects with different textures can be segmented The variation in textures is a cue to shape

Part III: Early Vision in Multiple Images The geometry of multiple views –Where could it appear in camera 2 (3, etc.) given it was here in 1 (1 and 2, etc.)? Stereopsis –What we know about the world from having 2 eyes Structure from motion –What we know about the world from having many eyes or, more commonly, our eyes moving.

Part IV: Mid-Level Vision Finding coherent structure so as to break the image or movie into big units –Segmentation: Breaking images and videos into useful pieces E.g. finding video sequences that correspond to one shot E.g. finding image components that are coherent in internal appearance –Tracking: Keeping track of a moving object through a long sequence of views

Part V: High Level Vision (Geometry) The relations between object geometry and image geometry –Model based vision find the position and orientation of known objects –Smooth surfaces and outlines how the outline of a curved object is formed, and what it looks like –Aspect graphs how the outline of a curved object moves around as you view it from different directions –Range data

Part VI: High Level Vision (Probabilistic) Using classifiers and probability to recognize objects –Templates and classifiers how to find objects that look the same from view to view with a classifier –Relations break up objects into big, simple parts, find the parts with a classifier, and then reason about the relationships between the parts to find the object. –Geometric templates from spatial relations extend this trick so that templates are formed from relations between much smaller parts

Applications: Factory Inspection Cognex’s “CapInspect” system: Low-level image analysis: Identify edges, regions Mid-level: Distinguish “cap” from “no cap” Estimation: What are orientation of cap, height of liquid?

Applications: Face Detection courtesy of H. Rowley How is this like the bottle problem on the previous slide?

Applications: Text Detection & Recognition from J. Zhang et al. Similar to face finding: Where is the text and what does it say? Viewing at an angle complicates things...

Applications: MRI Interpretation Coronal slice of brainSegmented white matter from W. Wells et al.

Detection and Recognition: How? Build models of the appearance characteristics (color, texture, etc.) of all objects of interest Detection: Look for areas of image with sufficiently similar appearance to a particular object Recognition: Decide which of several objects is most similar to what we see Segmentation: “Recognize” every pixel

Applications: Football First-Down Line courtesy of Sportvision

Applications: Virtual Advertising courtesy of Princeton Video Image

First-Down Line, Virtual Advertising: How? Where should message go? –Sensors that measure pan, tilt, zoom and focus are attached to calibrated cameras at surveyed positions –Knowledge of the 3-D position of the line, advertising rectangle, etc. can be directly translated into where in the image it should appear for a given camera What pixels get painted? –Occluding image objects like the ball, players, etc. where the graphic is to be put must be segmented out. These are recognized by being a sufficiently different color from the background at that point. This allows pixel-by-pixel compositing.

Applications: Inserting Computer Graphics with a Moving Camera How does motion complicate things? Opening titles from the movie “Panic Room”

Applications: Inserting Computer Graphics with a Moving Camera courtesy of 2d3

CG Insertion with a Moving Camera: How? This technique is often called matchmove Once again, we need camera calibration, but also information on how the camera is moving—its egomotion. This allows the CG object to correctly move with the real scene, even if we don’t know the 3-D parameters of that scene. Estimating camera motion: –Much simpler if we know camera is moving sideways (e.g., some of the “Panic Room” shots), because then the problem is only 2-D –For general motions: By identifying and following scene features over the entire length of the shot, we can solve retrospectively for what 3-D camera motion would be consistent with their 2-D image tracks. Must also make sure to ignore independently moving objects like cars and people.

Applications: Rotoscoping 2d3’s Pixeldust

Applications: Motion Capture Vicon software: 12 cameras, 41 markers for body capture; 6 zoom cameras, 30 markers for face

Applications: Motion Capture without Markers courtesy of C. Bregler What’s the difference between these two problems?

Motion Capture: How? Similar to matchmove in that we follow features and estimate underlying motion that explains their tracks Difference is that the motion is not of the camera but rather of the subject (though camera could be moving, too) –Face/arm/person has more degrees of freedom than camera flying through space, but still constrained Special markers make feature identification and tracking considerably easier Multiple cameras gather more information

Applications: Image-Based Modeling courtesy of P. Debevec Façade project: UC Berkeley Campanile

Image-Based Modeling: How? 3-D model constructed from manually- selected line correspondences in images from multiple calibrated cameras Novel views generated by texture- mapping selected images onto model

Applications: Robotics Autonomous driving: Lane & vehicle tracking (with radar)

Why is Vision Interesting? Psychology –~ 50% of cerebral cortex is for vision. –Vision is how we experience the world. Engineering –Want machines to interact with world. –Digital images are everywhere.

Vision is inferential: Light (

Vision is inferential: Light (

Vision is Inferential: Geometry

Computer Vision Inference  Computation Building machines that see Modeling biological perception

Boundary Detection: Local cues

Boundary Detection

Boundary Detection Finding the Corpus Callosum (G. Hamarneh, T. McInerney, D. Terzopoulos)

(Sharon, Balun, Brandt, Basri)

Texture Photo Pattern Repeated

Texture Computer Generated Photo

Tracking (Comaniciu and Meer)

Understanding Action

Tracking and Understanding (

Tracking

Stereo

Stereo

Motion Courtesy Yiannis Aloimonos

Motion - Application (

Pose Determination Visually guided surgery

Recognition - Shading Lighting affects appearance

Classification (Funkhauser, Min, Kazhdan, Chen, Halderman, Dobkin, Jacobs)

Viola and Jones: Real time Face Detection

Vision depends on: Geometry Physics The nature of objects in the world (This is the hardest part).

Modeling + Algorithms Build a simple model of the world (eg., flat, uniform intensity). Find provably good algorithms. Experiment on real world. Update model. Problem: Too often models are simplistic or intractable.

Bayesian inference Bayes law: P(A|B) = P(B|A)*P(A)/P(B). P(world|image) = P(image|world)*P(world)/P(image) P(image|world) is computer graphics –Geometry of projection. –Physics of light and reflection. P(world) means modeling objects in world. Leads to statistical/learning approaches. Problem: Too often probabilities can’t be known and are invented.

Related Fields Graphics. “Vision is inverse graphics”. Visual perception. Neuroscience. AI Learning Math: eg., geometry, stochastic processes. Optimization.