Presentation is loading. Please wait.

Presentation is loading. Please wait.

Why is computer vision difficult?

Similar presentations


Presentation on theme: "Why is computer vision difficult?"— Presentation transcript:

1 Why is computer vision difficult?
Viewpoint variation Scale Illumination

2 Why is computer vision difficult?
Motion (Source: S. Lazebnik) Intra-class variation Occlusion Background clutter

3 Challenges: local ambiguity
slide credit: Fei-Fei, Fergus & Torralba

4 But there are lots of cues we can exploit…
Source: S. Lazebnik

5 Bottom line Perception is an inherently ambiguous problem
Many different 3D scenes could have given rise to a particular 2D picture We often need to use prior knowledge about the structure of the world Image source: F. Durand

6 Course overview (tentative)
Low-level vision image processing, edge detection, feature detection, cameras, image formation Geometry and algorithms projective geometry, stereo, structure from motion, Markov random fields Recognition face detection / recognition, category recognition, segmentation Light, color, and reflectance Advanced topics

7 Projects (tentative) Roughly five projects
First one will be done solo, others in groups You can discuss the projects on a whiteboard, but all code must be your (or your group’s) own First project to be released today or tomorrow

8 Project: Image Scissors

9 Project: Feature detection and matching

10 Project: Creating panoramas

11 Object category recognition
Project: Recognition Location recognition Face recognition Object category recognition

12 Grading Occasional quizzes (at the beginning of class)
One prelim, one final exam Rough grade breakdown: Quizzes: 5% Midterm: 15% Programming projects: 60% Final exam: 15%

13 Late policy Two “late days” will be available for the semester
Late projects will be penalized by 25% for each day it is late, and no extra credit will be awarded.

14 Questions?

15 Lecture 1: Images and image filtering
CS4670/5670: Intro to Computer Vision Noah Snavely Lecture 1: Images and image filtering Hybrid Images, Oliva et al.,

16 Lecture 1: Images and image filtering
CS4670: Computer Vision Noah Snavely Lecture 1: Images and image filtering Hybrid Images, Oliva et al.,

17 Lecture 1: Images and image filtering
CS4670: Computer Vision Noah Snavely Lecture 1: Images and image filtering Hybrid Images, Oliva et al.,

18 Lecture 1: Images and image filtering
CS4670: Computer Vision Noah Snavely Lecture 1: Images and image filtering Hybrid Images, Oliva et al.,

19 Reading Szeliski, Chapter

20 What is an image?

21 What is an image? We’ll focus on these in this class
Digital Camera We’ll focus on these in this class (More on this process later) The Eye Source: A. Efros

22 = What is an image? A grid (matrix) of intensity values
(common to use one byte per value: 0 = black, 255 = white) 255 20 75 95 96 127 145 175 200 47 74 =

23 What is an image? We can think of a (grayscale) image as a function, f, from R2 to R: f (x,y) gives the intensity at position (x,y) A digital image is a discrete (sampled, quantized) version of this function x y f (x, y) snoop 3D view

24 Image transformations
As with any function, we can apply operators to an image We’ll talk about a special kind of operator, convolution (linear filtering) g (x,y) = f (x,y) + 20 g (x,y) = f (-x,y)

25 Question: Noise reduction
Given a camera and a still scene, how can you reduce noise? Answer: take lots of images, average them Take lots of images and average them! What’s the next best thing? Source: S. Seitz

26 Image filtering Modify the pixels in an image based on some function of a local neighborhood of each pixel 5 1 4 7 3 10 Some function 7 Local image data Modified image data Source: L. Zhang

27 Linear filtering One simple version: linear filtering (cross-correlation, convolution) Replace each pixel by a linear combination (a weighted sum) of its neighbors The prescription for the linear combination is called the “kernel” (or “mask”, “filter”) 6 1 4 8 5 3 10 0.5 1 8 Local image data kernel Modified image data Source: L. Zhang

28 Cross-correlation Let be the image, be the kernel (of size 2k+1 x 2k+1), and be the output image Can think of as a “dot product” between local neighborhood and kernel for each pixel This is called a cross-correlation operation:

29 Convolution Same as cross-correlation, except that the kernel is “flipped” (horizontally and vertically) Convolution is commutative and associative This is called a convolution operation:

30 Convolution Adapted from F. Durand

31 Mean filtering 90 10 20 30 40 60 90 50 80 * = 1

32 Linear filters: examples
* 1 = Original Identical image Source: D. Lowe

33 Linear filters: examples
* 1 = Original Shifted left By 1 pixel Source: D. Lowe

34 Linear filters: examples
1 * = Original Blur (with a mean filter) Source: D. Lowe

35 Linear filters: examples
Sharpening filter (accentuates edges) 1 2 - * = Original Source: D. Lowe

36 Sharpening Source: D. Lowe

37 Smoothing with box filter revisited
I always walk through the argument on the left rather carefully; it gives some insight into the significance of impulse responses or point spread functions. Source: D. Forsyth

38 Gaussian Kernel Source: C. Rasmussen

39 Gaussian filters = 1 pixel = 5 pixels = 10 pixels = 30 pixels

40 Mean vs. Gaussian filtering

41 Gaussian filter Removes “high-frequency” components from the image (low-pass filter) Convolution with self is another Gaussian Convolving twice with Gaussian kernel of width = convolving once with kernel of width * = Linear vs. quadratic in mask size Source: K. Grauman

42 Sharpening revisited = = What does blurring take away? – + α
original smoothed (5x5) detail = Let’s add it back: original detail + α sharpened = Source: S. Lazebnik

43 unit impulse (identity)
Sharpen filter blurred image image unit impulse (identity) Gaussian scaled impulse Laplacian of Gaussian f + a(f - f * g) = (1+a)f-af*g = f*((1+a)e-g)

44 Sharpen filter unfiltered filtered

45 “Optical” Convolution
Camera shake = * Source: Fergus, et al. “Removing Camera Shake from a Single Photograph”, SIGGRAPH 2006 Bokeh: Blur in out-of-focus regions of an image. Source:

46 Questions? For next time: Read Szeliski, Chapter


Download ppt "Why is computer vision difficult?"

Similar presentations


Ads by Google