Gaussian Pyramid and Texture

Slides:



Advertisements
Similar presentations
Linear Filtering – Part II Selim Aksoy Department of Computer Engineering Bilkent University
Advertisements

Linear Filtering – Part II Selim Aksoy Department of Computer Engineering Bilkent University
3-D Computational Vision CSc Image Processing II - Fourier Transform.
November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.
Pyramids and Texture. Scaled representations Big bars and little bars are both interesting Spots and hands vs. stripes and hairs Inefficient to detect.
Linear Filtering – Part II Selim Aksoy Department of Computer Engineering Bilkent University
Texture. Edge detectors find differences in overall intensity. Average intensity is only simplest difference. many slides from David Jacobs.
School of Computing Science Simon Fraser University
Image processing. Image operations Operations on an image –Linear filtering –Non-linear filtering –Transformations –Noise removal –Segmentation.
Sampling and Pyramids : Rendering and Image Processing Alexei Efros …with lots of slides from Steve Seitz.
Computer Vision - A Modern Approach
General Functions A non-periodic function can be represented as a sum of sin’s and cos’s of (possibly) all frequencies: F(  ) is the spectrum of the function.
Immagini e filtri lineari. Image Filtering Modifying the pixels in an image based on some function of a local neighborhood of the pixels
CSCE 641 Computer Graphics: Fourier Transform Jinxiang Chai.
Announcements For future problems sets: matlab code by 11am, due date (same as deadline to hand in hardcopy). Today’s reading: Chapter 9, except.
Introduction to Computer Vision CS / ECE 181B  Handout #4 : Available this afternoon  Midterm: May 6, 2004  HW #2 due tomorrow  Ack: Prof. Matthew.
CS443: Digital Imaging and Multimedia Filters Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University Spring 2008 Ahmed Elgammal Dept.
Texture Reading: Chapter 9 (skip 9.4) Key issue: How do we represent texture? Topics: –Texture segmentation –Texture-based matching –Texture synthesis.
2D Fourier Theory for Image Analysis Mani Thomas CISC 489/689.
CPSC 641 Computer Graphics: Fourier Transform Jinxiang Chai.
Computer Vision - A Modern Approach Set: Pyramids and Texture Slides by D.A. Forsyth Scaled representations Big bars (resp. spots, hands, etc.) and little.
Computational Photography: Fourier Transform Jinxiang Chai.
Computer Vision P. Schrater Spring 2003
Fourier Analysis : Rendering and Image Processing Alexei Efros.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5409 T-R 10:30am – 11:50am.
Topic 7 - Fourier Transforms DIGITAL IMAGE PROCESSING Course 3624 Department of Physics and Astronomy Professor Bob Warwick.
Applications of Image Filters Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/04/10.
CSC589 Introduction to Computer Vision Lecture 8
CAP5415: Computer Vision Lecture 4: Image Pyramids, Image Statistics, Denoising Fall 2006.
EECS 274 Computer Vision Pyramid and Texture. Filter, pyramid and texture Frequency domain Fourier transform Gaussian pyramid Wavelets Texture Reading:
09/19/2002 (C) University of Wisconsin 2002, CS 559 Last Time Color Quantization Dithering.
Texture Key issue: representing texture –Texture based matching little is known –Texture segmentation key issue: representing texture –Texture synthesis.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm.
THE LAPLACE TRANSFORM LEARNING GOALS Definition
Edge Detection and Geometric Primitive Extraction Jinxiang Chai.
Edges.
Fourier Transform.
Image hole-filling. Agenda Project 2: Will be up tomorrow Due in 2 weeks Fourier – finish up Hole-filling (texture synthesis) Image blending.
Instructor: Mircea Nicolescu Lecture 7
Last Lecture photomatix.com. Today Image Processing: from basic concepts to latest techniques Filtering Edge detection Re-sampling and aliasing Image.
Linear filters. CS8690 Computer Vision University of Missouri at Columbia What is Image Filtering? Modify the pixels in an image based on some function.
Fourier Transform (Chapter 4) CS474/674 – Prof. Bebis.
Miguel Tavares Coimbra
Edge Detection slides taken and adapted from public websites:
Sampling, Template Matching and Pyramids
Linear Filters and Edges Chapters 7 and 8
- photometric aspects of image formation gray level images
… Sampling … … Filtering … … Reconstruction …
Linear Filtering – Part II
Linear Filters and Edges Chapters 7 and 8
Linear Filters and Edges Chapters 7 and 8
Department of Computer Engineering
(C) 2002 University of Wisconsin, CS 559
Edge Detection CS 678 Spring 2018.
General Functions A non-periodic function can be represented as a sum of sin’s and cos’s of (possibly) all frequencies: F() is the spectrum of the function.
Fourier Transform.
All about convolution.
Outline Linear Shift-invariant system Linear filters
Computer Vision Lecture 16: Texture II
CSCE 643 Computer Vision: Thinking in Frequency
Texture.
Phase and Amplitude in Fourier Transforms, Meaning of frequencies
Oh, no, wait, sorry, I meant, welcome to the implementation of 20th-century science fiction literature, or a small part of it, where we will learn about.
Basic Image Processing
Linear Filtering CS 678 Spring 2018.
Department of Computer Engineering
IT472 Digital Image Processing
IT472 Digital Image Processing
DIGITAL CONTROL SYSTEM WEEK 3 NUMERICAL APPROXIMATION
DIGITAL IMAGE PROCESSING Elective 3 (5th Sem.)
Presentation transcript:

Gaussian Pyramid and Texture EECS 274 Computer Vision Gaussian Pyramid and Texture

Scaled representations Big bars (resp. spots, hands, etc.) and little bars are both interesting Stripes and hairs, say Inefficient to detect big bars with big filters And there is superfluous detail in the filter kernel Alternative: Apply filters of fixed size to images of different sizes Typically, a collection of images whose edge length changes by a factor of 2 (or root 2) This is a pyramid (or Gaussian pyramid) by visual analogy

A bar in the big images is a hair on the zebra’s nose; in smaller images, a stripe; in the smallest, the animal’s nose

Aliasing Can’t shrink an image by taking every second pixel If we do, characteristic errors appear In the next few slides Typically, small phenomena look bigger; fast phenomena can look slower Common phenomenon Wagon wheels rolling the wrong way in movies Checkerboards misrepresented in ray tracing Striped shirts look funny on colour television

Resample the checkerboard by taking one sample at each circle Resample the checkerboard by taking one sample at each circle. In the case of the top left board, new representation is reasonable. Top right also yields a reasonable representation. Bottom left is all black (dubious) and bottom right has checks that are too big. Sampling scheme is crucially related to frequency

Constructing a pyramid by taking every second pixel leads to layers that badly misrepresent the top layer

Open questions What causes the tendency of differentiation to emphasize noise? In what precise respects are discrete images different from continuous images? How do we avoid aliasing? General thread: a language for fast changes --- The Fourier Transform It’s worth elaborating the nature of the links to fast changes here; differentiation emphasizes noise, because derivatives are large at fast changes, and noise pixels tend to be different from their neighbours. The difference between discrete and continuous images is important here, because aliasing is related to passing from one to the other (the checkerboard example).

The Fourier Transform Represent function on a new basis Think of functions as vectors, with many components We now apply a linear transformation to transform the basis dot product with each basis element In the expression, u and v select the basis element, so a function of x and y becomes a function of u and v basis elements have the form Measure amount of the sinusoid with given frequency and orientation in the signal

as a function of x,y for some fixed u, v. To get some sense of what basis elements look like, we plot a basis element --- or rather, its real part --- as a function of x,y for some fixed u, v. We get a function that is constant when (ux+vy) is constant, e.g., (u,v)=(1,2) The magnitude of the vector (u, v) gives a frequency, and its direction gives an orientation. The function is a sinusoid with this frequency along the direction, and constant perpendicular to the direction. Spatial frequency component

Here u and v are larger than in the previous slide, (u,v)=(0,0.4)

And larger still...

Phase and Magnitude Fourier transform of a real function is complex difficult to plot, visualize instead, we can think of the phase and magnitude of the transform Phase is the phase of the complex transform Magnitude is the magnitude of the complex transform Curious fact all natural images have about the same magnitude transform hence, phase seems to matter, but magnitude largely doesn’t Demonstration Take two pictures, swap the phase transforms, compute the inverse - what does the result look like?

This is the magnitude transform of the cheetah pic

This is the phase transform of the cheetah pic

This is the magnitude transform of the zebra pic

This is the phase transform of the zebra pic

Reconstruction with zebra phase, cheetah magnitude Magnitude spctrum of an image is rather uninformative

Reconstruction with cheetah phase, zebra magnitude

Various Fourier Transform Pairs Important facts The Fourier transform is linear There is an inverse FT if you scale the function’s argument, then the transform’s argument scales the other way. This makes sense --- if you multiply a function’s argument by a number that is larger than one, you are stretching the function, so that high frequencies go to low frequencies The FT of a Gaussian is a Gaussian. The convolution theorem The Fourier transform of the convolution of two functions is the product of their Fourier transforms The inverse Fourier transform of the product of two Fourier transforms is the convolution of the two inverse Fourier transforms There’s a table in the book.

Sampling in 1D takes a continuous function and replaces it with a vector of values, consisting of the function’s values at a set of sample points. We’ll assume that these sample points are on a regular grid, and can place one at each integer for convenience.

Sampling in 2D does the same thing, only in 2D Sampling in 2D does the same thing, only in 2D. We’ll assume that these sample points are on a regular grid, and can place one at each integer point for convenience.

A continuous model for a sampled function We want to be able to approximate integrals sensibly Leads to the delta function model on right This slide needs the instructor to tell the story. The point is that modelling a sampled signal as a function that is zero everywhere except at the sample points, where it takes the value of the original function, doesn’t work, because in this model the integral of the sampled function is zero, which is a poor approximation to the original integral. If we use a delta function --- whose integral property you should show, as a limit of box functions (it’s in the book) --- then we can get something that integrates the right way. Zero everywhere except at integer points

The Fourier transform of a sampled signal The crucial missing bit here is that the Fourier transform of a bed-of-nails function is another bed-of-nails function. I mention this, but don’t prove it, as it takes some sophistication with limits and sums. It is usually a good idea to point out to students that the linearity of the Fourier transform doesn’t mean you can take an infinite sum of shifted versions of the Fourier transform of a single delta function and expect the right answer. F(u,v) is the Fourier transform of f(x,y)

Example Transform of box filter is sinc. Transform of Gaussian is Gaussian. (Trucco and Verri)

Fourier transform of the sampled signal consists of a sum of copies of the Fourier transform of the original signal, shifted with respect to each other by the sampling frequency. If the shifted copies do not overlap, the original signal can be reconstructed from the sampled signal. If they overlap, we cannot obtain a separate copy of the Fourier transform, and the signal is aliased

If they overlap, we cannot obtain a separate copy of the Fourier transform, and the signal is aliased This and the last slid constitute a “proof by drawing” of Nyquist’s theorem. It’s worth mentioning this, though we don’t use it explicitly. The crucial point to mention here is that sampling doesn’t work if you have frequency components that are too high.

Smoothing as low-pass filtering The message of the FT is that high frequencies lead to trouble with sampling. Solution: suppress high frequencies before sampling multiply the FT of the signal with something that suppresses high frequencies or convolve with a low-pass filter A filter whose FT is a box is bad, because the filter kernel has infinite support Common solution: use a Gaussian multiplying FT by Gaussian is equivalent to convolving image with Gaussian.

Sampling without smoothing Sampling without smoothing. Top row shows the images, sampled at every second pixel to get the next; bottom row shows the magnitude spectrum of these images.

Sampling with smoothing. Top row shows the images Sampling with smoothing. Top row shows the images. We get the next image by smoothing the image with a Gaussian with σ=1 pixel, then sampling at every second pixel to get the next; bottom row shows the magnitude spectrum of these images. Low pass filter suppresses high frequency components with less aliasing

Large σ  less aliasing, but with little detail Sampling with smoothing. Top row shows the images. We get the next image by smoothing the image with a Gaussian with σ=1.4 pixels, then sampling at every second pixel to get the next; bottom row shows the magnitude spectrum of these images. The phenomena at the coarsest scale are related to the fact that a Gaussian is a fairly poor low-pass filter. If the sigma is big, then the reconstruction error will be smaller (because the function is flatter in the relevant region of FT space), but aliasing will be larger, because it doesn’t die off quickly enough. Large σ  less aliasing, but with little detail Gaussian is not an ideal low-pass filter

Applications of scaled representations Search for correspondence look at coarse scales, then refine with finer scales, coarse-to-fine matching 4 × 4, 16 × 16, …, 1024 × 1024 versions of images Edge tracking a “good” edge at a fine scale has parents at a coarser scale Control of detail and computational cost in matching e.g. finding stripes terribly important in texture representation

The Gaussian pyramid Smooth with Gaussians, because Synthesis Analysis a Gaussian*Gaussian=another Gaussian Synthesis smooth and sample Analysis take the top image

Texture Key issue: representing texture Texture based matching little is known Texture segmentation key issue: representing texture Texture synthesis useful; also gives some insight into quality of representation Shape from texture cover superficially

Typical textured images. For materials such as brush, grass, foliage and water, our perception of what the material is is quite intimately related to the texture (for the figure on the left, what would the surface feel like if you ran your fingers over it? is it wet?, etc.). Notice how much information you are getting about the type of plants, their shape, etc. from the textures in the figure on the right. These textures are also made of quite stylized subelements, arranged in a rough pattern.

Representing textures Textures are made up of quite stylised subelements, repeated in meaningful ways Representation: find the subelements, and represent their statistics But what are the subelements, and how do we find them? recall normalized correlation find subelements by applying filters, looking at the magnitude of the response What filters? experience suggests spots and oriented bars at a variety of different scales details probably don’t matter What statistics? within reason, the more the merrier. At least, mean and standard deviation better, various conditional histograms.

Spots and bars at a fine scale for the butterfly; images show squared response for corresponding filter

ditto, for a coarser scale ditto, for a coarser scale. The size of the filter has stayed fixed, but the image is half the original size (this points to the utility of pyramids).

I’ve scaled up the coarse scale responses so they can be compared I’ve scaled up the coarse scale responses so they can be compared. Notice that these representations don’t respond uniquely --- i.e. there isn’t one peak at one orientation and one scale for a given bar --- but they do give quite a good idea of what is there. If they were tightly tuned, there would have to be a lot more filters. But looking at these pix gives quite a good idea of where there are big bars, where there are little ones, and at what orientation.

Gabor filters at different scales and spatial frequencies top row shows anti-symmetric (or odd) filters, bottom row the symmetric (or even) filters. Gabor filters are just another way to generate spots and bars.

The Laplacian Pyramid Synthesis Analysis preserve difference between upsampled Gaussian pyramid level and Gaussian pyramid level band pass filter - each level represents spatial frequencies (largely) unrepresented at other levels Analysis reconstruct Gaussian pyramid, take top layer

This is a Gaussian pyramid

And this a Laplacian pyramid And this a Laplacian pyramid. Notice that the lowest layer is the same in each case, and that if I upsample the lowest layer and add to the next, I get the Gauss layer. Notice also that each layer shows detail at a particular scale --- these are, basically, bandpass filtered versions of the image.

And this shows a (rough) representation of what happens in FT space when we form pyramids. In the Laplace pyramid, each layer represents a range of frequencies ---an annulus in FT space --- or, roughly, a scale. We’d actually like to break out each annulus into an orientation, which is equivalent to a wedge. We won’t discuss the design of the relevant filters in detail, just note that this is possible. If we do this, we have an oriented pyramid.

Oriented pyramids Laplacian pyramid is orientation independent Apply an oriented filter to determine orientations at each layer by clever filter design, we can simplify synthesis this represents image information at a particular scale and orientation

on Information Theory, 1992, copyright 1992, IEEE This is an example with four orientations at each layer Reprinted from “Shiftable MultiScale Transforms,” by Simoncelli et al., IEEE Transactions on Information Theory, 1992, copyright 1992, IEEE

Analysis

synthesis

Final texture representation Form an oriented pyramid (or equivalent set of responses to filters at different scales and orientations). Square the output Take statistics of responses e.g. mean of each filter output (are there lots of spots) std of each filter output mean of one scale conditioned on other scale having a particular range of values (e.g. are the spots in straight rows?)

Texture synthesis Use image as a source of probability model Choose pixel values by matching neighbourhood, then filling in Matching process look at pixel differences count only synthesized pixels

Figure from Texture Synthesis by Non-parametric Sampling, A Figure from Texture Synthesis by Non-parametric Sampling, A. Efros and T.K. Leung, Proc. Int. Conf. Computer Vision, 1999 copyright 1999, IEEE

Variations Texture synthesis at multiple scales Texture synthesis on surfaces Texture synthesis by tiles “Analogous” texture synthesis