CS 2750: Machine Learning Bias-Variance Trade-off (cont’d) + Image Representations Prof. Adriana Kovashka University of Pittsburgh January 20, 2016.

Slides:



Advertisements
Similar presentations
Feature extraction: Corners
Advertisements

CSE 473/573 Computer Vision and Image Processing (CVIP)
Interest points CSE P 576 Ali Farhadi Many slides from Steve Seitz, Larry Zitnick.
TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.
Computational Photography CSE 590 Tamara Berg Filtering & Pyramids.
Recognition: A machine learning approach
Algorithms and Applications in Computer Vision
Interest points CSE P 576 Larry Zitnick Many slides courtesy of Steve Seitz.
Locating and Describing Interest Points
Feature extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2005 with a lot of slides stolen from Steve Seitz and.
Feature extraction: Corners and blobs
Interest Points and Corners Computer Vision CS 143, Brown James Hays Slides from Rick Szeliski, Svetlana Lazebnik, Derek Hoiem and Grauman&Leibe 2008 AAAI.
Distinctive image features from scale-invariant keypoints. David G. Lowe, Int. Journal of Computer Vision, 60, 2 (2004), pp Presented by: Shalomi.
Scale Invariant Feature Transform (SIFT)
Lecture 3a: Feature detection and matching CS6670: Computer Vision Noah Snavely.
Lecture 2: Image filtering
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.
CS4670: Computer Vision Kavita Bala Lecture 8: Scale invariance.
Applications of Image Filters Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/04/10.
Computer vision.
Locating and Describing Interest Points Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/02/10 Acknowledgment: Many keypoint slides.
Image Stitching Tamara Berg CSE 590 Computational Photography Many slides from Alyosha Efros & Derek Hoiem.
Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.
CSE 185 Introduction to Computer Vision Local Invariant Features.
CS 1699: Intro to Computer Vision Local Image Features: Extraction and Description Prof. Adriana Kovashka University of Pittsburgh September 17, 2015.
Image Filtering Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/02/10.
Why is computer vision difficult?
Feature extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.
776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2013.
Local invariant features 1 Thursday October 3 rd 2013 Neelima Chavali Virginia Tech.
Feature extraction: Corners and blobs. Why extract features? Motivation: panorama stitching We have two images – how do we combine them?
Project 3 questions? Interest Points and Instance Recognition Computer Vision CS 143, Brown James Hays 10/21/11 Many slides from Kristen Grauman and.
Bias and Variance of the Estimator PRML 3.2 Ethem Chp. 4.
Local features and image matching October 1 st 2015 Devi Parikh Virginia Tech Disclaimer: Many slides have been borrowed from Kristen Grauman, who may.
Local features: detection and description
CS 691B Computational Photography
CS654: Digital Image Analysis
CS 2750: Machine Learning The Bias-Variance Tradeoff Prof. Adriana Kovashka University of Pittsburgh January 13, 2016.
CS 2750: Machine Learning Linear Regression Prof. Adriana Kovashka University of Pittsburgh February 10, 2016.
Grauman Today: Image Filters Smooth/Sharpen Images... Find edges... Find waldo…
CSE 185 Introduction to Computer Vision Local Invariant Features.
Keypoint extraction: Corners 9300 Harris Corners Pkwy, Charlotte, NC.
Prof. Adriana Kovashka University of Pittsburgh September 14, 2016
Interest Points EE/CSE 576 Linda Shapiro.
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Distinctive Image Features from Scale-Invariant Keypoints
Local features: main components
Project 1: hybrid images
3D Vision Interest Points.
Corners and interest points
TP12 - Local features: detection and description
Source: D. Lowe, L. Fei-Fei Recap: edge detection Source: D. Lowe, L. Fei-Fei.
Scale and interest point descriptors
Local features: detection and description May 11th, 2017
Feature description and matching
CS 2750: Machine Learning Linear Regression
CS 2750: Machine Learning Review (cont’d) + Linear Regression Intro
CS 2770: Computer Vision Intro to Visual Recognition
CS 2750: Machine Learning Line Fitting + Bias-Variance Trade-off
Local features and image matching
CS 1675: Intro to Machine Learning Regression and Overfitting
Lecture VI: Corner and Blob Detection
Feature descriptors and matching
Lecture 5: Feature invariance
Lecture 5: Feature invariance
Presentation transcript:

CS 2750: Machine Learning Bias-Variance Trade-off (cont’d) + Image Representations Prof. Adriana Kovashka University of Pittsburgh January 20, 2016

Announcement Homework 1 now due Feb. 1

How well does a learned model generalize from the data it was trained on to a new test set? Training set (labels known)Test set (labels unknown) Slide credit: L. Lazebnik Generalization

Components of expected loss – Noise in our observations: unavoidable – Bias: how much the average model over all training sets differs from the true model Error due to inaccurate assumptions/simplifications made by the model – Variance: how much models estimated from different training sets differ from each other Underfitting: model is too “simple” to represent all the relevant class characteristics – High bias and low variance – High training error and high test error Overfitting: model is too “complex” and fits irrelevant characteristics (noise) in the data – Low bias and high variance – Low training error and high test error Adapted from L. Lazebnik

Bias-Variance Trade-off Models with too few parameters are inaccurate because of a large bias (not enough flexibility). Models with too many parameters are inaccurate because of a large variance (too much sensitivity to the sample). Adapted from D. Hoiem Red dots = training data (all that we see before we ship off our model!) Green curve = true underlying modelBlue curve = our predicted model/fit Purple dots = possible test points Think about “squinting”…

Polynomial Curve Fitting Slide credit: Chris Bishop

Sum-of-Squares Error Function Slide credit: Chris Bishop

0 th Order Polynomial Slide credit: Chris Bishop

1 st Order Polynomial Slide credit: Chris Bishop

3 rd Order Polynomial Slide credit: Chris Bishop

9 th Order Polynomial Slide credit: Chris Bishop

Over-fitting Root-Mean-Square (RMS) Error: Slide credit: Chris Bishop

Data Set Size: 9 th Order Polynomial Slide credit: Chris Bishop

Data Set Size: 9 th Order Polynomial Slide credit: Chris Bishop

How to reduce over-fitting? Get more training data Slide credit: D. Hoiem

Regularization Penalize large coefficient values (Remember: We want to minimize this expression.) Adapted from Chris Bishop

Regularization: Slide credit: Chris Bishop

Regularization: Slide credit: Chris Bishop

Polynomial Coefficients Slide credit: Chris Bishop

Polynomial Coefficients Adapted from Chris Bishop No regularization Huge regularization

Regularization: vs. Slide credit: Chris Bishop

Bias-variance Figure from Chris Bishop

How to reduce over-fitting? Get more training data Regularize the parameters Slide credit: D. Hoiem

Bias-variance tradeoff Training error Test error UnderfittingOverfitting Complexity Low Bias High Variance High Bias Low Variance Error Slide credit: D. Hoiem

Bias-variance tradeoff Many training examples Few training examples Complexity Low Bias High Variance High Bias Low Variance Test Error Slide credit: D. Hoiem

Effect of training size Testing Training Generalization Error Number of Training Examples Error Fixed prediction model Adapted from D. Hoiem

How to reduce over-fitting? Get more training data Regularize the parameters Use fewer features Choose a simpler classifier Use validation set to find when overfitting occurs Adapted from D. Hoiem Test error UnderfittingOverfitting

Remember… Three kinds of error – Inherent: unavoidable – Bias: due to over-simplifications – Variance: due to inability to perfectly estimate parameters from limited data Try simple classifiers first Use increasingly powerful classifiers with more training data (bias-variance trade-off) Adapted from D. Hoiem

Image Representations Keypoint-based image description – Extraction / detection of keypoints – Description (via gradient histograms) Texture-based – Filter bank representations – Filtering

An image is a set of pixels… What we seeWhat a computer sees Source: S. Narasimhan Adapted from S. Narasimhan

Problems with pixel representation Not invariant to small changes – Translation – Illumination – etc. Some parts of an image are more important than others

Human eye movements Yarbus eye tracking D. Hoiem

Choosing distinctive interest points If you wanted to meet a friend would you say a)“Let’s meet on campus.” b)“Let’s meet on Green street.” c)“Let’s meet at Green and Wright.” – Corner detection Or if you were in a secluded area: a)“Let’s meet in the Plains of Akbar.” b)“Let’s meet on the side of Mt. Doom.” c)“Let’s meet on top of Mt. Doom.” – Blob (valley/peak) detection D. Hoiem

Interest points Suppose you have to click on some point, go away and come back after I deform the image, and click on the same points again. – Which points would you choose? original deformed D. Hoiem

Corners as distinctive interest points We should easily recognize the point by looking through a small window Shifting a window in any direction should give a large change in intensity “edge”: no change along the edge direction “corner”: significant change in all directions “flat” region: no change in all directions A. Efros, D. Frolova, D. Simakov

K. Grauman Example of Harris application

Local features: desired properties Repeatability – The same feature can be found in several images despite geometric and photometric transformations Distinctiveness – Each feature has a distinctive description Compactness and efficiency – Many fewer features than image pixels Locality – A feature occupies a relatively small area of the image; robust to clutter and occlusion Adapted from K. Grauman

Overview of Keypoint Description Adapted from K. Grauman, B. Leibe B1B1 B2B2 B3B3 A1A1 A2A2 A3A3 1. Find a set of distinctive key- points 2. Define a region around each keypoint 3. Compute a local descriptor from the normalized region

Gradients

SIFT Descriptor [Lowe, ICCV 1999] Histogram of oriented gradients Captures important texture information Robust to small translations / affine deformations K. Grauman, B. Leibe

HOG Descriptor Computes histograms of gradients per region of the image and concatenates them N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, CVPR 2005Histograms of Oriented Gradients for Human Detection Image credit: N. Snavely

What is this?

What is this?

What is this?

Image Representations Keypoint-based image description – Extraction / detection of keypoints – Description (via gradient histograms) Texture-based – Filter bank representations – Filtering [read the extra slides if interested]

Texture Marks and patterns, e.g. ones caused by grooves Can include regular or more random patterns

Texture representation Textures are made up of repeated local patterns, so: – Find the patterns Use “filters” that look like patterns (spots, bars, raw patches…) Consider magnitude of response – Describe their statistics within each image E.g. histogram of pattern occurrences Results in a d-dimensional feature vector, where d is the number of patterns/filters Adapted from Kristen Grauman

Filter banks What filters to put in the bank? – Typically we want a combination of scales and orientations, different types of patterns. Matlab code available for these examples: scales orientations “Edges” “Bars” “Spots”

Image from Kristen Grauman

Showing magnitude of responses Kristen Grauman

[r1, r2, …, r38] Patch description: A feature vector formed from the list of responses at each pixel. Adapted from Kristen Grauman

You try: Can you match the texture to the response? Mean responses Filters A B C Derek Hoiem Answer: 1  B, 2  C, 3  A

How do we compute these reponses? The remaining slides are optional (i.e. view them if you’re interested)

Next time Unsupervised learning: clustering

Image filtering Compute a function of the local neighborhood at each pixel in the image – Function specified by a “filter” or mask saying how to combine values from neighbors. Uses of filtering: – De-noise an image Expect pixels to be like their neighbors Expect noise processes to be independent from pixel to pixel – Extract information (texture, edges, etc) Adapted from Derek Hoiem

Moving Average In 2D Source: S. Seitz

Moving Average In 2D Source: S. Seitz

Moving Average In 2D Source: S. Seitz

Moving Average In 2D Source: S. Seitz

Moving Average In 2D Source: S. Seitz

Moving Average In 2D Source: S. Seitz

Say the averaging window size is 2k+1 x 2k+1: Loop over all pixels in neighborhood around image pixel F[i,j] Attribute uniform weight to each pixel Now generalize to allow different weights depending on neighboring pixel’s relative position: Non-uniform weights Filtering an image = replace each pixel with linear combination of neighbors. Correlation filtering

F = image H = filter u = -1,v = -1  200x0.6 + (0, 0) (i, j)

Correlation filtering F = image H = filter u = -1,v = -1  200x0.6 + v = 0  3x.12 + (0, 0) (i, j)

Correlation filtering F = image H = filter u = -1,v = -1 v = 0 v = +1  200x0.6 + (0, 0) (i, j) v = -1  200x0.6 + v = 0  3x.12 +

Correlation filtering F = image H = filter u = -1,v = -1 v = 0 v = +1 u = 0,v = -1  5x.12 … (0, 0) (i, j) v = +1  200x0.6 + v = -1  200x0.6 + v = 0  3x.12 +

Practice with linear filters Original ? Source: D. Lowe

Practice with linear filters Original Filtered (no change) Source: D. Lowe

Practice with linear filters Original ? Source: D. Lowe

Practice with linear filters Original Shifted left by 1 pixel with correlation Source: D. Lowe

Practice with linear filters Original ? Source: D. Lowe

Practice with linear filters Original Blur Source: D. Lowe

Practice with linear filters Original ? Source: D. Lowe

Practice with linear filters Original Sharpening filter: accentuates differences with local average Source: D. Lowe

Filtering examples: sharpening

What if we want nearest neighboring pixels to have the most influence on the output? Gaussian filter This kernel is an approximation of a 2d Gaussian function: Source: S. Seitz