Image Priors and the Sparse-Land Model

Slides:



Advertisements
Similar presentations
Chapter 28 – Part II Matrix Operations. Gaussian elimination Gaussian elimination LU factorization LU factorization Gaussian elimination with partial.
Advertisements

Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
MMSE Estimation for Sparse Representation Modeling
Joint work with Irad Yavneh
Computer vision: models, learning and inference Chapter 13 Image preprocessing and feature extraction.
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
The Analysis (Co-)Sparse Model Origin, Definition, and Pursuit
K-SVD Dictionary-Learning for Analysis Sparse Models
* * Joint work with Michal Aharon Freddy Bruckstein Michael Elad
An Introduction to Sparse Coding, Sparse Sensing, and Optimization Speaker: Wei-Lun Chao Date: Nov. 23, 2011 DISP Lab, Graduate Institute of Communication.
1 Applications on Signal Recovering Miguel Argáez Carlos A. Quintero Computational Science Program El Paso, Texas, USA April 16, 2009.
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
Visual Recognition Tutorial
Sparse & Redundant Signal Representation, and its Role in Image Processing Michael Elad The CS Department The Technion – Israel Institute of technology.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Dictionary-Learning for the Analysis Sparse Model Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000,
“Random Projections on Smooth Manifolds” -A short summary
Sparse and Overcomplete Data Representation
SRINKAGE FOR REDUNDANT REPRESENTATIONS ? Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel.
Mathematics and Image Analysis, MIA'06
Image Denoising via Learned Dictionaries and Sparse Representations
An Introduction to Sparse Representation and the K-SVD Algorithm
1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class.
Dimensional reduction, PCA
Optimized Projection Directions for Compressed Sensing Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa.
* Joint work with Michal Aharon Guillermo Sapiro
Lecture 4 Unsupervised Learning Clustering & Dimensionality Reduction
Recent Trends in Signal Representations and Their Role in Image Processing Michael Elad The CS Department The Technion – Israel Institute of technology.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
A Weighted Average of Sparse Several Representations is Better than the Sparsest One Alone Michael Elad The Computer Science Department The Technion –
Sparse and Redundant Representation Modeling for Image Processing Michael Elad The Computer Science Department The Technion – Israel Institute of technology.
Topics in MMSE Estimation for Sparse Approximation Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000,
Linear Algebra and Image Processing
SVD(Singular Value Decomposition) and Its Applications
Sparse Coding Arthur Pece Outline Generative-model-based vision Linear, non-Gaussian, over-complete generative models The penalty method.
Cs: compressed sensing
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Lecture note for Stat 231: Pattern Recognition and Machine Learning 4. Maximum Likelihood Prof. A.L. Yuille Stat 231. Fall 2004.
Computational Intelligence: Methods and Applications Lecture 12 Bayesian decisions: foundation of learning Włodzisław Duch Dept. of Informatics, UMK Google:
1 E. Fatemizadeh Statistical Pattern Recognition.
Learning to Sense Sparse Signals: Simultaneous Sensing Matrix and Sparsifying Dictionary Optimization Julio Martin Duarte-Carvajalino, and Guillermo Sapiro.
Fields of Experts: A Framework for Learning Image Priors (Mon) Young Ki Baik, Computer Vision Lab.
Image Denoising Using Wavelets
23 November Md. Tanvir Al Amin (Presenter) Anupam Bhattacharjee Department of Computer Science and Engineering,
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
Image Decomposition, Inpainting, and Impulse Noise Removal by Sparse & Redundant Representations Michael Elad The Computer Science Department The Technion.
A Weighted Average of Sparse Representations is Better than the Sparsest One Alone Michael Elad and Irad Yavneh SIAM Conference on Imaging Science ’08.
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.
Vector Quantization CAP5015 Fall 2005.
Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.
Dimension reduction (2) EDR space Sliced inverse regression Multi-dimensional LDA Partial Least Squares Network Component analysis.
KNN & Naïve Bayes Hongning Wang
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Sparsity Based Poisson Denoising and Inpainting
Data Transformation: Normalization
Ch8: Nonparametric Methods
Outlier Processing via L1-Principal Subspaces
Singular Value Decomposition
K Nearest Neighbor Classification
Image Restoration and Denoising
Goodfellow: Chapter 14 Autoencoders
* *Joint work with Ron Rubinstein Tomer Peleg Remi Gribonval and
* * Joint work with Michal Aharon Freddy Bruckstein Michael Elad
Sparse and Redundant Representations and Their Applications in
Goodfellow: Chapter 14 Autoencoders
Presentation transcript:

Image Priors and the Sparse-Land Model

Lets Start with a Virtual Experiment … What are we Expected to See? Suppose that we take a VERY LARGE set of small images – say that we have accumulated 1e12 patches, each of size 20×20 pixels. Clearly, every such image is a point in R400. Lets put these point in this 400-dimensional Euclidean space, in the cube [0,1]400. Now, LETS STEP INTO THIS SPACE and look at the cloud of points we have just generated. What are we Expected to See? Deserts! Vast emptiness! Why? Concentration of points in some regions. Different densities from one place to another. Filaments, manifold structure … In this experiment we have actually created an empirical estimate of the Probability Density Function (PDF) of … images – Call it P(x)

P(x) Answer: EVERYTHING So, Lets Talk about This We “experimented” with small images, but actually the same phenomena will be found in audio, seismic data, financial data, text-files, … and practically any source of information you are familiar with. Nevertheless, we will stick to images for the discussion. Imagine this: a function that can be given an image and return its chances to exist! amazing, No? Well, what could you do with such a function? Answer: EVERYTHING

Signal/Image Prior P(x) What is it good for? Denoising: The measurement is and we are trying to recover Region where P(x) is high Recall that for random noise we have E{(y-x0)Tx0}=0

Signal/Image Prior P(x) What is it good for? Denoising: The measurement is and we are trying to recover : Option 1: MAP Option 2: MMSE

Signal/Image Prior P(x) What is it good for? Inverse Problems: The measurement is and we are trying to recover , as before. H could be blur, projection, downscaling, subsampling, …

Signal/Image Prior P(x) What is it good for? Compression: We are given x and a budget of B bits. Our goal is to get the best possible compression (i.e. minimize the error). The approach we take is to divide the whole domain into 2B disjoint sets (Voronoi) and minimize the error w.r.t. the representation vectors (VQ):

Signal/Image Prior P(x) What is it good for? Sampling: Our goal is to propose sampling and reconstruction strategies, each (or just the first) is parameterized, and optimize the parameters for the smallest possible error:

Signal/Image Prior P(x) What is it good for? Separation: We are given Where and are two different signals from two different distributions, and our goal is to separate the signal into its ingredients:

Signal/Image Prior P(x) What is it good for? Anomaly Detection: We are given x and we are supposed to say if it is an anomaly. This is done by testing

Signal/Image Prior P(x) Question: What is it good for? Answer: Many great things. P(x)=?

The Evolution of Priors for Images PDE smoothness WLS Robust stat. for images FoE Transform GMM, Co-Sparse Analysis, Low-Rank, … wavelet Learn sparse Major themes: L2 → L1 - Linear vs. Non-Linear Approx. Training on examples Random generator for x 70’s 80’s 90’s 00’s

Signal/Image Prior P(x) Here is an untold secret: The vast literature in image processing over the past 4-5 decades is NOTHING BUT an evolution of ideas on the identity of P(x), and ways to use it in actual tasks By the way, the same is true for many other data sources and signals …

Linear Versus Non-Linear Approximation Suppose that our prior is the following (T is unitary): The matrix  weights the transform elements: Our goal: Denoising a signal with this prior by solving

Linear Versus Non-Linear Approximation The solution is given by Implication: We leave the transform coefficients with the small weights and remove the ones with the high weight. The decision who survives the process is fixed by  - This is Linear Approximation

Linear Versus Non-Linear Approximation Suppose now that our prior is the following (T is unitary): Our goal: Denoising a signal with this prior by solving We have seen that the solution for this problem is given by soft shrinkage Implications: Just like before, we filter the signal in the transform domain. This time we leave the dominant coefficients and discard of the small ones. This is known as Nonlinear Approximation.

Sparse-Land Signal Generation Draw k0 – the cardinality of the representation Draw k0 non-zero values Draw k0 locations, and generate the representation  Multiply  by the dictionary D Add random iid (model) noise e + = Sparse-Land M Generator of signals from

Sparse-Land vs. Earlier Models Assume no noise in the model and that D is square and invertible The Sparse-land model generalizes the previous method by adopting over-completeness***, and daring to work with true sparsity and L0 *** What about redundant T? this will be addressed later!

This is universally true for signals we operate on Geometrical Insight The effective rank d of E (found by SVD) is expected to be very low: d<<n This is universally true for signals we operate on The orientation and dimension of this subspace changes form one point to another (smoothly?)

Geometrical Insight – Implications Given a noisy version of x0 How shall we denoise it? By projecting to the subspace around x0 (chicken and egg) How come y is not on the subspace itself? The relative volume of the subspace is negligible Recall that E{(y-x0)Tx0}=0

Geometrical Insight – Denoising in Practice Given a noisy version of x0 How shall we denoise it? Non-parametric: Nearest Neighbor (NN), or K-NN Local-Parametric: Group neighbors, estimate the subspace and project Parametric: Cluster the DB into K subgroups, and estimate a subspace per each. When a signal is to be denoised, assign it to the closest subgroup, and the project on the corresponding subspace (K=1: PCA) Sparse-Land: one dictionary encapsulates many such clusters

Union of Subspaces (UoS) We said that with Sparse-Land, one dictionary encapsulates many such clusters Consider all the signals x that emerge from the same k atoms in D – all of them reside in the same subspace, spanned by these columns. Thus, every possible support (and there are m-choose-k of them) represent one such subspace which the signal could belong to. The pursuit task: Given a noisy signal we are searching the “closest subspace” and projecting onto it. It is so hard because of the number of the subspaces involved in this union. = Sparse-Land

Processing Sparse-Land Signals Objective Given Data Goal Most effective transform - getting the sparsest possible set of iid coefficients Signal Transform Cleanest possible signal Signal Denoising We have a budget of B bits and we want to best represent the signal Compress Treat blur, subsampling, missing values, projection, compressed-sensing Inverse Problem The two signals are form different sources and thus have different models Separate

Processing Sparse-Land Signals Objective Given Data Goal Most effective transform - getting the sparsest possible set of iid coefficients Signal Transform Cleanest possible signal Signal Denoising We have a budget of B bits and we want to best represent the signal Compress Treat blur, subsampling, missing values, projection, compressed-sensing Inverse Problem The two signals are form different sources and thus have different models Separate All these (and other) processing methods boil down to the solution of For which we now know that It is theoretically sensible, and There are numerical ways to handle it

To Summarize The Sparse-Land forms a general Union of Subspaces, all encapsulated by the concise matrix D. This follows many earlier work that aims to model signals using a union of subspaces (or mixture of Gaussians – think about it – it is the same). Sparse-Land is Rooted on solid modeling ideas , while improving on them due to its generality and it solid mathematical foundations