1 Active Random Fields Adrian Barbu. FSU 2 The MAP Estimation Problem Estimation problem: Given input data y, solve Example: Image denoising Given noisy.

Slides:



Advertisements
Similar presentations
Mean-Field Theory and Its Applications In Computer Vision1 1.
Advertisements

Bayesian Belief Propagation
HOPS: Efficient Region Labeling using Higher Order Proxy Neighborhoods Albert Y. C. Chen 1, Jason J. Corso 1, and Le Wang 2 1 Dept. of Computer Science.
Learning to Combine Bottom-Up and Top-Down Segmentation Anat Levin and Yair Weiss School of CS&Eng, The Hebrew University of Jerusalem, Israel.
An Introduction of Support Vector Machine
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
I Images as graphs Fully-connected graph – node for every pixel – link between every pair of pixels, p,q – similarity w ij for each link j w ij c Source:
Patch-based Image Deconvolution via Joint Modeling of Sparse Priors Chao Jia and Brian L. Evans The University of Texas at Austin 12 Sep
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
1 Hierarchical Image-Motion Segmentation using Swendsen-Wang Cuts Adrian Barbu Siemens Corporate Research Princeton, NJ Acknowledgements: S.C. Zhu, Y.N.
Lecture 14 – Neural Networks
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Schedule Introduction Models: small cliques and special potentials Tea break Inference: Relaxation techniques:
Image Denoising via Learned Dictionaries and Sparse Representations
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
Abstract We present a model of curvilinear grouping using piecewise linear representations of contours and a conditional random field to capture continuity.
New Results in Image Processing based on Sparse and Redundant Representations Michael Elad The Computer Science Department The Technion – Israel Institute.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
* Joint work with Michal Aharon Guillermo Sapiro
SUSAN: structure-preserving noise reduction EE264: Image Processing Final Presentation by Luke Johnson 6/7/2007.
Today Introduction to MCMC Particle filters and MCMC
Computer vision: models, learning and inference Chapter 10 Graphical Models.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 11, NOVEMBER 2011 Qian Zhang, King Ngi Ngan Department of Electronic Engineering, the Chinese university.
Sparse and Redundant Representation Modeling for Image Processing Michael Elad The Computer Science Department The Technion – Israel Institute of technology.
(1) A probability model respecting those covariance observations: Gaussian Maximum entropy probability distribution for a given covariance observation.
Rician Noise Removal in Diffusion Tensor MRI
Automatic Estimation and Removal of Noise from a Single Image
Classification III Tamara Berg CS Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell,
Adaptive Regularization of the NL-Means : Application to Image and Video Denoising IEEE TRANSACTION ON IMAGE PROCESSING , VOL , 23 , NO,8 , AUGUST 2014.
CSE 185 Introduction to Computer Vision Pattern Recognition.
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/31/15.
Mutual Information-based Stereo Matching Combined with SIFT Descriptor in Log-chromaticity Color Space Yong Seok Heo, Kyoung Mu Lee, and Sang Uk Lee.
Using Fast Weights to Improve Persistent Contrastive Divergence Tijmen Tieleman Geoffrey Hinton Department of Computer Science, University of Toronto ICML.
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.
City University of Hong Kong 18 th Intl. Conf. Pattern Recognition Self-Validated and Spatially Coherent Clustering with NS-MRF and Graph Cuts Wei Feng.
INDEPENDENT COMPONENT ANALYSIS OF TEXTURES based on the article R.Manduchi, J. Portilla, ICA of Textures, The Proc. of the 7 th IEEE Int. Conf. On Comp.
Multiplicative Bounds for Metric Labeling M. Pawan Kumar École Centrale Paris Joint work with Phil Torr, Daphne Koller.
Learning Lateral Connections between Hidden Units Geoffrey Hinton University of Toronto in collaboration with Kejie Bao University of Toronto.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
Markov Random Fields Probabilistic Models for Images
BAGGING ALGORITHM, ONLINE BOOSTING AND VISION Se – Hoon Park.
Real-Time Exemplar-Based Face Sketch Synthesis Pipeline illustration Note: containing animations Yibing Song 1 Linchao Bao 1 Qingxiong Yang 1 Ming-Hsuan.
1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.
Fields of Experts: A Framework for Learning Image Priors (Mon) Young Ki Baik, Computer Vision Lab.
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
Training Conditional Random Fields using Virtual Evidence Boosting Lin Liao, Tanzeem Choudhury †, Dieter Fox, and Henry Kautz University of Washington.
Ensemble Methods in Machine Learning
Image restoration and segmentation by convolutional networks
Markov Random Fields & Conditional Random Fields
NTU & MSRA Ming-Feng Tsai
Efficient Belief Propagation for Image Restoration Qi Zhao Mar.22,2006.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Performance Measurement of Image Processing Algorithms By Dr. Rajeev Srivastava ITBHU, Varanasi.

Learning Deep Generative Models by Ruslan Salakhutdinov
Boosted Augmented Naive Bayes. Efficient discriminative learning of
Sublinear Computational Time Modeling in Statistical Machine Learning Theory for Markov Random Fields Kazuyuki Tanaka GSIS, Tohoku University, Sendai,
Markov Random Fields with Efficient Approximations
Dynamical Statistical Shape Priors for Level Set Based Tracking
Learning to Combine Bottom-Up and Top-Down Segmentation
Collaborative Filtering Matrix Factorization Approach
Announcements more panorama slots available now
Improving K-SVD Denoising by Post-Processing its Method-Noise
Neural networks (3) Regularization Autoencoder
Announcements more panorama slots available now
Lecture 7 Patch based methods: nonlocal means, BM3D, K- SVD, data-driven (tight) frame.
Presentation transcript:

1 Active Random Fields Adrian Barbu

FSU 2 The MAP Estimation Problem Estimation problem: Given input data y, solve Example: Image denoising Given noisy image y, find denoised image x Issues Modeling: How to approximate ? Computing: How to find x fast? Noisy image y Denoised image x

FSU 3 MAP Estimation Issues Popular approach: Find a very accurate model Find best optimum x of that model Problems with this approach Hard to obtain good Desired solution needs to be at global maximum For many models, the global maximum cannot be obtained in any reasonable time. Using suboptimal algorithms to find the maximum leads to suboptimal solutions E.g. Markov Random Fields

FSU 4 Markov Random Fields Bayesian Models: Markov Random Field (MRF) prior E.g. Image Denoising model Gaussian Likelihood Fields of Experts MRF prior Differential Lorentzian Image filters J i Image Filters J i Roth and Black, 2005

FSU 5 MAP Estimation (Inference) in MRF Exact inference is too hard For the Potts model, one of the simplest MRFs it is already NP hard (Boykov et al, 2001) Approximate inference is suboptimal Gradient descent Iterated Conditional Modes (Besag 1986) Belief Propagation (Yedidia et al, 2001) Graph Cuts (Boykov et al, 2001) Tree-Reweighted Message Passing (Wainwright et al, 2003)

FSU 6 Gradient Descent for Fields of Experts Energy function: Analytic gradient (Roth & Black, 2005) Gradient descent iterations 3000 iterations with small  Takes more than 30 min per image on a modern computer FOE filters

FSU 7 Training the MRF Gradient update in model parameters Minimize KL divergence between learned prior and true probability Gradient ascent in log-likelihood Need to know Normalization Constant Z E X from training data Z and E p obtained by MCMC Slow to train Training the FOE prior Contrastive divergence (Hinton) An approximate ML technique Initialize at data points and run a fixed number of iterations Takes about two days

FSU 8 Going to Real-Time Performance Wainwright (2006) In computation-limited settings, MAP estimation is not the best choice Some biased models could compensate for the fast inference algorithm How much can we gain from biased models? Proposed denoising approach: 1-4 gradient descent iterations (not 3000) Takes less than a second per image times speedup vs MAP estimation Better accuracy than FOE model

FSU 9 Active Random Field Active Random Field = A pair (M,A) of a MRF model M, with parameters  M a fast and suboptimal inference algorithm A with parameters  A They cannot be separated since they are trained together E.g. Active FOE for image denoising Fields of Experts model Algorithm: 1-4 iterations of gradient descent Parameters:

FSU 10 Training the Active Random Field Discriminative training Training examples = pairs inputs y i + desired outputs t i Training=optimization Loss function L Aka benchmark measure Evaluates accuracy on training set End-to-end training: covers entire process from input image to final result

FSU 11 Related Work Energy Based Models (LeCun & Huang, 2005) Train a MRF energy model to have minima close to desired locations Assumes exact inference (slow) Shape Regression Machine (Zhou & Comaniciu, 2007) Train a regressor to find an object Uses a classifier to clean up result Aimed for object detection, not MRFs

FSU 12 Related Work Training model-algorithm combinations CRF based on pairwise potential trained for object classification Torralba et al, 2004 AutoContext: Sequence of CRF-like boosted classifiers for object segmentation, Tu 2008 Both minimize a loss function and report results on another loss function (suboptimal) Both train iterative classifiers that are more and more complex at each iteration – speed degrades quickly for improving accuracy

FSU 13 Related Work Training model-algorithm combinations and reporting results on the same loss function for image denoising Tappen, & Orlando, Use same type of training for obtaining a stronger MAP optimum in image denoising Gaussian Conditional Random Fields: Tappen et al, 2007 – exact MAP but hundreds of times slower. Results comparable with 2- iteration ARF Common theme: trying to obtain a strong MAP optimum This work: fast and suboptimal estimator balanced by a complex model and appropriate training

FSU 14 Training Active Fields of Experts Training set 40 images from the Berkeley dataset (Martin 2001) Same as Roth and Black 2005 Separate training for each noise level  Loss function L = PSNR Same measure used for reporting results is the standard deviation of Trained Active FOE filters, n iter = 1

FSU 15 Training 1-Iteration ARF,  =25 Follow Marginal Space Learning Consider a sequence of subspaces Represent marginals by propagating particles between subspaces Propagate only one particle (mode) 1. Start with one filter, size 3x3 Train until no improvement We found the particle in this subspace 2. Add another filter initialized with zeros Retrain to find the new mode 3. Repeat step 2 until there are 5 filters 4. Increase filters to 5x5 Retrain to find new mode 5. Repeat step 2 until there are 13 filters PSNR training (blue), testing (red) while training the 1-iteration ARF,  =25 filters

FSU 16 Training other ARFs Other levels initialized as Start with one iteration,  =25 Each arrow takes about one day on a 8-core machine 3-iteration ARFs can also perform 4 iterations

FSU 17 Concerns about Active FOE Avoid overfitting Use large patches to avoid boundary effect Full size images instead of smaller patches Totally 6 million nodes Lots of training data Use a validation set to detect overfitting Long training time Easily parallelizable 1 -3 days on a 8 core PC Good news: CPU power increases exponentially (Moore’s law)

FSU 18 Results Corrupted with Gaussian noise,  =25, PSNR= iteration ARF, PSNR=28.94, t=0.6s 3000-iteration FOE, PSNR=28.67, t=2250s Original Image

FSU 19 Standard Test Images LenaBarbara Boats House Peppers

FSU 20 Evaluation, Standard Test Images  noise =25 LenaBarbaraBoatsHousePeppersAverage FOE (Roth & Black, 2005) Active FOE, 1 iteration Active FOE, 2 iterations Active FOE, 3 iterations Active FOE, 4 iterations Wavelet Denoising (Portilla et al, 2003) Overcomplete DCT (Elad et al, 2006) Globally Trained Dictionary (Elad et al, 2006) KSVD (Elad et al, 2006) BM3D (Dabov et al, 2007)

FSU 21 Evaluation, Berkeley Dataset 68 images from the Berkeley dataset Not used for training, not overfitted by other methods. Roth & Black ‘05 also evaluated on them. A more realistic evaluation than on 5 images.

FSU 22 Evaluation, Berkeley Dataset Average PSNR on 68 images from the Berkeley dataset, not used for training. 1: Wiener Filter 2: Nonlinear diffusion 3: Non-local means (Buades et al, 2005) 4: FOE model, 3000 iterations, 5,6,7,8: Our algorithm with 1,2,3 and 4 iterations 9: Wavelet based denoising (Portilla et al, 2003) 10: Overcomplete DCT (Elad et al, 2006) 11: KSVD (Elad et al, 2006) 12: BM3D (Dabov et al, 2007)

FSU 23 Speed-Performance Comparison  noise =25

FSU 24 Performance on Different Levels of Noise Trained for a specific noise level No data term Band-pass behavior

FSU 25 Adding a Data Term Active FOE 1-iteration version has no data term Modification with data term Equivalent

FSU 26 Performance with Data Term Data term removes band-pass behavior 1-iteration ARF as good as 3000-iteration FOE for a range of noises

FSU 27 Conclusion An Active Random Field is a pair of A Markov Random Field based model A fast, approximate inference algorithm (estimator) Training = optimization of the MRF and algorithm parameters using A benchmark measure on which the results will be reported Training data as pairs of input and desired output Pros Great speed and accuracy Good control of overfitting using a validation set Cons Slow to train

FSU 28 Future Work Extending image denoising Learning filters over multiple channels Learning the robust function Learn filters for image sequences using temporal coherence Other applications Computer Vision: Edge and Road detection, Image segmentation Stereo matching, motion, tracking etc Medical Imaging Learning a Discriminative Anatomical Network of Organ and Landmark Detectors

FSU 29 References A. Barbu. Training an Active Random Field for Real-Time Image Denoising. IEEE Trans. Image Processing, 18, November Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 23(11):1222–1239, A. Buades, B. Coll, and J.M. Morel. A Non-Local Algorithm for Image Denoising. Computer Vision and Pattern Recognition, CVPR IEEE Computer Society Conference on, 2, K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian. Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering. Image Processing, IEEE Transactions on, 16(8):2080–2095, M. Elad and M. Aharon. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process, 15(12):3736–3745, G.E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence. Neural Computation, 14(8):1771–1800, Y. LeCun and F.J. Huang. Loss functions for discriminative training of energy-based models. Proc. of the 10- thInternational Workshop on Artificial Intelligence and Statistics (AIStats 05), 3, D. Martin, C. Fowlkes, D. Tal, and J. Malik. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms. Proc. of ICCV01, 2:416–425. J. Portilla, V. Strela, MJ Wainwright, and EP Simoncelli. Image denoising using scale mixtures of Gaussians in the wavelet domain. Image Processing, IEEE Transactions on, 12(11):1338–1351, S. Roth and M.J. Black. Fields of Experts. International Journal of Computer Vision, 82(2):205–229, 2009.