Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur.

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Introduction to Monte Carlo Markov chain (MCMC) methods
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
CSCE643: Computer Vision Bayesian Tracking & Particle Filtering Jinxiang Chai Some slides from Stephen Roth.
Estimation, Variation and Uncertainty Simon French
Tutorial on Bayesian Techniques for Inference A.Asensio Ramos Instituto de Astrofísica de Canarias.
Learning from spectropolarimetric observations A. Asensio Ramos Instituto de Astrofísica de Canarias aasensio.github.io/blog.
Bayesian Estimation in MARK
Markov-Chain Monte Carlo
CHAPTER 16 MARKOV CHAIN MONTE CARLO
Bayesian Reasoning: Markov Chain Monte Carlo
Bayesian statistics – MCMC techniques
Visual Recognition Tutorial
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
Bayesian Analysis of X-ray Luminosity Functions A. Ptak (JHU) Abstract Often only a relatively small number of sources of a given class are detected in.
Lecture 5: Learning models using EM
Comparative survey on non linear filtering methods : the quantization and the particle filtering approaches Afef SELLAMI Chang Young Kim.
Visual Recognition Tutorial
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Thanks to Nir Friedman, HU
Robin McDougall, Ed Waller and Scott Nokleby Faculties of Engineering & Applied Science and Energy Systems & Nuclear Science 1.
Image Analysis and Markov Random Fields (MRFs) Quanren Xiong.
Introduction to Monte Carlo Methods D.J.C. Mackay.
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Material Model Parameter Identification via Markov Chain Monte Carlo Christian Knipprath 1 Alexandros A. Skordos – ACCIS,
The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.
Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.
1 Physical Fluctuomatics 5th and 6th Probabilistic information processing by Gaussian graphical model Kazuyuki Tanaka Graduate School of Information Sciences,
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Module 1: Statistical Issues in Micro simulation Paul Sousa.
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
1 Gil McVean Tuesday 24 th February 2009 Markov Chain Monte Carlo.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
Simulation of the matrix Bingham-von Mises- Fisher distribution, with applications to multivariate and relational data Discussion led by Chunping Wang.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Simulation techniques Summary of the methods we used so far Other methods –Rejection sampling –Importance sampling Very good slides from Dr. Joo-Ho Choi.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:
A multiline LTE inversion using PCA Marian Martínez González.
Lecture 2: Statistical learning primer for biologists
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Sampling and estimation Petter Mostad
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
CSC321: Introduction to Neural Networks and Machine Learning Lecture 17: Boltzmann Machines as Probabilistic Models Geoffrey Hinton.
Markov-Chain-Monte-Carlo (MCMC) & The Metropolis-Hastings Algorithm P548: Intro Bayesian Stats with Psych Applications Instructor: John Miyamoto 01/19/2016:
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
How many iterations in the Gibbs sampler? Adrian E. Raftery and Steven Lewis (September, 1991) Duke University Machine Learning Group Presented by Iulian.
10 October, 2007 University of Glasgow 1 EM Algorithm with Markov Chain Monte Carlo Method for Bayesian Image Analysis Kazuyuki Tanaka Graduate School.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Markov Chain Monte Carlo in R
MCMC Output & Metropolis-Hastings Algorithm Part I
MCMC Stopping and Variance Estimation: Idea here is to first use multiple Chains from different initial conditions to determine a burn-in period so the.
Advanced Statistical Computing Fall 2016
Bayesian data analysis
Markov Random Fields with Efficient Approximations
Latent Variables, Mixture Models and EM
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
Multidimensional Integration Part I
Instructors: Fei Fang (This Lecture) and Dave Touretzky
Lecture 5 Unsupervised Learning in fully Observed Directed and Undirected Graphical Models.
Slides for Sampling from Posterior of Shape
LECTURE 09: BAYESIAN LEARNING
Opinionated Lessons #39 MCMC and Gibbs Sampling in Statistics
Presentation transcript:

Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur Mer, 8-10 October 2007)

Outline Introduction Bayesian Inversion Markov Chain Monte Carlo Applications Conclusions

Introduction Observations Model What is an inversion process? We have a set of observations and we propose a physical model FORWARD PROBLEM (Typically univoque) INVERSION PROBLEM (is it univoque?)

Introduction If we are living in a perfect, ideal, Teletubbie world with No noise No ambiguities No degeneracies There will be ONE model that better explains the observations and one can safely say that this is THE correct model

Introduction However, we are fortunately living in an imperfect, non ideal world with NoiseAmbiguitiesDegeneracies There will be more than ONE model that better explains the observations and one cannot say that only one model is THE correct model

Introduction As a consequence, any inversion procedure carried out in our noisy and ambiguous world cannot give only one model as solution but has to give a set of models that are compatible with our observables Any inversion problem has to be understood as a probabilistic problem that has to be tackled using a statistical approach We give the probability that any given model explains the observables

Bayesian Inference D represents our observables (Stokes profiles) M represents our model (Milne-Eddington, LTE,...) and it is parameterized by a vector of parameters  We have some a-priori knowledge of their values

Bayesian Inference The inductive inference problem is to update from our a-priori knowledge of the parameters to a a-posteriori knowledge after taking into account the information encoded in the observed dataset Bayes theorem Posterior probability Prior probability Likelihood

Priors Any Bayesian reasoning scheme introduces the prior probability (a-prior information) Typical priors Top-hat function (flat prior) ii  max  min Gaussian prior (we know a high probable value) ii Assuming statistical independence for all parameters the total prior can be calculated as

Likelihood Assuming normal (gaussian) noise, the likelihood can be calculated as where the  2 function is defined as usual In this case, the  2 function is specific for the the case of Stokes profiles

Bayesian Inference In order to completely solve the inversion procedure, we NEED to know the complete p(  |D) posterior probability distribution Sometimes, we are interested only in a subset of parameters Marginalization In any case, we still need the complete posterior distribution

Bayesian Inference. The naïve approach Our model is parameterized with N parameters We use M values for each parameter (to have a good coverage) We end up with M N evaluations of the forward problem to obtain the full posterior distribution Example: if N=10 (typical for ME models) and M=10 (relatively coarse grid), we end up with evaluations  ~31 years if each model is evaluated in 0.1 s Only one experiment is possible during a typical scientific life!!! You better choose the correct experiment!!!

Bayesian Inference. The practical approach Markov Chain Monte Carlo “HAPPY IDEA”!! Build a Markov Chain with an equilibrium probability distribution function equal to the posterior distribution GOOD NEWS!! The Markov Chain rapidly converges towards the desired distribution using a reduced amount of evaluations (typically increases linearly with the number of parameters)

MCMC. Technical details Propose an initial set of parameters  0 Calculate the posterior p(  0 |D) Obtain new set of parameters sampling from q(  i |  i-1 ) Calculate the posterior p(  i |D) and the ratio Accept set of parameters  i with probability 

Bayesian Inference. Simple example Andrieu et al. 2003

Bayesian Inference. Proposal density The proposal density is the key point in MCMC methods It should ideally be as similar as possible to the posterior distribution but easy to calculate Typical proposal densities Uniform distribution In the limit that the proposal is equal to the distribution you want to sample, all proposed models are accepted Multi-dimensional gaussian

MCMC. Possible post-facto analysis If chains start far from the region of large posterior probability, it takes some iterations to locate the region  the first N burn-in iterations are thrown away BURN-IN Starting point Burn-in Reduces the size of the chain hopefully maintaining their properties Less used now due to the increase of the computer capabilities THINNING

Academic example  =10 -5 I c Original values B=100 G  B =45º

Academic example  =10 -5 I c Markov chains without burn-in and thinning Marginalization (multi-dimensional integration) is obtained by “making histograms”

Academic example  =10 -3 I c Original values B=100 G  B =45º B cos  B =cte

Academic example  =10 -3 I c

B cos  B =100 cos 45º = 70.7 G

Realistic example  =10 -4 I c Stokes profiles with a low flux (10 Mx/cm 2 ) B=1000 G f = 1% Fields between 500 and 1800 G are compatible with the observed Stokes profiles at 1  confidence level Some parameters are not constrained by the observables

Low flux region The “thermodynamical” parameters of the non-magnetic component can be nicely constrained by the data

Low flux region

High flux region  =10 -4 I c Stokes profiles with a high flux (200 Mx/cm 2 ) B=1000 G f = 20% Magnetic field strength and other parameters of the magnetic component are constrained by the data

High flux region  =10 -4 I c Broader confidence levels are seen in the non-magnetic component due to its reduced filling factor

Inclined fields Only Stokes I and V have been considered until now What happens if linear polarization is also taken into account? Do we see an improvement? We consider three values of  B 20º  Linear polarization is very weak and it is below the noise level 70º  Circular polarization is very weak and it is below the noise level 45º  intermediate result

Inclined fields Posterior distributions are very similar to those of vertical fields Linear polarization appears but circular polarization amplitudes come closer to the noise level

Lack of information Marginalized magnetic field strength posterior distribution Note the similarity with the prior distribution

Observed sunspot Umbral profile observed with THÉMIS 6302 Å

Observed sunspot Umbral profile observed with THÉMIS 6301 Å

Combination of information Inclusion of new information is trivial under the Bayesian approach by directly multiplying their posteriors The 6302 Å line constrains better the magnetic field vector than the 6301 Å line in this case 6301 Å 6302 Å

Conclusions The inversion process is a statistical problem  give set of parameters of a model that are compatible with the observables for a given noise Bayesian methods allow us to move from the a-priori information to a-posteriori situation using the information encoded in the data The posterior and their marginalized distributions are easily obtained using a Markov Chain Monte Carlo method Applications to synthetic and real data show the potential of the technique and points out severe degeneration problems