Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.

Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often need to integrate it The posterior probability distribution contains the complete information concerning the parameters, but need often need to integrate it E.g., normalizing posterior requires an n- dimensional integral for an n-parameter posterior E.g., normalizing posterior requires an n- dimensional integral for an n-parameter posterior Rarely can analytically integrate posterior Rarely can analytically integrate posterior Numerical integration is also difficult when n > several Numerical integration is also difficult when n > several

Markov-Chain Monte Carlo Can then easily compute confidence intervals: 1.Sum histogram from best-fit value (often peak of histogram) in both directions 2.Stop when x% of values summed for an x% confidence interval The histogram of chain values for a parameter is a visual representation of the (marginalized) probability distribution for that parameter Instead of integrating, sample from posterior in a way that results in a set of parameters values that is identically-distributed as the posterior

MCMC Algorithms MCMC is similar to a random-walk approach MCMC is similar to a random-walk approach Most common techniques are Metropolis- Hastings and Gibbs Sampling (see Wikipedia entries on these) Most common techniques are Metropolis- Hastings and Gibbs Sampling (see Wikipedia entries on these) Metropolis-Hastings is simplest algorithm since it only requires evaluations of the posterior for a given set of parameter values Metropolis-Hastings is simplest algorithm since it only requires evaluations of the posterior for a given set of parameter values

Gibbs Sampling 1.Derive conditional probability of each parameter given values of the other parameters 2.Pick parameter at random 3.Draw from conditional probability of that parameter given values of all other parameters from previous iteration 4.Repeat until chain converges A good approach is to form “hierarchical” models that result in simple conditional probabilities. Example: Case of a two-variable posterior p(x,y), for each MCMC draw i: x i ~ p(x|y=y i-1 ) y i ~ p(y|x=x i-1 )“~” means “distributed as”

Metropolis-Hastings Can be visualized as similar to the rejection method of random number generation Can be visualized as similar to the rejection method of random number generation Use a “proposal” distribution that is similar in shape (especially width) to the expected posterior distribution to generate new parameter values Use a “proposal” distribution that is similar in shape (especially width) to the expected posterior distribution to generate new parameter values Accept new step when probability of new values is higher, occasionally accept new step otherwise (to go “up hill”, avoiding relative minima) Accept new step when probability of new values is higher, occasionally accept new step otherwise (to go “up hill”, avoiding relative minima)

M-H with a Gaussian Proposal Distr. 1. For each parameter θ estimate/determine “step” size σ 2. For each chain iteration, new proposal value θ’= θ i-1 + N(σ), N = normal deviate with st. dev. = σ 3. If p(θ’)/p(θ i-1 ) > 1, θ i = θ’ 4. If p(θ’)/p(θ i-1 ) < 1, accept θ i = θ’ with probability p(θ’)/p(θ i-1 ), θ i = θ i-1 otherwise

M-H Issues Can be very slow to converge, especially when there are correlated variables Can be very slow to converge, especially when there are correlated variables Use multivariate proposal distributions (done in XSPEC approach) Use multivariate proposal distributions (done in XSPEC approach) Transform correlated variables: Transform correlated variables: If x and y are correlated, instead set y = ax + b, fit for x, a, b, compute y from chain values afterwards (may also need to also fix a or b with a tight prior) If x and y are correlated, instead set y = ax + b, fit for x, a, b, compute y from chain values afterwards (may also need to also fix a or b with a tight prior) Convergence Convergence Run multiple chains, compute convergence statistics Run multiple chains, compute convergence statistics

MCMC Example In Ptak et al. (2007) we used MCMC to fit the X-ray luminosity functions of normal galaxies in the GOODS area (see poster) In Ptak et al. (2007) we used MCMC to fit the X-ray luminosity functions of normal galaxies in the GOODS area (see poster) Tested code first by fitting published FIR luminosity function Tested code first by fitting published FIR luminosity function Key advantages: Key advantages: visualizing full probability space of parameters visualizing full probability space of parameters ability to derive quantities from MCMC chain value (e.g., luminosity density) ability to derive quantities from MCMC chain value (e.g., luminosity density)

Sanity Check: Fitting local 60  m LF log L/L ○ Fit Saunders et al (1990) LF assuming Gaussian errors and ignoring upper limits Param. S1990 MCMC α 1.09 ± 0.12 1.04 ± 0.08 σ 0.72 ± 0.03 0.75 ± 0.02 Φ* 0.026 ± 0.008 0.026 ± 0.003 log L* 8.47 ± 0.23 8.39 ± 0.15 Φ

(Ugly) Posterior Probabilities Early-type GalaxiesLate-type Galaxies Red crosses show 68% confidence interval z< 0.5 X-ray luminosity functions

Marginalized Posterior Probabilities log L*  log φ* Dashed curves show Gaussian with same mean & st. dev. as posterior Dotted curves show prior   Note: α and σ tightly constrained by (Gaussian) prior, rather than being “fixed”

MCMC in XSPEC XSPEC MCMC is based on the Metropolis-Hastings algorithm. The chain proposal command is used to set the proposal distribution. The basic options are multivariate gaussian or cauchy although user-defined distributions can be entered. The covariance matrix for these distributions can be calculated from the best-fit, entered from the command line, or calculated from previous MCMC chain(s). MCMC is integrated into other XSPEC commands. If chains are loaded then these are used to generate confidence regions on parameters, fluxes and luminosities. This is more accurate than the current method for estimating errors on fluxes and luminosities. The tclout simpars command returns a set of parameter values drawn from the probability distribution defined by the currently loaded chain(s). Chains can be saved either as ascii or FITS files.

XSPEC MCMC Output Histogram and probability density plot (2-d histogram) of spectral fit parameters from an XSPEC MCMC run produced by fv (see https://astrophysics.gsfc.nasa.gov/XSPECwiki)

Future Use “physical” priors… have posterior from previous work be prior for current work Use “physical” priors… have posterior from previous work be prior for current work Use observed distribution of photon indices of nearby AGN when fitting for N H in deep surveys Use observed distribution of photon indices of nearby AGN when fitting for N H in deep surveys Incorporate calibration uncertainty into fitting (Kashyap AISR project) Incorporate calibration uncertainty into fitting (Kashyap AISR project) XSPEC has a plug-in mechanism for user-defined proposal distributions… would be good to also allow user-defined priors XSPEC has a plug-in mechanism for user-defined proposal distributions… would be good to also allow user-defined priors Code repository/WIKI for MCMC analysis in astronomy Code repository/WIKI for MCMC analysis in astronomy

Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.

Similar presentations

Presentation on theme: "Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.

Similar presentations

Presentation on theme: "Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often."— Presentation transcript:

Similar presentations

About project

Feedback