CMB Power spectrum likelihood approximations

Slides:



Advertisements
Similar presentations
Antony Lewis Institute of Astronomy, Cambridge
Advertisements

CMB and cluster lensing Antony Lewis Institute of Astronomy, Cambridge Lewis & Challinor, Phys. Rept : astro-ph/
6. Radial-basis function (RBF) networks
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
1 General Iteration Algorithms by Luyang Fu, Ph. D., State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting LLP 2007 CAS.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Combining Monte Carlo Estimators If I have many MC estimators, with/without various variance reduction techniques, which should I choose?
All about noisy time series analysis in which features exist and long term behavior is present; finite boundary conditions are always an issue.
Pattern Recognition and Machine Learning
Introduction to Regression with Measurement Error STA431: Spring 2015.
Planck 2013 results, implications for cosmology
Use of Kalman filters in time and frequency analysis John Davis 1st May 2011.
NASSP Masters 5003F - Computational Astronomy Lecture 5: source detection. Test the null hypothesis (NH). –The NH says: let’s suppose there is no.
2008 SIAM Conference on Imaging Science July 7, 2008 Jason A. Palmer
Systematic effects in cosmic microwave background polarization and power spectrum estimation SKA 2010 Postgraduate Bursary Conference, Stellenbosch Institute.
Iterative methods TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A A A A A.
Visual Recognition Tutorial
Component Separation of Polarized Data Application to PLANCK Jonathan Aumont J-F. Macías-Pérez, M. Tristram, D. Santos
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
Lecture 7 Advanced Topics in Least Squares. the multivariate normal distribution for data, d p(d) = (2  ) -N/2 |C d | -1/2 exp{ -1/2 (d-d) T C d -1 (d-d)
Multiple Regression Analysis
Multiple Regression Analysis
Independent Component Analysis (ICA) and Factor Analysis (FA)
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 3. Asymptotic Properties.
7. Least squares 7.1 Method of least squares K. Desch – Statistical methods of data analysis SS10 Another important method to estimate parameters Connection.
MACHINE LEARNING 6. Multivariate Methods 1. Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Motivating Example  Loan.
Introduction to Power Spectrum Estimation Lloyd Knox (UC Davis) CCAPP, 23 June 2010.
Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges.
Introduction to Regression with Measurement Error STA431: Spring 2013.
Modern Navigation Thomas Herring
The Statistical Properties of Large Scale Structure Alexander Szalay Department of Physics and Astronomy The Johns Hopkins University.
Weak Lensing 3 Tom Kitching. Introduction Scope of the lecture Power Spectra of weak lensing Statistics.
(Station Dependent) Correlation in VLBI Observations John M. Gipson NVI, Inc./NASA GSFC 4 th IVS General Meeting 9-11 January 2006 Concepcion, Chile.
US Planck Data Analysis Review 1 Christopher CantalupoUS Planck Data Analysis Review 9–10 May 2006 CTP Working Group Presented by Christopher Cantalupo.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Learning Theory Reza Shadmehr LMS with Newton-Raphson, weighted least squares, choice of loss function.
Risk Analysis & Modelling Lecture 2: Measuring Risk.
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
Geology 5670/6670 Inverse Theory 21 Jan 2015 © A.R. Lowry 2015 Read for Fri 23 Jan: Menke Ch 3 (39-68) Last time: Ordinary Least Squares Inversion Ordinary.
July 11, 2006Bayesian Inference and Maximum Entropy Probing the covariance matrix Kenneth M. Hanson T-16, Nuclear Physics; Theoretical Division Los.
Bayes Theorem The most likely value of x derived from this posterior pdf therefore represents our inverse solution. Our knowledge contained in is explicitly.
3rd International Workshop on Dark Matter, Dark Energy and Matter-Antimatter Asymmetry NTHU & NTU, Dec 27—31, 2012 Likelihood of the Matter Power Spectrum.
Blind Component Separation for Polarized Obseravations of the CMB Jonathan Aumont, Juan-Francisco Macias-Perez Rencontres de Moriond 2006 La.
Lecture 8 Source detection NASSP Masters 5003S - Computational Astronomy
Review of statistical modeling and probability theory Alan Moses ML4bio.
Maximum likelihood estimators Example: Random data X i drawn from a Poisson distribution with unknown  We want to determine  For any assumed value of.
Jin Huang M.I.T. For Transversity Collaboration Meeting Jan 29, JLab.
CMB, lensing, and non-Gaussianities
Geo479/579: Geostatistics Ch12. Ordinary Kriging (2)
BlueFin Best Linear Unbiased Estimate Fisher Information aNalysis Andrea Valassi (IT-SDC) based on the work done with Roberto Chierici TOPLHCWG meeting.
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
NASSP Masters 5003F - Computational Astronomy Lecture 4: mostly about model fitting. The model is our estimate of the parent function. Let’s express.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
CWR 6536 Stochastic Subsurface Hydrology Optimal Estimation of Hydrologic Parameters.
Probability Theory and Parameter Estimation I
Velocity Estimation from noisy Measurements
Combining Effect Sizes
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Pattern Recognition and Machine Learning
Computing and Statistical Data Analysis / Stat 7
Learning Theory Reza Shadmehr
J. Ellis, F. Jenet, & M. McLaughlin
#21 Marginalize vs. Condition Uninteresting Fitted Parameters
Mathematical Foundations of BME
Separating E and B types of CMB polarization on an incomplete sky Wen Zhao Based on: WZ and D.Baskaran, Phys.Rev.D (2010) 2019/9/3.
Multiple Regression Analysis
Presentation transcript:

CMB Power spectrum likelihood approximations Antony Lewis, IoA Work with Samira Hamimeche

Start with full sky, isotropic noise Assume alm Gaussian

Integrate alm that give same Chat - Wishart distribution For temperature Non-Gaussian skew ~ 1/l For unbiased parameters need bias << - might need to be careful at all ell

Gaussian/quadratic approximation Gaussian in what? What is the variance? Not Gaussian of Chat – no Det fixed fiducial variance exactly unbiased, best-fit on average is correct Actual Gaussian in Chat or change variable, Gaussian in log(C), C-1/3 etc…

Do you get the answer right for amplitude over range lmin < l lmin+1 ?

Binning: skewness ~ 1/ (number of modes) ~ 1 / (l Δl) - can use any Gaussian approximation for Δl >> 1 Fiducial Gaussian: unbiased, - error bars depend on right fiducial model, but easy to choose accurate to 1/root(l) Gaussian approximation with determinant: - Best-fit amplitude is - almost always a good approximation for l >> 1 - somewhat slow to calculate though

New approximation Can we write exact likelihood in a form that generalizes for cut-sky estimators? - correlations between TT, TE, EE. - correlations between l, l’ Would like: Exact on the full sky with isotropic noise Use full covariance information Quick to calculate

Matrices or vectors? Vector of n(n+1)/2 distinct elements of C Covariance: For symmetric A and B, key result is:

For example exact likelihood function in terms of X and M is using result: Try to write as quadratic from that can be generalized to the cut sky

Likelihood approximation where Then write as where Re-write in terms of vector of matrix elements…

For some fiducial model Cf where Now generalizes to cut sky:

Other approximations also good just for temperature Other approximations also good just for temperature. But they don’t generalize. Can calculate likelihood exactly for azimuthal cuts and uniform noise - to compare.

Unbiased on average

T and E: Consistency with binned likelihoods (all Gaussian accurate to 1/(l Delta_l) by central limit theorem)

Test with realistic mask kp2, use pseudo-Cl directly

Isotropic noise test ~ 143Ghz from science case red – same realisation analysed on full sky all 1 < l < 2001 Provisional CosmoMC module at http://cosmologist.info/cosmomc/CMBLike.html

More realistic anisotropic Planck noise /data/maja1/ctp_ps/phase_2/maps/cmb_symm_noise_all_gal_map_1024.fits For test upgrade to Nside=2048, smooth with 7/3arcmin beam. What is the noise level???

Science case vs phase2 sim (TT only, noise as-is)

Hybrid Pseudo-Cl estimators Following GPE 2003, 2006 (+ numerous PCL papers) slight generalization to cross-weights For n weight functions wi define X=Y: n(n+1)/2 estimators; X<>Y, n2 estimators in general

Covariance matrix approximations Small scales, large fsky etc… straightforward generalization for GPE’s results.

Also need all cross-terms…

Combine to hybrid estimator? Find best single (Gaussian) fit spectrum using covariance matrix (GPE03). Keep simple: do Cl separately Low noise: want uniform weight - minimize cosmic variance High noise: inverse-noise weight - minimize noise (but increases cosmic variance, lower eff fsky) Most natural choice of window function set? w1 = uniform w2 = inverse (smoothed with beam) noise Estimators like CTT,11 CTT,12 CTT,22 … For cross CTE,11 CTE,12 CTE,21 CTE,22 but Polarization much noisier than T, so CTE,11 CTE,12 CTE,22 OK? Low l TT force to uniform-only? Or maybe negative hybrid noise is fine, and doing better??

TT cov diagonal, 2 weights

Does weight1-weight2 estimator add anything useful? TT hybrid diag cov, dashed binned, 2 weight (3est) vs 3 weights (6 est) vs 2 weights diag only (GPE) Noisex1 Does it asymptote to the optimal value??

TE probably much more useful.. TE diagonal covariance

fwhm=7arcmin 2 weights, kp2 cut Hybrid estimator cmb_symm_noise_all_gal_map_1024.fits sim with TT Noise/16 N_QQ=N_UU=4N_TT fwhm=7arcmin 2 weights, kp2 cut

l >30, tau fixed full sky uniform noise exact science case 153GHz avg vs TT,TE,EE polarized hybrid (2 weights, 3 cross) estimator on sim (Noise/16) Somewhat cheating using exact fiducial model chi-sq/2 not very good 3200 vs 2950

Very similar result with Gaussian approx and (true) fiducial covariance

What about cross-spectra from maps with independent noise? (Xfaster?) - on full sky estimators no longer have Wishart distribution. Eg for temp - asymptotically, for large numbers of maps it does -----> same likelihood approx probably OK when information loss is small

Conclusions Gaussian can be good at l >> 1 -> MUST include determinant - either function of theory, or constant fixed fiducial model New likelihood approximation - exact on full sky - fast to calculate - uses Nl, C-estimators, Cl-fiducial, and Cov-fiducial - with good Cl-estimators might even work at low l [MUCH faster than pixel-like] - seems to work but need to test for small biases