# NASSP Masters 5003F - Computational Astronomy - 2009 Lecture 5: source detection. Test the null hypothesis (NH). –The NH says: let’s suppose there is no.

## Presentation on theme: "NASSP Masters 5003F - Computational Astronomy - 2009 Lecture 5: source detection. Test the null hypothesis (NH). –The NH says: let’s suppose there is no."— Presentation transcript:

NASSP Masters 5003F - Computational Astronomy - 2009 Lecture 5: source detection. Test the null hypothesis (NH). –The NH says: let’s suppose there is no source there – ie, model is just background. –Calculate χ 2 for this null model. –Calculate the probability of χ 2 exceeding the value obtained. (Sometimes called a P-value.) Cutoff at 5% level.

NASSP Masters 5003F - Computational Astronomy - 2009 Source detection. –If this probability (the P-value) is smaller than a previously chosen cutoff, call this a positive detection. BUT! Note that there is no certainty. –Sometimes the null model will by chance give a large χ 2 => ‘false positives.’ For given data, background and cutoff, there will be a fixed number of false positives expected in the source list. => ‘reliability’. More on this later. –Sometimes a real source will give a small null- hypothesis χ 2 => ‘false negatives’, real sources which are missed. => ‘completeness’. More on this later.

NASSP Masters 5003F - Computational Astronomy - 2009 Problems with the NH approach: We don’t have exact knowledge of the background. –Have to estimate it either from separate data – in which case we need separate data! or from the same data… but this may be dominated by the source... –Or our background model may be wrong. Same issues as other model fitting. In particular: –χ 2 has to be used with care when the noise is Poisson.

NASSP Masters 5003F - Computational Astronomy - 2009 But where are the sources? A low probability for the null hypothesis tells us, at best, that there is a source somewhere. Finding the source(s) consists rather of looking for peaks in a random signal. The simplest example is when the noise is uncorrelated and the source peaks have width=0.

NASSP Masters 5003F - Computational Astronomy - 2009 A generic source-detection algorithm We shall assume that: –The data is ‘binned’ (eg CCD data). –We have a good independent estimate of the background. –The sources are sparsely distributed – such that we can deal with them one at a time. –The shape of the source profile is known. –The source position is unknown. –The source amplitude is unknown (but >0).

NASSP Masters 5003F - Computational Astronomy - 2009 Generic source-detection algorithm: The algorithm has 3 steps: Calculate a sliding-window map. Find the peaks in this map. For each peak, calculate the probability that it could arise by chance from the background (the null hypothesis P-value). P < P cutoff ? Sources Rejects NoYes 1: 2: 3: Choose a P cutoff

NASSP Masters 5003F - Computational Astronomy - 2009 1: The sliding window. y y y U U U

NASSP Masters 5003F - Computational Astronomy - 2009 1: The sliding window. For each position of the sliding window, a single number U is calculated from the values falling within the window. The output is a map of the U values. The intent is to: –Raise the signal-to-noise –Improve sensitivity –Amplify the sources at the expense of the noise. Sliding-window processing only has value when the source has a width > 1 pixel. Edges need special treatment. Same thing.

NASSP Masters 5003F - Computational Astronomy - 2009 1: Window functions A weighted sum (= a convolution). –Simplest with all weights = 1: “sliding box”. –Optimum weights – a “matched filter”: For uniform Gaussian noise, w opt = s. Trickier to optimize for Poisson noise. Per-window null-hypothesis χ 2. –With either an independent value of bkg (in which case degrees of freedom = number of pixels N w in the window), or… –…one fitted from the data (deg free = N w -1). Likelihood (same bkg provisions as χ 2 ).

NASSP Masters 5003F - Computational Astronomy - 2009 1: Window functions Parent function Data

NASSP Masters 5003F - Computational Astronomy - 2009 Parent function 1: Window functions Matched filter, size=10 Chi squared, size=100 Log-likelihood, size=100

NASSP Masters 5003F - Computational Astronomy - 2009 2: Peak finding Gaussian noise, convolved with a gaussian filter. …don’t get the gaussians mixed up!

NASSP Masters 5003F - Computational Astronomy - 2009 2: Peak finding No single neat prescription. Naive prescription: –Pixel i is a peak pixel if y i > any other y within a patch of pixels from i-j to i+j. But what value to choose for j? Things to avoid are: –j too small – results in more than 1 peak per source; –j too large – misses a close adjacent source.

NASSP Masters 5003F - Computational Astronomy - 2009 2: Peak finding Box too small: Box too large:

NASSP Masters 5003F - Computational Astronomy - 2009 3: Decision time – is it a source or not? To calculate a P-value we need the probability distribution of peaks in the post- window map of U values (given the null hypothesis). This is not the same as the probability distribution of the original data values… …nor is it even the same as the probability distribution of U values. In fact, little work seems to have been done on p peaks. (Though there is quite a lot on the distribution of extrema – not quite the same thing.)

NASSP Masters 5003F - Computational Astronomy - 2009 ‘Map’ vs ‘peak’ distributions for Gaussian noise. 3: The decision Black: all pixels Red: peaks

NASSP Masters 5003F - Computational Astronomy - 2009 3: Cash to the rescue First of all, remember that our model m has p parameters θ = [ θ 1, θ 2,… θ p ]. Cash theory – form a ratio between 2 likelihoods: –The numerator is calculated with all p parameters fixed at their ‘null hypothesis’ values. –For the denominator, a subset, q in number, of the parameters are adjusted to give the highest likelihood value. -2log(this ratio) behaves like χ 2 with q degrees of freedom.

NASSP Masters 5003F - Computational Astronomy - 2009 3: Cash to the rescue A practical recipe for applying Cash to source detection goes as follows: –Choose a window area surrounding each peak. –Within this window, calculate L null with model m i = b i (the background map values). –Calculate L best by fitting a model Degrees of freedom ν = 1 (the amplitude) + d (the dimensions of the spatial fit). –The Cash statistic 2(L best -L null ) behaves like χ 2 with 1+d deg. free. m i = b i + θ 1 s(r i – θ r )

NASSP Masters 5003F - Computational Astronomy - 2009 3: Cash to the rescue The only difficult point (which is a problem for every method) is to calculate the fraction of pixels which are peaks. –Monte Carlo –Possibly a Fourier technique? Also, don’t want to use the fit for final parameter values. A Mighell fit is better.

NASSP Masters 5003F - Computational Astronomy - 2009 Useful references: W Press et al, “Numerical Recipes in Fortran” P Bevington, “Data reduction and error analysis for the physical sciences” W Cash, Ap J 228, 939 (1979) K J Mighell, Ap J 518, 380 (1999) I M Stewart, A&A 454, 997 (2006) I M Stewart, A&A, in print (2009) Wikipedia

Download ppt "NASSP Masters 5003F - Computational Astronomy - 2009 Lecture 5: source detection. Test the null hypothesis (NH). –The NH says: let’s suppose there is no."

Similar presentations