Presentation is loading. Please wait.

Presentation is loading. Please wait.

SegmentationSegmentation C. Phillips, Institut Montefiore, ULg, 2006.

Similar presentations


Presentation on theme: "SegmentationSegmentation C. Phillips, Institut Montefiore, ULg, 2006."— Presentation transcript:

1 SegmentationSegmentation C. Phillips, Institut Montefiore, ULg, 2006

2 In image analysis, segmentation is the partition of a digital image into multiple regions (sets of pixels), according to some criterion. The goal of segmentation is typically to locate certain objects of interest which may be depicted in the image. Segmentation criteria can be arbitrarily complex, and take into account global as well as local criteria. A common requirement is that each region must be connected in some sense. DefinitionDefinition

3 A simple example of segmentation is thresholding a grayscale image with a fixed threshold t: each pixel p is assigned to one of two classes, P 0 or P 1, depending on whether I(p) < t or I(p) ≥ t. t=.5

4 Example: medical imaging...

5 How to fix the threshold ?

6 Goal of brain image segmentation Split the head volume into its « main » components: gray matter (GM) white matter (WM) cerebrol-spinal fluid (CSF) the rest/others (tumour)

7 Manual segmentation: an operator classifies the voxels manually Segmentation approaches

8 Semi-automatic segmentation: an operator defines a set of parameters, that are passed to an algorithm Example: threshold at t=200 Segmentation approaches

9 Automatic segmentation: no operator intervention  objective and reproducible

10 Model the histogram of the image ! Intensity based segmentation

11 Segmentation - Mixture Model Intensities are modelled by a mixture of K Gaussian distributions, parameterised by:Intensities are modelled by a mixture of K Gaussian distributions, parameterised by: –means –variances –mixing proportions Intensities are modelled by a mixture of K Gaussian distributions, parameterised by:Intensities are modelled by a mixture of K Gaussian distributions, parameterised by: –means –variances –mixing proportions

12 Segmentation - Algorithm Starting estimates for belonging probabilities Compute Gaussian parameters from belonging probabilities Compute belonging probabilities from Gaussian parameters Converged ? NoYes STOP

13 Segmentation - Problems Noise & Partial volume effect

14 MR images are corrupted by a smooth intensity non-uniformity (bias). Image with bias artefact Corrected image Segmentation - Problems Intensity bias field

15 Segmentation - Priors Overlay prior belonging probability maps to assist the segmentation –Prior probability of each voxel being of a particular type is derived from segmented images of 151 subjects Assumed to beAssumed to berepresentative –Requires initial registration to standard space. Overlay prior belonging probability maps to assist the segmentation –Prior probability of each voxel being of a particular type is derived from segmented images of 151 subjects Assumed to beAssumed to berepresentative –Requires initial registration to standard space.

16 Bias correction informs segmentation Registration informs segmentation Segmentation informs bias correction Bias correction informs registration Segmentation informs registration Unified approach: segmentation-correction-registration

17 Unified Segmentation The solution to this circularity is to put everything in the same Generative Model.The solution to this circularity is to put everything in the same Generative Model. –A MAP solution is found by repeatedly alternating among classification, bias correction and registration steps. The Generative Model involves:The Generative Model involves: –Mixture of Gaussians (MOG) –Bias Correction Component –Warping (Non-linear Registration) Component The solution to this circularity is to put everything in the same Generative Model.The solution to this circularity is to put everything in the same Generative Model. –A MAP solution is found by repeatedly alternating among classification, bias correction and registration steps. The Generative Model involves:The Generative Model involves: –Mixture of Gaussians (MOG) –Bias Correction Component –Warping (Non-linear Registration) Component

18 Gaussian Probability Density If intensities are assumed to be Gaussian of mean  k and variance  2 k, then the probability of a value y i is:If intensities are assumed to be Gaussian of mean  k and variance  2 k, then the probability of a value y i is:

19 Non-Gaussian Probability Distribution A non-Gaussian probability density function can be modelled by a Mixture of Gaussians (MOG):A non-Gaussian probability density function can be modelled by a Mixture of Gaussians (MOG): Mixing proportion - positive and sums to one

20 Mixing Proportions The mixing proportion  k represents the prior probability of a voxel being drawn from class k - irrespective of its intensity.The mixing proportion  k represents the prior probability of a voxel being drawn from class k - irrespective of its intensity. So:So: The mixing proportion  k represents the prior probability of a voxel being drawn from class k - irrespective of its intensity.The mixing proportion  k represents the prior probability of a voxel being drawn from class k - irrespective of its intensity. So:So:

21 Non-Gaussian Intensity Distributions Multiple Gaussians per tissue class allow non-Gaussian intensity distributions to be modelled.Multiple Gaussians per tissue class allow non-Gaussian intensity distributions to be modelled.

22 Probability of Whole Dataset If the voxels are assumed to be independent, then the probability of the whole image is the product of the probabilities of each voxel:If the voxels are assumed to be independent, then the probability of the whole image is the product of the probabilities of each voxel: It is often easier to work with negative log-probabilities:It is often easier to work with negative log-probabilities: If the voxels are assumed to be independent, then the probability of the whole image is the product of the probabilities of each voxel:If the voxels are assumed to be independent, then the probability of the whole image is the product of the probabilities of each voxel: It is often easier to work with negative log-probabilities:It is often easier to work with negative log-probabilities:

23 Modelling a Bias Field A bias field is included, such that the required scaling at voxel i, parameterised by , is  i (  ).A bias field is included, such that the required scaling at voxel i, parameterised by , is  i (  ). Replace the means by  k /  i (  )Replace the means by  k /  i (  ) Replace the variances by (  k /  i (  )) 2Replace the variances by (  k /  i (  )) 2 A bias field is included, such that the required scaling at voxel i, parameterised by , is  i (  ).A bias field is included, such that the required scaling at voxel i, parameterised by , is  i (  ). Replace the means by  k /  i (  )Replace the means by  k /  i (  ) Replace the variances by (  k /  i (  )) 2Replace the variances by (  k /  i (  )) 2

24 Modelling a Bias Field After rearranging:After rearranging: ()() y y ()y ()

25 Tissue Probability Maps Tissue probability maps (TPMs) are used instead of the proportion of voxels in each Gaussian as the prior.Tissue probability maps (TPMs) are used instead of the proportion of voxels in each Gaussian as the prior. ICBM Tissue Probabilistic Atlases. These tissue probability maps are kindly provided by the International Consortium for Brain Mapping, John C. Mazziotta and Arthur W. Toga.

26 “Mixing Proportions” Tissue probability maps for each class are available.Tissue probability maps for each class are available. The probability of obtaining class k at voxel i, given weights  is then:The probability of obtaining class k at voxel i, given weights  is then:

27 Deforming the Tissue Probability Maps Tissue probability images are deformed according to parameters .Tissue probability images are deformed according to parameters . The probability of obtaining class k at voxel i, given weights  and parameters  is then:The probability of obtaining class k at voxel i, given weights  and parameters  is then:

28 The Extended Model By combining the modified P(c i =k|  ) and P(y i |c i =k,  ), the overall objective function ( E ) becomes:By combining the modified P(c i =k|  ) and P(y i |c i =k,  ), the overall objective function ( E ) becomes: The Objective Function

29 OptimisationOptimisation The “best” parameters are those that minimise this objective function.The “best” parameters are those that minimise this objective function. Optimisation involves finding them.Optimisation involves finding them. Begin with starting estimates, and repeatedly change them so that the objective function decreases each time.Begin with starting estimates, and repeatedly change them so that the objective function decreases each time. The “best” parameters are those that minimise this objective function.The “best” parameters are those that minimise this objective function. Optimisation involves finding them.Optimisation involves finding them. Begin with starting estimates, and repeatedly change them so that the objective function decreases each time.Begin with starting estimates, and repeatedly change them so that the objective function decreases each time.

30 Schematic of optimisation Repeat until convergence... Hold , ,  2 and  constant, and minimise E w.r.t.  - Levenberg-Marquardt strategy, using dE/d  and d 2 E/d  2 Hold , ,  2 and  constant, and minimise E w.r.t.  - Levenberg-Marquardt strategy, using dE/d  and d 2 E/d  2 Hold  and  constant, and minimise E w.r.t. ,  and  2 -Use an Expectation Maximisation (EM) strategy. end Repeat until convergence... Hold , ,  2 and  constant, and minimise E w.r.t.  - Levenberg-Marquardt strategy, using dE/d  and d 2 E/d  2 Hold , ,  2 and  constant, and minimise E w.r.t.  - Levenberg-Marquardt strategy, using dE/d  and d 2 E/d  2 Hold  and  constant, and minimise E w.r.t. ,  and  2 -Use an Expectation Maximisation (EM) strategy. end (Iterated Conditional Mode)

31 Levenberg-Marquardt Optimisation LM optimisation is used for the nonlinear registration and bias correction components.LM optimisation is used for the nonlinear registration and bias correction components. Requires first and second derivatives of the objective function ( E ).Requires first and second derivatives of the objective function ( E ). Parameters  and  are updated byParameters  and  are updated by Increase to improve stability (at expense of decreasing speed of convergence).Increase to improve stability (at expense of decreasing speed of convergence). LM optimisation is used for the nonlinear registration and bias correction components.LM optimisation is used for the nonlinear registration and bias correction components. Requires first and second derivatives of the objective function ( E ).Requires first and second derivatives of the objective function ( E ). Parameters  and  are updated byParameters  and  are updated by Increase to improve stability (at expense of decreasing speed of convergence).Increase to improve stability (at expense of decreasing speed of convergence).

32 EM is used to update ,  2 and  For iteration (n), alternate between: –E-step: Estimate belonging probabilities by: –M-step: Set  (n+1) to values that reduce: For iteration (n), alternate between: –E-step: Estimate belonging probabilities by: –M-step: Set  (n+1) to values that reduce:

33 Bayes rule states: p(q|e)  p(e|q) p(q) – –p(q|e) is the a posteriori probability of parameters q given errors e. – –p(e|q) is the likelihood of observing errors e given parameters q. – –p(q) is the a priori probability of parameters q. Maximum a posteriori (MAP) estimate maximises p(q|e). Maximising p(q|e) is equivalent to minimising the Gibbs potential of the posterior distribution (H(q|e), where H(q|e)  -log p(q|e)). The posterior potential is the sum of the likelihood and prior potentials: H(q|e) = H(e|q) + H(q) + c – –The likelihood potential ( H(e|q)  -log p(e|q )) is based upon the sum of squared difference between the images. – –The prior potential ( H(q)  -log p(q )) penalises unlikely deformations. Bayes rule states: p(q|e)  p(e|q) p(q) – –p(q|e) is the a posteriori probability of parameters q given errors e. – –p(e|q) is the likelihood of observing errors e given parameters q. – –p(q) is the a priori probability of parameters q. Maximum a posteriori (MAP) estimate maximises p(q|e). Maximising p(q|e) is equivalent to minimising the Gibbs potential of the posterior distribution (H(q|e), where H(q|e)  -log p(q|e)). The posterior potential is the sum of the likelihood and prior potentials: H(q|e) = H(e|q) + H(q) + c – –The likelihood potential ( H(e|q)  -log p(e|q )) is based upon the sum of squared difference between the images. – –The prior potential ( H(q)  -log p(q )) penalises unlikely deformations. Bayesian Formulation

34 Linear Regularisation Some bias fields and distortions are more probable (a priori) than others.Some bias fields and distortions are more probable (a priori) than others. Encoded using Bayes rule:Encoded using Bayes rule: Prior probability distributions can be modelled by a multivariate normal distribution.Prior probability distributions can be modelled by a multivariate normal distribution. –Mean vector a and  b –Covariance matrix a and  b –-log[P(a)] = (a-m a ) T S a -1 (a-m a ) + const Some bias fields and distortions are more probable (a priori) than others.Some bias fields and distortions are more probable (a priori) than others. Encoded using Bayes rule:Encoded using Bayes rule: Prior probability distributions can be modelled by a multivariate normal distribution.Prior probability distributions can be modelled by a multivariate normal distribution. –Mean vector a and  b –Covariance matrix a and  b –-log[P(a)] = (a-m a ) T S a -1 (a-m a ) + const

35 Voxels are assumed independent!

36 Hidden Markov Random Field Voxels are NOT independent: GM voxels are surrounded by other GM voxels, at least on one side. Model the intensity and classification of the image voxels by 2 random field: a visible field y for the intensities a hidden field c for the classifications Modify the cost function E: And, at each voxel, the 6 neighbouring voxels are used to to build U mrf, imposing local spatial constraints.

37 Hidden Markov Random Field T1 imageT2 image

38 Hidden Markov Random Field White matter T1 & T2: MoG + hmrf T1 only: MoG only

39 Hidden Markov Random Field Gray matter T1 & T2: MoG + hmrf T1 only: MoG only

40 Hidden Markov Random Field CSF T1 & T2: MoG + hmrf T1 only: MoG only

41 PerspectivesPerspectives Multimodal segmentation : 1 image is good but 2 is better ! Model the joint histogram using multi- dimensional normal distributions. Tumour detection : contrasted images to modify the prior images automatic detection of outliers ?

42 Thank you for your attention !


Download ppt "SegmentationSegmentation C. Phillips, Institut Montefiore, ULg, 2006."

Similar presentations


Ads by Google