Bootstrap Segmentation Analysis and Expectation Maximization

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Clustering k-mean clustering Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Image Modeling & Segmentation
Mixture Models and the EM Algorithm
Unsupervised Learning
Pattern Recognition and Machine Learning
Clustering Beyond K-means
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
Segmentation and Fitting Using Probabilistic Methods
Visual Recognition Tutorial
Paper Discussion: “Simultaneous Localization and Environmental Mapping with a Sensor Network”, Marinakis et. al. ICRA 2011.
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 10 Statistical Modelling Martin Russell.
First introduced in 1977 Lots of mathematical derivation Problem : given a set of data (data is incomplete or having missing values). Goal : assume the.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Lecture 5: Learning models using EM
Clustering.
Parametric Inference.
Machine Learning CMPT 726 Simon Fraser University
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Jeremy Tantrum, Department of Statistics, University of Washington joint work with Alejandro Murua & Werner Stuetzle Insightful Corporation University.
Bayesian Learning Rong Jin.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
GLAST Science Support Center June 29, 2005Data Challenge II Software Workshop GRB Analysis David Band GSFC/UMBC.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
1 Linear Methods for Classification Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Isolated-Word Speech Recognition Using Hidden Markov Models
EM and expected complete log-likelihood Mixture of Experts
Random Sampling, Point Estimation and Maximum Likelihood.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
1 HMM - Part 2 Review of the last lecture The EM algorithm Continuous density HMM.
Basic Time Series Analyzing variable star data for the amateur astronomer.
HMM - Part 2 The EM algorithm Continuous density HMM.
Lecture 2: Statistical learning primer for biologists
Unsupervised Mining of Statistical Temporal Structures in Video Liu ze yuan May 15,2011.
Collaboration Meeting Moscow, 6-10 Jun 2011 Collaboration Meeting Moscow, 6-10 Jun 2011 Agustín Sánchez Losa IFIC (CSIC – Universitat de València)
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Biointelligence Laboratory, Seoul National University
A 2 veto for Continuous Wave Searches
Probability Theory and Parameter Estimation I
Photon Event Maps Source Detection Transient Detection Jeff Scargle
Dept. Computer Science & Engineering, Shanghai Jiao Tong University
Model Inference and Averaging
Classification of unlabeled data:
Statistical Models for Automatic Speech Recognition
Haim Kaplan and Uri Zwick
Maximum Likelihood Estimation
Probabilistic Robotics
Special Topics In Scientific Computing
Latent Variables, Mixture Models and EM
Igor V. Cadez, Padhraic Smyth, Geoff J. Mclachlan, Christine and E
More about Posterior Distributions
Expectation Maximization Mixture Models HMMs
Probabilistic Models with Latent Variables
Statistical Models for Automatic Speech Recognition
SMEM Algorithm for Mixture Models
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Gaussian Mixture Models And their training with the EM algorithm
EE513 Audio Signals and Systems
EM for Inference in MV Data
Pattern Recognition and Machine Learning
Machine Learning for Signal Processing Expectation Maximization Mixture Models Bhiksha Raj Week Apr /18797.
EM for Inference in MV Data
EM Algorithm and its Applications
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.
Presentation transcript:

Bootstrap Segmentation Analysis and Expectation Maximization to Detect and Characterize Sources Jeffrey.D.Scargle@nasa.gov Space Science Division NASA Ames Research Center Diffuse Emission & LAT Source Catalog SLAC Workshop: May 23, 2005

Problems (Solutions) Algorithm provides no Error Estimate Bootstrap Errors Block Posterior Probabilities Point Spread: Overlapping Sources Expectation Maximization (EM) Point Spread: Function of Energy Maximum Likelihood Models?

One Dimensional1 Example: Swift GRB Data 1. But everything applies to 2D and higher!

Expectation Maximization (EM) Initialize: Find a good guess at the mixture model: How Many Sources? Locations of the Sources Source Parameters (size, spectra, …) Iterate: From the model: Divide the data into pieces that are relevant to each Source separately Re-determine the Source Locations and Parameters by fitting to the data pieces in 1 Repeat as needed

The “Maximization” Step with Point Data Maximize the Unbinned log-Likelihood: Log(L) =  i log[ a(ti) + b ] -  w(t) [ a(t) + b ] Where the model for each source (pulse) is: X(t) = [ a(t) + b ] a = source parameters (location, size, … ) b = local background constant w(t) is the EM partitioning, or weighting, function NB: the term  i log[ w(ti) ] in the full log-likelihood is a constant, irrelevant for model fitting.

Science!

Time Series of N Discrete Events Bootstrap Method: Time Series of N Discrete Events For many iterations: Randomly select N of the observed events with replacement Analyze this sample just as if it were real data Compute mean and variance of the bootstrap samples Bias = result for real data – bootstrap mean RMS error derived from bootstrap variance Caveat: The real data does not have the repeated events in bootstrap samples. I am not sure what effect this has.

Piecewise Constant Model (partitions the data space) Signal modeled as constant over each partition element (block).

Optimum Partitions in Higher Dimensions Blocks are collections of Voronoi cells (1D,2D,...) Relax condition that blocks be connected Cell location now irrelevant Order cells by volume Theorem: Optimum partition consists of blocks that are connected in this ordering Now can use the 1D algorithm, O(N2) Postprocessing: identify connected block fragments

Blocks Block: a set of data cells Two cases: Connected (can't break into distinct parts) Not constrained to be connected Model = set of blocks Fitness function: F( Model ) = sum over blocks F( Block )

Connected vs. Arbitrary Blocks

2D Synthetic Bootstrap Example: Raw Data

Local Mean & Variance of Area/Energy (idea due to Bill Atwood)