Particle Filtering.

Particle Filtering

Sensors and Uncertainty
Real world sensors are noisy and suffer from missing data (e.g., occlusions, GPS blackouts) Use sensor models to estimate ground truth, unobserved variables, make forecasts

Hidden Markov Model Use observations to get a better idea of where the robot is at time t Hidden state variables X0 X1 X2 X3 Observed variables z1 z2 z3 Predict – observe – predict – observe…

Last Class Kalman Filtering and its extensions
Exact Bayesian inference for Gaussian state distributions, process noise, observation noise What about more general distributions? Key representational issue How to represent and perform calculations on probability distributions?

Agenda Bayesian filtering, in more detail
Particle filtering: a Monte Carlo approach to Bayesian filtering with complex distributions Principles Ch. 9

Aside… why is this hard?

Bayesian Prediction on Markov Chain
The probability distribution of X1 depends probabilistically on the value of X0 The probability distribution of X2 depends probabilistically on the value of X1 … X0 X1 X2 X3 P(Xt|Xt-1) known as transition model

Bayesian Prediction on MC
Prediction / forecasting: what’s the probability distribution over a future state Xt?

Prediction / forecasting: what’s the probability distribution over a future state Xt? Need to marginalize over the possible values of the prior state 𝑃 𝑋 𝑡 =𝑥 = 𝑥 𝑡−1 𝑃 𝑋 𝑡 =𝑥 𝑥 𝑡−1 𝑃 𝑥 𝑡−1 𝑑 𝑥 𝑡−1 Transition model Distribution over previous state

Prediction / forecasting: what’s the probability distribution over a future state Xt? Need to marginalize over the possible values of the prior state 𝑃 𝑋 𝑡 =𝑥 = 𝑥 𝑡−1 𝑃 𝑋 𝑡 =𝑥 𝑥 𝑡−1 𝑃 𝑥 𝑡−1 𝑑 𝑥 𝑡−1 Recursive inference: maintain a belief state Belt(X)=P(Xt), use above equation to advance to Belt+1(X) Transition model Distribution over previous state

Belief state evolution
P(Xt) = Sxt-1P(Xt|Xt-1) P(Xt-1) “Blurs” over time, and, if domain is bounded, typically approaches a stationary distribution as t grows Limited prediction power Rate of blurring known as “mixing time”

History Dependence In Markov models, the state must be chosen so that the future distribution is determined entirely by the current state (history independence) Often this requires adding variables that cannot be directly observed minimum essentials “the bare” market wipes himself with the rabbit Are these people walking toward you or away from you? What comes next?

Partial Observability
Hidden Markov Model (HMM) Hidden state variables X0 X1 X2 X3 Observed variables z1 z2 z3 P(Zt|Xt) called the observation model (or sensor model)

Bayesian Filtering Name comes from signal processing Query variable X0
z1 z2 z3 Observed variables

Bayesian Filtering Name comes from signal processing
Maintain belief over time 𝐵𝑒𝑙 𝑡 𝑥 =𝑃( 𝑋 𝑡 =𝑥| 𝑧 1 ,…, 𝑧 𝑡 ) Query variable X0 X1 X2 X3 z1 z2 z3 Observed variables

Bayesian Filtering 𝑃 𝑋 𝑡 =𝑥 𝑧 1 ,…, 𝑧 𝑡 = 𝑥 𝑡−1 𝑃 𝑋 𝑡 =𝑥 𝑥 𝑡−1 , 𝑧 𝑡 𝑃 𝑥 𝑡−1 𝑧 1 ,…, 𝑧 𝑡−1 𝑑 𝑥 𝑡−1 Query variable X0 X1 X2 X3 z1 z2 z3 Observed variables

Bayesian Filtering 𝑃 𝑋 𝑡 =𝑥 𝑧 1 ,…, 𝑧 𝑡 = 𝑥 𝑡−1 𝑃 𝑋 𝑡 =𝑥 𝑥 𝑡−1 , 𝑧 𝑡 𝐵𝑒𝑙 𝑡−1 ( 𝑥 𝑡−1 ) 𝑑 𝑥 𝑡−1 Query variable X0 X1 X2 X3 z1 z2 z3 Observed variables

Bayesian Filtering 𝑃 𝑋 𝑡 =𝑥 𝑧 1 ,…, 𝑧 𝑡 = 𝑥 𝑡−1 𝑃 𝑋 𝑡 =𝑥 𝑥 𝑡−1 , 𝑧 𝑡 𝐵𝑒𝑙 𝑡−1 ( 𝑥 𝑡−1 ) 𝑑 𝑥 𝑡−1 𝑃 𝑋 𝑡 =𝑥 𝑥 𝑡−1 , 𝑧 𝑡 =𝑃 𝑧 𝑡 𝑥 𝑡−1 , 𝑋 𝑡 =𝑥 𝑃 𝑋 𝑡 =𝑥 𝑋 𝑡−1 /𝑃( 𝑧 𝑡 | 𝑥 𝑡−1 ) X0 X1 X2 X3 z1 z2 z3

Bayesian Filtering 𝑃 𝑋 𝑡 =𝑥 𝑧 1 ,…, 𝑧 𝑡 = 𝑥 𝑡−1 𝑃 𝑋 𝑡 =𝑥 𝑥 𝑡−1 , 𝑧 𝑡 𝐵𝑒𝑙 𝑡−1 ( 𝑥 𝑡−1 ) 𝑑 𝑥 𝑡−1 𝑃 𝑋 𝑡 =𝑥 𝑥 𝑡−1 , 𝑧 𝑡 =𝑃 𝑧 𝑡 𝑥 𝑡−1 , 𝑋 𝑡 =𝑥 𝑃 𝑋 𝑡 =𝑥 𝑋 𝑡−1 /𝑃( 𝑧 𝑡 | 𝑥 𝑡−1 ) 𝑃 𝑧 𝑡 𝑥 𝑡−1 , 𝑋 𝑡 =𝑥 =𝑃( 𝑧 𝑡 | 𝑋 𝑡 =𝑥) X0 X1 X2 X3 z1 z2 z3

Bayesian Filtering 𝑃 𝑋 𝑡 =𝑥 𝑧 1 ,…, 𝑧 𝑡 = 𝑥 𝑡−1 𝑃 𝑋 𝑡 =𝑥 𝑥 𝑡−1 , 𝑧 𝑡 𝐵𝑒𝑙 𝑡−1 ( 𝑥 𝑡−1 ) 𝑑 𝑥 𝑡−1 𝑃 𝑋 𝑡 =𝑥 𝑥 𝑡−1 , 𝑧 𝑡 =𝑃 𝑧 𝑡 𝑥 𝑡−1 , 𝑋 𝑡 =𝑥 𝑃 𝑋 𝑡 =𝑥 𝑥 𝑡−1 /𝑃( 𝑧 𝑡 | 𝑥 𝑡−1 ) 𝑃 𝑧 𝑡 𝑥 𝑡−1 , 𝑋 𝑡 =𝑥 =𝑃( 𝑧 𝑡 | 𝑋 𝑡 =𝑥) 𝑃 𝑧 𝑡 𝑥 𝑡−1 = 𝑥 𝑡 𝑃 𝑧 𝑡 𝑥 𝑡 𝑃 𝑥 𝑡 𝑥 𝑡−1 𝑑 𝑥 𝑡 X0 X1 X2 X3 z1 z2 z3

Bayesian Filtering Recap: Two-step interpretation:
𝑃 𝑋 𝑡 =𝑥 𝑧 1 ,…, 𝑧 𝑡 = 𝑥 𝑡−1 𝑃 𝑧 𝑡 𝑋 𝑡 =𝑥 /𝑃( 𝑧 𝑡 | 𝑥 𝑡−1 )𝑃 𝑋 𝑡 =𝑥 𝑋 𝑡−1 𝐵𝑒𝑙 𝑡−1 ( 𝑥 𝑡−1 ) 𝑑 𝑥 𝑡−1 𝑃 𝑧 𝑡 𝑥 𝑡−1 = 𝑥 𝑡 𝑃 𝑧 𝑡 𝑥 𝑡 𝑃 𝑥 𝑡 𝑥 𝑡−1 𝑑 𝑥 𝑡 Two-step interpretation: Predict 𝐵𝑒𝑙 𝑡 ′ (𝑥)=𝑃( 𝑋 𝑡 =𝑥| 𝑧 1 ,…, 𝑧 𝑡−1 ) w/o 𝑧 𝑡 Update using the information from 𝑧 𝑡 to derive 𝐵𝑒 𝑙 𝑡 (𝑥) X0 X1 X2 X3 z1 z2 z3

Particle Filtering (aka Sequential Monte Carlo)
Represent distributions as a set of particles Applicable to non-gaussian high-D distributions Convenient implementations Widely used in vision, robotics

Simultaneous Localization and Mapping (SLAM)
Mobile robots Odometry Locally accurate Drifts significantly over time Vision/ladar/sonar Inaccurate locally Global reference frame Combine the two State: (robot pose, map) Observations: (sensor input)

General problem xt ~ Bel(xt) (arbitrary p.d.f.) xt+1 = f(xt,u,ep)
zt+1 = g(xt+1,eo) ep ~ arbitrary p.d.f., eo ~ arbitrary p.d.f. Process noise Observation noise

Particle Representation
Bel(xt) = {(wk,xk), k=1,…,n} wk are weights, xk are state hypotheses Weights sum to 1 Approximates the underlying distribution

Monte Carlo Integration
More formally, if P(x) ≈ Bel(x) = {(wk,xk), k=1,…,N} 𝐸 𝑃 𝜙 𝑥 = 𝑥 𝜙(𝑥)𝑃(𝑥)𝑑𝑥 ≈ 𝑘=1 𝑁 𝑤𝑘𝜙(𝑥𝑘) for any test function φ(x) What might you want to compute? Mean: use φ(x) = x Variance: use φ(x) = x2 (recover Var(x) = E[x2]-E[x]2) P(y): use φ(x) = P(y|x) Because P(y) = integral[ P(y|x)P(x)dx ]

Recovering the Distribution
Kernel density estimation P(x) = Sk wk K(x,xk) K(x,xk) is the kernel function Better approximation as # particles, kernel sharpness increases

Performing a transformation
Let P(x) ≈ Bel(x) = {(wk,xk), k=1,…,N} Let 𝑦=𝑓(𝑥) be a general nonlinear transformation Want to recover distribution Q(y) = P(f(X)) Hypothesis: Bel(y) = {(wk,f(xk)), k=1,…,N} approximates Q(y)

Particle Propagation

Hypothesis: Bel(y) = {(wk,f(xk)), k=1,…,N} approximates Q(y) Let φ be a test function 𝐸 𝑄 𝜙 𝑦 = 𝑦 𝜙 𝑦 𝑄 𝑦 𝑑𝑦 = 𝑦 𝜙 𝑦 𝑥 𝐼[𝑓 𝑥 =𝑦]𝑃 𝑥 𝑑𝑥 𝑑𝑦

Hypothesis: Bel(y) = {(wk,f(xk)), k=1,…,N} approximates Q(y) Let φ be a test function 𝐸 𝑄 𝜙 𝑦 = 𝑦 𝜙 𝑦 𝑄 𝑦 𝑑𝑦 = 𝑦 𝜙 𝑦 𝑥 𝐼[𝑓 𝑥 =𝑦]𝑃 𝑥 𝑑𝑥 𝑑𝑦 Now consider 𝐼[𝑓 𝑥 =𝑦] as a test function 𝜓 𝑦 (𝑥) 𝑥 𝐼[𝑓 𝑥 =𝑦]𝑃 𝑥 𝑑𝑥 ≈ 𝑘=1 𝑁 𝑤𝑘 𝜓 𝑦 𝑥 𝑘 = 𝑘=1 𝑁 𝑤𝑘 𝐼[𝑓 𝑥 𝑘 =𝑦] 𝐸 𝑄 𝜙 𝑦 ≈ 𝑦 𝜙 𝑦 𝑘=1 𝑁 𝑤𝑘 𝐼 𝑓 𝑥 𝑘 =𝑦 𝑑𝑦 = 𝑘=1 𝑁 𝑤 𝑘 𝑦 𝜙 𝑦 𝐼 𝑓 𝑥 𝑘 =𝑦 𝑑𝑦 = 𝑘=1 𝑁 𝑤 𝑘 𝜙 𝑦 𝑘

Filtering Steps Predict Update
Compute Bel’(xt+1): distribution of xt+1 using dynamics model alone Update Compute a representation of P(xt+1|zt+1) via likelihood weighting for each particle in Bel’(xt+1) Resample to produce Bel(xt+1) for next step

Predict Step Given input particles Bel(xt)
Distribution of xt+1=f(xt,ut,e) determined by sampling e from its distribution and then propagating individual particles Gives Bel’(xt+1)

Update Step Goal: compute a representation of P(xt+1 | zt+1) given Bel’(xt+1), zt+1 P(xt+1 | zt+1) = a P(zt+1 | xt+1) P(xt+1) P(xt+1) = Bel’(xt+1) (given) Each state hypothesis xk  Bel’(xt+1) is reweighted by P(zt+1 | xt+1) Likelihood weighting: wk  wk P(zt+1|xt+1=xk) Then renormalize to 1

Update Step wk wk’ * P(zt+1 | xt+1=xk) 1D Gaussian example:
g(x,eo) = h(x) + eo eo ~ N(m,s) P(zt+1 | xt+1=xk) = C exp(- (h(xk)-zt+1)2 / 2s2) In general, distribution can be calibrated using experimental data

Resampling Likelihood weighted particles may no longer represent the distribution efficiently Importance resampling: sample new particles proportionally to weight

Sampling Importance Resampling (SIR) variant
Predict Update Resample

Particle Filtering Issues
Variance Std. dev. of a quantity (e.g., mean) computed as a function of the particle representation ~ 1/sqrt(N) Loss of particle diversity Resampling will likely drop particles with low likelihood They may turn out to be useful hypotheses in the future

Other Resampling Variants
Selective resampling Keep weights, only resample when # of “effective particles” < threshold Stratified resampling Reduce variance using quasi-random sampling Optimization Explicitly choose particles to minimize deviance from posterior …

Storing more information with same # of particles
Unscented Particle Filter Each particle represents a local gaussian, maintains a local covariance matrix Combination of particle filter + Kalman filter Rao-Blackwellized Particle Filter State (x1,x2) Particle contains hypothesis of x1, analytical distribution over x2 Reduces variance

Recap Bayesian mechanisms for state estimation are well understood
Representation challenge Methods: Kalman filters: highly efficient closed-form solution for Gaussian distributions Particle filters: approximate filtering for high-D, non-Gaussian distributions Implementation challenges for different domains (localization, mapping, SLAM, tracking)

Particle Filtering.

Similar presentations

Presentation on theme: "Particle Filtering."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Particle Filtering.

Similar presentations

Presentation on theme: "Particle Filtering."— Presentation transcript:

Similar presentations

About project

Feedback