Download presentation

Presentation is loading. Please wait.

Published byZoe Houston Modified over 3 years ago

1
Using particle filters to fit state-space models of wildlife population dynamics Len Thomas, Stephen Buckland, Ken Newman, John Harwood University of St Andrews 16 th September 2004 I always wanted to be a model

2
Outline 1. Introduction / background / recap –State space models –Motivating example – grey seal model –Methods of fitting the models –Types of inference from time series data 2. Tutorial: nonlinear particle filtering / sequential importance sampling –importance sampling –sequential importance sampling (SIS) –tricks 3. Results and discussion –Potential of particle filtering in ecology and epidemiology

3
References This talk: ftp://bat.mcs.st-and.ac.uk/pub/len/emstalk.zip See also talk by David Elston, tomorrow Books: –Doucet A., N. de Freitas and N. Gordon (eds). 2001. Sequential Monte Carlo Methods in Practice. Springer- Verlag. –Lui, J.S. 2001. Monte Carlo Strategies in Scientific Computing. Springer-Verlag. Papers from St. Andrews group – see abstract

4
1. Introduction

5
State-space models Describe the evolution of two parallel time series –n t – vector of true states (unobserved) –y t – vector of observations on the states t=1... T State process densityg t (n t |n t-1 ; Θ) Observation process density f t (y t |n t ; Θ) Initial state densityg 0 (n 0 ; Θ) Dont need to be equally spaced Can be at the level of individual

6
Hidden process models An extension of state-space models –state process can be > 1 st order Markov –observation process can depend on previous states and observations

7
Example: British Grey Seal

8
Surveying seals Hard to survey outside of breeding season: 80% of time at sea, 90% of this time underwater Aerial surveys of breeding colonies since 1960s used to estimate pup production (Other data: intensive studies, radio tracking, genetic, photo-ID, counts at haul-outs) ~6% per year overall increase in pup production

9
Estimated pup production

10
Orkney example colonies

11
Questions What is the current population size? What is the future population trajectory? Biological interest in population processes –E.g., movement of recruiting females (What sort of data should be collected to most efficiently answer the above questions?) ?

12
Grey seal model: states 7 age classes –pups (n 0 ) –age 1 – age 5 females (n 1 -n 5 ) –age 6+ females (n 6+ ) = breeders 48 colonies – aggregated into 4 regions

13
State process time step 1 year, which starts just after the breeding season 4 sub-processes –survival –age incrementation and sexing –movement of recruiting females –breeding u s,a,r,t n a,r,t-1 u i,a,r,t u m,a,r,t n a,r,t breedingmovementagesurvival

14
Survival density independent adult survival density dependent pup survival where

15
Age incrementation and sexing u i,1,r,t ~Bin (u s,0,r,t, 0.5) u i,a+1,r,t = u s,a,r,t a=1-4 u i,6+,r,t = u s,5,r,t + u s,6+,r,t

16
Movement of recruiting females only age 5 females move movement is fitness dependent –females move if expected offspring survival is higher elsewhere expected proportion moving proportional to –difference in juvenile survival rates –inverse of distance between colonies –inverse of site faithfulness where

17
Breeding density independent u b,0,r,t ~ Bin (u m,6+,r,t, α)

18
Matrix representation E(n t |n t-1, Θ) B M t A S t n t-1

19
Matrix representation E(n t |n t-1, Θ) L t n t-1

20
Life cycle graph pup12345 6+ density dependence

21
Life cycle graph pup12345 6+ North Sea pup12345 6+ Inner Hebrides pup12345 6+ Outer Hebrides pup12345 6+ Orkneys density dependent movement depends on distance density dependence site faithfulness

22
Observation process pup production estimates normally distributed, with variance proportional to expectation

23
Grey seal model: summary Parameters: –survival : φ a, φ pmax, β 1,..., β 4 –breeding: α –movement: γ dd, γ dist, γ sf observation CV: ψ –total 7 + 4 = 11 States: –7 age classes per region per year –total 7 x 4 = 28 per year

24
Fitting state-space models Analytic approaches –Kalman filter (Gaussian linear model) Takis Besbeas –Extended Kalman filter (Gaussian nonlinear model – approximate) –Numerical maximization of the likelihood Monte Carlo approximations –Likelihood-based (Geyer 1996) –Bayesian Carmen Fernández Rejection sampling Damien Clancy Markov chain Monte Carlo (MCMC) Philip ONeill Monte Carlo particle filtering Me!

25
Inferences tasks for time series data Observe data y 1:t = (y 1,...,y t ) We wish to infer the unobserved states n 1:t = (n 1,...,n t ) and parameters Θ Fundamental inference tasks: –Smoothingp(n 1:t, Θ| y 1:t ) –Prediction p(n t+x | y 1:t ) –Filtering p(n t, Θ t | y 1:t )

26
Filtering inference can be fast! Suppose we have p(n t |y 1:t ) A new observation y t+1 arrives. We want to update to p(n t+1 |y 1:t+1 ). Can use the filtering recursion:

27
Monte-Carlo particle filters: online inference for evolving datasets Particle filtering used when fast online methods required to produce updated (filtered) estimates as new data arrives: –Tracking applications in radar, sonar, etc. –Finance Stock prices, exchange rates arrive sequentially. Online update of portfolios. –Medical monitoring Online monitoring of ECG data for sick patients –Digital communications –Speech recognition and processing

28
2. Monte Carlo Particle Filtering Variants/Synonyms: Sequential Importance Sampling (SIS) Sampling Importance Sampling Resampling (SISR) Bootstrap Filter Interacting Particle Filter Auxiliary Particle Filter

29
Importance sampling Want to make inferences about some function p(), but cannot evaluate it directly Solution: –Sample from another function q() (the importance function) which has the same support as p() –Correct using importance weights

30
Importance sampling algorithm Want to make inferences about p(n t+1 |y 1:t+1 ) Prediction step: Make K random draws from importance function Correction step: Calculate: Normalize weights so that Approximate the target function:

31
Sequential importance sampling SIS is just repeated application of importance sampling at each time step Basic sequential importance sampling: –Proposal distribution q() = g(n t+1 |n t ) –Leads to weights To do basic SIS, need to be able to: –Simulate forward from the state process –Evaluate the observation process density (the likelihood)

32
Basic SIS algorithm Generate K particles from the prior on {n 0, Θ} and with weights 1/K: For each time period t=1,...,T –For each particle i=1,...,K Prediction step: Correction step:

33
Example of basic SIS State-space model of exponential population growth –State model –Observation model –Priors

34
Example of basic SIS t=1 Obs: 12 0.028 0.012 0.201 0.073 0.038 0.029 0.000 0.012 Predict Correct 11 12 14 13 16 20 14 9 16 Sample from prior 1.055 1.107 1.195 0.974 0.936 1.029 1.081 1.201 1.000 0.958 n 0 Θ 0 w 0 0.1 17 18 11 15 20 17 7 6 22 Prior at t=1 1.055 1.107 1.195 0.974 0.936 1.029 1.081 1.201 1.000 0.958 n 1 Θ 0 w 0 17 18 11 15 20 17 7 6 22 Posterior at t=1 1.055 1.107 1.195 0.974 0.936 1.029 1.081 1.201 1.000 0.958 0.063 0.034 0.558 0.202 0.010 0.063 0.000 0.003 n 1 Θ 1 w 1 gives f()

35
Example of basic SIS t=2 Obs: 14 gives f() 0.160 0.190 0.112 0.008 0.046 0.160 0.011 0.000 0.046 0.007 Predict Correct 17 18 11 15 20 17 7 6 22 Posterior at t=1 1.055 1.107 1.195 0.974 0.936 1.029 1.081 1.201 1.000 0.958 n 1 Θ 1 w ! 0.063 0.034 0.558 0.202 0.010 0.063 0.000 0.003 0.063 0.034 0.558 0.202 0.010 0.063 0.000 0.003 15 14 12 10 11 15 21 9 11 20 Prior at t=2 1.055 1.107 1.195 0.974 0.936 1.029 1.081 1.201 1.000 0.958 n 2 Θ 1 w 1 15 14 12 10 11 15 21 9 11 20 Posterior at t=2 1.055 1.107 1.195 0.974 0.936 1.029 1.081 1.201 1.000 0.958 0.105 0.068 0.691 0.015 0.005 0.105 0.007 0.000 n 2 Θ 2 w 2

36
Problem: particle depletion Variance of weights increases with time, until few particles have almost all the weight Results in large Monte Carlo error in approximation Can quantify: From previous example: Time012 ESS10.02.51.8

37
Problem: particle depletion Worse when: –Observation error is small –Lots of data at any one time point –State process has little stochasticity –Priors are diffuse or not congruent with observations –State process model incorrect (e.g., time varying) –Outliers in the data

38
Particle depletion: solutions Pruning: throw out bad particles (rejection) Enrichment: boost good particles (resampling) –Directed enrichment (auxiliary particle filter) –Mutation (kernel smoothing) Other stuff –Better proposals –Better resampling schemes –…–…

39
Rejection control Idea: throw out particles with low weights Basic algorithm, at time t: –Have a pre-determined threshold, c t, where 0 < c t <=1 –For i = 1, …, K, accept particle i with probability –If particle is accepted, update weight to –Now we have fewer than K samples Can make up samples by sampling from the priors, projecting forward to the current time point and repeating the rejection control

40
Rejection control - discussion Particularly useful at t=1 with diffuse priors Can have a sequence of control points (not necessarily every year) Check points dont need to be fixed – can trigger when variance of weights gets too high Thresholds, c t, dont need to be set in advance but can be set adaptively (e.g., mean of weights) Instead of restarting at time t=0, can restart by sampling from particles at previous check point (= partial rejection control)

41
Resampling: pruning and enrichment Idea: allow good particles to amplify themselves while killing off bad particles Algorithm. Before and/or after each time step (not necessarily every time step) –For j = 1, …, K Sample independently from the set of particles according to the probabilities Assign new weights Reduces particle depletion of states as children particles with the same parent now evolve independently

42
Resample probabilities Should be related to the weights (as in the bootstrap filter) –α could vary according to the variance of weights –α = ½ has been suggested related to future trend – as in auxiliary particle filter

43
Directed resampling: auxiliary particle filter Idea: Pre-select particles likely to have high weights in future Example algorithm. –For j = 1, …, K Sample independently from the set of particles according to the probabilities Predict: Correct: If future observations are available can extend to look >1 time step ahead – e.g., protein folding application

44
Kernel smoothing: enrichment of parameters through mutation Idea: Introduce small mutations into parameter values when resampling Algorithm: –Given particles –Let V t be the variance matrix of the –For i = 1, …, K Sample where h controls the size of the perturbations –Variance of parameters is now (1+h 2 )V t, so need shrinkage to preserve 1 st 2 moments

45
Kernel smoothing - discussion Previous algorithm does not preserve the relationship between parameters and states –Leads to poor smoothing inference –Possibly unreliable filtered inference? –Pragmatically – use as small a value of h as possible Extensions: –Kernel smooth states as well as parameters –Local kernel smoothing

46
Other tricks Better proposals: –Rao Blackwellization – the Gibbs sampling equivalent –Importance sampling (e.g., from priors) Better resampling: –Residual resamling –Stratified resampling Alternative mutation algorithms: –MCMC within SIS Gradual focussing on posterior: –Tempering/anneling …

47
3. Results and discussion Ill make you fit into my model!!!

48
Example: Grey seal model Used pup production estimates for 1984-2002 Informative priors on parameters; 1984 data used with sampled parameters to provide priors for states Algorithm: –Auxiliary particle filter with kernel smoothing (h=0.9) and rejection control (c=99 th %ile) at the first time step –400,000 particles –Took ~6hrs to run on a PC (in SPlus)

49
Posterior parameter estimates

50
Coplots

51
Smoothed pup production

52
Predicted adults

53
Model selection and multi-model inference Can put priors on alternative models, and then sample from the models to initialize the particles Proportion of particles with each model gives posterior model probabilities Can also penalize for more parameters Model LnLAICAkaike weight Outer Hebrides = Western Isles Production, 1δ-625.01258.00.02 Production, 3δs-624.11260.20.01 Staff, 1δ-624.51257.10.03 Staff, 3δs-623.81259.50.01 Outer Hebrides = Western Isles + Northwest Production, 1δ -621.51250.90.67 Production, 3δs-621.91255.90.06 Staff, 1δ-622.91253.80.16 Staff, 3δs-622.91257.90.02

54
Discussion: application in ecology and epidemiology An alternative to MCMC? –Debatable when there is plenty of time for fitting and main emphasis is on smoothing inference. –Best suited to situations where fast filtered estimates are required – e.g.: foot and mouth outbreak? N. American West coast salmon harvest openings? Disadvantages: –Methods less well developed than for MCMC? –No general software (no WinBUPF)

55
Current / future research Efficient general algorithms (and software) Comparison with MCMC and Kalman filter Incorporating other types of data (e.g., mark-recapture) Parallelization Multi-model inference Diagnostics Other applications!

56
Just another particle

Similar presentations

OK

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google