Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University.

Similar presentations


Presentation on theme: "1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University."— Presentation transcript:

1 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University Advisor: Prof. Petar M. Djuric STONY BROOK UNIVERSITY Dissertation Defense

2 2 Outline  PART I: Introduction  Conclusions and future work  PART II: Theory of PFs Dynamic model Monte Carlo sampling Importance sampling Resampling Bearings-only tracking example Steps and complexity  PART III: Implementation of PFs VLSI signal processing architectures Methodology Non-parallel implementation Algorithm characteristics Modifications of the PF New resampling algorithms Architecture Implementation results Parallel implementation Propagation of particles Parallel resampling Architectures for parallel resampling Space exploration Gaussian PFs Motivation and goals Challenges

3 3 sensor Particle Filter t Observed signal t Estimation PARTICLE FILTER CHIP Introduction – Motivations and Goals Goal Increase speed of particle filters

4 4 Introduction - Challenges First hardware implementation of particle filters (50 times improvement in speed in comparison with DSP) New resampling algorithms suitable for hardware implementation Fast particle filtering algorithms that do not use memories First distributed algorithms and architectures for particle filters  Contributions Reducing computational complexity Randomness – difficult to exploit regular structures in VLSI Exploiting temporal and spatial concurrency  Challenges

5 5 Outline  PART I: Introduction  Conclusions and future work  PART II: Theory of PFs Dynamic model Monte Carlo sampling Importance sampling Resampling Bearings-only tracking example Steps and complexity  PART III: Implementation of PFs VLSI signal processing architectures Methodology Non-parallel implementation Algorithm characteristics Modifications of the PF New resampling algorithms Architecture Implementation results Parallel implementation Propagation of particles Parallel resampling Architectures for parallel resampling Space exploration Gaussian PFs Motivation and goals Challenges

6 6 States: position and velocity x k =[x k, V xk, y k, V yk ] T Observations: angle z k Theory of PFs – Dynamic model zk=fz(xk,vk)zk=fz(xk,vk) x k =f x (x k-1, u k )  Example: Bearings-only tracking Observation equation: z k =atan(y k / x k )+v k State equation: x k =Fx k-1 + Gu k f z measurement function v k observation noise f x state transition function u k process noise  General dynamic model

7 7 Objective in Bayesian approach p(x 0:k |z 1:k ) posterior distribution Theory of PFs – Bayesian approach xk?xk? State space model Solution Problem Estimate posterior Difficult to draw samples Integrals are not tractable Monte Carlo Sampling Importance Sampling Use of knowing the posterior All kinds of estimates can be calculated Gaussian processes and linear model Kalman filter Non-Gaussian processes and/or non-linear model Particle filter

8 8 Theory of PFs – Monte Carlo Sampling Densities can be approximated by discrete random measures: Particles and Weights χ approximates the density p(x) Integrals simplify to summations t State space model Solution Problem Estimate posterior Difficult to draw samples Integrals are not tractable Monte Carlo Sampling Importance Sampling

9 9 State space model Solution Problem Estimate posterior Difficult to draw samples Integrals are not tractable Monte Carlo Sampling Importance Sampling Theory of PFs - Importance Sampling Objective: Approximate a density p(x) by a discrete random measure Steps: 1. Generation of particles proposal density 2. Updating of the weights Bayes theory

10 10 Theory of PFs - Resampling Particles after resampling time Problems: Weight Degeneration Wastage of Computational resources Solution RESAMPLING Replicate particles in proportion to their weights

11 11 Theory of PFs – Bearings-Only Tracking Example

12 12 Theory of PFs - Bearings-Only Tracking Example (Cont.) Blue – True trajectory Red – Estimates

13 13 Theory of PFs – Steps and Complexity Initialize particles Output estimates 12M... Particle generation New observation Exit Normalize weights 12M... Weigth computation Resampling 4M random number generations Propagation of the particles M exponential and arctangent functions Bearings-only tracking problem Number of particles M=1000 Complexity More observations? yes no

14 14 Outline  PART I: Introduction  Conclusions and future work  PART II: Theory of PFs Dynamic model Monte Carlo sampling Importance sampling Resampling Bearings-only tracking example Steps and complexity  PART III: Implementation of PFs VLSI signal processing architectures Methodology Non-parallel implementation Algorithm characteristics Modifications of the PF New resampling algorithms Architecture Implementation results Parallel implementation Propagation of particles Parallel resampling Architectures for parallel resampling Space exploration Gaussian PFs Motivation and goals Challenges

15 15 Implementation of PFs – VLSI Signal Processing Architectures  Approach Temporal and spatial concurrency One-to-one mapping between operations and hardware blocks FPGA implementation Speed is the main goal Functionality of the system does not change  Application specific processors Programmable digital signal processors Application-domain specific processors Application specific processors  Types of architectures

16 16 Implementation of PFs – Methodology Algorithmic level Architecture level RT level Gate level Impact of a design decision Complexity System level Joint algorithmic and architectural design To increase performances, algorithms must be matched to architectures

17 17 Implementation of PFs – Algorithm Characteristics Start 12M... Particle generation New observation Exit Resampling 12M... Weight computation Propagation of particles Particle generation and weight computation High computational complexity No data dependencies among particles Complexity depends on the state space model Suitable for parallel and pipelined implementation Resampling Data dependent algorithm Low complexity operations Propagation of particles: random Algorithm does not depend on the state space model

18 18 Implementation of PFs – Modifications of the PF Modifications ArchitectureAlgorithm Fine-grain pipelining Avoiding normalization Loop transformations Finite precision arithmetic Spatial concurrency Dedicated hardware Addressing schemes ParameterCurrentLimits Sample period~2MT clk ~MT clk Memories(2N+1)M(N+1)M

19 19 Implementation of PFs –New Resampling Algorithms ParameterAlgorithm 1Algorithm 2 Sample period~2MT clk ~MT clk MemoriesParticle memory: (N+1)M Index memory: 2M Particle memory: (N+1)M Index memory: 4M PerformancesSameWorse (deterministic algorithm)

20 20 Implementation of PFs – Architecture

21 21 Implementation of PFs – Implementation results Particle generation Weight Computation Resampling Logic blocks16%75%9% Block RAMs67%11%22% Logic blocks: 4% Memories: 3%  Resources DSP: ~ 1kHz FPGA: ~ 50 kHz  Sampling frequency  Percentage of utilization of the PF blocks Hardware platform is Xilinx Virtex-II Pro Clock period is 10ns PFs is applied to the bearings-only tracking problem 1000 particles is used

22 22 Universal architecture with a central unit Processing Element 1 Processing Element 4 Processing Element 2 Central Unit Implementation of PFs – Parallelism Start New observation Exit 12M... Particle generation Resampling 12M... Weight computation Propagation of particles Processing Element 3 Processing elements (PE) Particle generation Weight computation Central Unit Algorithm for particle propagation Resampling 1 M 1 M

23 23 PE 2PE 1PE 3PE 4 Implementation of PFs – Propagation of Particles Processing Element 1 Processing Element 4 Processing Element 2 Central Unit Processing Element 3 Disadvantages of the particle propagation step Random communication pattern Decision about connections is not known before the run time Requires dynamic type of a network Speed-up is significantly affected Particles after resampling time t

24 24 Implementation of PFs – Parallel Resampling 12 34 N=13N=0 N=3 1 4 4 12 34 N=8N=0 N=8 4 4 12 34 N=4 1 1 11 Advantages Propagation is only local Propagation is controlled in advance by a designer Performances are the same as in the sequential applications Solution The way in which Monte Carlo sampling is performed is modified Result Speed-up is almost equal to the number of PEs (up to 8 PEs)

25 25 PE1 PE2PE4 PE3 Central Unit Architecture that allows adaptive connection among the processing elements Implementation of PFs Architectures for Parallel Resampling Controlled particle propagation after resampling

26 26 Implementation of PFs – Space exploration Hardware platform is Xilinx Virtex-II Pro Clock period is 10ns PFs are applied to the bearings-only tracking problem Limit: Available memory Limit: Logic blocks

27 27 12M... Implementation of PFs – Gaussian PFs Sampling period is minimal ~ MT clk No need for memories for storing particles Simple communication in parallel implementation  Advantages Start 12M... Particle generation Exit 12M... Weight computation Computing the mean and the covariance matrix Drawing conditioning particles New observation No Yes Propagates only first two moments Approximates densities by Gaussians No need for resampling  Functionality Higher computational complexity Limited scope of applications  Disadvantages

28 28 Implementation of PFs – Gaussian PFs (cont.) Minimum sampling period versus number of PEs of parallel GPFs and SIRs

29 29 Conclusions and Future Work Simplifying floating to fixed-point conversion Developing application-domain specific processor for PFs Developing reconfigurable architectures for PFs  Future work  Summary Modification of the algorithms to be suitable for hardware implementation Development of parallel algorithms and architectures Implementation of the particle filter in FPGA Analysis of the other types of particle filtering algorithms

30 30 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University Advisor: Prof. Petar M. Djuric STONY BROOK UNIVERSITY Dissertation Defense


Download ppt "1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University."

Similar presentations


Ads by Google