Sie sind auf Seite 1von 17

Lecture 4: The Particle Filter

Lecturer: Raghu Krishnapuram


Distinguished Member of Technical Staff
Robert Bosch Centre for Cyber-Physical Systems
kraghu@iisc.ac.in

1/17
* The Particle Filter

* Represent the posterior bel(xt ) by a set of random state samples drawn from the
posterior.
* Since it is non-parametric, is more broadly applicable.
* The samples of the posterior distribution are called particles:
[1] [2] [M ]
Xt := xt , xt , . . . , xt

* In general, M needs to be quite large for a good representation (e.g., 1,000).


* It can be a function of t or other quantities.
* Ideally, the likelihood of a state hypothesis xt to be included in Xt should be
[m]
proportional to its Bayes filter posterior, i.e., xt ∼ p(xt |z1:t , u1:t )

2/17
* Algorithm Particle filter (Xt−1 , ut , zt )
1. X t = Xt = ∅
2. for m = 1 to M do
[m] [m]
3. sample xt ∼ p(xt |ut , xt−1 ) – representation of bel(xt )
[m] [m]
4. wt = p(zt |xt ) – compute the importance factor
[m] m]
5. X t = X t + hxt , wt i
6. endfor
7. for m = 1 to M do
[i]
8. draw i with probability ∝ wt – importance sampling
[i]
9. add xi to Xt
10. endfor
* Return Xt
[m]
* Importance sampling changes distribution from bel(xt ) to η p(zt |xt )bel(xt ).

3/17
* Importance sampling

* Used when we need to work with a probability density function f , but we have
access only to samples from a different pdf g. For example, need to estimate the
expectation that x ∈ A:
Z
Ef [I(x ∈ A)] = f (x)I(x ∈ A)dx
Z
f (x)
= g(x)I(x ∈ A)dx
g(x)
| {z }
w(x)

= Eg [w(x)I(x ∈ A)]

* f is the target distribution, and g is the proposal distribution such that whenever
f (x) > 0, g(x) > 0.
4/17
* Importance sampling (contd...)

* When we have samples (particles) from g, we can compensate for the difference
by weighting them according to:

f (x[m] )
w[m] = , and
g(x[m] )
M
X M
−1 X Z
[m] [m] [m]
w I(x ∈ A)w → f (x)dx
m=1 m=1 A

* In most cases, estimates with weighted particles converge to the desired Ef at the
rate of O( √1M ).

5/17
* Importance sampling in particle filtering

* Importance sampling will possess many duplicates, since the sampling is with
replacement and M is fixed.
* It has a tendency to “collapse” all samples to a single value (survival of the
fittest).
* An alternative version does not re-sample, but iteratively update a weight for each
sample, starting with weight=1:
[m] [m] [m]
wt = p(zt |xt )wt−1

* In this approach, many particles will continue to exist in regions of low probability,
and hence the representation is not adaptive.

6/17
* Illustration of the “particle” representation used by particle filters

Samples of X are passed through the nonlinear function, resulting the samples of Y .
7/17
* Illustration of importance factors particle filters

We need to approximate the target density f . However, we can only generate samples
from g. 8/17
* Illustration of importance factors particle filters (contd...)

A sample of f can be obtained by attaching a weight f (x)/g(x) to each sample x.

9/17
* Mathematical derivation of the PF

* Particle filters can be thought of samples of state sequences, and define the belief
accordingly:
[m] [m] [m] [m]
x0:t = x0 , x1 , . . . , xt
bel(x0:t ) = p(x0:t |u1:t , z1:t )
* This gives us

p(x0:t |z1:t , u1:t ) = η p(zt |x0:t , z1:t−1 , u1:t )p(x0:t |, z1:t−1 , u1:t )
= η p(zt |xt )p(x0:t |, z1:t−1 , u1:t )
= η p(zt |xt )p(xt |x0:t−1 , z1:t−1 , u1:t )p(x0:t−1 |z1:t−1 , u1:t )
= η p(zt |xt )p(xt |xt−1 , ut )p(x0:t−1 |z1:t−1 , u1:t−1 )

10/17
* Mathematical derivation of the PF (contd...)

* The first particle set is sampled from the prior p(x0 ). Particle set at time t − 1 is
distributed according to bel(x0:t−1 )
[m]
* Sample xt , which replaces the m-th particle in the set, is generated from the
proposal distribution:

p(xt |xt−1 , ut )bel(x0:t−1 ) = p(xt |xt−1 , ut )p(x0:t−1 |, z1:t−1 , u1:t ), with


[m] target distribution
wt =
proposal distribution
p(zt |xt )p(xt |xt−1 , ut )p(x0:t−1 |z1 1 : t − 1, u1:t )

p(xt |xt−1 , ut )p(x0:t−1 |z1:t−1 , u1:t−1 )
= η p(zt |xt )
[m]
* Therefore, it follows that by using importance weights wt , we achieve state
[m]
samples xt distributed according to bel(xt ). 11/17
* Practical considerations of particle filters

* Often we need continuous representations of density functions.


* This can be achieved by a Gaussian approximation (only unimodal), or a mixture
of Gaussians (higher computational cost), or histograms (high space complexity),
density trees (expensive lookups), or kernel densities (smoother, but linear in
complexity w.r.t. number of particles or kernels).
* The method chosen depends on the application, the dimensionality of state space,
the number of hypotheses to be supported, etc.

12/17
* Different ways of extracting densities from particles

Gaussian approximation, kernel estimate, density and sample set approximation,and


histogram approximation.
13/17
* Sampling variance

When the sample size is small, the variation is generally large. The sampling variance
can be reduced with larger samples.
14/17
* Resampling and variance reduction

* The resampling step in the particle filtering algorithm will not reproduce all the
particles in general.
* The particles with larger probabilities or weights tend to repeat and the ones with
smaller probabilities or weights may not appear in the next round.
* In theory, after repeated re-sampling, the diversity vanishes, and M copies of a
single particle will survive!
* The variance of of the particle set as an estimator will increas with time.
* One possible approach to counter this effect is to reduce the frequency of
resampling, and simply update the important weights as follows:
(
[m] 1
wt = [m] [m]
p(zt |xt ) wt−1
15/17
* Resampling and variance reduction (contd...)

* Monitor the variance of the importance weights and decide when to resample.
* Another option is low variance sampling
* This approach covers the samples systematically by cycling through them.
* Guarantees that if all have the same importance factors, all will be reproduced.
* The complexity of the low variance sampler is O(M ).
* There are many alternative approaches, such as stratified sampling, which group
particles into subsets, and ensure adequate representation from each subset.
* Sampling bias occurs because of the normalization of the weights, i.e., one degree
of freedom is reduced.
* Particle deprivation happens because of variance in random sampling. Generally
can be overcome with a larger set of particles.
16/17
* Algorithm Low variance sampler(Xt , Wt )
1. Xt = ∅
2. r = rand(0; M −1 )
[1]
3. c = wt
4. i=1
5. for m = 1 to M do
6. U = r + (m − 1).M −1
7. while (U > c)
8. i=i+1
[i]
9. c = c + wt
10. endwhile
[i]
11. add xt to X t
12. endfor
13. return X t
* Return Xt

17/17

Das könnte Ihnen auch gefallen