Speaker localisation and tracking: improved dynamics


This page presents some acoustic source tracking examples obtained with an improved particle filter (PF) algorithm. Specifically, the dynamics model used in the algorithm implementation has been optimised to better represent the range of possible human motions, rather than using an all-purpose dynamics model with standard parameter settings. The details of this particular work can be found in [1,2].

The movies below have been generated from the tracking results obtained in a real office room. The dimensions of the environment are 3.36m x 4.43m x 2.6m, with eight microphones (represented as grey circles) located at a height of 1.55m. The frequency-averaged reverberation time was practically measured in the room as T60 = 0.5s.
  1. Movie #1 (1.8MB)
  2. Movie #2 (1.9MB)
  3. Movie #3 (1.9MB)
In these movies, the star represents the true source position and the white circle is the speaker position estimate delivered by the particle filter. The dotted line shows the trajectory of the speaker, which was determined on the basis of the audio data itself using the high-accuracy beamforming approach described in [3]. The movies also show the area of uncertainty (ellipse), which becomes larger whenever the speaker is silent.

The main difference between these movies and the results obtained with our previous particle filtering implementations (as demonstrated on this page, for instance) is in the evolution of the tracker's estimates during periods of silence. Whereas the estimates would simply appear "frozen" during such periods with previous implementations, the use of an optimised dynamics model here allows the tracker to keep following a silent speaker "blindly", to a certain extent, when no useful signal is available. This is demonstrated in the above movies as the white circle (PF estimate) tends to keep moving in the same general direction as the speaker during short breaks in the speech signal.


References

[1] Eric A. Lehmann, Anders M. Johansson, and Sven Nordholm, Modeling of Motion Dynamics and its Influence on the Performance of a Particle Filter for Acoustic Speaker Tracking, Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA'07), pp. 98-101, New Paltz, NY, USA, October 2007.
[2] Eric A. Lehmann and Anders M. Johansson, Dynamics Models for Acoustic Speaker Tracking—Preliminary Results, NICTA/WATRI Technical Report PRJ-NICTA-PM-023, Western Australian Telecommunications Research Institute, Perth, Australia, August 2007.
[3] Eric A. Lehmann and Anders M. Johansson, Experimental Performance Assessment of a Particle Filter with Voice Activity Data Fusion for Acoustic Speaker Tracking, Proceedings of the IEEE Nordic Signal Processing Symposium (NORSIG'06), pp. 126-129, Reykjavik, Iceland, June 2006.


Home