Abstract
Sequential Monte Carlo probability hypothesis density (SMC-PHD) filtering has been recently exploited for audio-visual (AV) based tracking of multiple speakers, where audio data are used to inform the particle distribution and propagation in the visual SMC-PHD filter. However, the performance of the AV-SMC-PHD filter can be affected by the mismatch between the proposal and the posterior distribution. In this paper, we present a new method to improve the particle distribution where audio information (i.e. DOA angles derived from microphone array measurements) is used to detect new born particles and visual information (i.e. histograms) is used to modify the particles with particle flow (PF). Using particle flow has the benefit of migrating particles smoothly from the prior to the posterior distribution. We compare the proposed algorithm with the baseline AV-SMC-PHD algorithm using experiments on the AV16.3 dataset with multi-speaker sequences.
Y. Liu—This work was supported by the EPSRC Programme Grant S3A: Future Spatial Audio for an Immersive Listener Experience at Home (EP/L000539/1), the BBC as part of the BBC Audio Research Partnership, the China Scholarship Council (CSC), and the EPSRC grant EP/K014307/1 and the MOD University Defence Research Collaboration in Signal Processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Baser, E., Efe, M.: A novel auxiliary particle PHD filter. In: 15th International Conference on Information Fusion, pp. 165–172. IEEE (2012)
Bernardin, K., Gehrig, T., Stiefelhagen, R.: Multi-level particle filter fusion of features and cues for audio-visual person tracking. In: Stiefelhagen, R., Bowers, R., Fiscus, J. (eds.) CLEAR/RT -2007. LNCS, vol. 4625, pp. 70–81. Springer, Heidelberg (2008). doi:10.1007/978-3-540-68585-2_5
Cevher, V., Sankaranarayanan, A.C., McClellan, J.H., Chellappa, R.: Target tracking using a joint acoustic video system. IEEE Trans. Multimedia 9(4), 715–727 (2007)
Cui, P., Sun, L.F., Wang, F., Yang, S.Q.: Contextual mixture tracking. IEEE Trans. Multimedia 11(2), 333–341 (2009)
Daum, F., Huang, J.: Particle flow for nonlinear filters with log-homotopy. In: Proceedings of SPIE, pp. 696918-1–696918-12 (2008)
Daum, F., Huang, J.: Nonlinear filters with log-homotopy. In: Drummond, O.E., Teichgraeber, R.D. (eds.) Optical Engineering + Applications. pp. 669918–669918-15. International Society for Optics and Photonics (2007)
Daum, F., Huang, J.: Nonlinear filters with particle flow. In: SPIE Optical Engineering + Applications, pp. 74450R-1–74450R-9. International Society for Optics and Photonics (2009)
Daum, F., Huang, J.: Particle flow for nonlinear filters. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5920–5923 (2011)
Daum, F., Huang, J.: Renormalization group flow and other ideas inspired by physics for nonlinear filters, Bayesian decisions, and transport. In: SPIE Defense + Security, p. 90910I. International Society for Optics and Photonics (2014)
Daum, F., Huang, J.: Renormalization group flow in k-space for nonlinear filters, Bayesian decisions and transport. In: 18th International Conference on Information Fusion, pp. 1617–1624. IEEE (2015)
Daum, F., Huang, J., Noushin, A.: Exact particle flow for nonlinear filters. In: SPIE Defense, Security, and Sensing, p. 769704 (2010)
Daum, F., Huang, J., Noushin, A.: Small curvature particle flow for nonlinear filters. Signal Processing, Sensor Fusion, and Target Recognition XIX 7697(1), 769704 (2010)
Daum, F., Huang, J., Noushin, A.: Coulomb’s law particle flow for nonlinear filters. In: Drummond, O.E. (ed.) SPIE Optical Engineering + Applications, pp. 1–15. International Society for Optics and Photonics (2011)
Fortmann, T., Bar-Shalom, Y., Scheffe, M.: Sonar tracking of multiple targets using joint probabilistic data association. IEEE J. Oceanic Eng. 8(3), 173–184 (1983)
Gatica-Perez, D., Lathoud, G., Odobez, J.M., McCowan, I.: Audiovisual probabilistic tracking of multiple speakers in meetings. IEEE Trans. Audio Speech Lang. Process. 15(2), 601–616 (2007)
Hafner, J., Sawhney, H.S., Equitz, W., Flickner, M., Niblack, W.: Efficient color histogram indexing for quadratic form distance functions. IEEE Trans. Pattern Anal. Mach. Intell. 17(7), 729–736 (1995)
Khan, M.A., Ulmke, M.: Non-linear and non-Gaussian state estimation using log-homotopy based particle flow filters. In: 2014 Workshop on Sensor Data Fusion: Trends, Solutions, Applications, SDF (2014)
Khan, M.A., Ulmke, M.: Non-linear and non-Gaussian state estimation using log-homotopy based particle flow filters. In: Sensor Data Fusion: Trends, Solutions, Applications (SDF), pp. 1–6. IEEE (2014)
Kidron, E., Schechner, Y.Y., Elad, M.: Cross-modal localization via sparsity. IEEE Trans. Sig. Process. 55(4), 1390–1404 (2007)
Kilic, V., Barnard, M., Wang, W., Hilton, A., Kittler, J.: Mean-shift and sparse sampling based SMC-PHD filtering for audio informed visual speaker tracking. IEEE Trans. Multimedia 18(12), 2417–2431 (2016)
Kılıç, V., Barnard, M., Wang, W., Kittler, J.: Audio assisted robust visual tracking with adaptive particle filtering. IEEE Trans. Multimedia 17(2), 186–200 (2015)
Kong, A., Liu, J.S., Wong, W.H.: Sequential imputations and Bayesian missing data problems. J. Am. Stat. Assoc. 89(425), 278–288 (1994)
Lathoud, G., Odobez, J.-M., Gatica-Perez, D.: AV16.3: an audio-visual corpus for speaker localization and tracking. In: Bengio, S., Bourlard, H. (eds.) MLMI 2004. LNCS, vol. 3361, pp. 182–195. Springer, Heidelberg (2005). doi:10.1007/978-3-540-30568-2_16
Li, Y., Zhao, L., Coates, M.: Particle flow for particle filtering. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3979–3983. IEEE (2016)
Liu, Q., Rui, Y., Gupta, A., Cadiz, J.J.: Automating camera management for lecture room environments. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 442–449. ACM (2001)
Maggio, E., Piccardo, E., Regazzoni, C., Cavallaro, A.: Particle PHD filtering for multi-target visual tracking. In: 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. I–1101. IEEE (2007)
Polat, E., Ozden, M.: A nonparametric adaptive tracking algorithm based on multiple feature distributions. IEEE Trans. Multimedia 8(6), 1156–1163 (2006)
Ristic, B., Vo, B.N., Clark, D., Vo, B.T.: A metric for performance evaluation of multi-target tracking algorithms. IEEE Trans. Sig. Process. 59(7), 3452–3457 (2011)
Talantzis, F., Constantinides, A.G., Polymenakos, L.C.: Estimation of direction of arrival using information theory. IEEE Sig. Process. Lett. 12(8), 561–564 (2005)
Zhao, L., Wang, J., Li, Y., Coates, M.J.: Gaussian particle flow implementation of PHD filter. In: SPIE Defense + Security, p. 98420D (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Liu, Y., Wang, W., Chambers, J., Kilic, V., Hilton, A. (2017). Particle Flow SMC-PHD Filter for Audio-Visual Multi-speaker Tracking. In: Tichavský, P., Babaie-Zadeh, M., Michel, O., Thirion-Moreau, N. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2017. Lecture Notes in Computer Science(), vol 10169. Springer, Cham. https://doi.org/10.1007/978-3-319-53547-0_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-53547-0_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-53546-3
Online ISBN: 978-3-319-53547-0
eBook Packages: Computer ScienceComputer Science (R0)