Autonomous Robots

, Volume 32, Issue 4, pp 351–368 | Cite as

PLISS: labeling places using online changepoint detection

  • Ananth RanganathanEmail author


A shared vocabulary between humans and robots for describing spatial concepts is essential for effective human robot interaction. Towards this goal, we present a novel technique for place categorization from visual cues called PLISS (Place Labeling through Image Sequence Segmentation). PLISS is different from existing place categorization systems in two major ways—it inherently works on video and image streams rather than single images, and it can detect “unknown” place labels, i.e. place categories that it does not know about. PLISS uses changepoint detection to temporally segment image sequences which are subsequently labeled. Changepoint detection and labeling are performed inside a systematic probabilistic framework. Unknown place labels are detected by using a probabilistic classifier and keeping track of its label uncertainty. We present experiments and comparisons on the large and extensive VPC dataset. We also demonstrate results using models learned from images downloaded from Google’s image search.


Place categorization Semantic mapping Computer vision Bayesian Probabilistic modeling Place recognition 

Supplementary material

(MOV 8.3 MB)


  1. Adams, R. P., & MacKay, D. J. C. (2007). Bayesian online changepoint detection (Technical report). University of Cambridge, Cambridge, UK. arXiv:0710.3742v1 [stat.ML].
  2. Andreasson, H., Treptow, A., & Duckett, T. (2005). Localization for mobile robots using panoramic vision, local features and particle filter. In IEEE intl. conf. on robotics and automation (ICRA). Google Scholar
  3. Bosch, A., Zisserman, A., & Munoz, X. (2007). Image classification using random forests and ferns. In Intl. conf. on computer vision (ICCV) (pp. 1–8). Google Scholar
  4. Casella, G., & Robert, C. P. (1996). Rao-Blackwellisation of sampling schemes. Biometrika, 83(1), 81–94. MathSciNetzbMATHCrossRefGoogle Scholar
  5. Chang, C.-C., & Lin, C.-J. (2001). LIBSVM: a library for support vector machines. Google Scholar
  6. Chopin, N. (2007). Dynamic detection of change points in long time series. Annals of the Institute of Statistical Mathematics, 59(2), 349–366. MathSciNetzbMATHCrossRefGoogle Scholar
  7. Csato, L., & Opper, M. (2002). Sparse online Gaussian processes. Neural Computation, 14(2), 641–669. zbMATHCrossRefGoogle Scholar
  8. Dasgupta, S., Hsu, D. J., & Verma, N. (2006). A concentration theorem for projections. In Conf. on uncertainty in artificial intelligence (UAI). Google Scholar
  9. Datta, R., Joshi, D., Li, J., & Wang, J. Z. (2008). Image retrieval: Ideas, influences, and trends of the new age.. ACM Computing Surveys (CSUR), 40(2), 1–60. CrossRefGoogle Scholar
  10. Diaconis, P., & Freedman, D. (1984). Asymptotics of graphical projection pursuit. Annals of Statistics, 12, 793–815. MathSciNetzbMATHCrossRefGoogle Scholar
  11. Esterby, S. R., & El-Shaarawi, A. H. (1981). Inference about the point of change in a regression model. Applied Statistics, 30(3), 277–285. MathSciNetzbMATHCrossRefGoogle Scholar
  12. Fearnhead, P., & Clifford, P. (2003). Online inference for hidden Markov models. Journal of the Royal Statistical Society: Series B, 65, 887–899. MathSciNetzbMATHCrossRefGoogle Scholar
  13. Fearnhead, P., & Liu, Z. (2007). On-line inference for multiple changepoint problems. Journal of the Royal Statistical Society: Series B, 69(4), 589–605. MathSciNetCrossRefGoogle Scholar
  14. Fei-Fei, L., & Perona, P. (2005). A Bayesian hierarchical model for learning natural scene categories. In IEEE conf. on computer vision and pattern recognition (CVPR). Google Scholar
  15. Gaspar, J., Winters, N., & Santos-Victor, J. (2000). Vision-based navigation and environmental representations with an omnidirectional camera. IEEE Transactions on Robotics and Automation, 16(6), 890–898. CrossRefGoogle Scholar
  16. Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (1995). Bayesian data analysis. London: Chapman and Hall. Google Scholar
  17. Grauman, K., & Darrell, T. (2007). The pyramid match kernel: Efficient learning with sets of features. Journal of Machine Learning Research, 8, 725–760. zbMATHGoogle Scholar
  18. Kapoor, A., Grauman, K., Urtasun, R., & Darrell, T. (2010). Gaussian processes for object categorization. International Journal of Computer Vision, 88, 169–188. CrossRefGoogle Scholar
  19. Kuipers, B. J. (2000). The spatial semantic hierarchy. Artificial Intelligence, 119, 191–233. MathSciNetzbMATHCrossRefGoogle Scholar
  20. Kuipers, B., & Beeson, P. (2002). Bootstrap learning for place recognition. In Nat. conf. on artificial intelligence (AAAI) (pp. 174–180). Google Scholar
  21. Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In IEEE conf. on computer vision and pattern recognition (CVPR). Google Scholar
  22. Madsen, R. E., Kauchak, D., & Elkan, C. (2005). Modeling word burstiness using the Dirichlet distribution. In Intl. conf. on machine learning (ICML) (pp. 545–552). CrossRefGoogle Scholar
  23. Malik, J., Belongie, S., Leung, T., & Shi, J. (2001). Contour and texture analysis for image segmentation. International Journal of Computer Vision, 43, 7–27. zbMATHCrossRefGoogle Scholar
  24. Martínez Mozos, O., Rottmann, A., Triebel, R., Jensfelt, P., & Burgard, W. (2006). Semantic labeling of places using information extracted from laser and vision sensor data. In Proc. of the IEEE/RSJ IROS 2006 workshop: from sensors to human spatial concepts. Google Scholar
  25. Menegatti, E., Maeda, T., & Ishiguro, H. (2004). Image-based memory for robot navigation using properties of the omnidirectional images. Journal of Robotics and Autonomous Systems, 47(4), 251–267. CrossRefGoogle Scholar
  26. Minka, T. P. Estimating a Dirichlet distribution (2003). Google Scholar
  27. Minka, T. P. (2003). The ‘summation hack’ as an outlier model. Google Scholar
  28. Naor, A., & Romik, D. (2003). Projecting the surface measure of the sphere of \(l_{p}^{n}\). Annales de l’Institut Henri Poincare (B), Probability and Statistics, 39, 241–261. MathSciNetzbMATHCrossRefGoogle Scholar
  29. Oliva, A., & Torralba, A. (2006). Building the gist of a scene: The role of global image features in recognition. Visual Perception, Progress in Brain Research, 155. Google Scholar
  30. Page, E. S. (1954). Continuous inspection scheme. Biometrika, 41, 100–115. MathSciNetzbMATHGoogle Scholar
  31. Posner, I., Schroeter, D., & Newman, P. (2006). Using scene similarity for place labeling. In International symposium of experimental robotics. Google Scholar
  32. Posner, I., Cummins, M., & Newman, P. (2009). A generative framework for fast urban labeling using spatial and temporal context. Autonomous Robots, 26, 153–170. CrossRefGoogle Scholar
  33. Pronobis, A., Mozos, O. M., Caputo, B., & Jensfelt, P. (2010). Multi-modal semantic place classification. International Journal of Robotics Research, 29(2–3), 298–320. Google Scholar
  34. Ranganathan, A. (2010). Pliss: Detecting and labeling places using online change-point detection. In Proceedings of robotics: science and systems. Google Scholar
  35. Ranganathan, A., & Dellaert, F. (2007). Semantic modeling of places using objects. In Robotics: science and systems (RSS), Atlanta, USA. Google Scholar
  36. Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. Cambridge: MIT Press. zbMATHGoogle Scholar
  37. Rauch, H. (1963). Solutions to the linear smoothing problem. IEEE Transactions on Automatic Control, 8(4), 371–372. CrossRefGoogle Scholar
  38. Rottmann, A., Martinez Mozos, O., Stachniss, C., & Burgard, W. (2005). Semantic place classification of indoor environments with mobile robots using boosting. In Nat. conf. on artificial intelligence (AAAI). Google Scholar
  39. Salakhutdinov, R., & Hinton, G. (2009). Semantic hashing. International Journal of Approximate Reasoning, 50(7), 969–978. CrossRefGoogle Scholar
  40. Schölkopf, B., Burges, C. J. C., & Smola, A. J. (1999). Advances in kernel methods—support vector learning. Cambridge: MIT Press. Google Scholar
  41. Siagian, C., & Itti, L. (2007). Biologically-inspired robotics vision Monte-Carlo localization in the outdoor environment. In IEEE/RSJ intl. conf. on intelligent robots and systems (IROS). Google Scholar
  42. Tapus, A., Tomatis, N., & Siegwart, R. (2004). Topological global localization and mapping with fingerprint and uncertainty. In Proceedings of the international symposium on experimental robotics. Google Scholar
  43. Taylan Cemgil, A., Zajdel, W., & Krose, B. (2005). A hybrid graphical model for robust feature extraction from video. In IEEE conf. on computer vision and pattern recognition (CVPR). Google Scholar
  44. Topp, E. A., Hüttenrauch, H., Christensen, H. I., & Eklundh, K. S. (2006). Bringing together human and robotic environment representations—a pilot study. In IEEE/RSJ intl. conf. on intelligent robots and systems (IROS), Beijing, China, October 2006. Google Scholar
  45. Torralba, A., Murphy, K. P., Freeman, W. T., & Rubin, M. A. (2003). Context-based vision system for place and object recognition. In Intl. conf. on computer vision (ICCV) (Vol. 1, pp. 273–280). Google Scholar
  46. Tsechpenakis, G., Metaxas, D., Hadjiliadis, O., & Neidle, C. (2006). Robust online change-point detection in video sequences. In 2nd IEEE workshop on vision for human computer interaction (V4HCI), in conjunction with the IEEE conference on computer vision and pattern recognition. Google Scholar
  47. Ulrich, I., & Nourbakhsh, I. (2000). Appearance-based place recognition for topological localization. In IEEE intl. conf. on robotics and automation (ICRA), April (Vol. 2, pp. 1023–1029). Google Scholar
  48. Weiss, Y., Torralba, A., & Fergus, R. (2008). Spectral hashing. In Advances in neural information processing systems (NIPS). Google Scholar
  49. Wiiliams, C. K. I., & Barber, D. (1998). Bayesian classification with Gaussian processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12), 1342–1351. CrossRefGoogle Scholar
  50. Wu, J., & Rehg, J. M. (2008). Where am i: Place instance and category recognition using spatial pact. In IEEE conf. on computer vision and pattern recognition (CVPR). Google Scholar
  51. Wu, J., Christensen, H., & Rehg, J. M. (2009). Visual place categorization: Problem, dataset, and algorithm. In IEEE/RSJ intl. conf. on intelligent robots and systems (IROS). Google Scholar
  52. Zabih, R., & Woodfill, J. (1994). Non-parametric local transforms for computing visual correspondence. In Eur. conf. on computer vision (ECCV) (Vol. 2, pp. 151–158). Google Scholar
  53. Zender, H., Jensfelt, P., Mozos, O. M., Kruijff, G.-J., & Burgard, W. (2007). An integrated robotic system for spatial understanding and situated interaction in indoor environments. In Nat. conf. on artificial intelligence (AAAI). Google Scholar
  54. Zhai, Y., & Shah, M. (2005). A general framework for temporal video scene segmentation. In Intl. conf. on computer vision (ICCV) (Vol. 2, pp. 1111–1116). Google Scholar
  55. Zivkovic, Z., Booij, O., & Kröse, B. (2007). From images to rooms. Journal of Robotics and Autonomous Systems, 55(5), 411–418. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Honda Research Institute USA, IncMountain ViewUSA

Personalised recommendations