Spatially-dependent Bayesian semantic perception under model and localization uncertainty

Abstract

Semantic perception can provide autonomous robots operating under uncertainty with more efficient representation of their environment and better ability for correct loop closures than only geometric features. However, accurate inference of semantics requires measurement models that correctly capture properties of semantic detections such as viewpoint dependence, spatial correlations, and intra- and inter-class variations. Such models should also gracefully handle open-set conditions which may be encountered, keeping track of the resultant model uncertainty. We propose a method for robust visual classification of an object of interest observed from multiple views in the presence of significant localization uncertainty and classifier noise, and possible dataset shift. We use a viewpoint dependent measurement model to capture viewpoint dependence and spatial correlations in classifier scores, showing how to use it in the presence of localization uncertainty. Assuming a Bayesian classifier providing a measure of uncertainty, we show how its outputs can be fused in the context of the above model, allowing robust classification under model uncertainty when novel scenes are encountered. We present statistical evaluation of our method both in synthetic simulation, and in a 3D environment where rendered images are fed into a Deep Neural Network classifier. We compare to baseline methods in scenarios of varying difficulty showing improved robustness of our method to localization uncertainty and dataset shift. Finally, we validate our contribution w.r.t. localization uncertainty on a dataset of real-world images.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

References

  1. Ammirato, P., Poirson, P., Park, E., Kosecka, J., & Berg, A. C. (2017). A dataset for developing and benchmarking active vision. In IEEE International Conference on Robotics and Automation (ICRA)

  2. Atanasov, N., Sankaran, B., Ny, J., Pappas, G. J., & Daniilidis, K. (2014). Nonmyopic view planning for active object classification and pose estimation. IEEE Transactions on Robotics, 30, 1078–1090.

    Article  Google Scholar 

  3. Becerra, I., Valentín-Coronado, L. M., Murrieta-Cid, R., & Latombe, J. C. (2016). Reliable confirmation of an object identity by a mobile robot: A mixed appearance/localization-driven motion approach. International Journal of Robotics Research, 35(10), 1207–1233.

    Article  Google Scholar 

  4. Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., & Vaughan, J. W. (2010). A theory of learning from different domains. Machine Learning, 79(1–2), 151–175.

    MathSciNet  Article  Google Scholar 

  5. Bowman, S., Atanasov, N., Daniilidis, K., & Pappas, G. (2017). Probabilistic data association for semantic slam. In IEEE International Conference on Robotics and Automation (ICRA), IEEE (pp. 1722–1729).

  6. Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., et al. (2016). Simultaneous localization and mapping: Present, future, and the robust-perception age. IEEE Transactions on Robotics, 32(6), 1309–1332.

    Article  Google Scholar 

  7. Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785–794).

  8. Choudhary, S., Carlone, L., Nieto, C., Rogers, J., Christensen, H. I., & Dellaert, F. (2017). Distributed mapping with privacy and communication constraints: Lightweight algorithms and object-based models. International Journal of Robotics Research, 36(12), 1286–1311.

    Article  Google Scholar 

  9. Farhi, E. I., & Indelman, V. (2019). ix-bsp: Belief space planning through incremental expectation. In IEEE International Conference on Robotics and Automation (ICRA)

  10. Feldman, Y., & Indelman, V. (2018a). Bayesian viewpoint-dependent robust classification under model and localization uncertainty. In IEEE International Conference on Robotics and Automation (ICRA)

  11. Feldman, Y., & Indelman, V. (2018b). Towards robust autonomous semantic perception. In Workshop on representing a complex world: perception, inference, and learning for joint semantic, geometric, and physical understanding, in conjunction with ieee international conference on robotics and automation (ICRA)

  12. Gal, Y. (2017). Uncertainty in deep learning. Ph.D. thesis, University of Cambridge

  13. Gal, Y., & Ghahramani, Z. (2016). Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In International Conference on Machine Learning (ICML)

  14. Gal, Y., Islam, R., & Ghahramani, Z. (2017). Deep bayesian active learning with image data. In International Conference on machine learning (ICML), JMLR. org (pp. 1183–1192).

  15. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093

  16. Kaess, M., Johannsson, H., Roberts, R., Ila, V., Leonard, J., & Dellaert, F. (2012). iSAM2: Incremental smoothing and mapping using the Bayes tree. International Journal of Robotics Research, 31, 217–236.

    Article  Google Scholar 

  17. Kendall, A., & Gal, Y. (2017). What uncertainties do we need in bayesian deep learning for computer vision? In Advances in neural information processing systems (NIPS) (pp. 5580–5590).

  18. Kendall, A., Badrinarayanan, V., & Cipolla, R. (2015a). Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding. arXiv:1511.02680.

  19. Kendall, A., Grimes, M., & Cipolla, R. (2015b). Posenet: Convolutional networks for real-time 6-dof camera relocalization. In International Conference on Computer Vision (ICCV).

  20. Kopitkov, D., & Indelman, V. (2018). Robot localization through information recovered from cnn classificators. In IEEE/RSJ International conference on intelligent robots and systems (IROS), IEEE.

  21. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).

  22. Lakshminarayanan, B., Pritzel, A., & Blundell, C. (2017). Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in neural information processing systems (NIPS) (pp. 6402–6413).

  23. Lianos, K. N., Schonberger, J. L., Pollefeys, M., & Sattler, T. (2018). Vso: Visual semantic odometry. In European Conference on Computer Vision (ECCV) (pp. 234–250).

  24. Lütjens, B., Everett, M., & How, J. P. (2018). Safe reinforcement learning with model uncertainty estimates. arXiv:1810.08700

  25. Malinin, A., & Gales, M. (2018). Predictive uncertainty estimation via prior networks. In Advances in neural information processing systems (NIPS) (pp. 7047–7058).

  26. Malinin, A., Ragni, A., Knill, K., & Gales, M. (2017). Incorporating uncertainty into deep learning for spoken language assessment. In Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 2: Short Papers), vol. 2 (pp. 45–50).

  27. McAllister, R., Gal, Y., Kendall, A., Van Der Wilk, M., Shah, A., Cipolla, R., & Weller, A. V. (2017). Concrete problems for autonomous vehicle safety: advantages of bayesian deep learning. In International Joint Conference on AI (IJCAI).

  28. Miller, D., Dayoub, F., Milford, M., & Sünderhauf, N. (2018a). Evaluating merging strategies for sampling-based uncertainty techniques in object detection. arXiv:1809.06006

  29. Miller, D., Nicholson, L., Dayoub, F., & Sünderhauf, N. (2018b). Dropout sampling for robust object detection in open-set conditions. In IEEE International conference on robotics and automation (ICRA), IEEE (pp. 1–7).

  30. Mu, B., Liu, S. Y., Paull, L., Leonard, J., & How, J. (2016). Slam with objects using a nonparametric pose graph. In IEEE/RSJ International conference on intelligent robots and systems (IROS).

  31. Myshkov, P., & Julier, S. (2016). Posterior distribution analysis for bayesian inference in neural networks. NIPS: In workshop on Bayesian deep learning.

  32. Omidshafiei, S., Lopez, B. T., How, J. P., & Vian, J. (2016). Hierarchical bayesian noise inference for robust real-time probabilistic object classification. arXiv:1605.01042

  33. Osband, I., Blundell, C., Pritzel, A., & Van Roy, B. (2016). Deep exploration via bootstrapped dqn. In Advances in neural information processing systems (NIPS) (pp. 4026–4034).

  34. Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in pytorch. Advances in neural information processing systems (NIPS)

  35. Patten, T., Zillich, M., Fitch, R., Vincze, M., & Sukkarieh, S. (2016). Viewpoint evaluation for online 3-d active object classification. IEEE Robotics and Automation Letters (RA-L), 1(1), 73–81.

    Article  Google Scholar 

  36. Patten, T., Martens, W., & Fitch, R. (2018). Monte carlo planning for active object classification. Autonomous Robots, 42(2), 391–421.

    Article  Google Scholar 

  37. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12(Oct), 2825–2830.

    MathSciNet  MATH  Google Scholar 

  38. Pillai, S., & Leonard, J. (2015). Monocular slam supported object recognition. In Robotics: Science and Systems (RSS).

  39. Qiu, W., Zhong, F., Zhang, Y., Qiao, S., Xiao, Z., Kim, T. S., & Wang, Y. (2017). Unrealcv: Virtual worlds for computer vision. In Proceedings of the 2017 ACM on multimedia conference, ACM (pp. 1221–1224).

  40. Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., & Lawrence, N. D. (2009). Dataset shift in machine learning. Cambridge: The MIT press.

    Google Scholar 

  41. Rasmussen, C., & Williams, C. (2006). Gaussian processes for machine learning. Cambridge: The MIT press.

    Google Scholar 

  42. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.

    MathSciNet  Article  Google Scholar 

  43. Salas-Moreno, R. F., Newcombe, R. A., Strasdat, H., Kelly, P., & Davison, A. J. (2013a). Slam++: Simultaneous localisation and mapping at the level of objects. In IEEE Conference on computer vision and pattern recognition (CVPR) (pp. 1352–1359).

  44. Salas-Moreno, R. F., Newcombe, R. A., Strasdat, H., Kelly, P. H., & Davison, A. J. (2013b). Slam++: Simultaneous localisation and mapping at the level of objects. In IEEE Conference on computer vision and pattern recognition (CVPR) (pp. 1352–1359).

  45. Singh, A., Sha, J., Narayan, K. S., Achim, T., & Abbeel, P. (2014). Bigbird: A large-scale 3d database of object instances. In 2014 IEEE international conference on robotics and automation (ICRA), IEEE (pp. 509–516).

  46. Sünderhauf, N., Pham, T. T., Latif, Y., Milford, M., & Reid, I. (2017). Meaningful maps with object-oriented semantic mapping. In IEEE/RSJ International conference on intelligent robots and systems (IROS), IEEE (pp. 5079–5085).

  47. Tchuiev, V., & Indelman, V. (2018). Inference over distribution of posterior class probabilities for reliable bayesian classification and object-level perception. IEEE Robotics and Automation Letters (RA-L), 3(4), 4329–4336.

    Article  Google Scholar 

  48. Tchuiev, V., Feldman, Y., & Indelman, V. (2019). Data association aware semantic mapping and localization via a viewpoint-dependent classifier model. In IEEE/RSJ International conference on intelligent robots and systems (IROS).

  49. Teacy, W., Julier, S. J., De Nardi, R., Rogers, A., & Jennings, N. R. (2015). Observation modelling for vision-based target search by unmanned aerial vehicles. In International conference on autonomous agents and multiagent systems (AAMAS) (pp. 1607–1614).

  50. Velez, J., Hemann, G., Huang, A. S., Posner, I., & Roy, N. (2012). Modelling observation correlations for active exploration and robust object detection. Journal of Artificial Intelligence Research, 44, 423–425.

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yuri Feldman.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was partially supported by the Israel Ministry of Science & Technology (MOST)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Feldman, Y., Indelman, V. Spatially-dependent Bayesian semantic perception under model and localization uncertainty. Auton Robot (2020). https://doi.org/10.1007/s10514-020-09921-0

Download citation

Keywords

  • Semantic perception
  • Object classification and pose estimation
  • SLAM